Preparing Texts with Python
Preparing Texts with Python
Overview
Text data is often messy and unstructured, meaning you have to invest a lot of time in preparing the files before you can even begin analyzing them with computational methods. Python makes it easier to extract and clean text data at scale.
This workshop will walk participants through several common tasks for preparing text data, including:
- extracting metadata from a file,
- cleaning “messy” texts,
- removing stop-words,
- and more.
By the end of the workshop, participants will have a sample dataset that’s ready for text analysis.
This workshop is designed for participants who have taken or worked through the materials for the Python for Humanists workshop series or who have some experience working with a programming language.
Instructor: Joshua Dull (DHLab)
Registration & Requirements
This workshop is open to all Yale students, faculty, and staff, but space is limited. To register, please visit the YUL Instruction Calendar.
Participants are required to bring a laptop with Anaconda 3.7 already installed to the workshop. If you have trouble with the installation, stop by the Digital Humanities Lab’s Office Hours—Monday through Thursday at 3:00 p.m.—for help.
Be among the first to know when new digital humanities workshops are offered by signing up for the DHLab’s newsletter.