TRIADS Spring 2025 Training Series: Introduction to Text Analysis in Python
This four-session course will provide participants with an introduction to analyzing textual data using Python. We will begin by learning how to perform simple operations on text and convert text into data. This will cover topics such as working with strings, text preprocessing, NLP tasks (e.g., stemming, tokenizing), as well as representing text as data (e.g., bag-of-words, word embeddings). Subsequently, the course will introduce methods for measuring concepts using textual data and provide an overview of rule-based techniques, supervised learning, and unsupervised learning approaches. Specifically, we will delve into utilizing dictionaries, the application of Naive Bayes, Random Forests and SVMs for text classification.
This course is intended for graduate students, faculty and staff from any field at WashU who are interested in learning about quantitative text analysis and would like to become familiar with the main libraries and functions used to work with textual data in Python. Participants are expected to have a basic proficiency in Python (taking the Introduction to Python training series 1 and 2 should be sufficient).
This class will be fully in-person, and participants will use their own laptops.
Time: Tuesdays and Wednesdays, 2:00 – 3:30 p.m.
Location: Olin Library Instruction Room 3
Instructor: Ishita Gopal
RSVP