Event box

OCR (Optical Character Recognition) and Text Mining for Digital Humanities In-Person

In this 90 min to 2 hour workshop- Explore the fundamentals and preparatory steps for text mining in digital humanities. Participants can expect a walkthrough of the main steps in a workflow, starting with the collection of data from digital archives like HathiTrust, JSTOR, and Internet Archive- followed by some practical tools for converting images and PDFs to text using OCR (Optical Character Recognition)  software and optimizing those images for basic text analysis with platforms like Voyant. This introductory course is designed to provide a clear, accessible overview of the text mining process and tools useful in that process.

Date:
Monday, March 31, 2025
Time:
4:00pm - 5:30pm
Time Zone:
Eastern Time - US & Canada (change)
Location:
Crosland 2130

Registration is required. There are 40 seats available.

Event Organizer

Alison Valk

More events like this...