This page was copied and adapted from the Boston College Libraries Text & Data Mining Guide under a Creative Commons Attribution 4.0 License. Our thanks to Boston College for developing this excellent resource and sharing it under the license!
A sampling of projects that are using text and data mining methods. Many of these projects are also applying other computational and quantitative methods as well as visualizations.
America's Public Bible (Lincoln Mullen)
Early Modern Print: Text Mining Early Printed English (Washington University in St. Louis)
Martha Ballard's Diary (Cameron Blevins)
Mining the Dispatch (Digital Scholarship Lab, University of Richmond)
Robots Reading Vogue (Yale University)
Viral Texts (Northeastern University)
HathiTrust Research Center - texts and tools for analysis, mining, and visualization
Text Mining the Novel (NovelTM) - Large scale cross-cultural study of the novel using quantitative methods
Uses of Scale in Literary Study - Aims to demonstrate new methodologies, reduce barriers to entry for scholars, share resources for normalizing large collections of texts
Ben M. Schmidt - text mining and data visualization with a focus on history, politics, and current media and social issues
Image Mining (Miriam Posner) - materials and post on image and text mining (with B. Schmidt) for a medical history workshop at the National Library of Medicine. She also writes on a variety of digital humanities topics and tools
Matthew L. Jockers - exploration of text mining and sentiment analysis with examples and documentation
Ted Underwood - text mining and modeling eighteenth and nineteenth century literary texts
Tidy Topic Modeling (Julia Silge & David Robinson) - explores using tidy text principles to create topic models on works by Dickens, Wells, Verne, and Austen