Skip to main content
Create Clip
Add To List

Case Study with John Walsh of HathiTrust

The mission of the HathiTrust Research Center (HTRC) is to provide tools, environments, and services for computational research on the content of the 17-million-volume HathiTrust Digital Library. In this talk, I will provide an overview of the Text Data Mining (TDM) activities and services provided by HTRC, with additional detail on two current initiatives, Scholar Curated Worksets for Analysis, Re-use, and Dissemination (SCWAReD), supported by the Andrew W. Mellon Foundation, and Tools for Open Research and Computation with HathiTrust: Leveraging Intelligent Text Extraction (TORCHLITE), supported by the National Endowment for the Humanities.
1 Video
John A. Walsh

Director, HathiTrust Research Center

John A. Walsh is the Director of the HathiTrust Research Center and Associate Professor of Information and Library Science in the Luddy School of Informatics, Computing, and Engineering at Indiana University. His research applies computational methods to the study of literary and historical documents. Walsh is an editor of digital scholarly editions, including: the Petrarchive, the Algernon Charles Swinburne Project, and the Chymistry of Isaac Newton. He developed Comic Book Markup Language (CBML), for scholarly encoding of comics and graphic novels, and TEI Boilerplate, for publishing documents encoded according to the Text Encoding Initiative (TEI) Guidelines for Electronic Text Encoding and Interchange. He is the founding Technical Editor and a current General Editor of Digital Humanities Quarterly, an open-access online journal published by the Alliance of Digital Humanities Organizations. Walsh’s research interests include: computational literary studies; textual studies and bibliography; text technologies; book history; 19th-century British literature, poetry and poetics; and comic books. Homepage: