Skip to main content
Create Clip
Add To List

NISO Virtual Conference, Text and Data Mining , May 25, 2022

Not so long ago, Text and Data Mining (TDM) — the automated detection of patterns and extraction of knowledge from machine-readable content or data — was a particular area of interest. So much so, that libraries and content providers developed licensing language and other resources to support researchers wanting to work with and manipulate this material, including a proliferation of LibGuides and APIs. But where are we now in identifying available resources and tools for TDM activities? This virtual conference will provide an “explainer” for information professionals tasked with supporting researchers who are just beginning to engage with TDM, and wondering how to pull the data they need, how it is structured, and how they can expect to engage with it. Our speakers will cover essential technology, how it is deployed and used, the scope of support that the library may be asked to provide, and the spectrum of options for collaboration between information professionals and content and service providers.
54 Videos
NISO Virtual Conferences

These half-day events cover a range of important and timely topics in more depth than our monthly webinars. With expert speakers from across the information community, they include a mix of formats — keynotes, case studies, perspectives, and vision interviews. Recordings are shared immediately with registered participants, and made openly available after two years.
3 Videos
Huajin Wang

Liaison Librarian, Carnegie Mellon University Libraries

Huajin Wang is a Liaison Librarian at Carnegie Mellon University Libraries, and the Program Director for Open Science & Data Collaborations. She is passionate about helping the research community with data needs, fostering collaboration across disciplinary boundaries, facilitating scientific data reuse and reproducibility, and engaging various stake holders to build a healthy research data ecosystem. Huajin holds a PhD in Cell Biology, and has had more than 10 years of research experience in membrane trafficking, lipid metabolism, and bioinformatics before joining the libraries. “
2 Videos
John A. Walsh

Director, HathiTrust Research Center

John A. Walsh is the Director of the HathiTrust Research Center and Associate Professor of Information and Library Science in the Luddy School of Informatics, Computing, and Engineering at Indiana University. His research applies computational methods to the study of literary and historical documents. Walsh is an editor of digital scholarly editions, including: the Petrarchive, the Algernon Charles Swinburne Project, and the Chymistry of Isaac Newton. He developed Comic Book Markup Language (CBML), for scholarly encoding of comics and graphic novels, and TEI Boilerplate, for publishing documents encoded according to the Text Encoding Initiative (TEI) Guidelines for Electronic Text Encoding and Interchange. He is the founding Technical Editor and a current General Editor of Digital Humanities Quarterly, an open-access online journal published by the Alliance of Digital Humanities Organizations. Walsh’s research interests include: computational literary studies; textual studies and bibliography; text technologies; book history; 19th-century British literature, poetry and poetics; and comic books. Homepage:

1 Video
Nathan Kelber

Director, Text Analysis Pedagogy Institute, JSTOR Labs

Dr. Nathan Kelber is the Constellate Education Manager for JSTOR Labs and Director of the Text Analysis Pedagogy Institute. He is an international speaker and educator on “how to apply text analysis in higher education and cultural heritage institutions.” Part digital humanist and part data scientist, his projects bridge research communities, improve access to open educational resources, and apply machine learning for social good.
2 Videos
Petr Knoth

Senior Research Fellow in Text and Data Mining, Open University

Petr Knoth leads the Big Scientific Data and Text Analytics Group (BSDTAG) undertaking R&D in the domains of text-mining, digital libraries and open access/science. Dr Knoth is the founder and head of CORE (, a large full text aggregator of open access papers with millions of monthly active users. CORE makes research papers available for people to freely discover and access and for machines to text-mine. Previously, Dr Knoth worked as a Senior Data Scientist at Mendeley on information extraction and content recommendation for research and has a deep interest in the use of AI to improve research workflows. Dr Knoth co-founded aiming to go beyond bibliometrics and altmetrics to produce new research evaluation methods that make use of the publication full-texts in research assessment. Dr. Knoth has been involved as a researcher and Primary Investigator (PI) in over 20 European Commission, national and international funded research projects in the areas of text-mining, open science and eLearning.
1 Video
Prathik Roy

Product Director, Data Solutions and Strategy, Springer Nature

Dr. Prathik Roy is the Product Director of Data Solutions and Strategy at Springer Nature, New York. He leads product and business development efforts and is an advocate for agile adaptation of data & analytics, ML and AI in the scientific space. Dr. Prathik has been part of the scientific publishing industry for over a decade through various editorial and advisory roles across top publishers. He is a certified Data & Product Strategist from MIT Sloan & Kellogg’s Northwestern University and a former MacDiarmid fellow from University of Canterbury, New Zealand. Dr. Prathik obtained his Ph.D. in Chemistry from National Taiwan University and is widely published and is ranked among the top 0.5% Scientists in the world in Chemistry & Material Science.
1 Video
Shyama Saha

Senior Machine Learning/Text-mining Scientist, Literature Service, EMBL-EBI

Shyama is currently working as a senior machine learning scientist for the Literature Service Team, EMBL-EBI. She holds a PhD in Bioinformatics. Her research interests are how to leverage machine learning and Natural language processing to improve human health. She has years of experience in the use of machine learning and meathod development for the biomedical domain.