Matthew L. Jockers, Stanford University
Benjamin M. Schmidt, Princeton University
Tim Sherratt, National Museum of Australia
Stéfan Sinclair, McGill University
Session Abstract
Most online historical research is based on close reading of texts (as is most non-digital historical research). In recent years, however, historians and other humanists have begun to consider how to draw inferences from bodies of text that are much too large for anyone to read in a lifetime. In this roundtable session, five practitioners of text mining will discuss the pros and cons of using computational tools to do a number of different kinds of analysis that can supplement and extend traditional modes of reading. These include (1) automatically classifying documents into categories, (2) measuring the similarity or dissimilarity of texts, (3) identifying and extracting entities like names, dates, institutions, events, and causal links, (4) inferring social relationships and visualizing networks of interaction and exchange, and (5) automatically summarizing documents. The panelists each have extensive, hands-on experience with the techniques that they will be discussing, and represent a variety of different disciplinary and national perspectives.