Visualizing Literary History: Topic Modeling, Network Analysis, and the German Novel, 1731–1864

Friday, January 4, 2013: 9:30 AM
Rhythms Ballroom 2 (Sheraton New Orleans)
Matt Erlin, Washington University in St. Louis
My paper will explore the use of network diagrams as a way to visualize relationships across a series of eighteenth- and nineteenth-century German novels. Over the past two years, I have been using a technique called probabilistic topic modeling to test a set of longstanding assumptions about the periodization of German literary history. Scholars have applied a fairly consistent set of period designations to categorize German literature written during the span of roughly one hundred years between 1750 and 1850. Applying the MALLET topic modeling toolkit to a data set of 150 novels written between 1731 and 1864, I have been evaluating whether these novels do in fact cluster together in ways that support the scholarly consensus, or whether there might be hidden thematic structures in these works that point to new ways of thinking about their resemblances to one another.

A key component of the project involves the search for compelling ways to visualize the “proximity” among texts as measured across multiple variables. Network diagrams offer a promising model for such visualizations, allowing one to manipulate edge weights and colors to indicate levels of semantic similarity and node size to highlight differing degrees of connectivity. At the same time, however, the very notion of thinking about a collection of novels (as opposed to individuals) in terms of a “network” raises unique challenges. Moreover, the number of novels involved as well as the number of potential connections can quickly lead to indecipherable visualizations. My remarks will focus in particular on the possibility of addressing this latter problem through ego networks as well as serial images that help reduce complexity by limiting the representations to one variable per frame. While my own interest is in works of literature, the issues addressed are relevant to any text-based historical analysis.

<< Previous Presentation | Next Presentation