The Foreign Relations of the United States Series as a Case Study in Digital History Methods with the Text Encoding Initiative, eXist, and XQuery

Sunday, January 10, 2016: 12:00 PM
Room A706 (Atlanta Marriott Marquis)
Joseph C. Wicentowski, Office of the Historian, United States Department of State
In 2009 the State Department’s Office of the Historian embarked on a multi-year program to digitize the entire Foreign Relations of the United States (FRUS) series, the official documentary record of U.S. foreign policy, which dates back to 1861 and consists of over 500 volumes and hundreds of thousands of annotated primary sources from the National Archives, presidential libraries, and agency records. While the scope of this project may be beyond what an individual scholar might take on, the approach that the Office took is widely applicable across the profession, because of its unique combination of methodological rigor, low cost, and accessibility to scholars in the humanities. From among many competing formats and standards, the Office selected the Text Encoding Initiative (TEI) as the master digital format for the text. TEI is an eminently flexible, media-neutral, humanities-oriented, non-proprietary, open standard for capturing texts in digital form. It allows scholars to add their own annotations atop a text, creating distinct layers of analysis that together enable forms of research not previously feasible using print editions or even other digital formats. Besides standardizing on TEI, the Office selected eXist, a free, open source software package, to power its public research portal ( This software, which supports the high-level XQuery programming language, allows historians to explore and enrich digital texts. Both within and outside the State Department, historians have put the Foreign Relations digital archive to use, some employing advanced techniques in natural language processing, topic modeling, and data visualization. This paper/presentation uses the Foreign Relations digital archive as a case study to show how historians can use these technologies and standards—TEI, eXist, and XQuery—as the core of a research and teaching agenda in digital history.
