Min(d)ing the Gap: Citation Preservation as a Tool to Open Paywalled Sources to Computational Analysis
The historical analysis of memory building in medieval saints' lives at the core of this paper is built on a foundation of data curation oriented toward the preservation of citation data, which is often lost or obscured as we work with topic modeling and natural-language processing. The paper will use this memory-building argument as a case study to demonstrate a process for scraping, cleaning and importing paywalled sources, and then adding a layer of natural language processing that preserves the citation data scholars need to participate in a historiographic debate.
A set of online documentation and resources for replicating the process will accompany the paper.
See more of: AHA Sessions