Saturday, January 7, 2012: 2:30 PM
Chicago Ballroom A (Chicago Marriott Downtown)
I will focus on the issues that need to be resolved prior to actually entering data. These are generic questions that apply to all types of historical information that can be reproduced numerically in digital form—public opinion surveys, electoral behavior, vital statistics, economic or demographic data--or whether one samples or uses full counts. I have directed the Guadalajara Census Project from its inception in the 1990s, and it is this experience that informs my presentation. I will briefly outline the initial steps in historical database construction (preliminary studies, university support, outside funding) and then concentrate on the critical questions of database organization and appropriate software programs. In particular, I will address the question of why we choose to enter the data in a rectangular vs. hierarchical or relational format, and why we initially choose Microsoft Excel, and why we replaced it with SPSS (and not SAS or Stata, for example). I will explore newer database software packages (e.g. Pajek), and suggest ways that our procedures might be altered by recent software advances. I will also provide details of several near disasters caused by internal breaches in database security, and explore the lessons learned. Essentially, my job here is to present the conceptual rationalizations for policy decisions made prior to data entry, and specify the procedures and policies that had to be changed in the light of practical, “real time,” considerations. Other panelists will detail how the modifications were arrived at during the various phases of the project, and how they were implemented at the “shop floor” level.
See more of: Digital Research Learning Curve: Practical Lessons from a Seven-Year Historical Census Database Project
See more of: AHA Sessions
See more of: AHA Sessions
Previous Presentation
|
Next Presentation >>