Download presentation
Presentation is loading. Please wait.
1
IASSIST 2007 Montreal, Quebec
Data Preservation Alliance for the Social Sciences: A Model for Collaboration IASSIST 2007 Montreal, Quebec
2
The Odum Institute The Odum Institute is the oldest Institute at America’s First Public University Established in 1924, Odum is a multidisciplinary institute for research in the social sciences. The Institute maintains one of the country’s largest archive of computer-readable social science data. Holdings include national and international economic, electoral, demographic, financial, health, public opinion, and other types of data to meet a variety of research and teaching needs. 4/14/2019
3
Odum Data-PASS Focus Areas
Virtual Data Center Adoption National Network of State Polls (NNSP) Harris Polls Private Research Organizations Distributed Storage 4/14/2019
4
Partnering with Harvard MIT Data Center in adopting VDC
Spires/OpenText > DDI/XML Coordinated Question Level Search Modification Currently testing next generation called Dataverse Network 4/14/2019
5
National Network of State Polls & Harris Polls
Working with polling agencies to close gaps in historical collection Assisting NNSP members with the ingest process Building on existing relationships Partnering with Harris Interactive to find missing surveys in the Louis Harris Data Center at the Odum Institute 4/14/2019
6
Private Research Organizations
The late-1940s and 1950s witnessed the rise of private organizations and firms that deal almost exclusively in the production and analysis of information, knowledge, and public policy. These organizations are potentially a major source of social science research on important public policy issues. Organizations such as Research Triangle Institute (RTI International), the National Opinion Research Center (NORC), Westat, and ABT Associates have played important roles in the advancement of scientific research in the social sciences. 4/14/2019
7
Saving the Kennedy Assassination Study
Following 9/11 NORC researchers wanted to replicate questions after the 1963 assassination study Data could not be found until the old card catalog was found and pointed to holdings of boxes in storage After six weeks, ten boxes of punched cards were retrieved Cards were hand-delivered to a New York firm Card reader was refurbished Data/documentation needed to be interpreted Multi-punched Single-punched conversion With persistent effort a clean data file emerged and was archived Example of the “Data Rescue” Process Identify Just after the tragic events of September 11, 2001, Tom Smith was reminded of the study NORC completed just after another American tragedy, the John F. Kennedy assassination. He remembered the detailed questions gauging how Americans were coping in 1963 and wondered how those strategies compared to this current situation. He and his colleagues set out to replicate these questions in the National Tragedy Study in First, he needed that early survey. Locate The archives didn’t have that 1963 study. Internal databases of records were consulted with no indication that the data or the punched cards existed in the 20,000+ cubic feet of storage. A retired NORC librarian was called in who thought the cards were in storage. Old hard-copy inventories of the materials in bulk storage were reviewed, but they were unable definitively confirm or rule out the existence of the cards. A second former employee was contacted who recalled the existence of an older card catalog listing the holdings of some of boxes in storage. After six weeks ten boxes were retrieved from an off-site storage facility in Chicago. Data Conversion The data had to be interpreted and converted to a current machine-readable medium. The 38 year-old cards had to be read The data had to be converted from multiple-punched data to single punched data National Data Conversion Institute in NYC could read the cards. A single set of the cards existed, so the cards were hand-delivered to NY. Complications arose in reading the “near perfect” card collection. The card-reader needed refurbishing The first test file was corrupted The firm didn’t know how to spread multiple-punched data Preservation Ultimately, with senior NORC staff working diligently with the Institute, a clean data file emerged. With the data fully recovered, NORC created a final SPSS system file with detailed labels and archived it with the Roper Center. If Tom Smith had not worked with these data in the mid-1970s, the data would have remained a hidden treasure—no finding aids pointed researchers to this valuable dataset. Smith and Forstrom concluded “…it took persistent efforts, the assistance of two ex- employees, and a bit of serendipity to unearth the data. Moreover, once recovered we had data on a medium that was so antiquated that it took four months of extensive efforts to convert it to a modern, user-friendly format.” 4/14/2019
8
Data-PASS Efforts Roper is negotiating with the National Opinion Research Center (NORC) to preserve valuable datasets Odum has been working with RTI International, a private research organization located in Research Triangle Park, NC, to develop a strategy for requesting data from PROs nationwide. 4/14/2019
9
Roper Center Archive of public opinion survey data
Established in 1947 at Williams College Core historical data collections: Gallup Polls, 1936-present Fortune Magazine surveys, American Soldiers Surveys, Data files for over 15,000 surveys 4/14/2019
10
Roper Center – NARA Objective of collaboration USIA Data Collection
Recover, preserve, document and make accessible the United States Information Agency Office of Research surveys, USIA Data Collection Estimated at over 2,000 surveys Survey results contributed to formulation of US foreign and defense policy Some surveys are the only opinion surveys available from certain countries 4/14/2019
11
Roper Center – NARA Leveraging relative strengths (NARA)
structure for working with the State Department in the context of its mandate to preserve federal electronic records standards for appraising, cataloging and preserving electronic records permanent storage and file-level access for all materials related to the collection access to additional USIA records, reports and related federal government records NARA provides: structure for working with the State Department in the context of its mandate for preservation of federal electronic records standards for appraising, cataloging and preserving electronic records permanent storage and file-level access for all materials related to the collection access to additional USIA records, reports and related federal government records 4/14/2019
12
Roper Center – NARA Leveraging relative strengths (Roper)
potential flexibility in communications and approach federal agency-to-agency protocols may not be as flexible as required for a project of this type experience working with a variety of organizations to acquire data resources active migration and management of data more streamlined access to data-based materials access to related public opinion survey data from the private sector and non-federal public sector Roper Center provides: potential flexibility in communications and approach Federal government agency-to-agency protocols may not be as flexible as required for a project of this type experience working with a variety of organizations to acquire data resources active migration and management of data more streamlined access to data-based materials access to related public opinion survey data from the private sector and non-federal public sector 4/14/2019
13
Benefits to Cooperation
Preservation of valuable datasets Many researchers gain access to additional material PROs gain electronic access to previous work PROs receive digital curation assistance Potential to reduce PROs storage costs 4/14/2019
14
Barriers to Preserving Data
Contract restrictions High ingest costs Poor metadata Labor intensive operations Uniqueness requires custom solutions PRO’s lost opportunity costs Overhead associated with building relationships 4/14/2019
15
Questions from PRO Business Offices
If datasets are assets, what is their value to our PRO? Can they be used to leverage existing research or identify new areas of interest? Do datasets have value to other organizations that might be willing to pay for them? What legal and technical issues are involved? What costs are associated with dataset archival? How can we build a business case for preservation at our PRO 4/14/2019
16
Approaching PROs Do background research
Build on existing relationships Assess Different “PRO” business models Contractors for hire PROs with their own research agenda 4/14/2019
17
Early research data life cycle intervention
Assist researchers and PROs with preservation requirements during proposal process Preparing for preservation at the proposal level can ensure That: Datasets are “born digital”, making preservation affordable. And PRO business models are more affordable 4/14/2019
18
Preserving Future Studies
Funding agencies are key Ultimate owners of research data Could request and enforce archival of data Issues are easily addressed early in Life Cycle Single point of contact for many PROs 4/14/2019
19
Digital Curation Keys to Dataset Collection Development
Knowing the producers’ & consumers’ needs Educating producers on preservation requirements Early involvement in the research data life cycle Building and Maintaining relationships 4/14/2019
20
Thank You 4/14/2019
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.