Presentation is loading. Please wait.

Presentation is loading. Please wait.

IASSIST 2007 Montreal, Quebec

Similar presentations


Presentation on theme: "IASSIST 2007 Montreal, Quebec"— Presentation transcript:

1 IASSIST 2007 Montreal, Quebec
Data Preservation Alliance for the Social Sciences: A Model for Collaboration IASSIST 2007 Montreal, Quebec

2 The Odum Institute The Odum Institute is the oldest Institute at America’s First Public University Established in 1924, Odum is a multidisciplinary institute for research in the social sciences. The Institute maintains one of the country’s largest archive of computer-readable social science data. Holdings include national and international economic, electoral, demographic, financial, health, public opinion, and other types of data to meet a variety of research and teaching needs. 4/14/2019

3 Odum Data-PASS Focus Areas
Virtual Data Center Adoption National Network of State Polls (NNSP) Harris Polls Private Research Organizations Distributed Storage 4/14/2019

4 Partnering with Harvard MIT Data Center in adopting VDC
Spires/OpenText > DDI/XML Coordinated Question Level Search Modification Currently testing next generation called Dataverse Network 4/14/2019

5 National Network of State Polls & Harris Polls
Working with polling agencies to close gaps in historical collection Assisting NNSP members with the ingest process Building on existing relationships Partnering with Harris Interactive to find missing surveys in the Louis Harris Data Center at the Odum Institute 4/14/2019

6 Private Research Organizations
The late-1940s and 1950s witnessed the rise of private organizations and firms that deal almost exclusively in the production and analysis of information, knowledge, and public policy. These organizations are potentially a major source of social science research on important public policy issues. Organizations such as Research Triangle Institute (RTI International), the National Opinion Research Center (NORC), Westat, and ABT Associates have played important roles in the advancement of scientific research in the social sciences. 4/14/2019

7 Saving the Kennedy Assassination Study
Following 9/11 NORC researchers wanted to replicate questions after the 1963 assassination study Data could not be found until the old card catalog was found and pointed to holdings of boxes in storage After six weeks, ten boxes of punched cards were retrieved Cards were hand-delivered to a New York firm Card reader was refurbished Data/documentation needed to be interpreted Multi-punched  Single-punched conversion With persistent effort a clean data file emerged and was archived Example of the  “Data Rescue” Process  Identify  Just after the tragic events of September 11, 2001, Tom Smith was reminded of the study NORC completed just after another American tragedy, the John F. Kennedy assassination. He remembered the detailed questions  gauging how Americans were coping in 1963 and  wondered how those strategies compared to this current  situation.  He and his colleagues set out to replicate these  questions in the National Tragedy Study in First,  he needed that early survey. Locate The archives didn’t have that 1963 study.  Internal databases of records were consulted with no  indication that the data or the punched cards existed in the 20,000+ cubic feet of storage.  A retired NORC librarian was called in who thought the cards were in storage.  Old hard-copy inventories of the materials in bulk storage were reviewed, but they were unable definitively confirm or rule out the existence of the cards.    A second former employee was contacted who recalled the existence of an older card catalog listing the holdings of some of boxes in storage.  After six weeks ten boxes were retrieved from an off-site  storage facility in Chicago.    Data Conversion    The data had to be interpreted and converted to a current machine-readable medium.  The 38 year-old cards had to be read   The data had to be converted from multiple-punched data to single punched data  National Data Conversion Institute in NYC could read the cards.   A single set of the cards existed, so the cards were hand-delivered to NY.  Complications arose in reading the “near perfect” card collection.   The card-reader needed refurbishing   The first test file was corrupted   The firm didn’t know how to spread multiple-punched data   Preservation  Ultimately, with senior NORC staff working diligently  with the Institute, a clean data file emerged. With the  data fully recovered, NORC created a final SPSS  system file with detailed labels and archived it with  the Roper Center.  If Tom Smith had not worked with these data in the  mid-1970s, the data would have remained a hidden  treasure—no finding aids pointed researchers to this  valuable dataset.  Smith and Forstrom concluded  “…it took persistent efforts, the assistance of two ex- employees, and a bit of serendipity to unearth the  data. Moreover, once recovered we had data on a  medium that was so antiquated that it took four  months of extensive efforts to convert it to a modern,  user-friendly format.”   4/14/2019

8 Data-PASS Efforts Roper is negotiating with the National Opinion Research Center (NORC) to preserve valuable datasets Odum has been working with RTI International, a private research organization located in Research Triangle Park, NC, to develop a strategy for requesting data from PROs nationwide. 4/14/2019

9 Roper Center Archive of public opinion survey data
Established in 1947 at Williams College Core historical data collections: Gallup Polls, 1936-present Fortune Magazine surveys, American Soldiers Surveys, Data files for over 15,000 surveys 4/14/2019

10 Roper Center – NARA Objective of collaboration USIA Data Collection
Recover, preserve, document and make accessible the United States Information Agency Office of Research surveys, USIA Data Collection Estimated at over 2,000 surveys Survey results contributed to formulation of US foreign and defense policy Some surveys are the only opinion surveys available from certain countries 4/14/2019

11 Roper Center – NARA Leveraging relative strengths (NARA)
structure for working with the State Department in the context of its mandate to preserve federal electronic records standards for appraising, cataloging and preserving electronic records permanent storage and file-level access for all materials related to the collection access to additional USIA records, reports and related federal government records NARA provides: structure for working with the State Department in the context of its mandate for preservation of federal electronic records standards for appraising, cataloging and preserving electronic records permanent storage and file-level access for all materials related to the collection access to additional USIA records, reports and related federal government records 4/14/2019

12 Roper Center – NARA Leveraging relative strengths (Roper)
potential flexibility in communications and approach federal agency-to-agency protocols may not be as flexible as required for a project of this type experience working with a variety of organizations to acquire data resources active migration and management of data more streamlined access to data-based materials access to related public opinion survey data from the private sector and non-federal public sector Roper Center provides: potential flexibility in communications and approach Federal government agency-to-agency protocols may not be as flexible as required for a project of this type experience working with a variety of organizations to acquire data resources active migration and management of data more streamlined access to data-based materials access to related public opinion survey data from the private sector and non-federal public sector 4/14/2019

13 Benefits to Cooperation
Preservation of valuable datasets Many researchers gain access to additional material PROs gain electronic access to previous work PROs receive digital curation assistance Potential to reduce PROs storage costs 4/14/2019

14 Barriers to Preserving Data
Contract restrictions High ingest costs Poor metadata Labor intensive operations Uniqueness requires custom solutions PRO’s lost opportunity costs Overhead associated with building relationships 4/14/2019

15 Questions from PRO Business Offices
If datasets are assets, what is their value to our PRO? Can they be used to leverage existing research or identify new areas of interest? Do datasets have value to other organizations that might be willing to pay for them? What legal and technical issues are involved? What costs are associated with dataset archival? How can we build a business case for preservation at our PRO 4/14/2019

16 Approaching PROs Do background research
Build on existing relationships Assess Different “PRO” business models Contractors for hire PROs with their own research agenda 4/14/2019

17 Early research data life cycle intervention
Assist researchers and PROs with preservation requirements during proposal process Preparing for preservation at the proposal level can ensure That: Datasets are “born digital”, making preservation affordable. And PRO business models are more affordable 4/14/2019

18 Preserving Future Studies
Funding agencies are key Ultimate owners of research data Could request and enforce archival of data Issues are easily addressed early in Life Cycle Single point of contact for many PROs 4/14/2019

19 Digital Curation Keys to Dataset Collection Development
Knowing the producers’ & consumers’ needs Educating producers on preservation requirements Early involvement in the research data life cycle Building and Maintaining relationships 4/14/2019

20 Thank You 4/14/2019


Download ppt "IASSIST 2007 Montreal, Quebec"

Similar presentations


Ads by Google