Models of Data Archiving Services The Results of an International Survey Chuck Humphrey University of Alberta IASSIST 2003: Strength in Numbers
Outline Context for the Survey The Survey Instrument The Sampling Frame The Results Applying the Results in Canada
The National Data Archive Consultation (NDAC) The National Data Archive Consultation was a joint initiative of the National Archives and the Social Sciences and Humanities Research Council began in October The Consultation consisted on two phases.
The NDAC: Phase I The objective of Phase I was to demonstrate the need for national data archiving services in Canada.
The NDAC: Phase I Yvette Hackett, “A National Research Data Management Strategy for Canada: The Work of the National Data Archive Consultation Working Group,” IASSIST Quarterly, Vol. 25 (3), pp
The NDAC: Phase I A report was completed May 2001 and approved by the Council of the Social Sciences and Humanities Research Council and the National Archivist in June This approval gave the green light to undertake Phase II.
The NDAC: Phase II One objective of Phase II was to recommend the institutional form that a national data archiving service should take in Canada. A Research Sub-Committee was formed to collect evidence about existing institutional models.
The Research Sub-Committee The members of the Research Committee consisted of Yvette Hackett, Wendy Watkins, Alexandra Bal, Ernie Boyko, Douglas Hodges, Geoffrey Rockwell and Chuck Humphrey.
The Research Sub-Committee A team of three graduate assistants supported the sub-committee: Allison Sivak, Stephen Carney, Bojan Korenic
The Research Sub-Committee A literature review was conducted about data archiving services and institutions. A survey of existing international organizations providing data services and archiving was administered.
The Research Sub-Committee In-depth interviews were undertaken with a few directors of long-standing national data archives.
The Survey The survey of institutions consisted of four stages: – A template of institutional characteristics were identified; – A sampling frame was built; – Information was gathered about those in the sample from Web sites and annual reports;
The Survey – Templates were then ed to contacts at each institution in the sample for verification and in some instances for completion of missing information.
The Template Sixteen main characteristics that were identified about the institutional nature of data archives and data services Many of these sixteen consisted of multiple indicators – For example, the budget had both the categories of expenditures as well as the total size of the budget
The Main Characteristics Institutional source of mandate Institutional placement Institutional partnerships Funding sources Budget Total number of staff Organizational structure Scope of mandate Acquisition priorities Preservation standards & procedures Types of users Types of services Performance metrics Protection of intellectual property Privacy & confidentiality Technology use
The Sampling Frame Thirty-six institutions were identified by the members of the Research Sub- committee as leading international organizations in data archiving and data services in the social sciences and humanities. – Twenty-eight in the social sciences; – Eight in the humanities.
The Sampling Frame Twenty-two participated in the follow-up survey; Only have information from the websites for the remaining fourteen.
The Results Appear in Appendix C: “International Models of Data Archiving Services” of the Final Report
The Results A typology of organizational models was developed from the survey results. Three generalized models were identified that summarized groupings of the characteristics from the template.
The Results While no single existing institution is necessarily described completely by one of these three models, the typology offers a fair summary of the current mix of organizations.
The Three Models The Topical Data Archive The Agency-based Data Archive The Comprehensive Research Data Archive
The First Model
Topical Data Archive Narrowly focused mandate to serve a specific field or a few closely related disciplines; Located in a university setting, usually an individual academic department; Mandate from host university;
Topical Data Archive Financial base depends on in-kind support from host; Additional financial resources derived from contract work and user fees; Organizational structure consists of one unit staffed by up to 5 FTE and an equal number of PT student employees;
Topical Data Archive The level of data processing is kept to one or two steps; Rely on a few widely accepted formats to minimize amount of processing; Dissemination strategy depends on self- sufficient users retrieving products over the Internet;
Topical Data Archive Off-the-shelf technology is used to support the unit; The unit tends to be the only one of its kinds in a country and is recognized by other institutions as the location for this service.
The Second Model
Agency-based Data Archive Located in an agency likely to be a national institute or government department with a strong research mandate; Receives its mandate from the agency in which it is located, but this mandate is only part of the host agency’s larger mandate;
Agency-based Data Archive Institutionally funded by the host agency, although it may supplement its financial base through cost-recovery charges or user fees; Some operational expenses absorbed by the host agency, such as overhead costs for administration;
Agency-based Data Archive Organizational structure consists of one to three units staffed from 10 to 30 FTE; Number of PT employees is dependent on the proximity of the service to a university and the availability of student labour; The level of data processing involves three or four steps;
Agency-based Data Archive Data received in a variety of formats and converted to the data archive’s standards (these standards are published on the Internet); Access to data available through the Internet and a mediated service, which is required because of confidentiality restrictions on some data;
Agency-based Data Archive One staff position is dedicated to negotiating and establishing privacy assurance on all data; Reference services are offered to assist users in locating data and in fielding questions from users; May not be the only data archive in a country;
Agency-based Data Archive But is recognized as the authoritative service for the type of research associated with its host agency; Staff are recognized nationally and internationally as experts; Senior staff involved in international projects and exchanges relating to common data and research interests.
The Third Model
The Comprehensive Research Data Archive Mandate arises from a shared common interest of a variety of communities, including academic researchers, policy analysts, archivists, librarians, and producers of data; A legislated mandate exists to articulate the interests of these communities;
The Comprehensive Research Data Archive Recognized as a national institution responsible for the principles established in legislation; May have more than one physical location; Financial base is institutional funding that may be supplemented with major contracts with research agencies and data producers;
The Comprehensive Research Data Archive Organizational structure consists of five or six units that support administration, archives/collections, technical support, research & development, reference services/user support, and education outreach; Up to 60 FTE and another 20 PTE;
The Comprehensive Research Data Archive The level of data processing involves up to six steps; Data in a wide range of formats are accepted and converted to an official standard; These standards are published on the Web and workshops are conducted for data producers;
The Comprehensive Research Data Archive A sub-unit is responsible for privacy issues and intellectual property to ensure rights are protected while also enabling valuable research; Conditions of access are negotiated by this sub-unit; A reference unit helps users to access and work with data;
The Comprehensive Research Data Archive Workshops are conducted to prepare users to work with data and to promote data literacy skills generally; Works closely with other national institutions to ensure the preservation of research data; Staff are nationally and internationally recognized;
The Comprehensive Research Data Archive Staff engage in national and international projects, including data exchanges on behalf of their country; The research & development unit is recognized for its adaptation and use of technology to push back the frontiers in the preservation and access to data.
A Recommended Model
A Canadian Model Established by legislative mandate & reporting to Parliament through the Ministry of Industry or Heritage or a combination of both; Centrally funded through Parliament; Structured as a network of distributed service points with a central service facility;
A Canadian Model The central facility would be responsible for data management, standards development, and data preservation; The service points would be responsible for assisting with the deposit of data, accessing data, and training and user consultation;
A Canadian Model The service points would be located at universities or other institutions interested in providing these kind of data services (a model similar to the Depository Service Program between government publishing and Canadian libraries);
A Canadian Model A management board would oversee the operation of this National Data Archive Network and consist of representatives from the regions in Canada as well as various stakeholders that manage, use, and produce research data;
A Canadian Model Furthermore, this agency would enter into formal co-operative working relationships with other national institutions, such as the Library and Archives of Canada and Statistics Canada;
A Canadian Model The agency would be given authority to act on behalf of the Government of Canada in international negotiations related to research data and its management standards and practices.
The Next Step We await the outcome of the amalgamation of the National Archives and National Library.