DASISH WP4 Data Archiving Digital Services Infrastructure for Social Sciences and Humanities WP4 Data Archiving Vigdis Kvalheim Norwegian Social Science Data Services (NSD) IASSIST Toronto 2014
DASISH PM Distribution and Partners CESSDA NSD , Norwegian Social Science Data Services ( 15 PM) FSD, Finish Social Science Data Archive (2 PM) SND, Swedish National Data Services (5 PM) GESIS - Leibniz Institute for the Social Sciences, (6 PM) CLARIN MPG , Max Planck Institute for Psycholinguistics (6 PM) UiB, University of Bergen (7 PM) DARIAH OEAW, Austrian Academy of Sciences (5 PM) DANS, Data Archiving and networked services (5 PM) UGOE, Goettingen University (6 PM) ESS CITY, City University, London (2 PM) SHARE CentERdata, The Netherlands (7 PM) NSD©2014
Archiving and Curation - Access and Sharing “DASISH will rely on common data services offered by a network of strong data centres with national backing” Purpose: Assess and discuss the state of data and deposit services in the SSH domain and identify gaps, bottlenecks and requirements Develop and recommend a requirements for deposit services which handle various types of data Work out and suggest policy rules and guidelines for proper data management, that can be taken up by data infrastructures providing long term preservation and curation services NSD©2014
WP4 Sub-tasks Task 4.1: State-of-the-art of data preservation and curation Current state of data preservation and curation Policies and guidelines Requirements specification Task 4.2: Assessment of deposit services Analyze and describe Recommendations Service level agreements Task 4.3: Deposit service convergence Investigate existing deposit offers. Service Level Agreements PR and training material Task 4.4: Recommendation of a set of policy rules Assess the scope of policy rules and their requirements Establish policy rules Implement and test the policy framework NSD © 2012
D4.1 and D4.2: Fact Sheets – First Year http://dasish.eu/publications/projectreports/D4.1_-_Roadmap_for_Preservation_and_Curation_in_the_SSH.pdf http://dasish.eu/publications/projectreports/D4.2_-_Report_about_Preservation_Service_Offers.pdf
Five Level Trust Maturity Model (D4.1) Trust Maturity Level Key Guideline Guideline Source 1. OAIS Core Conformance Support OAIS Information Model. Acknowledge OAIS Archive responsibilities. OAIS Information Model: Section 2.2 of CCSDS 650.0-M-2 / ISO 14721:2012. OAIS Archive Responsibilities: Section 3.1 of CCSDS 650.0-M-2 / ISO 14721:2012. 2. Initial self-assessment, PLATTER/DRAMBORA Self-assessment through PLATTER and DRAMBORA. PLATTER Key Self-assessment questions. DRAMBORA Key Self-assessment questions. 3. Peer-reviewed self-assessment I, DSA Peer-reviewed self-assessment I, DSA. Data Seal of Approval Guidelines. Support: NESTOR criteria 4. Peer-reviewed self-assessment II, ISO 16363/DIN 31644 Conformance to the OAIS Detailed Functional Model. Self-audit with the ISO 16363. Alternatively, self-audit with DIN 31644. OAIS Detailed Functional Model: Section 4.1 of CCSDS 650.0-M-2 / ISO 14721:2012. CCSDS 652.0-M-1 / ISO 16363:2012. DIN 31644 5. Certification and Optimization External review and formal certification in conformance with the ISO 16363. Alternatively, with DIN 31644. CCSDS 652.0-M-1 / ISO 16363:2012. DIN 31644. NSD©2014
Administrative context Archival storage and preservation DASISH Data Archive Description Sheet Nr Functionality Administrative context 1 Funding 2 Depositor Agreements 3 Usage Agreements , Code of Conduct to be signed 4 Policies in place 5 Rights on data claimed by the archive 6 Data Curation strategy Pre-Ingest 7 Primary community in focus for deposits 8 Secondary communities accepted for deposits Ingest 9 Formats accepted and curated 10 Formats accepted and not curated 11 Metadata formats accepted 12 User-based ingest Nr Functionality Archival storage and preservation 13 Size of current archive in TB 14 Size of current archive in other means (collections, files, etc.) 15 Maximal deposit size in TB 16 Long term guarantees / standards of trust 17 Checks on quality / quality control Dissemination 18 Costs / Conditions for Access 19 Tools / Interfaces used for Access NSD©2014
Survey on data deposit service arrangements The questionnaire; based on the results and recommendations of D4.1, D4.2 and the DADS The purpose; to gain broader and more detailed insights about the organization, the state of and the degree to which data archive solutions exists across Europe and across scientific fields. Point of departure for the next steps: having in-depth interviews with selected data archive services NSD©2014
Survey key findings Background Archive service level Merk, background of respondents: fleralternativ svar Designated community: “Many data archive services were not able to define one specific community. Instead, they are offering service to a broad range of disciplines” NSD©2014
Survey key findings - Organizational context Key requirement compliance indicators: Documentation on deposit agreements, usage agreements and preservation policies…..Data Seal of Approval (DSA), Service Provider requirements among others.. Overall, 75 % of the services do have a licence or depositor agreement North-Western Europe the percentage of respondents confirming the existence of deposit/license agreement is somewhat higher (85 %) than South and East (53 %) Code of conduct / usage agreements are in place among 82 % of the North-Western Europe respondents; 41 % among South and East Preservation policy are in place among 62 % of the North-Western Europe respondents; for South and East it is 29 % NSD©2014
Survey key findings - Level of Trust 25 of 46 respondents indicate that their services have undertaken activities to determine their trustworthiness 15 respondents from existing data archive services indicate that these services have not undertaken any action in this respect yet Among the respondents from North Western Europe, 65 % mentioned certification activities (half of them on the level of peer-reviewed DSA-assessment or higher); 27 % from Southern and Eastern Europe NSD©2014
Survey findings - Self-reported maturity level of Data Archive Services We asked the respondents if they are satisfied with the maturity level of several aspects of their data archive service. We split this item into 5 sub-items (related to the OAIS reference model) NSD©2014
The way ahead – some suggestions Further steps; the selection and recommendation of appropriate data service are dependent on further analyses of survey results The next step is to complete the DADS for all or the most promising data services, except those already included, based on the competed survey and with the help of the data infrastructure/deposit service itself. D4.3: List of recommended data services (trusted centres), will be a based on the completed and verified DADS – First step…feed into world wide registry Updated version of the ‘Survey Report’ including information on the less mature, emerging/aspiring data archives with institutional/national backing, that to various extent meet requirements recommended in 4.1, 4.2 and 4.3. NSD©2014
Policy Rules for Data Management Deliverable in Month 33: A Comprehensive Set of Policy Rules for Data Management Partners: NSD, UGOE, FSD, MPG, UiB, GESIS Procedure: Data Policy Description Sheet (DPDS) Assess the scope of policy rules and their requirements in collaboration with initiatives in Europe and the US Establish policy rules in close collaboration with experts and emerging collaborative data services infrastructure NSD©2014
IFDO Survey on Research Funders’ Data Policies Country-by-country information on current institutional research data policies Main focus on formal data policies Existence, contents and quality of data sharing requirements Type of linkage to funding
IFDO Data Policy Description Sheet Topic Nr. Topic Item Background information 1 Name of funder 2 Homepage General policy 3 General conditions 4 Data Management Plan (DMP) for Proposal 5 Data Timeframe 6 Guidance 7 Compliance/Monitoring 8 Funding / Costs 9 Scope of policy Standards/Documentation 10 Documentation Requirements 11 Data Standards 12 Metadata Standards Access and preservation 13 Data Preservation 14 Scope of preservation provisons 15 Data Access / Sharing 16 Data Access / Sharing incentives 17 Data Sharing Rights (IPR) 18 Data Embargo / Data Retention 19 Data Sharing requirements / timeframe 20 Designated Data Repository 21 Data Repository Supported 22 Institutional (data repository) Requirements Publications 23 Open Access to Publications 24 Publication Repository Specified 25 Publication Repository Supported Resources/References 26 Date of policy 27 Policy link 28 NSD©2014
Data Policy Description Sheet - example Input, short. Input, free text (elaborate from previous column) Direct quotes / paraphrased information from policy Links to documents containing quote(s) / paraphrase(s) - Research Council of Norway http://www.forskningsradet.no/en/ Well described Applies to all projects funded totally or partly by the Norwegian Research Council Suggested / Not stated Refers to 'progress report', not data management plan. "With regard to the use of research infrastructure for research involving the processing of large amounts of data (time series, registries, scientific collections, etc.), the progress report shall also show how the data generated are safeguarded through large-scale storage resources, data handling tools and dedicated point-to-point network connections for particularly demanding applications." R&D Project Agreement Document Not stated Suggested Applies to all research data "As a general rule, the formal applicant to the Research Council is to be a Norwegian institution/enterprise with a specific individual designated as the project administrator". General application requirements Suggested / Recommended All data and documentation to be deposited at designated data centre "Unless otherwise agreed with the Research Council, copies of all research-generated data, including requisite documentation, shall be transferred from the Project Owner to the Norwegian Social Science Data Services. This shall be carried out as soon as possible and at the latest two years following the conclusion of the project period. See quote in input nr 13 Not stated/Suggested Indirectly and externally, through NSD licence/deposit form. Required All research-generated data; as soon as possible, max. two years. Norwegian Social Science Data Services (NSD) Indirectly, NSD (financial support) Well described / Required "Scientific publications based on R&D projects funded wholly or partially by the Research Council must be made openly accessible to all interested parties". The Research Council's Principles for Open Access to Scientific Publications 2009, 2012 Requested information: First column: Input, short: ‘controlled vocabulary‘ (select one or more of a set of pre-defined categories: Not stated/ Suggested/ Well described // (and/or) Recommended/Required Second column: if possible, add free text to elaborate from previous column Third column: If possible/available, add direct quotes / paraphrased information from policy Fourth column: if possible, add direct url-inks to documents containing quote(s) / paraphrase(s) NSD©2014
Common Challenges and needs Looking at the overall picture: In many countries high-level policy recommendations has not yet led to specified national policies by key research funders. If SSH funders has formulated open access policies, they are likely to be soft recommendations without well defined requirements and guidance to follow-up and implementation of recommendations.
Common Challenges and needs Looking at the overall picture: it is still unusual to enforce projects to open their data - we need to move form policy statements to policy enforcements and monitoring too many countries lack sufficient data sharing (trusted centers) infrastructures – we need to move from short-term funding to long-term funding and business models that build trust, confidence and incentives to contribute to the data infrastructure. Moving towards policy based data archiving!
Thank you for listening! NSD©2014