Data Publication (in H2020)

Slides:



Advertisements
Similar presentations
UCL Library Services and Research Data Management – a case study Martin Moyle UCL Library Services ODE Workshop, LIBER Conference, 27 June 2012.
Advertisements

JRC's Open Access (OA) Policy G. P. Tartaglia, A. Annoni, G. Merlo, F
Data Publishing Workflows: Strategies and Standards
WORLD BANK Publications The reference of choice on development The Promise, and Challenge, of Implementing Open Access at the World Bank Carlos Rossel.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
The access to information divide: Breaking down barriers Bas Savenije Director General KB, National Library of the Netherlands Stellenbosch Symposium /
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Responsible Data Use: Copyright and Data Matthew Mayernik National Center for Atmospheric Research Version 1.0 Review Date.
Copyright and Data Matthew Mayernik National Center for Atmospheric Research Section: Responsible Data Use Version 1.0 October 2012 Copyright 2012 Matthew.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
Data Citation Implementation Pilot Workshop
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
ICSU-WDS & RDA Data Publication Services WG. 2 Linking Research Data and the Literature: why? Why link? 1.Increase visibility & discoverability of research.
Preservation e-Infrastructure IG Description: help ensure preservation of needed data succeeds Goals: foster worldwide collaboration; ensure consistency.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
Helmholtz Open Science Webinars on Research Data Webinar 34 – 6 / 11 April 2016 Dr. Birgit Schmidt Niedersächsische Staats- und Universitätsbibliothek.
NRF Open Access Statement
Jeff Moon Data Librarian &
Stuart J. Chalk, Department of Chemistry University of North Florida
Data Sharing entails shared responsibilities
The OpenAIRE Infrastructure
The OpenAIRE Catalogue of Services
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
EPSRC research data expectations and research software management
Paolo Budroni, University of Vienna
FAIR Metadata RDA 10 Luiz Olavo Bonino – - September 21, 2017.
A Publisher’s Perspective
Libraries as Data-Centers for the Arts and Humanities
FAIR Sample and Data Access
Donatella Castelli CNR-ISTI
Research software best practices: Transparency, credit, and citation
Data Ingestion in ENES and collaboration with RDA
Ways to upgrade the FAIRness of your data repository.
ACS 2016 Moving research forward with persistent identifiers
FAIR Metrics RDA 10 Luiz Bonino – - September 21, 2017.
Publishing software and data
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Institutional role in supporting open access, open science, open data
FAIR Data Management, Trustworthy Digital Repositories and Business Continuity / Disaster Preparedness
Vision for Open Science
Jay Bhatt Drexel University Libraries
Identifiers Answer Questions
Making Annotations FAIR
Digital Curation Centre
OPEN DATA – F.A.I.R. PRINCIPLES
Open Access to your Research Papers and Data
Nikhef RDM Policy – first experiences
EOSCpilot Skills Landscape & Framework
Introduction to Research Data Management
Metadata for research outputs management Part 2
OpenML Workshop Eindhoven TU/e,
Agenda welcome and goals (Peter)
Mission DataCite was founded in 2009 as an international organization which aims to: establish easier access to research data increase acceptance of research.
Research Data Management
A Funders Perspective Maria Uhle Co-Chair, Belmont Forum Directorates for Geosciences, US National Science Foundation.
Introduction to the MIABIS SOP Working Group
Interoperability – GO FAIR - RDA
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
How to Implement the FAIR Data Principles? Elly Dijk
Bird of Feather Session
Automatic evaluation of fairness
eScience - FAIR Science
Research data lifecycle²
Persistent identifiers for instruments (PIDINST) working group
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Supporting Open Research
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Data Publication (in H2020) Dr Sünje Dallmeier-Tiessen CERN Madrid, November 2016

Agenda Introduction Research data Relation to H2020 Data Publishing Examples, developments and lessons learnt from the “real world” General purpose and disciplinary repositories Adding journals to the mix Adding ”reproducibility workflows” to the mix Lessons learnt

Research Data What is it? How does it look like? Does it hurt?

Funders’ policies WELCOMES Open Access to scientific publications as the option by default for publishing the results of publicly funded research; […] RECOGNISES that the full scale transition towards Open Access should be based on common principles such as transparency, research integrity, sustainability, fair pricing and economic viability; and […] CALLS on Member States, the Commission and stakeholders to remove financial and legal barriers, and to take the necessary steps for successful implementation in all scientific domains, including specific measures for disciplines where obstacles hinder its progress. See for example: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Mandatory Data Management Plans (DMPs) http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Journals’ policies Springer Nature Data policy http://www.nature.com/news/announcement-where-are-the-data-1.20541

Data Publishing Paradigms

Data Publishing Concepts Standalone Data (Repository) Traditional article-data linking Data articles/journals Data Article Data Article Data

Data Publishing components (RDA endorsed) [2] DOI: 10.1007/s00799-016-0178-2

Another data publishing perspective: establishing context [2] DOI: 10.1007/s00799-016-0178-2

The FAIR Guiding Principles I To be Findable: F1. (meta)data are assigned a globally unique and persistent identifier F2. data are described with rich metadata (defined by R1 below) F3. metadata clearly and explicitly include the identifier of the data it describes F4. (meta)data are registered or indexed in a searchable resource To be Accessible: A1. (meta)data are retrievable by their identifier using a standardized communications protocol A1.1 the protocol is open, free, and universally implementable A1.2 the protocol allows for an authentication and authorization procedure, where necessary A2. metadata are accessible, even when the data are no longer available FAIR data http://www.nature.com/articles/sdata201618

The FAIR Guiding Principles II To be Interoperable: I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles I3. (meta)data include qualified references to other (meta)data To be Reusable: R1. meta(data) are richly described with a plurality of accurate and relevant attributes R1.1. (meta)data are released with a clear and accessible data usage license R1.2. (meta)data are associated with detailed provenance R1.3. (meta)data meet domain-relevant community standards http://www.nature.com/articles/sdata201618

Various solutions Disciplinary and institutional repositories exist Choose partners: re3data.org Article-Data linking Now easier with Datacite and CrossRef Data/software journals already exist With partner repositories With repository recommendations

Data Publishing solutions Examples, there are more!

re3data.org

Data Publishing Concepts Standalone Data (Repository) Traditional article-data linking Data articles/journals Data Article Data Article Data

All disciplines, institutions Needs replacement with Zenodo3 Zenodo.org

All disciplines, institutions Figshare screenshot Figshare.com

Established discipplinary databases: life sciences EBI database screenshot http://www.ebi.ac.uk/ena

Established disciplinary databases: earth & environmental sciences Pangaea screenshot pangaea.de

dataverse.org

Data Publishing Concepts Standalone Data (Repository) Traditional article-data linking Data articles/journals Data Article Data Article Data

http://www.nature.com/sdata

Discipline specific data journals Add another data journal, e.g. ESSD? http://www.earth-system-science-data.net/

http://joss.theoj.org/

Considerations for choosing the “right service” Future purpose: reuse, reproducibility, preservation Metadata (standards) Quality Dependencies (software, methods) Versioning Visibility, Discoverability (cf. FAIR principles) Referencing, data citation capability for all outputs Persistent links, sustainability

In practical terms Discuss with researchers What are the needs of the group/community Are there existing services or is there are need for more? Re3data.org Don’t shy away from contacting data centres or services directly Check out what community publishers do Recommended repositories? Discuss with partners in computer centre and/or community meetings What do they do and plan to do; anything you can contribute to or profit from

Moving beyond the individual elements Opening data publishing to reproducible workflows

http://www.economist.com/news/leaders/21588069-scientific-research-has-changed-world-now-it-needs-change-itself-how-science-goes-wrong http://www.economist.com/node/15579717

http://www.nature.com/news/reality-check-on-reproducibility-1.19961

Conditions very discipline specific Reproducibility Repeatability Replicability Reproducibility Reusability Repurposing In order to reuse/repurpose results, you sometimes have to reproduce the original results first (to understand the exact details) An article about computational science in a scientific publication is not the scholarship itself; it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. (http://sepwww. stanford.edu/doku.php?id=sep:research:reproducible :seg92) We can reserve the term "replicability" for the regeneration of published results from author provided code and data Reproducibility is a more general term, implying both replication and the regeneration of findings with at least some independence from the code and/or data associated with the original publication. Both refer to the analysis that occurs after publication. A third term, “repeatability,” is sometimes used in place of reproducibility, but this is more typically used as a term of art referring to the sensitivity of results when underlying measurements are retaken To summarize, we need replicability, in part, to resolve differences in outcomes that arise from reproduced computational results, regardless of whether the experiments have been repeated. Conditions very discipline specific

To reproduce or reuse research results a researcher needs… More than “just” the article Context, documentation Links to related research objects: data, code, workflows Understandable method, processing, software etc. Steps taken during the research process (versions)

Research Lifecycle http://www.jisc.ac.uk/whatwedo/campaigns/res3/jischelp.aspx

Seamless integration across the research lifecycle Who? When? Where? http://doi.org/10.5281/ZENODO.30799 ? http://orcid.org/0000-0002-4695-7874 http://doi.org/10.5281/ZENODO.30800 654039 http://cordis.europa.eu/ project/rcn/194927_en.html Slide credit to Trisha Cruse, Datacite http://doi.org/10.13039/501100000780

https://benchling.com/ Docker http://jupyter.org/ https://benchling.com/ https://osf.io/

Example from CERN: CERN Open Data and CERN Analysis Preservation Future purpose: reuse, reproducibility, preservation What are the components of an analysis (and where are they stored now) How much do these components vary within the collaboration How is quality defined What are the dependencies (software, methods) Versioning Linking Size (10-15TB per analysis) See CERN presentation later

Future Big challenge is adoption  needs all of us to work together We can help with data curation and services, i.e. guiding researchers to the right services But we need your expertise to make it an intrinsic process for researchers Integrated in publishing process Link objects/resources (DataCite!) Give data more ❤️ and visibility – make it discoverable

Backup slides

References THOR project; https://project-thor.eu/ ORCID: orcid.org All icons are kindly provided by freeicon via flaticon

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf