Data Publishing Workflows: Strategies and Standards

Slides:



Advertisements
Similar presentations
Introduction to DataCite Adam Farquhar PhD Head of Digital Library Technology, The British Library President, DataCite June 2010.
Advertisements

Introduction to DataCite Adam Farquhar, PhD Head of Digital Library Technology, The British Library President, DataCite June, 2010.
VO Sandpit, November 2009 Data Citation, Principles and Practice Sarah DataCite Annual Conference, 2014.
JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot CERN Library GS/SIS The Library behind the scene Opportunities for Scientific.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
Data citation from the perspective of a scholarly publisher Lyubomir Penev TDWG Data Citation Workshop, New Orleans, Oct 2011 ViBRANT.
IDENTIFIERS & THE DATA CITATION INDEX DISCOVERY, ACCESS, AND CITATION OF PUBLISHED RESEARCH DATA NIGEL ROBINSON 17 OCTOBER 2013.
ORCID Roundtable Heather Gordon CAUL President 29 July 2014.
OPEN DATA Patricia Herterich On the way to Open Science… Open Source Open Access Open Data Open Science.
JRC's Open Access (OA) Policy G. P. Tartaglia, A. Annoni, G. Merlo, F
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
Recommendation “Landing Pages” RDAP this is last-minute filler, as I only found out the day before that one of panel members couldn’t make it, so.
The Digital Curation Lifecycle Model Joy Davidson and Sarah Jones
ORCID and me: DataCite ORCID Outreach Meeting Jan Brase, Managing agent DataCite September 17th, 2011 CERN.
UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.
The Department of Energy’s Public Access Solution Giving Voice to Energy and Science R&D Results Jeffrey Salmon Deputy Director for Resource Management.
Citing Data Sets in the Literature: ORNL DAAC Practices Robert Cook, Suresh SanthanaVannan, and Daine Wright Environmental Sciences Division Oak Ridge.
SCIENCE, RESEARCH DATA, AND PUBLISHING Stewart Wills Editorial Director, Web & New Media, Science 26 February 2013.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
Scholarly communications Discussion group Linked Data Workshop May 2010.
Shaping the practice and measuring the impact of Open Science Gentner Day 2014 Patricia Herterich Humboldt-Universität zu Berlin & CERN.
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
Data Management and Accessibility S.M. Kaye PPPL Research Seminar 12/16/2013.
Software Sustainability Institute Software Attribution can we improve the reusability and sustainability of scientific software?
ODIN – ORCID and DATACITE Interoperability Network Presentation to S&C Open House January 2013 John Kaye – British Library Funded by The European Union.
DataCite – Bridging the gap and helping to find, access and reuse data Herbert Gruttemeier INIST-CNRS Paris, IPSL, 11/7/2013.
David Carr The Wellcome Trust Data management and sharing: the Wellcome Trust’s approach Economic & Social Data Service conference.
Can sharing research data raise your research profile and impact? Gerry Ryder Charles Darwin University, September 2015.
Now launched! Visit nature.com/scientificdata Honorary Academic Editor Susanna-Assunta Sansone Advisory.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
Data Citation: framing the discussion and global context Dr Simon Hodson Executive Director, CODATA Referencing data in publications: principles,
Data Citation & Digital Object Identifiers DOIs. 2 Digital Object Identifiers 101 Persistent identifier Identifies intellectual property in the digital.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
ODIN PROJECT Using Identifiers to Connect Researchers with Research ORCID and DataCite Interoperability Network John Kaye – (formally British.
Bridging the gap between data centres and publishers J. Brase ICSTI Workshop “Interactive Publications and the Record of Science February 8th, 2010.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
DOE Data Management Plan Requirements
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group.
1 Introducing the Australian National Data Service (ANDS) Research data as a scholarly output Options for data publishing and data discovery Make your.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
Data Citation Implementation Pilot Workshop
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Preparing Data Management Plans for WLCG and HNISciCloud IT International Collaboration for Data Preservation and Long Term.
Publication Ethics Webinar: Jan 2016 (Ethical) framework for author-driven publishing Dr Michaela Torkar Editorial Director, F1000Research
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
British Library Datasets Programme JISC RSP Winter School February 2011 Max Wilkinson.
Updating image To update the background image: Go to ‘View’ Select ‘Slide Master’ Select the page with the image Right click on the image and select ‘Change.
Sara Bowman Center for Open Science | Promoting, Supporting, and Incentivizing Openness in Scientific Research.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
INSPIREHEP … and data Sünje Dallmeier-Tiessen (CERN) for many collaborators in GS-SIS and IT-CIS.
Looking to find & evaluate the right research? Scopus has you covered.
Data Publication (in H2020)
NRF Open Access Statement
ACS 2016 Moving research forward with persistent identifiers
Publishing software and data
Megan Force Editor, Data Citation Index
Towards a national research information infrastructure in the Netherlands based on CERIF: challenges and opportunities Chris Baars, supervisor Electronic.
Mission DataCite was founded in 2009 as an international organization which aims to: establish easier access to research data increase acceptance of research.
Research Data Management
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Research data in library catalogues and the joint initiative of European technical libraries for data registration Jan Brase Workshop Primary data for.
Measuring Your Research Impact
Building an open library without walls : Archiving of particle physics data and results for long-term access and use Joanne Yeomans CERN Scientific Information.
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Looking to find & evaluate the right research?
Presentation transcript:

Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows Group

Outline Policy pressure Solutions across disciplines Standards Persistent Identifier Data Citation Quality Assurance, Peer Review Licensing Examples in High-Energy Physics (CERN) INSPIRE Analysis Preservation Framework Open Data Portal

Research data is a first class citizen Royal Society, 1665 and 2012

Towards Open Science Open Science Open Data & Code Open Access Open Source Open Access Open Data & Code Open Science We are here now Slide provided by Patricia Herterich, CERN

Policy pressure: STFC example https://www.stfc.ac.uk/Resources/pdf/STFC_Scientific_Data_Policy.pdf

Policy pressure: DOE example DMPs should provide a plan for making all research data displayed in publications resulting from the proposed research open, machine-readable, and digitally accessible to the public at the time of publication. …the underlying digital research data used to generate the displayed data should be made as accessible as possible to the public in accordance with the principles stated above. http://science.energy.gov/funding-opportunities/digital-data-management/

Expectations: PLOS Data Policy www.plos.org

Concerns across disciplines Datasets are… Not shared or lost Difficult to discover and access Difficult to understand > context missing Nature, 2009

How this challenge is addressed

Example: Dedicated Data Repositories www.pangaea.de

Preserving and promoting data reuse www.pangaea.de

International sharing and curation of data ww.icgc.org

ICGC – Data Publication Timeline Time limits for publication moratoriums: All data shall become free of a publication moratorium when either the data is published by the ICGC member project or one year after a specified quantity of data (e.g. genome dataset from 100 tumors per project) has been released via the ICGC database or other public databases. […] In all cases data shall be free of a publication moratorium two years after its initial release. https://icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy

Zenodo – Data Repository www.zenodo.org

How to find a data repository www.re3data.org

Example: A dedicated data journal Nature Scientific Data www.nature.com/sdata/

F1000 http://f1000research.com/

Connecting articles and data Tagged Genbank entry (genetic sequence) Slide provided by H. Koers, Elsevier. Article: doi: 10.1016/j.biortech.2010.03.063

Towards Open Science Open Science Open Data & Code Open Access Open Source Open Access Open Data & Code Open Science We are here now Slide provided by Patricia Herterich

Publish (Citable) Software

More and more examples

Published Software Papers http://openresearchsoftware.metajnl.com/

Standards

Licensing Enable others to reuse your data and software Choose the licenses or public domain dedications accordingly As “open” as possible Re-Use There are measures to demand citations to track reuse and the impact of your work If you re-use, cite the dataset yourself

  DOIs for datasets URLs are not persistent (e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5). Digital Object Identifiers (DOI names) offer a solution Mostly widely used identifier for scientific articles Researchers, authors, publishers know how to use them Put datasets on the same playing field as articles  Dataset Yancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA. doi:10.1594/PANGAEA.587840 Slides by courtesy of Dr. Jan Brase, DataCite

ORCID id www.orcid.org

Force11- Data Citation Principles Author, Publication Year, Dataset Title, Data Repository, Version, Unique Identifier - should include a persistent method for identification that is machine actionable and globally unique - should facilitate identification of, access to, and verification of the specific data that support a claim. www.force11.org

Data Citation in Practice

Quality assurance for data: peer review Products Data records in data repositories Data journals Data articles Note: standalone vs. supporting materials QA Workflows Standalone or integrated? Blind and invited peer review Open peer review Citable review reports

How to publish your data Decide which dataset should be preserved or which dataset might be of interest for others to study or reuse Are there issues which restrict the publishing process, e.g. confidentiality for patient data? Which data product? Do I have enough materials for a dedicated data article? Which journal or repository works for me? Prepare the documentation/metadata Publish and let the others know you did Cite the dataset in the resulting papers Track who used and cited your data

HEP High-Energy Physics

Research data in HEP

Research Data on INSPIRE: starting from the paper

The underlying datasets (HEPdata)

Data Citation (Tracking)

Referenced Data arXiv: 1311.1113

Code snippets

Code snippets

… and who gets the credit for sharing data?

Kyle’s profile on INSPIRE

Using author IDs for attributing credit

Excerpt from publication list on

Excerpt from publication list on Make data publications count - alongside your articles

Focusing on reproducibility and reuse Two important new tools

Capturing the complexity: Analysis Preservation Framework

Open it up: CERN Open Data Portal

How to publish your data Decide which dataset should be preserved or which dataset might be of interest for others to study or reuse Are there issues which restrict the publishing process, e.g. confidentiality for patient data? Which data product? Do I have enough materials for a dedicated data article? Which journal or repository works for me? Prepare the documentation/metadata Publish and let the others know you did Cite the dataset in the resulting papers Track who used and cited your data

Conclusions Policy pressure nationally and globally: we need data publishing solutions Considerable advancements in many disciplines  We learn from best practices HEP with commitment to data preservation and open data releases First tools are available to support data preservation and data publishing

Towards Open Science Open Science Open Data & Code Open Access Open Source Open Access Open Data & Code Open Science We are here now Slide provided by Patricia Herterich