Adoption of Data Citation Outcomes by BCO-DMO Cynthia Chandler, Adam Shepherd, David Bassendine Biological and Chemical Oceanography Data Management.


Similar presentations
Data Provenance and Attribution for Published Datasets The Challenge and the reality check April 9-10, 2009 National Academy of Sciences, Woods Hole, MA.

Visualizing Fitness for Purpose Bob Groman and Dicky Allison Biological and Chemical Oceanography Data Management Office Woods Hole Oceanographic Institution.
Biological and Chemical Oceanography Data Management Office 1 of 12 An Introduction to the Biological and Chemical Oceanography Data Management Office.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
Presented by DOI Create: TERN as a use-case Siddeswara Guru
Data Management Practices: BCO-DMO’s Successes and Challenges Bob Groman BCO-DMO Woods Hole Oceanographic Institution NERACOOS/NeCODP Data Management Workshop.
1 Chuck Koscher, CrossRef New Developments Relating to Linking Metadata Metadata Practices on the Cutting Edge May 20, 2004 Chuck Koscher Technology Director,
Agenda: DMWG SM policy status ESIP meeting recap Reminder - DM Webinar Series New and updated web pages on DM website Metadata Training Sessions CDI meeting.
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
Linking resources Praha, June 2001 Ole Husby, BIBSYS
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
Data Citation Working Group P6 23 nd Sep 2015, Paris.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
AQUATIC COMMONS INITIATIVE: a model for resource sharing in marine and aquatic sciences - presentation to IODE XIX, AQUATIC COMMONS INITIATIVE: a model.
Biological and Chemical Oceanography Data Management Office slide 1 of 19 CAMEO Data Management Bob Groman Biological and Chemical Oceanography Data Management.
NOAA Data Citation Procedural Directive 8 November 2012 DAARWG.
“Dynamic” Data at BCO-DMO Biological and Chemical Oceanography Data Management Office (BCO-DMO) Shannon Rauch -- Danie Kinkade --
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
Toward Adoption of RDA Outcomes by US Ocean Science Data Repositories Cynthia Chandler, Bob Arko, Adam Shepherd.
1 Interactions between the Marine Data Harmonization IG and Data Citation WG.
Breakout Session 2.2: A sustainable GEO Information System of Systems Chair: Lorenzo Bigagli Rapporteur: Greg Yetman.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
Data Citation Implementation Pilot Workshop
Data Citation Dataverse Mercè Crosas Chief Data Science and Technology Officer, IQSS, Harvard Workshop: Data Citation.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
RDA/US Adoption Seed Projects RDA/US is partnering with four groups as part of the MacArthur 2016 Adoption Seeds program Bringing visibility to food security.
SAIL 2011: Into the I of the Storm; Information Resources Undergo a Sea Change Texas A&M University at Galveston April 5, 2011 – April 8, 2011 Data Management.
Biological and Chemical Oceanography Data Management Office slide 1 of 10 U.S. GEOTRACES Data Management Cyndy Chandler BCO-DMO ~ WHOI 23 September 2008.
Biological and Chemical Oceanography Data Management Office slide 1 of 10 The Biological and Chemical Oceanography Data Management Office (BCO-DMO) Cyndy.
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
Development and Management of e-Repositories April 2013 IODE Project Office Oostende, Belgium Future Repository Trends: Repositories and Published.
Approaches to Making Data Citeable Recommendations of the RDA Working Group Andreas Rauber, Ari Asmi, Dieter van Uytvanck Stefan Pröll.
Acknowledgments Funding provided by the Jewett Foundation Introduction Data collected in ocean sciences, whether generated from research or operational.
RDA Cataloging and DOI Assignments for NOAA Technical Publications NOAA Central Library October 2015.
Identifiers and Citation
NRF Open Access Statement
Vienna University of Technology
RDA WG on Dynamic Data Citation
Overview of WGs, IGs and BoFs
Moving Biomedical Big Data Sharing Forward An adoption of the RDA Data Citation of Evolving Data Recommendation to Electronic Health Records Leslie McIntosh,
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
WHY? - Found initiative while case statement preparation
Linked Data for Field Deployments
Promoting and Preserving FIU Research and Scholarship
OceanDocs Digital Repository of Marine Science Research Outputs
Susanna-Assunta Sansone, Rebecca Lawrence and Simon Hodson
A Publisher’s Perspective
An Overview of Data-PASS Shared Catalog
Federation of Earth Science Information Partners (ESIP)
Data and Data Management: Introduction to the BCO-DMO
AGU Paper Number: IN43B-1697 Evolving a NASA Digital Object Identifiers System with Community Engagement Lalit Wanchoo1 and Nathan.
Identifiers and Citation
Policy and publishing developments for sharing data and code
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Linking persistent identifiers at the British Library
VI-SEEM Data Repository
NASA Technical Report Server (NTRS) Project Overview April 2, 2003
New input for CEOS Persistent Identifier Best Practices
Research Data Management
Research Data Alliance (RDA) 9th WG/IG Collaboration Meeting: Repository Platforms for Research Data (RPRD) Interest Group 13nd June 2018 Co-Chairs:
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Jisc Research Data Shared Service (RDSS)
Bird of Feather Session
A Brief Update on the Activity of the RDA FAIR Data Maturity Model Working Group – An action item from WGISS-46 Ge Peng North Carolina State University,
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Persistent identifiers for instruments (PIDINST) working group
The Data Management Plan (DMP) and your NSF proposal
Presentation transcript:

Adoption of Data Citation Outcomes by BCO-DMO Cynthia Chandler, Adam Shepherd, David Bassendine Biological and Chemical Oceanography Data Management Office Woods Hole Oceanographic Institution and Blue Dot Labs Chandler, CL; A. Shepherd; D. Bassendine (2016) “Adoption of Data Citation Outcomes by BCO-DMO”. Presented at Research Data Alliance Plenary Meeting 8. 16 September 2016. Denver, CO. This brief presentation was part of a series of adoption reports presented during an RDA P8 plenary session. 15-17 September 2016

A story of success enabled by RDA An existing repository ( Marine research data curation since 2006 Faced with new challenges, but no new funding e.g. data publication practices to support citation Used the outcomes from the RDA Data Citation Working Group to improve data publication and citation services

BCO-DMO curated data are Served: (URLs, URIs) BCO-DMO is a thematic, domain-specific repository funded by NSF Ocean Sciences and Polar Programs BCO-DMO curated data are Served: (URLs, URIs) Published: at an Institutional Repository (CrossRef DOI) Archived: at NCEI, a US National Data Center WHOAS: Woods Hole Open Access Server Example: Linked Data URI: NCEI: DOI: Also harvested by and discoverable from DataONE for Linked Data URI:

BCO-DMO Dataset Landing Page (Mar ‘16) The larval krill dataset with measurements from 2001 and 2002. The full dataset has been archived at NCEI and assigned an accession number. Data used for a publication may only include a portion of the data. Ideally the appropriate subset of data would be published out and assigned a DOI, enabling subsequent retrieval of the exact data used for a publication.

Initial Architecture Design Considerations (Jan 2016)

Modified Architecture (March 2016) Opportunity to update our information model

BCO-DMO Data Publication System Components BCO-DMO publishes data to WHOAS and a DOI is assigned. As of June 2016, the BCO-DMO architecture supports data versioning. In the current BCO-DMO data management system architecture data are served via the BCO-DMO website, with parallel options for publishing data at the WHOAS Institutional Repository and archiving data at the appropriate NCEI national data center. In the current BCO-DMO system architecture a Drupal content management system provides access to the metadata catalog and direct access to data via a URL. The Drupal MySQL content is published out in a variety of forms one of which is Linked Data. When the data are declared ‘Final, no updates expected’, a package of metadata and data can be exported from the BCO-DMO system and submitted to the Woods Hole Open Access System (WHOAS). The WHOAS is an Institutional Repository (IR) hosted by the WHOI Data Library and Archives (DLA) which is part of the larger Marine Biological Laboratory and Woods Hole Oceanographic Institution (MBLWHOI) Library system located in Woods Hole, MA. A Digital Object Identifier (DOI) is assigned when the package is deposited in WHOAS, with reciprocal links entered at BCO-DMO and WHOAS to connect the package at WHOAS with the record in the BCO-DMO catalog. The DOI resolves to the WHOAS dataset landing page, and a Linked Data URI resolves to the BCO-DMO landing page for the same dataset. However, the current system only supports one version of a dataset. NCEI is the appropriate National Data Center in the US for ocean research data WHOAS is the local OAIS-compliant institutional repository BCO-DMO publishes data to WHOAS and a DOI is assigned. The BCO-DMO architecture now supports data versioning.

BCO-DMO Data Citation System Components Data managed by BCO-DMO are published at WHOAS (the Institutional Repository (IR) curated by the MBLWHOI Data Library and Archives (DLA)), archived at NCEI and harvested by DataONE (an NSF funded harvesting system for environmental data). New data version assigned a new DOI (handle is versioned if only metadata changes) New capability (implemented): procedure: when a BCO-DMO data set is updated … A copy of the previous version is preserved Request a DOI for the new version of data Publish data, and create new landing page for new version of data, with new DOI assigned BCO-DMO database has links to all versions of the data (archived and published) Both archive and published dataset landing pages have links back to best version of full dataset at BCO-DMO BCO-DMO data set landing page displays links to all archived and published versions

BCO-DMO Data Set Landing Page Versioned dataset example: original indicates the metadata only changed DOI: 10.1575/1912/6421 Dataset URL: Maas - Pteropod respiration rates LOD URI: Data published at WHOAS: doi: 10.1575/1912/bco-dmo.651474

Published dataset DOI NSF award numbers BCO-DMO dataset URI (published out as Linked Data)

BCO-DMO Data Set Landing Page LOD URI: Data published at WHOAS: doi: 10.1575/1912/bco-dmo.651474 Linked from BCO-DMO dataset landing page to:

Linked to Publication via DOI Linking data sets with publications using DOIs

New Capabilities … BCO-DMO becoming a DataONE Member Node

New Capabilities … BCO-DMO Data Set Citation LOD URI: BCO-DMO Dataset landing page with new citation button CC by 4.0 license with suggested Citation text Based on DOI Data published at WHOAS: doi: 10.1575/1912/bco-dmo.651474 Using the DOI citation formatter service: Doesn’t work yet for our DOIs (cause we don’t have enough DOI metadata), but when it does, it will return something like this Cite as: Twining, B. (2016). “Element Quotas of Individual Synechococcus Cells Collected During Bermuda Atlantic Time-Series Study (BATS) Cruises Aboard the R/V Atlantic Explorer Between Dates 2012-07-11 and 2013-10-13”. Version 05/06/2016. Biological and Chemical Oceanography Data Management Office (BCO-DMO) Dataset. doi:10.1575/1912/bco-dmo.651474 [access date]

Thank you … To the Data Citation Working Group for their efforts RDA US and MacArthur Foundation for funding this adoption project TIMELINE: Redesign/protoype completed by 1 June 2016 New citation recommendation by 1 Sep 2016 Report out at RDA P8 (Denver, CO) September 2016 Final report by 1 December 2016 Cyndy Chandler @cynDC42 @bcodmo ORCID: 0000-0003-2129-1647

Removed these to reduce talk to 10-15 minutes EXTRA SLIDES Removed these to reduce talk to 10-15 minutes

Adoption of Data Citation Outputs Evaluation Evaluate recommendations (done December 2015) Try implementation in existing BCO-DMO architecture (work began 4 April 2016) Trial BCO-DMO: R1-11 fit well with current architecture; R12 doable; test as part of DataONE node membership; R13-14 are consistent with Linked Data approach to data publication and sharing NOTE: adoption grant received from RDA US (April 2016)

RDA Data Citation (DC) of evolving data DC goals: to create identification mechanisms that: allow us to identify and cite arbitrary views of data, from a single record to an entire data set in a precise, machine-actionable manner allow us to cite and retrieve that data as it existed at a certain point in time, whether the database is static or highly dynamic DC outcomes: 14 recommendations and associated documentation ensuring that data are stored in a versioned and timestamped manner identifying data sets by storing and assigning persistent identifiers (PIDs) to timestamped queries that can be re-executed against the timestamped data store More information from RDA site URL: In addition to BCO-DMO, there are already 8 other pilot studies that have expressed interest in adopting the DC WG recommendations: ARGO Austrian Centre for Digital Humanities BCO-DMO Center for Biomedical Informatics (CBMI), Washington University, St. Louis Climate Change Centre Austria ENVRIplus Natural History Museum London Ocean Networks Canada Virtual Atomic and Molecular Data Centre

RDA Data Citation WG Recommendations »» Data Versioning: For retrieving earlier states of datasets the data need to be versioned. Markers shall indicate inserts, updates and deletes of data in the database. »» Data Timestamping: Ensure that operations on data are timestamped, i.e. any additions, deletions are marked with a timestamp. »» Data Identification: The data used shall be identified via a PID pointing to a time-stamped query, resolving to a landing page. Oct 2015 version w/ 14 recommendations DC WG chairs: Andreas Rauber, Ari Asmi, Dieter van Uytvanck

New capability (implemented) procedure: when a BCO-DMO data set is updated … A copy of the previous version is preserved Request a DOI for the new version of data Publish data, and create new landing page for new version of data, with new DOI assigned BCO-DMO database has links to all versions of the data (archived and published) Both archive and published dataset landing pages have links back to best version of full dataset at BCO-DMO BCO-DMO data set landing page displays links to all archived and published versions

Extended description of recommendations REFERENCES Extended description of recommendations Altman and Crosas. 2013. “Evolution of Data Citation …” CODATA-ICSTI 2013. “Out of cite, out of mind” FORCE11 R. E. Duerr, et al. “On the utility of identification schemes for digital earth science data”, ESI, 2011. Altman and Crosas. 2013. “Evolution of Data Citation Altman, M., & Crosas, M. (2013). The evolution of data citation: from principles to implementation. IAssist Quarterly, 37(1-4), 62-70. CODATA-ICSTI 2013 “Out of cite, out of mind” Data Science Journal Vol. 12 (2013) p. CIDCR1-CIDCR75 Out of Cite, Out of Mind: The Current State of Practice, Policy, and Technology for the Citation of Data R. E. Duerr, et al. 2011 “On the utility of identification schemes for digital earth science data” DOI: 10.1007/s12145-011-0083-6 (online version, open access) Duerr, R. E., Downs, R. R., Tilmes, C., Barkstrom, B., Lenhardt, W. C., Glassy, J., ... & Slaughter, P. (2011). On the utility of identification schemes for digital earth science data: an assessment and recommendations. Earth Science Informatics, 4(3), 139-160.