Linked Data for Field Deployments Rolling Deck to Repository (R2R) Linked Data for Field Deployments R.Arko1, S.Carbotte1, C.Chandler2, S.Smith3, K.Stocks4, L.Stolp2 1LDEO 2WHOI 3FSU 4SIO
Rolling Deck to Repository (R2R) Overview Mission: Stewardship of environmental sensor data routinely acquired by U.S. academic research vessels Services: Publish master cruise catalog Document, preserve, disseminate original field datasets Assess quality of selected data types Create post-field data products Support at-sea event logging Support: NSF , ONR, NOAA, SOI, UNOLS
Cruise Catalog Inventory of 6,500+ expeditions on 40 U.S. research vessels, mainly since 2000 Enables linking of cruise-related data and documents at 16+ partner repositories
Motivation Old way: Data from U.S. cruises have historically been stovepiped by program/discipline. New way: Linking data across repositories drives the 3 “R” ’s Reproducibility (what journals want) Reuse (what funders want) Recognition (what scientists want)
Motivation Researcher Cruise @R2R Award @NSF Paper Dataset @ECL (cont.) Motivation In an ideal world: Researcher Cruise @R2R Award @NSF Paper Dataset @ECL Samples @SESAR Dataset @BCO-DMO
Solution R2R’s two-track implementation: Publish entire Cruise Catalog as a Linked Data graph http://data.rvdata.us/ where each Cruise is a RDF Resource. Publish a DOI for every Cruise http://search.datacite.org/ where each Cruise has a DataCite XML Record. In both cases, a Cruise is a Resource of Type=Event, with links to Related Resources.
Identifiers For class/type vocabularies, R2R reuses terms from the NERC Vocabulary Server: L06 Platform Types L05 Instrument Types L22 Instrument Make/Models C77 Shipboard Activity Types P02 Data Types P06 Unit Types L05: SeaDataNet device categories (e.g. current meters) http://vocab.nerc.ac.uk/
Identifiers For instances, R2R links to registries: (cont.) Identifiers For instances, R2R links to registries: R2R - Cruise eg. we are the authority for US academic fleet ICES - Platform (NVS:C17) GRID - Organization ORCID - Person IGSN - Sample DOI - Dataset, Document, Article OFR - Funder RE3 - Repository SeaVoX - Waterbody (NVS:C19) SDN - Port (NVS:C38) OFR: Open Funder Registry (from CrossRef, IDs for grant giving organizations)
Partnerships R2R has “reciprocal linking” arrangements with other repositories eg. Biological and Chemical Oceanography Data Management Office (BCO-DMO) CLIVAR and Carbon Hydrographic Data Office (CCHDO) Interdisciplinary Earth Data Alliance (IEDA) Index to Marine and Lacustrine Geological Samples (IMLGS) Incorporated Research Institutions for Seismology (IRIS/OBSIP) Shipboard Automated Meteorological and Oceanographic System (SAMOS) System for Earth Sample Registration (SESAR)
Challenges Work in progress: Community has not reached consensus on all Identifier types eg. Instruments, Projects, Organizations. Significant backfill mapping required. Identifier systems are not perfect, eg. a person can have more than one ORCID.