Download presentation
Presentation is loading. Please wait.
Published byRoderick Burke Modified over 8 years ago
1
Development and Management of e-Repositories 8 -12 April 2013 IODE Project Office Oostende, Belgium Future Repository Trends: Repositories and Published Data Lisa Raymond MBLWHOI Library, Woods Hole Oceanographic Institution lraymond@whoi.edu
2
Use Cases 1.Data related to traditional journal articles are assigned persistent identifiers referred to in the articles and stored in institutional repositories 2.Data held by data centers are packaged and served in formats that can be cited
3
Data Citation The goal of the use cases has been to identify best practices for tracking data provenance and clearly attributing credit to data creators/providers so that researchers will make their data accessible. The assignment of persistent identifiers, specifically Digital Object Identifiers (DOIs), enables accurate data citation.
4
Digital Object Identifier (DOI) A digital object identifier (DOI) is a character string used to uniquely identify an object. Metadata about the object is stored in association with the DOI name. Libraries have been using for years, now de facto standard for data. http://dx.doi.org/10.1575/1912/5105 10 -DOI registry 1575 - DOI registry agent – CrossRef 1912 - publisher – MBLWHOI 5105 – “item” number
5
Use Case One Article records link to the dataset records Dataset records link to the article records MBLWHOI Library has worked directly with authors as well as with a data management office to deposit data and metadata. DOIs have successfully been assigned to data associated with articles before and after publication of the article.
6
Use Case Two The Published Data Library (PDL) is implemented by the British Oceanographic Data Centre. It provides snapshots of specially chosen datasets that are archived using rigorous version management. The publication process exposes a fixed copy of an object and then manages that copy in such a way that it may be located and referred to over an indefinite period of time. Using metadata standards adopted across NERC's Environmental Data Centres, the repository assigns DOIs obtained from the British Library/DataCite to appropriate datasets.
7
Sample Record from PDL
8
Other Similar Models Dryad Pangea Dspace@MIT Other major projects also investigating linked data/enhanced publication include DRIVER and OPenAIREplus. This type of repository is meant to function in addition to national and subject data centers, not as a replacement.
9
DSpace Repository Accepts both text documents and datasets Accepts data related to articles as well as data not associated with a paper
10
Using standards is important for interoperability Get familiar with documentation
11
Lat / Long
12
Bounding Box
13
List of Most Common Metadata Fields
14
Science Direct Links to Associated Data in the Woods Hole Open Access Server (WHOAS)
15
Collaboration with NSF funded Biological and Chemical Oceanography Data Management Office (BCO-DMO) Automated the ingestion of metadata from BCO- DMO for deposit, with a copy of each dataset into the Institutional Repository WHOAS. Incorporates functionality for BCO-DMO to request a DOI from the Library. Partnership allows the Library to work with a trusted data repository to ensure high quality data while BCO-DMO utilizes the library services and is assured a permanent copy of the data is associated with the DOI.
16
BCO-DMO links to the repository WHOAS links to BCO-DMO
17
ORCID Open Researcher & contributor ID A registry of unique researcher identifiers Persistent identifiers for names Can enable linking to other resources created by the researcher
18
ORCID partners Nature Publishing Group, Faculty of 1000, Elsevier the Wellcome Trust and NIH for paper/grant submission process. NIH is testing the efficacy of ORCID iDs in the ScienCV platform Nature will soon be publishing ORCID IDs in papers NLM DTD for PMC papers edited to support ORCID NIH - track researcher’s progress to see if grants are producing high quality science. “ORCID will help us link the researchers we support to the things they’ve produced”
19
ORCID IDs 16-digit number assigned to individuals randomly assigned by the ORCID registry expressed as http uri – http://orcid.org/[#] – http://orcid.org/0000-0002-3843-3472
20
Adding works Automatic search of CrossRef database for potential matches [last name match] – Limited to works w/ DOIs – LINKING Import research activities from Scopus [identifier, profile, & publications] Manual input
21
Other IDs ResearcherID from Thomson – Integrates with Web of Knowledge – ORCID compliant, linking and importing/exporting ISNI – International Standard Name Identifier – Not intended to provide comprehensive information, more of a tool for linking information between systems – Interoperable with ORCID – Both recognize their separate goals and will continue to work together
22
Cookbook Step by step directions Appendix with examples Small organizations with limited staff can do this!!!
23
Future What other metadata fields can we link with? Who else can we link to?
24
Cookbook What is data publication? Why data publicaton? Persistent identifiers p.5 Providing a reference to a published dataset p.10
25
Data Citation http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines Google search: esip data citation for more examples http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines The core required elements of a citation are Author(s)--the people or organizations responsible for the intellectual work to develop the data set. The data creators. Release Date--when the particular version of the data set was first made available for use (and potential citation) by others. Title--the formal title of the data set Version--the precise version of the data used. Careful version tracking is critical to accurate citation. Archive and/or Distributor--the organization distributing or caring for the data, ideally over the long term. Locator/Identifier--this could be a URL but ideally it should be a persistent service, such as a DOI, Handle or ARK, that resolves to the current location of the data in question. Access Date and Time--because data can be dynamic and changeable in ways that are not always reflected in release dates and versions, it is important to indicate when on-line data were accessed. Additional fields can be added as necessary to credit other people and institutions, etc. Additionally, it is important to provide a scheme for users to indicate the precise subset of data that were used. This could be the temporal and spatial range of the data, the types of files used, a specific query id, or other ways of describing how the data were subsetted. An example citation: Cline, D., R. Armstrong, R. Davis, K. Elder, and G. Liston. 2002, Updated 2003. CLPX-Ground: ISA snow depth transects and related measurements ver. 2.0. Edited by M. Parsons and M. J. Brodzik. National Snow and Ice Data Center. Data set accessed 2008-05-14 at http://dx.doi.org/10.5060/D4MW2F23zhttp://dx.doi.org/10.5060/D4MW2F23z
26
Sample Data Sources NOAA Coastal Water Temps http://www.nodc.noaa.gov/dsdt/cwtg/catl.html http://www.nodc.noaa.gov/dsdt/cwtg/catl.html World Sea Temperatures http://www.seatemperature.org/http://www.seatemperature.org/ BODC CTD and Underway Data https://www.bodc.ac.uk/data/online_delivery/amt/ https://www.bodc.ac.uk/data/online_delivery/amt/ Global Ocean Data http://www.coriolis.eu.org/Observing-the- ocean/Global-and-regional-views/Global-Oceanhttp://www.coriolis.eu.org/Observing-the- ocean/Global-and-regional-views/Global-Ocean Real Time Arctic Data http://www.arctic.noaa.gov/data.htmlhttp://www.arctic.noaa.gov/data.html BCO-DMO http://bcodmo.org/datahttp://bcodmo.org/data
27
OceanTeacher Academy
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.