Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.

Slides:



Advertisements
Similar presentations
EPICUR Kathrin Schroeder Die Deutsche Bibliothek ETD The application of Persistent Identifiers as one approach to ensure long-term.
Advertisements

The Corporation for National Research Initiatives The Handle System Persistent, Secure, Reliable Identifier Resolution.
National Library of New Zealand Dave Thompson Resource Development Analyst Digital Initiatives Unit.
doi> Digital Object Identifier: overview
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
Digital Object Identifiers for EOSDIS data HDF Workshop April 17, 2012 John Moses, ESDIS
ORNL DAAC Experience With Digital Object Identifiers (DOIs) Bruce Wilson, ORNL DAAC Manager for NASA Data Center Managers telecon 22 Feb 2010.
1 Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, Sweden.
Project Prism Virtual Remote Control: Preservation Risk Management for Web Resources Nancy Y. McGovern, ECURE 2002.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
M. Stockhause et al. Martina Stockhause, Michael Lautenschlager, Frank Toussaint Deutsches Klimarechenzentrum (DKRZ) World Data Centre for Climate (WDCC)
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Why identifiers? To access resources To cite resources To unambiguously identify a resource –To register it as intellectual property –To record changes.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
Locating objects identified by DDI3 Uniform Resource Names Part of Session: Concurrent B2: Reports and Updates on DDI activities 2nd Annual European DDI.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
Resolving Unique and Persistent Identifiers for Digital Objects Why Worry About Identifiers? Individuals and organizations, including governments and businesses,
Z EGU Integration of external metadata into the Earth System Grid Federation (ESGF) K. Berger 1, G. Levavasseur 2, M. Stockhause 1, and M. Lautenschlager.
Long-term preservation aspects in the eSciDoc project Natasa Bulatovic Max-Planck Digital Library
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Digital Object Architecture
APARSEN WP22 Identifiers and Citability APARSEN WP22 Identifiers and Citability Some key results Fondazione Rinascimento Digitale Emanuele Bellini, Chiara.
The DOI Standard Nettie Lagace NISO Associate Director for Programs CEAL Workshop on Electronic Resources Standards and Best Practices March.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
DOI & Crossref Arnoud de Kemp Springer-Verlag
LSIDs in a Nutshell Jun Zhao University of Manchester 1 st December, 2005.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Measurement Data Workspace and Archive: Current State and Next Steps GEC15 Oct 2012 Giridhar Manepalli Corporation for National Research Initiatives
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008.
Module - Identifiers The DSpace Course. Module Overview  By the end of this module you will:  Understand what persistent identifiers are, how they work.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
LTER, PASTA, and persistent identifiers LTER IMC Water Cooler Series January 2011.
Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.
CNR – National Research Council, Rome (IT) Central Library ‘G. Marconi’ National Centre for Grey Literature and National ISSN Centre CNR – National Centre.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Low-Risk Persistent Identification: the “Entity” (N2T) Resolver 10 October 2006 John Kunze, California Digital Library, University of California.
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Weigel, Berger, Kindermann, Lautenschlager EGU Versioning for CMIP6 in the Earth System Grid Federation Data preparation Initial registration.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Identifiers and Citation
Metadata Issues in Long-term Management of Data and Metadata
Approaches and Challenges in Managing Persistent Identifiers
Introduction to Persistent Identifiers
Data Type Registries #2 12 Month Status Larry Lannom, Tobias Weigel Date Location TBD? CC BY-SA 4.0.
The RPID Testbed Rob Quick Manager – High Throughput Computing
Data Ingestion in ENES and collaboration with RDA
Data Type Registries Breakout
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
PID centric fabric constructed piece by piece
GSAF Grid Storage Access Framework
Persistent identifiers in VI-SEEM
CMIP6 / ENES Data TF Meeting: DKRZ
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
NSDL Data Repository (NDR)
Data types and persistent identifiers in
Agenda (AM) 9:30-10:15 Introduction to RDA
Bird of Feather Session
RDA uptake activities and plans: ESGF
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Persistent identifiers for instruments (PIDINST) working group
Presentation transcript:

Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism

Tobias Weigel (DKRZ)  What are Persistent Identifiers (PIDs)?  Extended PIDs  How to use them Agenda

Tobias Weigel (DKRZ) What are PIDs? 3

Tobias Weigel (DKRZ)  10876/abc123  /WDCC/CMIP5.NCCNMpc  ark:/13030/tf5p30086k  /  urn:lsid:ubio.org:namebank: Persistent identifiers come in various formats

Tobias Weigel (DKRZ) 5 Why do we need PIDs? Tracking IDCIM IDDRS Syntax...

Tobias Weigel (DKRZ) 6 PIDs point to resources /abc123 Resolution service example.com/xyz567

Tobias Weigel (DKRZ) 7 The resource is a black box Data Metadata Software code Document ? ? ?

Tobias Weigel (DKRZ) 8 PIDs are globally unique /abc /abc123

Tobias Weigel (DKRZ) 9 URLs are not persistent over time („link rot“) Today Not found

Tobias Weigel (DKRZ) 10 PIDs are persistent over time /abc /abc /abc123 Today

Tobias Weigel (DKRZ) PIDs establish a redirection layer /abc123 Stable Unstable

Tobias Weigel (DKRZ)  Create PID  Update the URL the PID points to  (Delete PID) 12 Operations on a PID

Tobias Weigel (DKRZ)  Handle System  Archival Resource Key (ARK)  Life Science Identifier (LSID)  Persistent URL (PURL)  Uniform Resource Name (URN)  There are many PID systems / infrastructures

Tobias Weigel (DKRZ) 14 How do PID infrastructures differ? Identifier name schema Resolver service Persistency mechanism Additional services PID Infrastructure

Tobias Weigel (DKRZ) We need to go beyond the simple redirection view Extended PIDs 15

Tobias Weigel (DKRZ) 16 Some information must be stored persistently Checksum: 7D01E /A Today 10876/A 2015 Checksum: 7D01E436 Verify... !

Tobias Weigel (DKRZ) 17 Build more complex information structures [1, 5, 13, 9, 12] A: 1 B: 4 C: 3 D: 7

Tobias Weigel (DKRZ) 18 Collections of PIDs are required for our use cases 10876/B 10876/A 10876/collectio n1

Tobias Weigel (DKRZ) 19 Graphs or trees of PIDs are required as well 10876/A 10876/C10876/B

Tobias Weigel (DKRZ) 20 The graph nodes and edges may be typed 10876/A 10876/C10876/B has metadata older versionhas metadata Data objectMetadata object Data Object

Tobias Weigel (DKRZ) 21 The graph structure must be stored persistently 10876/C10876/B has metadata Data objectMetadata object One combined entity

Tobias Weigel (DKRZ) 22 Collections can be realized through graphs 10876/B10876/A 10876/collectio n /B

Tobias Weigel (DKRZ) 23 What must be stored persistently? Minimal metadata (key- metadata) Checksum PID creation time stamp Graph structure (links) Collection membership static dynamic

Tobias Weigel (DKRZ) 24 Levels of preservation Primary level of preservation Secondary level of preservation 10876/abc123 Minimal metadata

Tobias Weigel (DKRZ)  Relation types must be standardized.  Research Data Alliance WG ‘PID Information Types’ WG ‘Type Registry’ collections? 25 PIDs are a topic for international collaboration 10876/A 10876/C10876/B 10876/A 10876/collect ion1

Tobias Weigel (DKRZ) 26 Usage scenario: Provenance as a DAG cdo t Data object PID Link „was derived from“

Tobias Weigel (DKRZ) How do we actually use PIDs? Software 27

Tobias Weigel (DKRZ)  For technical reasons key-metadata is unique feature that is required for PID graphs  For practical reasons – examples: ARKs and URNs lack wide adoption and support PURL maintenance is not clear LSIDs in older literature are not persistent  Handle System has an operational perspective I am biased towards the Handle System

Tobias Weigel (DKRZ)  Developed by CNRI Corporation for National Research Initiatives  Registered trademark  Fee for registering new prefixes (e.g )  Customers e.g. US military International DOI Foundation 29 What is the Handle System?

Tobias Weigel (DKRZ) URL: Checksum: How does the Handle System work? Prefix DB Central resolution service 10876/

Tobias Weigel (DKRZ) 31 What are Digital Objects? 10876/abc123http://example.com/xyz789

Tobias Weigel (DKRZ) 32 Build a stack of lightweight components LAPIS API for Persistent Identifier Services (on GitHub)

Tobias Weigel (DKRZ)  Weigel et al.: “A framework for extended persistent identification of scientific assets” (submitted to the Data Science Journal)  Duerr et al., doi: /s Further reading

Tobias Weigel (DKRZ) All slides available here: redmine.dkrz.de/seminar Thank you. 34

Tobias Weigel (DKRZ)  LTA application: Q4 2012, Q  EUDAT integration: 2014  CMIP6 + : The greater plan