Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.

Slides:



Advertisements
Similar presentations
The Corporation for National Research Initiatives The Handle System Persistent, Secure, Reliable Identifier Resolution.
Advertisements

ADL Registry (Plus a Little Technological Context) Larry Lannom Corporation for National Research Initiatives
National Library of New Zealand Dave Thompson Resource Development Analyst Digital Initiatives Unit.
Digital Object Architecture and the Handle System Larry Lannom 20 June 2006 Corporation for National Research Initiatives
Demonstration Files for the HDL Plug-in for Acrobat The HDL Plug-in for Adobe Acrobat and Acrobat Reader is an extension that adds functionality to PDF.
1 IDF Annual Members Meeting June 23, 2004 IDF – Annual Members Meeting Implementation Update.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
The eXtensible Catalog’s Drupal Toolkit: a Discovery Interface to Address Users’ Needs Jennifer Bowen University of Rochester, Rochester, NY ALA LITA Drupal.
Digital Repository Making use of handles. Introduction Digital Repository launched in June Handle Server Setup and.
1 Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, Sweden.
Advisory Board Meeting  Portland, Oregon  08 November 2000 System Architecture David Maier
Handle System Overview February 2011 Larry Lannom Corporation for National Research Initiatives
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Handle System Overview Larry Lannom 18 May 2004 Corporation for National Research Initiatives Copyright©
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
DDI Best Practices Technical Best Practices. High Level Architecture URNs and Entity Resolution Managing Unique Identifiers DDI as Content for Repositories.
CNRI Handle System and its Applications
Resolving Unique and Persistent Identifiers for Digital Objects Why Worry About Identifiers? Individuals and organizations, including governments and businesses,
WSIS Forum 2011 May 19, 2011 Presentation by Robert E. Kahn
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Ten Minute Handle System Overview July 2012 Larry Lannom Corporation for National Research Initiatives
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Module 7: Resolving NetBIOS Names by Using Windows Internet Name Service (WINS)
DOI’s, Open URL’s and Context Sensitive Linking What Are They and How Can I Make Them Work for My Library Rachel L. Frick Head, Bibliographic Access Services.
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
CERN-PH-SFT-SPI August Ernesto Rivera Contents Context Automation Results To Do…
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Global Digital Format Registry Progress Andrea Goethals, Harvard University Library NDIIPP Digital Preservation Partners’ Meeting Arlington, VA July 9,
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008.
Module - Identifiers The DSpace Course. Module Overview  By the end of this module you will:  Understand what persistent identifiers are, how they work.
A Overview of Standards and Technologies in Identification of Archival Information Lou Reich CSC/NASA AWIICS 13-Oct-99.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Object storage and object interoperability
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Digital Library Syllabus Uploader Will Cameron CSC 8530 Fall 2006 Presentation 1.
|| Barbara Hirschmann1 Establishing a DOI service for Switzerland’s university and research sector.
Replicate Research Data Safely eudat.eu/b2safe B2SAFE How to replicate your data using EUDAT’s B2SAFE Version 3 November 2015 This work is.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Data Citation Implementation Pilot Workshop
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Data Preservation.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Dynamic/Deferred Document Sharing (D3S) Profile for 2010 presented to the IT Infrastructure Technical Committee Karen Witting February 1, 2010.
Digital Object Architecture Tutorial
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
RDA Europe: Views about PID Systems
Module 8: Networking Services
Repository Software - Standards
Testing REST IPA using POSTMAN
Persistent identifiers in VI-SEEM
A step-by-step guide to DOI registration
Implementing an Institutional Repository: Part II
NSDL Data Repository (NDR)
Publishing data and metdata From iRODS to repositories
EUDAT Site and Service Registry
Overview Multimedia: The Role of WINS in the Network Infrastructure
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
SDMX IT Tools SDMX Registry
Presentation transcript:

Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara

Persistent Identifiers (PIDs) Pointers to data resources Digital Resources: Data, metadata, documents Real world objects: Species, patient, cell line Globally unique Exist infinitely long Used to identify and retrieve resources Examples: ISBNs, BSNs, DOIs, EPIC PIDS, URIs

Digital Object (DO) Data PID Metadata Synchronise PID, Data and Metadata during creation, maintenance and deletion of a digital object!

PIDs are static World of data infrastructure (hardware) Data 2 Data 1 Data 4 Data 3 PID 1PID 2PID 3PID 4

Workflow1: Change storage environment PID1 PID2 Storage site AStorage site B

Use Case 1: Digital repositories PIDs point to landing page of the digital repository showing metadata “Real” data can be downloaded from this page with another link E.g. B2SHARE, 3.TU Datacentrum & DANS repositories PID resolves to

Use Case 2: Enabling data flows PIDs point to data directly If needed create another field specifying the data type to choose application Use data in workflow via PID, NOT via actual location!

Resolving PIDs Global Registry E.g. Handle system Global Registry E.g. Handle system Client gets request to resolve hdl:123/ Client sends request to Global to resolve 0.NA/123 (prefix handle for 123/456) hdl:123/ Global Responds with Service Information for 123 #1 #2 #3 Secondary Site A, e.g. SURFsara Secondary Site B Local Service #1#2 Primary Site 4. Server responds with handle data Service Information Local Handle Service IP xc.. xc.. xc..... xcccxv xccx xcccxv xccx xcccxv xccx

Example: Relationships between DOs PID: prefix1/suffix1 Metadata: key1: … key2: prefix2/suffix2 key3: prefix3/suffix3 PID: prefix2/suffix2 Metadata: key1: … key2: prefix1/suffix1 PID: prefix3/suffix3 Metadata: key1: … key2: prefix1/suffix1 Part of/has part relationships Model cohort-patient relationship Model patient-samples relationship

Guidelines: Characteristics of PIDs What should be identifiable by a PID? Define what is data and what is metadata Granularity of PIDs: How much information should a PID contain? Location Checksums Other system specific information Do not put contents information of the data here! Don’t mix PIDs with other IDs, e.g. database IDs Opacity: No assumptions about data context in PID

Guidelines: Referable data How persistent is the data? What and how much in a DO may change? When should a new DO be instantiated? Versioning via PIDs? Define PID management processes: 1.Connecting Data, Metadata and PID 2.Handling changes in data and metadata 3.Handling changes in storage environment 4.Deleting data, metadata, or PIDs Which problem should be addressed with PIDs?

The handle system Offers a resolution service for PIDs Gives a lot of freedom for implementation, e.g. PID information types Software architecture designed for high availability and scalability Basis for several PID providers Costs: 50$ for registering a prefix with handle + 50$/year maintenance EPIC PIDs and DOIs built their service upon the handle system. Thus, a PID is a handle

PID systems DOIs Data registry service Library specific metadata standard incorporated in PID entry (Author info, Dublin core, …)  ensuring interoperability between registered data objects Costs: 0.06$-1$ per PID, depending on service (CrossRef) + annual fee EPIC PIDs Data registry service Create own metadata for PIDs for data interoperability Only costs for the handle service With one prefix one can create as many PIDs as wanted

Example: Python epicclient …

B2SAFE: iRODS and KNMI NFS mount iRODS dCache iRODS PID HPSS DMF OS: /data/orfeus/data/continuous/... iRODS: /ORFEUS/eudat/data/continuous/… iRODS: /vzSARA1/eudat/knmi/… KNMI NFS share Seismic system

Dataflow KNMI  SURFsara The B2SAFE is implemented as a 2 step process: 1.Register a file in irods ireg a file in KNMI create a KNMI 2.Replicate a file in irods to an other node Replicate the registered file to SURFsara Create a SURFsara Update the KNMI

Example handle Domain / prefix / unique identifier KNMI: d89d6771dd88?noredirect SURFsara: a0369f0b5f26?noredirect

Installation EPIC client, e.g. python or perl client Handle server and an EPIC API server iRODS and B2SAFE for ingesting data (optional) SURFsara provides Handle server EPIC API

How to obtain a handle prefix The production prefix has to be purchased from CNRI. Costs 50$/year plus once 50$ for request More information on how to obtain a handle prefix: More information on how to make use of SURFsara’s PID service: