ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 An Overview of Persistent Identifiers George M.

Slides:



Advertisements
Similar presentations
Registration agencies: DOI deployment doi>. POLICIES Any form of identifier NUMBERING DESCRIPTION framework: DOI can describe any form of intellectual.
Advertisements

doi> Digital Object Identifier: overview
Doi> DOI – new applications panel IDF Annual Members meeting Bologna 2005.
Experiences with Massive PKI Deployment and Usage Daniel Kouřil, Michal Procházka Masaryk University & CESNET Security and Protection of Information 2009.
Configuration management
Configuration management
DOI System: overview Norman Paskin International DOI Foundation.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
Persistent identifiers – an Overview Juha Hakala The National Library of Finland
CS 603 Naming in Distributed Systems January 28, 2002.
The Digital Object Identifier: A Tool for E-Commerce and Rights Management doi> Glen Secor 26 Nov 01.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
This chapter is extracted from Sommerville’s slides. Text book chapter
LC and the W3C: History b Attended two W3C Workshops Indexing/Distributed Search Indexing/Distributed Search Query Language Query Language.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
1 APARSEN - WP2200 Identifiers and Citability Interoperability Framework for PI systems Webinar on PI - 15 February 2013 Maurizio Lunghi.
Bioinformatics Forum: March 14-15, 2005 National Institute for Environmental Studies Bioinformatics Forum: March 14-15, 2005 Names for life An Introduction.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
DDI Best Practices Technical Best Practices. High Level Architecture URNs and Entity Resolution Managing Unique Identifiers DDI as Content for Repositories.
CNRI Handle System and its Applications
Digital Identity Management Strategy, Policies and Architecture Kent Percival A presentation to the Information Services Committee.
Doi> Norman Paskin, International DOI Foundation Digital Object Identifier.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Digital Object Identifier Charles Ellis: Chairman, International DOI Foundation Norman Paskin: Director, International DOI Foundation Steve Stone: Director,
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Chapter 4 Networking and the Internet Introduction to CS 1 st Semester, 2015 Sanghyun Park.
1 Chuck Koscher, CrossRef New Developments Relating to Linking Metadata Metadata Practices on the Cutting Edge May 20, 2004 Chuck Koscher Technology Director,
CrossRef, DOIs and Data: A Perfect Combination Ed Pentz, Executive Director, CrossRef CODATA ’06 Session K4 October 25, 2006.
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
Linking resources Praha, June 2001 Ole Husby, BIBSYS
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
Reference Linking via CrossRef April 13, 2000 Ed Pentz Executive Director CrossRef.
1 Ed Pentz, CrossRef CrossRef and DOIs: New Developments 32 nd LIBER Annual General Conference Extending the Network: libraries and their partners 18 June.
Configuration Management (CM)
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
The DOI Standard Nettie Lagace NISO Associate Director for Programs CEAL Workshop on Electronic Resources Standards and Best Practices March.
1 Annual Meeting 2004 CrossRef Publishers International Linking Association, Inc Charles Hotel, Cambridge, MA November 9 th, 2004.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
The Many Facets of Metadata Exchange Between Publishers and the Research Community: The Role that A&I Services and DOIs Play in Providing Access to Electronic.
LSIDs in a Nutshell Jun Zhao University of Manchester 1 st December, 2005.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
M O S A i C S MOSAICS Brussels 5-6 October 2005 © 2005 Belgian Science Policy. I Virginie Storms Belgian Science Policy Office Laboratory for Microbiology,
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 Not So Strange Bedfellows: Information Standards For Librarians AND Publishers November 6, 2015.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
Information Architecture The Open Group UDEF Project
Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Digital Object Identifier doi> Norman Paskin The International DOI Foundation W3C DRM workshop January 22/
Introduction to Active Directory
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
Naming CSCI 6900/4900. Names & Naming System Names have unique importance –Resource sharing –Identifying entities –Location reference Name can be resolved.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Course on persistent identifiers, Madrid (Spain) Information architecture and the benefits of persistent identifiers Greg Riccardi Director Institute for.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Norman Paskin International DOI Foundation
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Persistent identifiers in VI-SEEM
Digital Object Identifier
Session 11 – Implementation and Compliance Issues
Presentation transcript:

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 An Overview of Persistent Identifiers George M. Garrity Microbiology and Molecular Genetics Michigan State University

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 The phone call from Peru… To provide the TEG with an overview of persistent identifiers and digital objects Explore both the technical and social/policy issues Provide some perspective on how persistent identifiers have been applied in two settings Mature application - CrossRef Evolving application - NamesforLife Offer some thoughts on how PIDs might be applied to Certificates of Origin and Traditional Knowledge My assignment Disclaimers An end-user of persistent identifiers Dual interests and IP in this space

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 So, what’s the problem? “…link heterogeneous electronic libraries. The difficulties inherent in this third objective ultimately led to this paper. ” “But for the bioinformatician concerned with integrating and computing upon distributed information… In second place is perhaps naming (identifying), with all the gloriously idiosyncratic embedded semantics of local identifiers in disparate forms.” Kahn and Wilensky 1993 Clark 2003

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 So, what’s the problem? “Even well-formed and properly applied names can serve as a source of confusion and considerable frustration. This is hardly a new problem.” Garrity and Lyons 2003 “Although used every day, identifiers are a mystery to many people, including people responsible for building complex information systems.” Report of the NISO Identifiers Round- Table 2006 “And now, a much more succinct way to say this: our systems are autistic. They don’t make inferences. When we learn something in one system or one area, it doesn’t carry over to other areas.” McComb 2006

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Let’s start with some working definitions An instance of an abstract data type that has two components: metadata and key metadata Key metadata includes a handle A handle is a globally unique identifier that is bound to the digital object Digital objects differ from database records and files, are stored in network accessible repositories, and are accessed using a repository access protocol. Other key properties Digital objects From: Kahn and Wilenski 2006 Int J. Digit. Lib 6:

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Essential elements in Human - machine communications Machine - machine communications Identifiers Ideally… Exist as an unambiguous string Context and application dependent Actionable Resolvable Other points to consider Semantically opaque Global or local Unique or non-unique Unanticipated uses

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Definitions (continued) A name or an identifier for a resource that uniquely identifies that resource and will be forever associated with that resource. It will never be reassigned to any other resource and will not change regardless of where the resource is located or whatever protocol is used to access it. Use of a well managed persistent identifier rather than a location will ensure that when a document is moved, or its ownership changes, the links to it will remain actionable. Persistent Identifiers From: Diana Dack, Persistence is a Virtue Information Online Conference, Sydney. January 2001

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Definitions (continued) Name resolution The process of mapping a persistent identifier to a URL that retrieves a resource. The URL locates the named resource identified by the persistent identifier (the name). PID URL PID 1 PID 2 PID 3 URL 1 URL 2 URL 3 Resource Identifies Locates Name resolution

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Inherent in the design of such systems…. Name registration & Name resolution Name registration & Name resolution Authority PID URL PID 1 PID 2 PID 3 URL 1 URL 2 URL 3 Resource Metadata PID URL Identifies Locates User Key metadata Global registry

DOI directory URL Content Assigner DOI directory DOI directory DOI doi> Source: Norman Paskin, International DOI Foundation

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Comparing identifiers A single unambiguous string A numbering scheme A label that identifies an entity ISBN ATCC 27126* L-681, A method of providing consistent syntax to denote class membership of an entity. A formal standard or industry convention An arbitrary internal system Key point is establishing a 1:1 correspondence between labels and members Enumeration The number or label are simply strings

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Comparing identifiers (cont.) A syntax by which an identifier can be expressed in a form suitable for use within a specific infrastructure. Actionable identifiers URI (URN and URL) ISBN numbers as UPC/EAN identifiers Does not mandate a method of creating labels Does not create a managed environment An infrastructure specification

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Includes Unique identifiers A formalized infrastructure Management policies for registration, structured interoperable metadata, policy, and governance mechanisms. Examples UPC/EAN barcodes and RFID tags Digital object identifiers (digital identifiers of objects) A fully implemented identifier system Comparing identifiers (cont.)

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Desired properties of a candidate PID Semantically opaque - avoid the pitfalls of embedded meaning Governance - is there a technical and social framework overseeing the development, implementation and “marketing’ of the PID? Persistence - is there a mechanism in place to guarantee persistence of issued PIDs, when so desired? Registration - is there a mechanism for global registration of the PIDs or can anyone issue PIDs? Metadata - is there a minimal requirement for metadata associated with each identified object? Accepted standard - is there evidence that the PID is an accepted standard? Globally unique - are the PIDs globally unique? Widespread usage - how many PIDs have been issued and what is the rate of issuance of new PIDs?

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Desired properties of a candidate PID (cont) Object/location - what does the PID identify? Actionable - are network services attached/imbedded? Unique - does the resolution service check for uniqueness at the local level? Interoperability - can the identifiers be readily incorporated into other applications without modification or permission? Granularity - can the identifiers be assigned to subcomponents (nesting of entities within entities). Business model - is there a compelling business need for the PIDs to insure that the infrastructure can be maintained in a self-supporting manner?

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Comparison of identifier properties

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 What does a Digital Object Identifier look like? The prefix is assigned to the content provider by a DOI Registration Agency, or the Handle System directly. The suffix is an opaque string supplied by the content provider. Handle software stores a mapping of the Handle to one or more locations (or services) In virtually all cases today, the Handle is mapped to a location (URL). resolves to Which used to be: /myownnumbers prefix suffix subsuffix

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Syntax of some other PIDs in “common” use ::= "/" Persistent URLs LSID Life Science Identifiers ::= / / urn: : : : : h.gov:GenBank/accession:NT_001063:2

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Two implementations using DOIs Independent membership association,founded and directed by STM publishers. Mission is to connect users to primary research literature through a DOI RA that performs reference cross-linking, subject to publisher- access controls. The largest and most successful implementation of DOI services. NamesforLife is a proprietary semantic resolution service developed at MSU. It provides a method for persistently linking the occurrence of a biological name or other technical term in third party content to managed information about its origins, formal definition, current usage, and related goods and services.

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 “…because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know.” Rumsfeld’s axiom and knowledge bleed

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 The knowledge gradient Unknown unknowns Known knowns Basic and applied research advances knowledge Knowledge bleed results is a loss of knowledge that has already been gained Semantic resolution provides a mechanism to combat knowledge bleed Unknown knowns Known unknowns

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Ramifications of misunderstanding a name Wrong assumptions, assertions, or hypotheses Misdiagnosis of infectious diseases Misapplication of public policies Highly significant Significant Lost opportunities Failure to reach potential customers potentially interested in marketed content, goods, and services at point of need. The long-tail phenomenon* Names trigger specific responses But, the concepts to which names apply are not static May not always map 1:1 May require expertise for accurate interpretation

ABS Governance Dialogues The Role of Documentation in ABS and TK Governance Lima, Peru 21 January 2007 Some thoughts on selecting a PID for CO and TK The intended use of the identifier Syntactic rules governing the form of the identifier What the identifier resolves to The technical infrastructure that is available to support the identifier and the parties operating it Policies governing creation, maintence, support, and persistence of the identifier Information about any metadata related to the identifier that is or must be made available A history about the identifier, including any changes in any of the above points over time. Source: Report of the NISO Identifiers Roundtable 2006 Questions?