Download presentation
Presentation is loading. Please wait.
1
The DOI SystemThe DOI System
Overview OverviewOverview DOI SYSTEM: OVERVIEW International DOI Foundation 1
2
The DOI SystemThe DOI System
Overview OverviewOverview doi> Outline / Key concepts Origins of the DOI System Current position of DOI System activities Persistence Actionable identification Interoperability System components Standardisation DOI System applications 2
3
The DOI SystemThe DOI System
Overview OverviewOverview doi> Further reading DIGITAL OBJECT IDENTIFIER (DOI®) SYSTEM Article in: Encyclopedia of Library and Information Sciences (forthcoming) third edition (Taylor & Francis) 3
4
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI (Digital Object Identifier) System: Initially developed from the publishing industry but now wider a non-profit collaboration to develop infrastructure for persistent identification and management of content Approx 2000 user organisations (through agencies) CrossRef (scholarly publishers); EC; science data; major ISBN agencies; etc. Currently being standardised in ISO (TC46/SC9) the home of ISBN etc “content identifiers” One application of the Handle System® adds to it additional features – social and technical infrastructure, policies, metadata management focus on one area of interest (content/intellectual property) offers a specific data model based on indecs (discussed later) DOI System technology equally applicable for parties and licences 4
5
The DOI SystemThe DOI System
Overview OverviewOverview doi> 1966: ISBN began “identification numbering” “In 1965 the largest British book wholesaler WH Smith announced their intention to move their wholesaling and stock distribution operation to a purpose built warehouse in Swindon [in 1967]. To aid efficiency they would install a computer, and this would necessitate the giving of numbers to all books held in stock…” “The idea of numbering books is not new. One British publishing house has been giving numbers to its books for nearly a hundred years. What is an entirely new concept, however, is that numbers should be given to all books; that these numbers should be unique and non-changeable; and that they should be allocated according to a standard system…” (David Whitaker, The Bookseller, May ) 5
6
The DOI SystemThe DOI System
OverviewOverview Overview doi> ISO continues “identification numbering” Information and Documentation - Identification and Description ISO International Standard Book Numbering (ISBN) ISO International Standard Serial Number (ISSN) ISO International Standard Recording Code (ISRC) ISO International Standard Technical Report Number (ISRN) ISO International Standard Music Number (ISMN) ISO International Standard Audiovisual Number (ISAN) ISO International Standard Musical Work Code (ISWC) ISO Project Version identifier for Audiovisual Works (V-ISAN) ISO Project International Standard Text Code (ISTC) ISO Project International Standard Name Identifier ISO Project Digital Object Identifier System 1. trend towards identifiers of abstract entities all ISO TC46SC9 identifiers now carry mandatory structured metadata to specify the item identified (either from start, or when revised) 6
7
The DOI SystemThe DOI System
OverviewOverview doi> And other things…. National Bibliography Numbers… Government codes… Supply chain: Bar Codes, RFIDs… Hyper-G…
8
The DOI SystemThe DOI System
OverviewOverview Persistent identifiers on the web 1992: Berners-Lee: “universal document identifier” 1994: RFC 1738 : Uniform Resource Locator IETF consensus process “The web is not the universe” “people can change the URI when moving documents…” “oh no they can’t” “oh yes they can”… “Not just documents”
9
Web-related identifiers
The DOI System The DOI SystemThe DOI System Overview OverviewOverview doi> URI, URL and URN Not sophisticated enough alone for content management Additional techniques: PURLs, RDF, SW, ARK, N2T, Handle, etc Related standards: Open URL A syntax to create web-transportable packages of metadata and/or identifiers about an information object Not an identifier, but a complementary technology for appropriate redirection of identifier resolution in use with URLs, Digital Object Identifiers (DOI names) "info" URI Registry Turn legacy identifiers (e.g. info:lccn/ ) into URLs IETF RFC 4452: The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces. Note: DOI System is not designed ONLY for the web, but it is the current most common digital environment. 9
10
The DOI SystemThe DOI System
Overview OverviewOverview doi> Terminology: the over-used term “identifier” “Identifier” as numbering schemes Registries Normally central control, commitment Examples: ISBN, EAN bar codes, IANA, ITU phone numbering plans etc Normally focus on attributes (metadata) “Identifier” as syntax specifications Normally little central control e.g URI (URL); MPEG-21 DII Few structured attributes, low barriers to entry Some more structured than others: e.g. URN, info URI Other confusions: Some practical systems use both schemes and specifications Representations and interactions between different schemes and specifications: e.g. an ISBN can be expressed as a URL, as an EAN bar code, a DOI name, etc Identifier as “system” versus as a “unique label” Schemes begin to be used for things outside scope 10
11
The DOI SystemThe DOI System
Overview OverviewOverview doi> 1995: Armati Report Information Identification - a report to STM publishers (Mar 95) Uniform File Identifiers - a report to AAP publishers (Oct 95) “..need to unify in one scheme music, audiovisual, document management, internet engineering, digital libraries, copyright registration and object based software” [i.e. web was not the focus] “..maximise utility of digital objects; enable core interoperability; enable integration of disparate sourced data; ability to trace ownership to manage rights” requirements: protect legacy investments enable interoperability provide link between digital and physical maintain privacy of users have persistence standard syntax global scalability global uniqueness global meaning Led to launch of DOI System initiative (AAP committee, Uniform File Identifier) 11
12
The DOI SystemThe DOI System
Overview OverviewOverview 1998: DOI - Digital Object Identifier system 12
13
The DOI SystemThe DOI System
OverviewOverview Overview doi> (1) DOI System: development in three tracks Other efforts, standards, etc Metadata Single redirection (persistent identifier) Multiple resolution Initial implementation Full implementation Activity tracking A continuing development activity 13
14
The DOI SystemThe DOI System
Overview OverviewOverview doi> (2) Creation of an organisation Key driver: spend on development Key driver: cost reduction International DOI Foundation members Operating Federation Agencies Clients & 14
15
The DOI SystemThe DOI System
Overview OverviewOverview doi> Current DOI System activity (Oct 2007) Registration Agency Prefixes DOI name registrations Jun-Oct 2007 DOI name registrations to date CrossRef 945 2,135,117 29,517,872 Bowker 74 3,031 745,873 TIB 11 41,583 540,601 CNRI/default (experimental) 66 190 143,477 mEDRA 410 14,111 126,895 Nielsen BookData 211 9 36,578 CAL 270 16 451 OPOCE 300 33 57 Wanfang Data* 2 TOTAL 2027 2,592,775 28,419,009 Source: (restricted access) 15
16
The DOI SystemThe DOI System
Overview OverviewOverview doi> Current strategy Focus on enabling current RAs to generate more DOI names New RAs in new areas Social infrastructure development (RA policies) Business model: IDF RA C Incentive scheme: large discounts per DOI name for large numbers of registrations, e.g. 25% -> 90%+ IDF has no role in this 16
17
The DOI SystemThe DOI System
Overview OverviewOverview Persistence doi> “It is intended that the lifetime of a [persistent identifier] be permanent. That is, the [persistent identifier] will be globally unique forever, and may well be used as a reference to a resource well beyond the lifetime of the resource it identifies or of any naming authority involved in the assignment of its name.” [Persistent Identifier] = URN in IETF RFC 1737: Functional Requirements for Uniform Resource Names. ( Persistence is more a matter of social issues than technical solutions Technology can assist. 17
18
The DOI SystemThe DOI System
OverviewOverview Persistence? doi> e.g. JISC Information Environment Architecture Standards Framework
19
The DOI SystemThe DOI System
OverviewOverview Two principles for persistent identification doi> resource ID 1. Obvious: Assign ID to resource Once assigned the number must identify the same resource Beyond the lifetime of the resource, or the assigner Less obvious: Assign Resource to ID The resource must be “identified” Must ensure it is always the same thing (bound) Describe the resource “content” [with precision] Failure to do this will ultimately break interoperability “To an engineer, there is no forever. Instead, there is a fixed lifetime and a mechanism for moving forward before that lifetime expires” (Miller, DLib Nov 1996) characterises the technologist’s view. By contrast, a content/rights view would be that an ISBN identifies a book, irrespective of technology, for all time; there is no need for an upgrade to say so. It is not that one is right and one wrong: they're working different layers. We want the basic engineering design (create unique identifiers and resolve them to current state data) to outlast any implementation. The idea of associating an identifier with a 'data structure' may have more to it than 'bag of bits.' Examples: “Today's NY Times”; “the current temperature”. The content always changes; the location can certainly change; the technology can change; but they are all valid candidates for an identifier. In the case of the NY Times, everything may change in the next 20 years, including ownership, method of delivery, etc. But it is still useful to have an identifier resolve to some set of type/value pairs and at some level those type/value pairs may stay the same, at least the types and numbers of pairs. There is a design discipline necessary in assigning identifiers, which is not yet fully articulated. Technologists have focussed on (1) [and “bags of bits/data structures”] The content/rights world on (2) [and focus on “intellectual content”] Both viewpoints valid: (2) is now becoming more relevant
20
The DOI SystemThe DOI System
OverviewOverview doi> Back to 1994… URL (Uniform Resource Locator) is a location Managing by locations alone is not sustainable What we need is: a solution for redirecting… …a name for the object Treating the object as as “First class object”; A Name that could then be relied on even if moved anywhere Name would resolve to location (N L) A Name that is easy to automate, uses standard characters Hey, didn’t those old-fashioned text people do something like that..?
21
The DOI SystemThe DOI System
OverviewOverview doi> 1994: Uniform Resource Names: N L
22
The DOI SystemThe DOI System
OverviewOverview 1995: Persistent URLs – redirection L L doi>
23
The DOI SystemThe DOI System
OverviewOverview : Digital Object Architecture: N L doi> Wider scope than “the web”: the internet. Handles
24
The DOI SystemThe DOI System
OverviewOverview 1998: “Cool URI’s don’t change” doi>
25
The DOI SystemThe DOI System
OverviewOverview doi> 2001: Persistence on the web?? "One of the web sites I maintain is the Lisweb directory of library homepages. Every week, I run a link checker that contacts each page to see if it is still there, and every week about 20 sites that were in place seven days before have vanished. Across the Internet, the rate at which once-valid links start pointing at non-existent addresses -- a process called "link rot" -- is as high as 16 percent in six months. That means that about one sixth of all links will break.“ NetConnect, Thomas Dowling, Library Journal, Fall 2001, p. 36
26
The DOI SystemThe DOI System
OverviewOverview doi> 2002: Persistence on the web?? 19% links broken in 19 months
27
The DOI SystemThe DOI System
OverviewOverview
28
The DOI SystemThe DOI System
OverviewOverview All references based on
29
The DOI SystemThe DOI System
OverviewOverview From: ISO/TC 46/SC 9 Committee Sent: 15 April :36 To: Subject: Change of & Web addresses for ISO/TC46/SC9 work Dear ISTC Colleagues, Due to the creation of the new Library and Archives Canada, the Internet domain that hosts ISO/TC46/SC9 and its various Working Groups has been changed from "nlc-bnc.ca" to "lac-bac.gc.ca". Please update your address books for the following: - Jane Thacker: - ISO/TC46/SC9 Secretariat: And change your bookmarks for the ISO/TC46/SC9 Working Group 3 (ISTC) Web site to: The address for the server that runs the ISTC-L discussion list has NOT been changed. Continue to send your messages to: Thank you.
30
The DOI SystemThe DOI System
OverviewOverview doi> “Actionable” identifiers Resolution: The process in which an identifier is the input (a request) to a network service to receive in return a specific output Action: “Point and click” is what I do (URL model), so: “what I point to (resolve to and get) is what is identified”, right? It may be – but usually isn’t. Consider: Point and click “actionable” is not the same as referencing We can identify things that are intangible (“works”), or fugitive (“performances”) Or that change: “Todays NY Times” People and concepts can be identified but can’t be “returned” Pointing and clicking can return different things in different contexts Pointing and clicking can give multiple options Identifier identifies an entity. Pointing and clicking is a service about that entity even if a very simple one like “locate an instance” which often really means “locate a derivation” Entities can be physical, abstract, tangible, intangible, things, people, concepts, colours…
31
The DOI SystemThe DOI System
OverviewOverview doi> Interoperability We all know our own back yard (“We all know what we mean”) Q: Why do we want persistent identifiers? A: For interoperability Interoperability = the possibility of use in services outside the direct control of the issuing assigner “persistence is interoperability with the future” We know what we mean, but others may not. Identifiers assigned in one context may be encountered, and may be re-used, in another place (or time) - without consulting the assigner. You can’t assume that your assumptions will be known to someone else. Interoperability is accelerated through automation: Two key events: 1966: automation of supply chains (ISBN) 1994: automation of sharing resources (WWW) Increasing interoperability = increasing chance of breakdown
32
The DOI SystemThe DOI System
Overview OverviewOverview Persistent identifier applications doi> ISSUES What are we identifying with this identifier? [content not just bits] What are we resolving to from this identifier? What, if any, explicit metadata are we making available? How will the cost of providing the infrastructure be met? THEMES Identification of entities of all forms To be used in variety of contexts Appropriate use of metadata at appropriate level Development of ontology tools to describe entity relationships Persistent Interoperable Precise Automation Logic 32
33
The DOI SystemThe DOI System
Overview OverviewOverview Persistent identifier applications doi> DOI name = Digital Object Identifier Name An implemented identifier system Packaged system of components Principles of persistent identification including semantically consistent interoperation Implemented identifier systems actionable labels following a specification e.g. Bar code system, DOI System “if you use this system, then the label IS actionable” Packaged system offering label + tools + business model A packaged system is not essential, but is convenient 33
34
The DOI SystemThe DOI System
Overview OverviewOverview Syntax Policies doi> Data Model Resolution 34
35
The DOI SystemThe DOI System
Overview OverviewOverview DOI name syntax can include any existing identifier, formal or informal, of any entity An identifier “container” e.g. /5678 / / verview-DOI NISO standard Z39.84 First class object: name Not “intelligent” as a label Cannot tell what it is from looking at the DOI name Redirection through resolution 35
36
The DOI SystemThe DOI System
OverviewOverview Overview DOI URL Assigner Content DOI directory DOI directory DOI directory Content 36
37
The DOI SystemThe DOI System
Overview OverviewOverview Resolve from DOI name to data initially Location (URL) – persistence May be to multiple data: Multiple locations Metadata Services Extensible Uses the Handle System - Implementing URI/URN concept - Advantages of granularity, scalability, administrative delegation, security, etc Resolution allows a DOI name to link to any & multiple pieces of current data 37
38
The DOI SystemThe DOI System
Overview OverviewOverview doi> Why do we need “metadata”? Having an identifier alone doesn’t help – we want to know “what is this thing that’s identified?” we want to know precisely precisely enough for automation There’s lots of metadata already: which should be (re-) used People use different schemes: need to map from one scheme to another (e.g. does “owner” in scheme A mean “owner” in scheme B?) 38
39
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI System data model The underlying model of how data within the DOI System relates to other data Two components Data Dictionary + DOI Application Profile Framework Data Dictionary Provides tool for precise description of entity through metadata (and mapping to other schemes). DOI Application Profile framework. Provides means of relating entities: grouping entities and expressing relationships A mechanism for grouping DOI names with similar properties DOIs, APs, and DOI System services built using these: have many-to-many relationships: expressed through multiple resolution (handle) may have precise descriptions: expressed through metadata in Data Dictionary 39
40
The DOI SystemThe DOI System
OverviewOverview Overview doi> Application Profile (AP) Framework 965 876 456 453 784 369 908 Entities are identified by DOI names 784 369 965 876 456 908 453 Application Profile The properties of groups of DOI names are defined as APs Service Instance APs have one or more Services Service Definition Services have definitions 40
41
The DOI SystemThe DOI System
OverviewOverview Overview doi> Application Profile (AP) Framework Entities are identified by DOI names The properties of groups of DOI names are defined as APs APs have one or more Services Services have definitions 965 Application Profile 965 Service Instance Service Definition 876 876 456 456 453 453 453 784 Application Profile Service Definition Service Instance Application Profile Service Instance Service Definition 784 784 369 369 908 Service Instance Service Definition 908 New APs and services may be created or made available One change to an AP to affect all DOI names within that AP 41
42
The DOI SystemThe DOI System
Overview OverviewOverview Metadata tools: a data dictionary to define a grouping mechanism to relate Necessary for interoperability “Enabling information that originates in one context to be used in another in ways that are as highly automated as possible”. Able to use existing metadata Mapped using standard dictionary can describe any entity at any level of granularity <indecs> Data Dictionary + DOI AP framework 42
43
The DOI SystemThe DOI System
Overview OverviewOverview DOI System policies allow any business model for practical implementations Implementation through IDF Governance and agreed scope, policy, “rules of the road” , central tools (dictionary, resolution mechanism) Cost-recovery (self-sustaining) Registration agencies (“franchise”) Each can develop own applications Use in “own brand” ways appropriate for their community Examples: CrossRef, OPOCE 43
44
The DOI SystemThe DOI System
Overview OverviewOverview doi> Costs For an everyday user: Free: any DOI name may be resolved by anyone No obligations For an assigner: Must work through a Registration Agency Cost depends on application: DOI registration is bundled in e.g. CrossRef – crosslinking of citations: for a publisher, from $275 per year (2008) For a Registration Agency: Must be a full RA member of the International DOI Foundation Fees based on volume Developing, managing, implementing, standardising, etc: Paid for by International DOI Foundation (open to anyone) 44
45
The DOI SystemThe DOI System
OverviewOverview Overview More than an identifier… doi> Identify DOI name syntax can include any existing identifier, formal or informal, of any entity eg / / /ISBN /OPOCE_presentation / CENDI-DOI Describe DOI name metadata can be of any type, standard or proprietary eg OnixForBooks OnixForSerials IEEE/LOM MARC Dublin Core Proprietary scheme (but if you want to interoperate with anyone else in the DOI System network, you map to the <indecs> Data Dictionary (iDD). Resolve Handle resolution technology allows you to access any kind of Service associated with your DOI name. e.g. A package of services is defined for an Application Profile These services depend on metadata 45
46
The DOI SystemThe DOI System
Overview OverviewOverview doi> Standardisation of DOI System (ISO TC46/SC9) DOI System as ISO TC46 standard: entire DOI System Refer to component tools (Handle System, Data Dictionary, etc) as informative references Aim to separate existing “DOI Handbook” into formal standard (ISO) and operating manual (IDF) Show that DOI System supports (does not compete with) other TC46/SC9 “identifiers”: offers option of adding Internet actionability, interoperability, in a standard way Draft now finalised Supporting materials (response to comments, FAQ) available 2008 standard? Recent overview article is based on ISO draft: DOI Handbook to be revised 46
47
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI System applications The main use of the DOI System is not simply to register an identifier It is to make use of the identifier in a SERVICE offered to users E.g. CrossRef provides bibliographic citation pre-and post-production look-up service across hundreds of publishers It uses DOI names as one part of its service It has become a de-facto requirement for academic publishing 47
48
The DOI SystemThe DOI System
Overview OverviewOverview Application issues doi> Multiple services may exist for an identifier Don’t assume only monopoly services One service may be definitive; some may be better than others Multiple identifiers Need to distinguish abstractions, representations, compound objects Relation of DOI names to other identifiers (Bookland DOIs etc) Interoperability becomes more important as an economic feature when there are multiple services or multiple uses – which there will be eventually Don’t design only for today Common frameworks for naming and meaning (to do all this) become important when services cut across silos; across media; from different sources; etc Indecs–based approach (like ONIX etc) Multiple resolution: returns multiple results in response to a request (e.g. a choice, an automated service) need some way of grouping and ordering those results, e.g. Handle value typing 48
49
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI names work with existing identifier schemes General case ISO standardisation of DOI System “A DOI name is not intended as a replacement for other identifier schemes, but when used with them may enhance the identification functionality provided by those systems with additional functionality…” Incorporate the other identifier into the DOI name syntax and/or Record the other identifier in the DOI name metadata. Each scheme retains its autonomy but works together ISBN and ISSN have already agreed options 49
50
The DOI SystemThe DOI System
OverviewOverview doi> DOIs can be used to define and declare What does this DOI identify (precisely)? For interoperable uses: use in services outside the control of the assigner Metadata scheme already worked out Kernel plus Application Profiles (extensions) Standard ways of declaring simple metadata e.g. for Open URL uses Interoperability is key aspect which will tip requirements
51
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI names work with existing identifier systems Representations URL: MPEG-21: DII (Digital Item Identifier) URI schemes: Info URI URI URN 51
52
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI names to define the entity Suppose I have here a pdf version of Defoe’s “Robinson Crusoe” issued by Norton. I find an identifier – is it of: All works by Daniel Defoe The work “Robinson Crusoe”? The Norton edition of “Robinson Crusoe”? The pdf version of the Norton edition of…. ? The pdf version of…held on this server…? Most digital objects of interest have compound form, simultaneously embodying several referents. Multiple identifiers may be necessary (like music CDs) Identifiers assigned in one context may be encountered, and may be re-used, in another place or time - without consulting the assigner. You can’t assume that your assumptions made on assignment will be known to someone else. 52
53
The DOI SystemThe DOI System
OverviewOverview Overview DOI names to express relationships doi> DOI name of one item may be related to DOI name of another Through multiple resolution, metadata, Application Profiles… Example: A DOI name of a work could resolve to several available formats, languages, etc. Article DOI name 12345 Chinese version DOI name 56789 53
54
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI names for “non-traditional” entities Examples: Scientific data TIB (Registration Agency) is an example Biological nomenclature disambiguation and extension of the current taxonomy models: Names-4-Life: (IDF member) Clinical Trials identifying specific trials and sub-sets of items UK project currently using DOI names on pilot basis 54
55
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI names for “new” traditional entities Example: Book fragments – tables, figures, chapters, exercises Interactive e-books Some may use other identifiers which could become DOI names; Some may be in scope but not yet widely used (e.g. ISBNs for Chapters); Other may require new DOI names Book Industry Study Group (BISG) working on this Others: Nature “precedings”; Scirus “topic pages”; some blogs? 55
56
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI name multiple resolution Significant benefit of Handle System: Resolve from one DOI name to several different results One-to-many linkage Resolution request would give: all results, or all results of one type Need a framework to build these applications on: group similar uses so that the results are predictable and can be used across applications DOI Application Profile framework Handle System “data value typing” CrossRef to use for e.g. location-dependent resolution Other business cases? Could express relationships (ISTC to ISBNs etc) 56
57
The DOI SystemThe DOI System
Handles resolve to typed data OverviewOverview Overview doi> Handle Data type Index Handle data 10.123/456 URL 1 URL 2 DLS 9 acme/repository HS_ADMIN 100 acme.admin/jsmith XYZ 12 Rules for data type construction: 57
58
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI name contextual resolution Resolve DOI name with some additional information to give results depending on context Open URL: see e.g. Resolve to same content at different location (by user) Full contextual resolution: Handle System can do this (DVIA) Resolve to different content (by user) Of interest re licensing etc but not yet part of DOI System Steps in evolution: URLs: not useful for long term management naming and resolution: “get me the right thing” contextual resolution: “get me the thing that is right for me” (e.g. “that I have access rights for”) 58
59
The DOI SystemThe DOI System
Overview OverviewOverview doi> DOI name tools Several DOI Name Tools have been developed, from a variety of sources Such as plug-ins, e.g. Adobe Acrobat plug-in At different stages of development or use 59
60
The DOI SystemThe DOI System
OverviewOverview
61
The DOI SystemThe DOI System
OverviewOverview
62
The DOI SystemThe DOI System
OverviewOverview doi> Summary Origins of the DOI System Current position of DOI activities Persistence Actionable identification Interoperability System components Standardisation DOI system applications
63
The DOI SystemThe DOI System
Overview OverviewOverview DOI SYSTEM: OVERVIEW International DOI Foundation 63
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.