Download presentation
COINFO Beijing Oct 06 DOI and applications DOI SYSTEM AND ITS APPLICATIONS Co-operation and promotion of Information Resources in Science and Technology Beijing Oct Norman Paskin International DOI Foundation Norman Paskin
Naming (identifying) resources on the internet
COINFO Beijing Oct 06 DOI and applications Outline Naming (identifying) resources on the internet The problem Handles DOIs Meaning of resources on the internet Mapping meanings through metadata DOI System Current position of the DOI system Norman Paskin
1. Naming Assigning an identifier to a referent
COINFO Beijing Oct 06 DOI and applications 1. Naming Assigning an identifier to a referent Identifier: unique persistent alphanumeric string (“number”, “name”, “lexical token”) specifying a referent Unique: one to many: an identifier specifies one and only one referent (but a referent may have more than one identifier) Persistent: once assigned, does not change referent Resolution: process by which an identifier is input to a network service which returns its associated referent and/or descriptive information about it (metadata). Referent: the object which is identified by the identifier, whether or not resolution returns that object. Object: any entity within the scope of the identifier system. may be abstract, physical or digital, since all these forms of entity are of relevance in content management (e.g. creations, resources, agreements, people, organisations) Norman Paskin
Naming First class naming: Digital Object Architecture Handle system
COINFO Beijing Oct 06 DOI and applications Naming First class naming: Digital Object Architecture “Digital information needs to be a first class citizen in the networked environment” (Kahn/Wilensky 1995) First class = one that has an identity independent of any other item Handle system Part of the Digital Object Architecture: a system for persistent naming for digital objects and other resources on the Internet, and efficiently resolving those names to data DOI (Digital Object Identifier) system One application of the Handle System, which adds to it additional features – social and technical infrastructure, policies, metadata management. Internet the global information system that is logically linked by a globally unique address space and communications using TCP/IP and provides high level services layered on these (or successors) Not DNS; not the Web (includes P2P, voip, etc) DNS: Domain Name System maps domain names (computer hostnames) to IP addresses. Norman Paskin
What is being named? Three key problems
COINFO Beijing Oct 06 DOI and applications What is being named? Three key problems Granularity: the extent to which a collection of information has been subdivided for purposes of identification (e.g. a collection; a book; tables and figures) Functional Granularity: it should be possible to identify an entity whenever it needs to be distinguished Precisely what is being named? The work “Robinson Crusoe”? The Norton edition of “Robinson Crusoe”? The pdf version of the Norton edition of…. ? The pdf version of…held on this server…? Most digital objects of interest have compound form, simultaneously embodying several referents Resolution of an identifier may give the referent, or only metadata; or a “manifestation” Resolution of an identifier Persistence: “get me the right thing” Contextual resolution: “get me the thing that is right for me” Appropriate copy resolution (e.g. OpenURL context-sensitive linking): same content in different contexts Full contextual resolution (e.g. DVIA): different content in different contexts Norman Paskin
Resolution DNS is current basis of resolution of web-based identifiers
COINFO Beijing Oct 06 DOI and applications Resolution DNS is current basis of resolution of web-based identifiers URL: not a first class name; an attribute: a location of a file on the WWW specification allows addressing by full path to host ( IP address); rarely used. if the content of the file is moved, the URL link won't find it ("404 not found", or manual redirection, or automated redirection which may not persist). if the content, but not location, of the file is changed, a user may not know this. URN: naming convention for the content of files. Specification independent of technologies; but DNS the only present technique No widely standardised ways of using this: can't type URNs into browsers except in certain special circumstances. URI: collective name for URN and URL schemes. Not the basis of other non-web identifiers – e.g. Skype names DNS not a good general-purpose name system Does not meet requirements of first class name + appropriate granularity Not first class names: all URIs at one location have to be ultimately managed by the same domain name owner, which makes URLs brittle for any piece of content which could possibly change owners No granularity of administration per name by anyone other than a network administrator URLs are grouped by domain name and then by some hierarchical structure, originally based on file trees, now possibly unconnected from that but still a hierarchy problems of security and updating and internationalisation Potential scalability in the face of new technologies Norman Paskin
What is the problem? COINFO Beijing Oct 06 DOI and applications Managing information in the Net over very long periods of time: centuries or more Dealing with very large amounts of information in the Net over time Information, location(s) and systems may change dramatically over time Respecting and protecting rights, interests and value Allow for arbitrary types of information systems dynamic formatting and data typing interoperability between multiple different information systems metadata schema to be identified and typed Solution to this problem was put forward as Digital Object Architecture (Kahn/Wilensky 1995+) and has been successfully developed and deployed Handle System: resolution of unique identifiers Maps an identifier into “state information” about the Digital Object Identifiers are known as “Handles” Format is “prefix/suffix” (e.g /1234) Prefix is unique to a naming authority Suffix can be any string of bits assigned by that authority Handle System is a general purpose resolution system Norman Paskin
Handles resolve to typed data
COINFO Beijing Oct 06 Handles resolve to typed data DOI and applications Handle Data type Index Handle data 10.123/456 URL 1 URL 2 DLS 9 acme/repository HS_ADMIN 100 acme.admin/jsmith XYZ 12 Norman Paskin
Part of the Digital Object Architecture: (Bob Kahn)
COINFO Beijing Oct 06 Handle System DOI and applications Part of the Digital Object Architecture: (Bob Kahn) Basic resolution system for Internet: identify objects, not servers. Optimized for speed, reliability, scaling (compared to DNS) Open, well-defined protocol and data model (RFC 3650,1,2) free protocol; service at cost (non-profit); freely available to be used as engine underneath other named identifiers. Separation of control of the handle and who runs the servers distributed administration, granularity at the handle level Any Unicode character set China: CNNIC (.CN registrar) has integrated DNS and handle All transactions can be secure and certified own PKI as an option Not all data public: individual values within a handle can be private. No semantics in the identifier Logically centralized, physically distributed and highly scalable Does not need DNS, but can work with DNS: deployed via tools e.g http proxies, client plug-ins, server software, etc Norman Paskin
COINFO Beijing Oct 06 Handle System use DOI and applications Provides infrastructure for application domains, e.g., digital libraries & publishing, network management, id management ... Library of Congress DTIC (Defense Technical Information Center) IDF (International DOI Foundation) CrossRef (scholarly journal consortium) Office of Publications of the European Community CAL (Copyright Agency Ltd - Australia) MEDRA (Multilingual European DOI Registration Agency) Nielsen BookData (bibliographic data - ISBN) R.R. Bowker (bibliographic data - ISBN) German National Library of Science and Technology etc NTIS (National Technical Information Service) D-Space (MIT + HP) ADL (DoD Advanced Distributed Learning initiative) Several Digital Library projects (eg ARROW) In development: Globus Alliance (for GRID computing) Norman Paskin
Assigned Prefixes Handles Global Handle System Handle System use
COINFO Beijing Oct 06 Handle System use DOI and applications Assigned Prefixes DOI DSpace Other apps 406 Handles DOI M Other: additional millions (total per prefix known only to prefix manager; e.g. LANL adding 600M but privately) Global Handle System Core: three service sites (added locations being considered) c. 50 million direct resolutions per month c. 50 million proxy server resolutions Norman Paskin
DOI (Digital Object Identifier) system:
The DOI System COINFO Beijing Oct 06 DOI and applications DOI (Digital Object Identifier) system: Initially developed (1998) from the publishing industry but now wider Currently being standardised in ISO (TC46/SC9) the home of ISBN etc “content identifers” One application of the Handle System adds to it additional features – social and technical infrastructure, policies, metadata management. Norman Paskin
doi> Naming scheme and resolution Data Model for Policies declaring
COINFO Beijing Oct 06 DOI and applications Naming scheme and resolution doi> Policies Data Model for declaring meaning Norman Paskin
Naming scheme and resolution The Handle System
COINFO Beijing Oct 06 DOI and applications Naming scheme and resolution The Handle System An identifier “container” e.g. /NP5678 /ISBN / ISO-DOI Resolve from DOI to data Initially resolve to location (URL) – persistence May be to multiple data: Multiple locations Metadata Services Extensible Norman Paskin
DOI policies Implementation through International DOI Foundation
COINFO Beijing Oct 06 DOI and applications DOI policies Implementation through International DOI Foundation Not-for-profit body: federation of appointed agencies Governance and agreed scope, policy, “rules of the road” Technical infrastructure: resolution mechanism, proxy servers, mirrors, back-up, central dictionary, Social infrastructure: persistence commitments, fall-back procedures, cost-recovery (self-sustaining), shared use of IDF tools etc Registration agencies Each can develop own applications Any business model Use in “own brand” ways appropriate for their community Norman Paskin
Data Model for declaring meaning DOI Data Model = Metadata tools:
COINFO Beijing Oct 06 DOI and applications DOI Data Model = Metadata tools: a data dictionary to define a grouping mechanism to relate Necessary for interoperability Able to use existing metadata Mapped using a standard dictionary Can describe any entity at any level of granularity See “DOI and data dictionaries” Data Model for declaring meaning Norman Paskin
COINFO Beijing Oct 06 DOI and applications 2. Meaning Assigning metadata to a referent, to enable semantic interoperability “say what the referent is” Resolution of an identifier may give the referent, or only metadata; or a “manifestation” Semantic: Do two identifiers from different schemes actually denote the same referent? If A says “owner” and B says “owner”, are they referring to the same thing? If A says “released” and B says “disseminated”, do they mean different things? Interoperability: the ability for identifiers to be used in services outside the direct control of the issuing assigner Identifiers assigned in one context may be encountered, and may be re-used, in another place or time - without consulting the assigner. You can’t assume that your assumptions made on assignment will be known to someone else. Persistence = interoperability with the future Norman Paskin
COINFO Beijing Oct 06 DOI and applications Tools to ensure meaning Basis: “Interoperability of Data in E-Commerce Systems” (indecs) : Focus: generic intellectual property and how to make data about it interoperable Who: EC + groups from the content, author, creator, library, publisher and rights communities What: Pioneered a model of event-based metadata as a solution for integrating management of rights. Led to: a structured ontology (data dictionary); tools for mapping terms precisley; inference tools etc: contextual ontology architecture Norman Paskin
Agreed term-by-term mapping or “Crosswalk”
COINFO Beijing Oct 06 DOI and applications Metadata scheme e.g. ONIX Metadata scheme e.g. LOM Agreed term-by-term mapping or “Crosswalk” Norman Paskin
Metadata scheme Metadata scheme e.g. ONIX e.g. LOM
COINFO Beijing Oct 06 DOI and applications Metadata scheme e.g. ONIX Metadata scheme e.g. LOM Norman Paskin
Metadata scheme Metadata scheme e.g. ONIX e.g. LOM
COINFO Beijing Oct 06 DOI and applications Metadata scheme e.g. ONIX Metadata scheme e.g. LOM Norman Paskin
Tools to ensure meaning
COINFO Beijing Oct 06 DOI and applications Tools to ensure meaning “Contextual Ontology” approach is used in: ISO MPEG-21 Rights Data Dictionary ( DOI Data Dictionary ( ) DDEX digital data exchange - music industry ( ONIX: Book industry (+) messaging schemas ( ) Rightscom’s OntologyX - licensee of output, plus own work on tools ( ) Digital Library Federation - communication of licence terms (ERMI: ONIX for licensing terms) ACAP: Content Access ( ) etc Norman Paskin
3. DOI System in application
COINFO Beijing Oct 06 DOI and applications 3. DOI System in application DOI System solves the problems of: Naming: prerequisite for management of digital information entities Meaning: prerequisite for enabling digital information entities to interact And also: Building a practical system to do this Norman Paskin
E-mail news alert service
COINFO Beijing Oct 06 DOI and applications Recent news Link to archive news news alert service Norman Paskin
doi> Two consistent aims since 1998 COINFO Beijing Oct 06
DOI and applications Two consistent aims since 1998 doi> Norman Paskin
Initial implementation Full implementation Activity tracking
COINFO Beijing Oct 06 DOI and applications (1) DOI: development in three tracks Other efforts, standards, etc Metadata Single redirection (persistent identifier) Multiple resolution Initial implementation Full implementation Activity tracking A continuing development activity Norman Paskin
& development spend cost-reduction IDF M Operating Federation RA C
COINFO Beijing Oct 06 DOI and applications (2) Creation of an organisation development spend cost-reduction IDF M Operating Federation RA C & Norman Paskin
doi> Cumulative DOI Assigned Currently 7 RAs: but one dominates
COINFO Beijing Oct 06 DOI and applications Cumulative DOI Assigned doi> Currently 7 RAs: but one dominates Norman Paskin
doi> Cumulative DOI Prefixes – by RA per year
COINFO Beijing Oct 06 DOI and applications Cumulative DOI Prefixes – by RA per year doi> But prefix development improving Norman Paskin
doi> Increase in RA role IDF supported by 24 member organisations
COINFO Beijing Oct 06 DOI and applications Increase in RA role doi> IDF supported by 24 member organisations general members (not RAs) operational (RA = Registration Agency) members Year Number of RAs (end year) % of revenues RAs <10 2006 Forecast Norman Paskin
doi> Current strategy
COINFO Beijing Oct 06 DOI and applications Current strategy doi> Focus on enabling current RAs to generate more DOIs New RAs in new areas Social infrastructure development (RA policies) Business model: IDF RA C Incentive scheme: large discounts per DOI for large numbers of registrations, e.g. 25% -> 90%+ IDF has no role in this Norman Paskin
doi> Implementation of strategy
COINFO Beijing Oct 06 DOI and applications Implementation of strategy doi> RAs focus on building applications in their existing sectors viability of business models lower costs per DOI (for volume) IDF focus on tools for RAs: Resolution – e.g. Acrobat plug-in Multiple resolution: DOI-AP framework Semantic interoperability: Data Dictionary Contextual resolution – OpenURL, DVIA Norman Paskin
doi> ISO standardisation DOI system as an ISO standard
COINFO Beijing Oct 06 DOI and applications ISO standardisation doi> DOI system as an ISO standard Within ISO TC46 SC9 ISO/TC 46 = "Information and documentation". Subcommittee 9 = "Presentation, identification and description of documents": ISBN, ISSN, ISMN, ISRC, ISAN, V-ISAN, ISWC, ISTC Aim is to codify system by reference to components IDF becomes ISO appointed authority for DOI standard ISO standard is basis of operating procedures (Handbook) Sept 06: Working Group reviews Nov 06: Committee Draft Likely completion 2007 or 2008 Norman Paskin
doi> Interoperability
COINFO Beijing Oct 06 DOI and applications Interoperability doi> “the ability of independent systems to exchange meaningful information and initiate actions from each other, in order to operate together to mutual benefit...allowing for the possibility of their extensible use in services outside the direct control of the issuing assigner..." Practical consequences: Metadata interoperability Standard mechanisms for the expression of relationships between the referent of different standard identifiers Creation of common services: shared syntax or physical interface for the expression of requests and responses for provision of services and/or data (Metadata look up services, identifier discovery services) See: “Identifier Interoperability: a report on two recent ISO activities” Norman Paskin
doi> DOI Data Model: Application Profiles
COINFO Beijing Oct 06 DOI and applications DOI Data Model: Application Profiles doi> Referents are identified by DOIs The properties of groups of DOIs are defined as APs APs have one or more Services Services have definitions Entity Application Profile 965 Service Instance Service Definition Entity 876 Entity 456 453 453 784 Service Instance Application Profile Entity Service Definition Application Profile Service Instance Entity 784 Entity 369 908 Service Instance Service Definition Entity New APs and services may be created or made available Norman Paskin
COINFO Beijing Oct 06 DOI and applications DOI SYSTEM AND ITS APPLICATIONS Co-operation and promotion of Information Resources in Science and Technology Beijing Oct Norman Paskin International DOI Foundation Norman Paskin
Similar presentations
© 2025 Inc.
All rights reserved.