The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National.

Slides:



Advertisements
Similar presentations
Internet Evolution, Governance and the Digital Object Architecture Workshop on SCORM Sequencing and Navigation Gaithersburg, Maryland February 23, 2005.
Advertisements

The Corporation for National Research Initiatives The Handle System Persistent, Secure, Reliable Identifier Resolution.
Digital Object Architcture An open approach to Information Management on the Net Bibliotheca Alexandrina Dr. Robert E. Kahn Corporation for National Research.
Distributed Data Processing
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
Chapter 19: Network Management Business Data Communications, 5e.
Secure Sockets Layer eXtended (SSLX) Next Generation Internet Security Overview Presentation April 2011.
Digital Object Architecture: Building Information Management Infrastructure for Networks 20 September 2010 Larry Lannom Corporation for National Research.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
How to Succeed with Active Directory Robert Williams, PhD CEO Secure Logistix Corporation.
DESIGNING A PUBLIC KEY INFRASTRUCTURE
 Introduction Originally developed by Open Software Foundation (OSF), which is now called The Open Group ( Provides a set of tools and.
An Engineering Approach to Computer Networking
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Understanding Active Directory
A centralized system.  Active Directory is Microsoft's trademarked directory service, an integral part of the Windows architecture. Like other directory.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
CNRI Handle System and its Applications
Resolving Unique and Persistent Identifiers for Digital Objects Why Worry About Identifiers? Individuals and organizations, including governments and businesses,
Module 10: Designing an AD RMS Infrastructure in Windows Server 2008.
The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio.
WSIS Forum 2011 May 19, 2011 Presentation by Robert E. Kahn
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Reflections on the Digital Object Architecture by Robert E. Kahn, CNRI A presentation at a Symposium on Trusted Repositories in Rome, Italy on November.
Engr. M. Fahad Khan Lecturer Software Engineering Department University Of Engineering & Technology Taxila.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
Part 3: Internetworking Internet architecture, addressing, encapsulation, reliable transport and the TCP/IP protocol suite.
Attaching Rights to Content Larry Lannom Corporation for National Research Initiatives Copyright ©
Managing Digital Objects on the Net by Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia National Online 2001 New York City.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.
REST By: Vishwanath Vineet.
Introduction to Active Directory
AFS/OSD Project R.Belloni, L.Giammarino, A.Maslennikov, G.Palumbo, H.Reuter, R.Toebbicke.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
Identifiers and Repositories hussein suleman uct cs honours 2006.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Digital Object Architecture Tutorial
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Distributed computing environment
Jean-Philippe Baud, IT-GD, CERN November 2007
Issues need harmonization
Domain Name System (DNS)
Open Source distributed document DB for an enterprise
Corporation for National Research Initiatives
THE STEPS TO MANAGE THE GRID
CHAPTER 3 Architectures for Distributed Systems
The Anatomy and The Physiology of the Grid
An Engineering Approach to Computer Networking
Presentation transcript:

The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia

Selected Major Network Issues How to get affordable broadband access to homes, businesses, government, etc. How to get affordable broadband access to homes, businesses, government, etc. How to add more dimensionality to the mobile wireless experience How to add more dimensionality to the mobile wireless experience How to take advantage of many devices/appliances being on the Internet How to take advantage of many devices/appliances being on the Internet Protecting critical elements (including infrastructure elements such as DNS) Protecting critical elements (including infrastructure elements such as DNS) Stifling SPAM; detecting and fighting Viruses Stifling SPAM; detecting and fighting Viruses

Selected Major Issues (con’t) Identity Management (w/o certificates) Identity Management (w/o certificates) Trust in the security mechanisms Trust in the security mechanisms Managing Privacy Managing Privacy How to enable more widespread sharing of important information on the net How to enable more widespread sharing of important information on the net Trusting your information to the Net Trusting your information to the Net Managing your information on the Net over very long periods of time Managing your information on the Net over very long periods of time

Infrastructure Development What is so hard about it? What is so hard about it? Making it scalable over platforms, size and time Making it scalable over platforms, size and time Achieving Critical Mass Achieving Critical Mass Getting Buy in Getting Buy in Pleasing many essential participants Pleasing many essential participants Displacing prior capabilities Displacing prior capabilities Structuring matters to deal with concerns about empire building Structuring matters to deal with concerns about empire building It’s a lot easier to create brand new capabilities than to affect existing means of operation It’s a lot easier to create brand new capabilities than to affect existing means of operation

Infrastructure Creation is a Subtractive Process Infrastructure reduces a common, shared capability to its basic and essential attributes Infrastructure reduces a common, shared capability to its basic and essential attributes These attributes are not always recognized or understood up front These attributes are not always recognized or understood up front Upon further scrutiny, capabilities are usually deleted from a well-conceived architecture over time Upon further scrutiny, capabilities are usually deleted from a well-conceived architecture over time Consensus develops when no more can be removed without disabling the infrastructure Consensus develops when no more can be removed without disabling the infrastructure

What is the Problem? Managing information in the Net over very long periods of time – e.g. centuries or more Managing information in the Net over very long periods of time – e.g. centuries or more Dealing with very large amounts of information in the Net over time Dealing with very large amounts of information in the Net over time When information, its location(s) and even the underlying systems may change dramatically over time When information, its location(s) and even the underlying systems may change dramatically over time Respecting and protecting rights, interests and value Respecting and protecting rights, interests and value

A Meta-level Architecture Allows for arbitrary types of information systems Allows for arbitrary types of information systems Allows for dynamic formatting and data typing Allows for dynamic formatting and data typing Can accommodate interoperability between multiple different information systems Can accommodate interoperability between multiple different information systems Allows metadata schema to be identified and typed Allows metadata schema to be identified and typed

Digital Object Architecture: Motivation To reformulate the Internet architecture around the notion of uniquely identifiable data structures To reformulate the Internet architecture around the notion of uniquely identifiable data structures Enabling existing and new types of information to be reliably managed and accessed in the Internet environment over long periods of time Enabling existing and new types of information to be reliably managed and accessed in the Internet environment over long periods of time Providing mechanisms to stimulate innovation, the creation of dynamic new forms of expression and to manifest older forms Providing mechanisms to stimulate innovation, the creation of dynamic new forms of expression and to manifest older forms While supporting intellectual property protection, fine- grained access control, and enable well-formed business practices to emerge While supporting intellectual property protection, fine- grained access control, and enable well-formed business practices to emerge

Objective of the Framework Internet objective Best-effort Packet Delivery Heterogeneous Networks Information Systems Seamless Interoperability Networks Information Systems Organizing Heterogeneous Systems

Digital Object Architecture Technical Components Technical Components Digital Objects (DOs) Digital Objects (DOs) Structured data, independent of the platform on which it was created Structured data, independent of the platform on which it was created Consisting of “elements” of the form Consisting of “elements” of the form One of which is its unique, persistent identifier One of which is its unique, persistent identifier Resolution of Unique Identifiers Resolution of Unique Identifiers Maps an identifier into “state information” about the DO Maps an identifier into “state information” about the DO Handle System is a general purpose resolution system Handle System is a general purpose resolution system Repositories from which DOs may be accessed Repositories from which DOs may be accessed And into which they may be deposited And into which they may be deposited Metadata Registries Metadata Registries Repositories that contain general information about DOs Repositories that contain general information about DOs Supports multiple metadata schemes Supports multiple metadata schemes Can map queries into unique DO specifications (via handles) Can map queries into unique DO specifications (via handles)

What is a Digital Object Defined data structure, machine independent Defined data structure, machine independent Consisting of a set of elements Consisting of a set of elements Each of the form Each of the form One of which is the unique identifier One of which is the unique identifier Identifiers are known as “Handles” Identifiers are known as “Handles” Format is “prefix/suffix” Format is “prefix/suffix” Prefix is unique to a naming authority Prefix is unique to a naming authority Suffix can be any string of bits assigned by that authority Suffix can be any string of bits assigned by that authority Data structure can be parsed; types can be resolved within the architecture Data structure can be parsed; types can be resolved within the architecture Associated properties record and transaction record containing metadata and usage information Associated properties record and transaction record containing metadata and usage information

Interoperability & Federated Repositories Create a cohesive interoperable collection of repository-based systems Initially, perhaps, around a core set of projects, content, applications and/or organizations as in ADL Initially, perhaps, around a core set of projects, content, applications and/or organizations as in ADL Demonstrate interoperability between different repository collections Demonstrate interoperability between different repository collections Develop procedures to insure continued accessibility to key archival information Develop procedures to insure continued accessibility to key archival information

Repository Notion Any Hardware & Software Configuration Logical External Interface RAP Repository Access Protocol

Repository Digital Object Repository RAP Client Provides distributed Digital Object storage. May itself be a Digital Object. Provides a dynamic acquisition and execution mechanism for the mobile code that implements the content type operations. Exclusively accessed using the Repository Access Protocol (RAP). Disseminate Deposit

Nesting of Repository Functionality Core Structure Content Aggregation & De-aggregation Core Interface must be present at each level Other levels could be separately defined later

Repositories & Digital Objects REPOSITORY IPv6 Each Digital Object has its own unique & persistent ID Content Providers want to assign Ids Could be upwards of trillions of DOs per Repository Objects may be Replicated in Multiple Repositories

Handle System Distributed Identifier Service on the Internet Distributed Identifier Service on the Internet First General Purpose Resolution system First General Purpose Resolution system Can be used to locate repositories that contain digital objects given their handles - and more! Can be used to locate repositories that contain digital objects given their handles - and more! Other indirect references Other indirect references Public Keys, Authentication information for Dos Public Keys, Authentication information for Dos Accommodates interoperability between many different information systems; for example Accommodates interoperability between many different information systems; for example DNS was demonstrated on the Handle System in preparation for Y2K DNS was demonstrated on the Handle System in preparation for Y2K Can support ENUM, RFID, and more Can support ENUM, RFID, and more

Attributes of the Handle System The basic Architecture of the Handle System is flat, scaleable, and extensible The basic Architecture of the Handle System is flat, scaleable, and extensible Logically central, but physically decentralized Logically central, but physically decentralized Supports Local Handle Services, if desired Supports Local Handle Services, if desired Handle resolutions return entire “Handle Records” or portions thereof Handle resolutions return entire “Handle Records” or portions thereof Handle Records are also Handle Records are also digital objects digital objects signed by the servers signed by the servers doubly certificated by the system doubly certificated by the system

Resolution Mechanism Multiple Sites Multiple Servers Handle System Handle Record System is non –nodal Scaleable & Distributed Supports global (and local) resolution With backup for reliability, mirroring for efficiency

Type Resolution Types are resolvable in the Handle System Types are resolvable in the Handle System Types may be created dynamically Types may be created dynamically Types may be locally named, mapped into bit strings without semantics Types may be locally named, mapped into bit strings without semantics Primary prefix zero “0” is used for system identifiers Primary prefix zero “0” is used for system identifiers 0.type/ is the system handle for type 0.type/ is the system handle for type Other handles may cross reference this handle (e.g. for international use) Other handles may cross reference this handle (e.g. for international use)

Handle Format Handle Format Prefix Authority Item ID (any format) Prefix Suffix In use, a Handle is an opaque string /1234 Other examples of Handles 2304/general info 2304/ HQ/staff /memo Pub/2004

Direct Access and Proxies Direct Access One or more Proxy Servers Indirect Access

Redirection of Handle Requests Direct Access Direct Access One or more Local Handle Services General Registry of all Naming Authorities Redirection Information Redirection Information

Literary Music Video Financial Grid Enum RFID “SimpleLookup URL IPaddresses “Unfederated Databases”

Digital Object Content Type(s) Access Requests Information Digital Object Overview Disseminations Unique Identifier Handle

Hamlet It’s a Book Get Page(2) Digital Object Overview Hamlet

Digital objects are uniquely identified in a given identifier space. Data elements reference sequences of typed data. A Digital Object can have zero or more Content Types to reflect intended uses by its creator. Content Type Operations are accessible as DOs Data Element Data Element Hamlet Content Type Operations Content Type Operations Digital Object Overview

The Digital Object Identifier (DOI ® ) Used by the International DOI Foundation (IDF) to reference high- quality materials of publishers (and other owners of IP) Used by the International DOI Foundation (IDF) to reference high- quality materials of publishers (and other owners of IP) Major Commercial User of the Handle System at present with approximately 12 Million handles Major Commercial User of the Handle System at present with approximately 12 Million handles Usage growing at about 4 Million per year Usage growing at about 4 Million per year DNS domain names, by comparison, are relatively flat with perhaps 40% churn per year. DNS domain names, by comparison, are relatively flat with perhaps 40% churn per year.

Setting up a Local Handle Service... Download the software from Download the software from Follow the instructions in the installation script. Follow the instructions in the installation script. Send your “site bundle”, containing the IP address of your server and your administrator information, to the Global Handle Registry ® (GHR) administrator Send your “site bundle”, containing the IP address of your server and your administrator information, to the Global Handle Registry ® (GHR) administrator Site is under re-development to accommodate widespread use via automated means Site is under re-development to accommodate widespread use via automated means Experimental Repository software also available on- line Experimental Repository software also available on- line

Managing Rights & Interests Not just about copyright Not just about copyright Terms and Conditions (T&Cs) for use may be contained within each DO; also information about intrinsic value, such as monetary value Terms and Conditions (T&Cs) for use may be contained within each DO; also information about intrinsic value, such as monetary value T&Cs are intended to indicate clearly what one can and/or cannot do with a given DO, where such clarity is intended by the owner of the DO T&Cs are intended to indicate clearly what one can and/or cannot do with a given DO, where such clarity is intended by the owner of the DO Not an enforcement means, although it may be used by an enforcement system Not an enforcement means, although it may be used by an enforcement system Mobile programs that are Digital Objects may apply such terms to themselves and to any digital objects they contain Mobile programs that are Digital Objects may apply such terms to themselves and to any digital objects they contain

Handle-DNS Integration Developing Environment Developing Environment C/C++, Linux/Windows C/C++, Linux/Windows Additional Modules Additional Modules DNS Interface integrated with handle server DNS Interface integrated with handle server Cache/Preload Module Cache/Preload Module Database Connection Pools Database Connection Pools C-Version Handle-DNS Admin Toolkit C-Version Handle-DNS Admin Toolkit Performance Improvements Performance Improvements Exceptional Processing Exceptional Processing Memory Leak Protection Memory Leak Protection Thread Pool Management Thread Pool Management

Design & Implementation Simple Handle Server Workflow (C- Version) Simple Handle Server Workflow (C- Version) Storage Management Interface Handle Requests Thread Pool Listener Handle Server Client Message Processor DB Database Connection Pool

External Protocol Converter DNS Protocol Converter Handle Protocol Handle Process Module Handle Server Latency

Plug & Play Interfaces Integrate DNS Interface with Handle Server Integrate DNS Interface with Handle Server DNS Protocol DNS Message Processor Handle Protocol Handle Message Processor Handle Server

Cache & Storage Management Preload (Cache) Module Preload (Cache) Module Preload Handle Records from Database into RAM Preload Handle Records from Database into RAM Reduce Database Access Times Reduce Database Access Times Improve Throughput of Handle Server Improve Throughput of Handle Server Storage Management API Storage Management API User Transparent User Transparent RAM or Database RAM or Database Combination of RAM and Database Combination of RAM and Database Multiple Database Interfaces Multiple Database Interfaces Mysql, PostgreSQL, etc. Mysql, PostgreSQL, etc. Features of Cache Module Features of Cache Module Efficient Query Performance Efficient Query Performance STL RBTree, Hash Table STL RBTree, Hash Table Configurable size of RAM for each Handle Record, or total records Configurable size of RAM for each Handle Record, or total records Storage Management API Storage Management Interface RAM Operations Create Modify Delete Data Base Periodic Update

Benchmark UDP Interface for DNS Protocol UDP Interface for DNS Protocol Compared to BIND Compared to BIND 9.3.0

Selling infrastructure technology Selling infrastructure technology Providing identification, management and Metadata services Providing identification, management and Metadata services Enabling third-party value-added capabilities Enabling third-party value-added capabilities Helping organizations manage their own information better & offer new types of services Helping organizations manage their own information better & offer new types of services Stimulating access to “surface information” and “embedded information” with appropriate access controls and conditions of use Stimulating access to “surface information” and “embedded information” with appropriate access controls and conditions of use Business Potential

Conclusions Managing Digital Objects for long-term access is a key challenge Managing Digital Objects for long-term access is a key challenge Initial Technology Components are available; Industry is expected to generate more over time Initial Technology Components are available; Industry is expected to generate more over time Third-party value-added providers in the private sector will ultimately shape the long-term evolution Third-party value-added providers in the private sector will ultimately shape the long-term evolution Interoperability and reliable information access is a critical objective Interoperability and reliable information access is a critical objective A diversity of applications (with user-friendly interfaces) need to be developed & deployed A diversity of applications (with user-friendly interfaces) need to be developed & deployed Application Projects have a central role to play in demonstrating the technology and using it effectively Application Projects have a central role to play in demonstrating the technology and using it effectively