Alternative Architecture for Information in Digital Libraries Onno W. Purbo

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

Contextual Linking Architecture Christophe Blanchi June Corporation for National Research Initiatives Approved for.
Chapter 10: Designing Databases
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
1 CS 502: Computing Methods for Digital Libraries Lecture 2 The Nomadic Computing Experiment Object Models.
ISP 433/533 Week 8 IR in libraries. Goal Universal Access to Information Vannevar Bush 1945 article Memex A memex is a device in which an individual stores.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Repositories.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Introducing Symposia : “ The digital repository that thinks like a librarian”
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
Systems Architecture, Fourth Edition1 Internet and Distributed Application Services Chapter 13.
Chapter 9: Moving to Design
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
Demonstration of repositories Fedora (Flexible Extensible Digital Object Repository Architecture) Marie Lagerwall MIDESS Partners Meeting February 9, 2007.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
Digital Library Architecture and Technology
ViciDocs for BPO Companies Creating Info repositories from documents.
Resolving Unique and Persistent Identifiers for Digital Objects Why Worry About Identifiers? Individuals and organizations, including governments and businesses,
Chapter 10 Architectural Design
Chapter 9 Elements of Systems Design
The Design Discipline.
Chapter 33 CGI Technology for Dynamic Web Documents There are two alternative forms of retrieving web documents. Instead of retrieving static HTML documents,
XHTML Introductory1 Forms Chapter 7. XHTML Introductory2 Objectives In this chapter, you will: Study elements Learn about input fields Use the element.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Internet Basics Dr. Norm Friesen June 22, Questions What is the Internet? What is the Web? How are they different? How do they work? How do they.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 07. Review Architectural Representation – Using UML – Using ADL.
Architecture for a Database System
Interfacing Registry Systems December 2000.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 1: The Database Environment Modern Database Management 9 th Edition Jeffrey A. Hoffer,
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Attaching Rights to Content Larry Lannom Corporation for National Research Initiatives Copyright ©
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
PatentScope - Electronic Publication World Intellectual Property Organization.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Chapter 29 World Wide Web & Browsing World Wide Web (WWW) is a distributed hypermedia (hypertext & graphics) on-line repository of information that users.
14 1 Chapter 14 Web Database Development Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Persistent Identifiers (PIDs) & Digital Objects (DOs) Christine Staiger & Robert Verkerk SURFsara.
Object storage and object interoperability
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Identifiers and Repositories hussein suleman uct cs honours 2006.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Toward an Open Architectural Framework for Digital Objects M. Cristina Pattuelli INLS March 19, 2001.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
9 Systems Analysis and Design in a Changing World, Fifth Edition.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
CS 501: Software Engineering Fall 1999
CHAPTER 2 CREATING AN ARCHITECTURAL DESIGN.
Ch > 28.4.
MANAGING DATA RESOURCES
Database Design Hacettepe University
Database Management Systems
Presentation transcript:

Alternative Architecture for Information in Digital Libraries Onno W. Purbo

Reference i/02arms1.html i/02arms1.html William Y. Arms, Christophe Blanchi, Edward A. Overly, “An Architecture for Information in Digital Libraries,” Corporation for National Research Initiatives Reston, Virginia, February 1997.

The Structure of Information Digital data  digital library. Digital objects Metadata Unique identifier (handle). Group of digital objects  set of digital objects. Different type of material  categories.

Components of Comp System

Work Flow Example Search Z – list of digital objects identified by handle. Select Retrieval Resipository Access Protocol (RAP) Display

Information Architecture

Structure of Info in Digi Lib Relationship (chapter, index) Format (SGML, HTML) Version Right & Permission Computer System & Network (dialup vs. broadband).

Basic Principles User & app. Program must be flexible. Collections must be straightforward to manage. The information archirectire must reflect economic, social & legal framework.

Data type, structural metadata Data type – technical properties of data, format & processing. Structural metadata – type, version, relationship of digital material. Meta-object – reference to a set of digital object.

Guidelines for all categories All data is given an explicit data type All metadata is encoded explicitly Handles are given to individual items of intellectual property Meta-objects are used to aggregate digital objects Handles are used to identify items listed in meta-objects

An Example of the Use of Meta-objects Scanned photographs Digital objects for a scanned photograph Digital objects for individual versions Meta-object Handles for scanned photographs Depositing a scanned photograph

Digital objects for a scanned photograph Low resolution “thumbnail” High resolution “reference” image

Digital objects for individual versions Key metadata. used to manage the object in a networked environment. It includes the handle, and the rights and permissions associated with the digital object. Structural metadata. includes fields for description, owner, handle of meta-object, data size, data type (e.g., "jpg"), version number, description, date deposited, use (e.g., "thumbnail"), and the date of last revision. Image data. This is the image data.

Meta-object Key metadata. includes the handle, and the rights and permissions associated with the digital object. Structural metadata. includes a description, the owner, the number of versions, the date deposited, the use ("meta- object"), and the date of last revision. Data about each version. For each of the three scanned versions (e.g., the thumbnail), there is a package of information including the handle of the version, and the relationship among the versions.

Handles for scanned photographs control identifier - 3a16116r.jpg replace the control identifiers by handles, which provide a unique, persistent, location independent name for each item - loc.ndlp.amrlp/3a16116 Terminology to describe handles: "loc.ndlp.amrlp" is the naming authority "3a16116" is a locally unique string For convenience in processing, use sequence numbers loc.ndlp.amrlp/3a loc.ndlp.amrlp/3a

Meta object identifies 2 image

Depositing a scanned photograph Human machine

Depositing a scanned photograph - human Selection of the material that will be made into each digital object. Specification of the metadata for those fields that require judgment.

Depositing a scanned photograph - machine Creation of the meta-object and the links to other digital objects. Depositing the digital objects in the repository. Registering the handles in the handle system.

Access to a scanned photograph Bibliographic entries in search systems refer to the scanned photograph by the handle of the meta- object. If a user requests a summary of the photograph, the "thumbnail" image is provided. If the user requests access to the photograph without specifying which version, the "access" image is provided.

Technical Information

Digital Object

Key-metadata The key-metadata is the information stored in the digital object that is needed to manage the digital object in a networked environment -- for example to store, replicate, or transmit the object without providing access to the content. This includes terms and conditions, and the handle. Digital material The digital material (or data) comprises a set of sequences of bits.

Digital Objects Internal Structure An element is a bit sequence comprising an elementary unit of information. An element has its own ID. A package is a collection of elements and other packages, with its own ID. A digital object is a package with key- metadata for use in a networked environment. The ID is a handle.

Data Element

Data element A data element is any bit-sequence. Element ID The element ID is the internal identifier of the element within the digital object. Unlike a handle, which is unique and known publicly, the element ID is of local importance only. Attributes Attributes are the information that is needed to process the element. They include: a role, which defines the function of the element (such as "DTD" in the SGML world), and a type, which includes technical information (such as "jpeg").

A Package

Packages Packages are used to group or associate elements and other packages. A package has a package ID. If the package is a digital object, the package ID is a handle. Otherwise, it is the internal identifier of the package within the digital object. Unlike a handle, which is unique and known publicly, such a package ID is of local importance only. The content of a package consists of elements and other packages.

Handle & Handle System

The digital library is assembled from a great variety of components. They include people, computers, networks, repositories, databases, search systems, Web servers, digital objects, elements of objects, bibliographic records, and many more. Keeping track of these components requires a systematic approach to identification.

Typical handle record

Handle record for web

Handle System To resolve a handle is to present a handle to the handle system and receive as a reply information about the item identified. The handle system is a distributed computer system, with many computers distributed across the world. CNRI manages a global handle registry and there are local handle services operated by other organizations, e.g.

Naming Authority Handles are created by naming authorities, administrative units that are authorized to create and edit handles.

The Repository

Structure of a Repository A repository is a system for networked based storage and access to digital objects. All interaction with the repository uses a simple protocol, known as the Repository Access Protocol (RAP). RAP has a small number of fundamental operations, such as "deposit", which stores a digital object in the repository, and "access", which provides access to a digital object. Thus RAP provides a clearly defined, open interface for the repository that allows others to write clients and higher level interfaces.

Structure of Repository

Repository shell The repository shell is the part of the repository that interfaces with the outside world. It implements the RAP protocol Persistent store Information in the repository is held in the persistent store. The persistent store is completely hidden from the outside. Object management layer The object management layer provides an interface between the services provided by the persistent store and the object oriented functions required by the repository shell.

The Repository Access Protocol (RAP) VerifyHandle. Confirm that a handle has been registered in the handle system. AccessRepoMeta. Access the repository metadata. Verify_DO. Confirm that a repository stores a digital object with a specified handle. AccessMeta. Access the metadata for a specified digital object. Access_DO. Access the digital object. Deposit_DO. Deposit a digital object in a repository. Delete_DO. Deletes a digital object from a repository. MutateMeta. Edit the metadata for a digital object. Mutate_DO. Edit a digital object.

Handle system to access DO

Example RAP Work Flow The handle "loc.ndlp/1234" is sent to the handle system. It resolves to data type "handle" (HDL), value "loc/repos1". This is interpreted as information that the digital object is stored in the repository identified by the given handle. The handle "loc/repos1" is sent to the handle system. It resolves to information of type "RAP". This is information that the repository implements RAP. The corresponding data is a reference to a CORBA Object Request Broker (ORB). The command "Access_DO (loc.ndlp/1234)" is now sent to the repository.

Benefit Using Handle Since the digital object is identified by a handle, if it is moved to another repository the only change required is to alter the data in the first of the handle records in the figure. Since the repository is identified by a handle, if the repository is moved to a different computer or otherwise changed, but its handle remains the same, altering the single data item in the second handle record in the figure is the only change needed, for all the digital objects stored in the repository.

User Interface

User Interface System

Client via CGI-BIN

DO sets as hierarchies

Hierarchies Level 0: contains the digitized image, sound, text, or other data. Level 1: is a parent of digital objects of Level 0. Upon encountering a digital object of this type, the digital object browser extracts the content of the all the child Level 0 digital objects and displays them in an indexed list to the user. This type has been used to display indexes of thumbnail images. Level 2: is a parent of digital objects of Level 1.