Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi.

Similar presentations


Presentation on theme: "The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi."— Presentation transcript:

1 The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi Kanth Myadam

2 Challenges faced by Digital Libraries Representation of Complex objects. Making the vast heterogeneous Complex objects accessible to downstream applications. Ingesting and storing the vast assets. Defining unique Identity to each asset. Defining relationships between assets by complex object Models.

3 Introduction Los Alamos National Laboratory (LANL) research Library is one which has resolved the challenges to a great extent. Main goals Hosting, archiving and accessing vast heterogeneous assets in a consistent and sustainable manner Hosting, archiving and accessing vast heterogeneous assets in a consistent and sustainable manner Making it accessible by downstream applications

4 Properties of LANL Library Use of MPEG-21 Digital Item Declaration Language (DIDL) to represent complex objects.  Natively Distributed in nature.  XML tape to store complex objects  Multi-faceted use of the OAI-PMH to access stored content in incremental batches. Open URL to access data.

5

6 Components in LANL Repository Architecture Ingestion into the LANL Repository. OAI-PMH repositories. XML tapes for storing DIDs. Repository Index. Identifier Resolver. MPEG-21 DIP Engine and table. OAI-PMH federator. OpenURL gateway.

7 Ingestion into the LANL Repository How to Feed complex digital objects into LANL Repository Issue : If you have an article which has: If you have an article which has:  Metadata describing the article.  Article itself in PDF and ASCI.  References in XML format. MPEG-21 DIDL provides a standard for storing such kind of complex digital objects.

8 Contd. Different Kind of ways to feed data into the repository HTTP,FTP HTTP,FTP OAI-PMH Harvester Web Crawler Physical Media

9 Prototype Ingestion Process Converts the asset to XML document called Digital Item Declaration (DID). DID abides to MPEG-21 DIDL specification. DID also contains relationships which is generated by Ingestion process. Two Items added in Ingestion process DID identifier – globally unique identifier. DID creation time.

10 Characteristics of OAI-PMH repositories BaseURL(n) Contained records are DIDs only DID identifier used to identify DIDs datestamp is DID creation date OAI-PMH granularity is at seconds-level Sets structure ( out of scope)

11 XML tapes for storing DIDS The characteristics of OAI-PMH repository has lead to XML tapes as storage for DIDS XML tapes are created as follows: Asset converted to DID All DIDs concatinated into a single well-formed and valid XML file. XML file is indexed using the following Google's approach: XML files are gzipped Gzipped files are indexed with keys as identifiers and datestamp.

12 XML tapes never updated Assets are rarely updated. Even if Assets are updated the corresponding DIDs are never updated in the XML tape. A new DID is created for the updated asset and is stored in another OAI-PMH repository.

13 Repository Index Repository Index contains: Repository BaseURL – unique and persistent URI. Repository Creation time. Metadata of the OAI-PMH repository. The repository index can gather the newly added DIDs by Using the data stamp- based harvesting strategy.

14 Identifier Resolver A special purpose Service provider that collects information it requires by recurrent OAI-PMH harvesting. It uses only DID identifier, content identifier and base URL It uses only DID identifier, content identifier and base URL

15 Example If an object with identifier ID-1 is required then we look up the Identifier resolution and learn that it is located in BaseURL(3). With the above info an application can request for DID by issuing a OAI-PMH request

16 OAI-PMH Federator It acts as a single point of access for harvesters. It makes the harvester fell that there is only one OAI-PMH repository. Harvesters does not need to know the location of each OAI-PMH repository.

17 Functions of OAI-PMH federator The OAI-PMH Federator accepts incoming OAI-PMH requests and translates them into appropriate requests It receives response from various parts of the LANL repository and hands over to the harvesters as a valid OAI-PMH response.

18 Characteristics of the OAI-PMH federator Unique BaseURL(federator) IdentifierDatestamp Contains IDIs which are dynamically processed by digital item processing (DIP) engine Granularity is seconds level.

19 Federator Response The response to the Harvesting requests is provided In the following Manners: ListMetadata Formats List Sets Get Records ListIdentifiers

20 MPEG-21 DIP Engine capable of processing DIDs upon request of an agent A DIP Engine is able to respond to service requests in which the following information is conveyed: An actual DID. An identification of the entity of the DID for which the service is requested. An identification of the method—DIM—contained in the DID that implements the service. Reference: Using MPEG-21 DIP and NISO OpenURL for the dynamic dissemination of Complex Digital Objects in the Los Alamos National Laboratory Digital Library.

21 OpenURL gateway the OpenURL Standard introduces the notion of a ContextObject, which is an information construct that contains descriptions of various entities involved in the process of providing context-sensitive services EntityDefinition Referent The entity about which the ContextObject was created—the referenced resource ReferringEntity The entity that references the Referent Requester The entity that requests services pertaining to the Referent ServiceType The entity that defines the type of service requested Resolver The entity at which a request for services is targeted Referrer The entity that generated the ContextObject

22

23 Conclusion LANL repository architecture has successfully made a vast and ever growing data collection available for various down stream application. Introducing several interacting components to solve a complex problem. Light weight protocol for interaction which can be supported by many software. All these have made this architecture very attractive for digital libraries.

24 References Jerez, Henry, Xiaoming Liu, Patrick Hochstenbach, and Herbert Van de Sompel. The multi-faceted use of the OAI-PMH in the LANL Repository. 2004. Draft of an accepted submission to JCDL 2004. The multi-faceted use of the OAI-PMH in the LANL RepositoryThe multi-faceted use of the OAI-PMH in the LANL Repository Bekaert, Jeroen, Patrick Hochstenbach and Herbert Van de Sompel. Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library. 2003. D-Lib Magazine. Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital LibraryUsing MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library Bekaert, Jeroen, Patrick Hochstenbach and Herbert Van de Sompel. Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library. 2003. D-Lib Magazine. Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital LibraryUsing MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library

25 Thank you


Download ppt "The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi."

Similar presentations


Ads by Google