Download presentation
Presentation is loading. Please wait.
1
Georges Arnaout Chaitanya Krishna
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Website: Editors: Carl Lagoze (Cornell University) Herbert Van de Sompel (Los Alamos Laboratory) Michael Nelson (NASA Langley Research Ctr) Simeon Warner (Cornell University) Presented by: Georges Arnaout Chaitanya Krishna CS 791/891-WEB SYNDICATION FORMATS 1
2
OAI Open Archives Initiative The protocol is openly
documented, and metadata is “exposed” to at least some peer group Archive defined as a “collection of stuff” -- or “Repository” OAI is happening at break-neck speed... figure reference: CS 791/891-WEB SYNDICATION FORMATS
3
But what is interoperability ???
Definition OAI-PMH: - A protocol that provides an application-independent interoperability framework based on metadata harvesting. But what is interoperability ??? CS 791/891-WEB SYNDICATION FORMATS 3
4
What is Interoperability?
It is the ability of exchanging and using information from 2 or more applications or systems. CS 791/891-WEB SYNDICATION FORMATS 4
5
CS 791/891-WEB SYNDICATION FORMATS
What’s a Harvester ??? it’s a client application that issues OAI-PMH requests, operated in order to collect metadata from the repositories. CS 791/891-WEB SYNDICATION FORMATS 5
6
CS 791/891-WEB SYNDICATION FORMATS
What is a repository ??? It is a BIG database – A place where data is stored and maintained. It is a network accessible server. The data contained in the repository are the metadata that are exposed to harvesters. CS 791/891-WEB SYNDICATION FORMATS 6
7
Verbs Summary Verb Function Identify description of repository
ListMetadataFormats metadata formats supported by repository ListSets sets defined by repository ListIdentifiers OAI unique ids contained in repository ListRecords listing of N records GetRecord listing of a single record figure reference: CS 791/891-WEB SYNDICATION FORMATS
8
CS 791/891-WEB SYNDICATION FORMATS
OAI-PMH Data Model OAI-PMH distinguishes between 3 distinct entities related to the exposed metadata: 1- Resource: The object that metadata is about. 2- Item: Instance of a metadata object -That instance may be disseminated on the fly, cross-walked from some canonical form , actually stored in repository. 3- Record: is metadata in a specific metadata format. CS 791/891-WEB SYNDICATION FORMATS 8
9
Example: resource item = identifier all available metadata item
about David item Dublin Core metadata MARC SPECTRUM records record = identifier + metadata format + datestamp figure reference: CS 791/891-WEB SYNDICATION FORMATS
10
The XML-encoding of records
Header Metadata About Above link shows encoding of a record in XML CS 791/891-WEB SYNDICATION FORMATS 10
11
What happens if a record was deleted from the repository???
deleteRecord CS 791/891-WEB SYNDICATION FORMATS 11
12
What happens if a record was deleted from the repository???
Repositories must declare one of 3 levels of support: 1- no repository does not maintain information about deletions MUST NOT reveal a deleted status in any response. 2- persistent (opposite) maintains info about deletions with no time limit MUST persistently keep track of deletions and reveal the status of a deleted record. 3- transient persistent but to a limited time. Such a repository MAY reveal a deleted status. Not revealing the status is acceptable CS 791/891-WEB SYNDICATION FORMATS 12
13
Selective Harvesting (datestamp and SET)
Selective harvesting allows harvesters to limit harvest requests to portions of the metadata available from a repository. CS 791/891-WEB SYNDICATION FORMATS 13
14
Selective Harvesting via datestamps
Request: CS 791/891-WEB SYNDICATION FORMATS
15
CS 791/891-WEB SYNDICATION FORMATS
SET membership A set is an optional construct for grouping items for the purpose of selective harvesting. Think of it as a Fraternity. A student (item) may belong to a fraternity. Not all students belong to a fraternity. CS 791/891-WEB SYNDICATION FORMATS 15
16
Selective Harvesting Via Set
<record> <header> <identifier>oai:arXiv:cs/ </identifier> <datestamp> </datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec> </header> <metadata> ….. </metadata> </record> CS 791/891-WEB SYNDICATION FORMATS
17
CS 791/891-WEB SYNDICATION FORMATS
Date/time: T20:30:00Z is: UTC 8:30:00 PM on March 20th 1957 Encoded in: ISO8601, Z-notation Request: YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Response: YYYY-MM-DDThh:mm:ssZ. CS 791/891-WEB SYNDICATION FORMATS 17
18
The BIG PICTURE CS 791/891-WEB SYNDICATION FORMATS 18
Figure reference: CS 791/891-WEB SYNDICATION FORMATS 18
19
Request/Response Request is encoded in http Response in XML
figure reference: CS 791/891-WEB SYNDICATION FORMATS
20
CS 791/891-WEB SYNDICATION FORMATS
GET Example CS 791/891-WEB SYNDICATION FORMATS 20
21
CS 791/891-WEB SYNDICATION FORMATS
Flow Control List requests: A number of OAI-PMH requests. The number could be very large partition them among a series of requests and response CS 791/891-WEB SYNDICATION FORMATS 21
22
Flow Control Example harvester RDBMS ListRecords
Records 1-100, resumptionToken=AXad31 ListRecords, resumptionToken=AXad31 Records , resumptionToken=pQ22-x ListRecords, resumptionToken=pQ22-x Records figure reference: CS 791/891-WEB SYNDICATION FORMATS
23
Response with no errors
<?xml version="1.0" encoding="UTF-8"?> <OAI-PMH> <responseDate> T08:55:46Z</responseDate> <request verb=“GetRecord”… …> <GetRecord> <record> <header> <identifier>oai:arXiv:cs/ </identifier> <datestamp> </datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec> </header> <metadata> ….. </metadata> </record> </GetRecord> </OAI-PMH> CS 791/891-WEB SYNDICATION FORMATS 23
24
CS 791/891-WEB SYNDICATION FORMATS
Response with errors In event of an error or exception condition, repositories must indicate OAI-PMH errors by including the error in the response. Request: verb=nastyVerb Response: <?xml version="1.0" encoding="UTF-8"?> <OAI-PMH xmlns=" xmlns:xsi=" xsi:schemaLocation=" <responseDate> T19:20:30Z</responseDate> <request verb="ListRecords" from=" T02:00:00Z" until=" T03:020:00Z" metadataPrefix="oai_marc"> <error code="badArgument"/> </OAI-PMH> Figure reference: CS 791/891-WEB SYNDICATION FORMATS 24
25
CS 791/891-WEB SYNDICATION FORMATS
Request Verbs There are six different request types: 1) GetRecord 2) Identify 3) ListIdentifiers 4) ListMetadataFormats 5) ListRecords 6) ListSets CS 791/891-WEB SYNDICATION FORMATS
26
Argument Summary metadataPrefix from until set resumptionToken
identifier Identify ListMetadata Formats optional ListSets exclusive ListIdentifiers ListRecords GetRecord Figure reference: CS 791/891-WEB SYNDICATION FORMATS
27
Error Summary BA NMF IDDNE BRT NSH CDF NRM Identify ListMetadata
Formats NMF IDDNE ListSets BRT NSH ListIdentifiers CDF NRM ListRecords GetRecord Figure reference: CS 791/891-WEB SYNDICATION FORMATS
28
CS 791/891-WEB SYNDICATION FORMATS
Dublin Core The Dublin Core metadata element set is a standard for cross-domain information resource description. Mandated metadata format since the initial release of protocol. Purpose of this requirement was to promote interoperability among data providers. CS 791/891-WEB SYNDICATION FORMATS 28
29
Example http://memory.loc.gov/cgi-bin/oai2_0?verb=Identify
30
Repository explorer and example
We shall discuss following HU-Berlin example in above repository explorer
31
OAI-PMH service provider
this is a service provider using OAI-PMH. CS 791/891-WEB SYNDICATION FORMATS
32
CS 791/891-WEB SYNDICATION FORMATS
Conclusion OAI-PMH allows for any metadata format, so long as it is encoded in XML with an XML schema. All repositories must support oai_dc for a minimum level of interoperability. OAI-PMH now defines a single XML Schema to validate responses to all OAI-PMH requests In a successful and trend-setting collaboration with the Dublin Core Metadata Initiative, an XML Schema for unqualified Dublin Core has been created, which is hosted by the DCMI and used in the delivery of metadata in the mandatory DC format in the OAI-PMH. CS 791/891-WEB SYNDICATION FORMATS 32
33
CS 791/891-WEB SYNDICATION FORMATS
Questions? What are the benefits of OAI-PMH? Is the open archives initiative only concerned with metadata? Why choosing the Dublin Core as the standard for OAI-PMH? CS 791/891-WEB SYNDICATION FORMATS 33
34
CS 791/891-WEB SYNDICATION FORMATS
References [CENDI Meeting, MD(4/3/02)] [OA Forum Workshop, Pisa Italy(5/13/02)] CS 791/891-WEB SYNDICATION FORMATS 34
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.