Download presentation
Presentation is loading. Please wait.
Published byEthelbert Barber Modified over 9 years ago
1
Open Archives Initiative Protocol for Metadata Harvesting
2
Collections in isolation Some thoughts A wonderful collection is of limited use if it is not well known. Very redundant collections are often wasteful
3
Virtual collections Some collections do not contain actual materials, only information about materials and links to the home site. How do these virtual collections get the information about other collections? How do they stay up to date? --> The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
4
OAI - PMH A protocol -- that is just an agreement to exchange messages and interpret them according to strict rules. Metadata -- data about the data -- information about the material in the collection Harvesting -- gathering in the desired part of the collection for further use
5
The protocol See http://www.openarchives.org/OAI/openarchivesprotocol.html Two sides - the repository and the harvestor The repository (data providers) Prepares the required metadata Responds to the harvester queries Acts like a server - responding to queries when they come The Harvester (data gatherer) Gathers the metadata from the collections Organizes the harvested metadata in a way to serve its purpose. Acts like a client - requesting service when it needs it.
6
Resource, item, record Resource: the actual content of the collection; the point of the digital library Item: a part of the repository that generates the metadata. Record: metadata in a specific format available for dissemination. Encoded in XML Unique identifier Datestamp setSpecµ Optional status
7
Sets Repositories may organize items into sets Allows selective harvesting Each node in a set organization has setSpec Set may be hierarchical. If so, the levels are separated by colons setName setDescription
8
Requests Request embedded in an HTTP request Valid OAI PMH Requests: GetRecord Identify ListIdentifiers ListMetadataFormats ListRecords ListSets
9
GetRecord Required arguments Identifier = unique identifier of an item whose record is requested metadataPrefix = prefix part of the metadata record relevant to the requested item This identifies the type of metadata applied to the record. Example = oai_dc (the OAI version of the Dublin Core -- standard 15 elements, no extension.) Errors: badArgument, cannotDisseminateFormat, idDoesNotExist
10
Identify No arguments Requests information about the repository. Response includes repositoryName BaseURL protocolVersion earliestDatestamp deletedRecord (how does the repository handle deletions -- no, transient, persistent Granularity (how finely can the datestamp be specified?) adminEmail compression (what schemes are supported) description Optional
11
ListIdentifiers Required Argument metadataPrefix Optional Arguments from until set Exclusive argument resumptionToken (flow control token for resuming an incompleted previous ListIdentifiers request) Errors: badArgument, badResumptionToken, cannotDisseminateFormat, noRecordsMatch, noSetHierarchy
12
ListMetadataFormats Optional argument identifier (if metadataformat is needed only for some particular item) Errors - badArgument, idDoesNotExist, noMetadataFormats Response includes both metadataPrefix and the associated schema
13
ListRecords Required arguments metadataPrefix - Only records for which the specified metadataPrefix applies should be returned Optional arguments from until set Exclusive arguments resumtpionToken
14
ListSets Exclusive Argument resumptionToken (used to continue a previous incomplete response to ListSets) Errors - badArgument, badResumtpionToken, noSetHierarchy
15
Resources Compliance testing - www.dlib.vt.edu/projects/OAI/repexp/repexp.html OAI PMH - www.openarchives.org/OAI/openarchivesprotocol.html Implementation Guidelines www.openarchives.org/OAI/2.0/guidelines.htm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.