Download presentation
Presentation is loading. Please wait.
Published byKathleen Tyler Modified over 8 years ago
1
herbert van de sompel & carl lagoze Herbert Van de Sompel Los Alamos National Laboratory – Research Library Carl Lagoze Cornell University – Computer Science the OAI Protocol for Metadata Harvesting an update
2
herbert van de sompel & carl lagoze o rigins & e volution of OAI-PMH p rocess leading to OAI-PMH v.2.0 w hat’s new in OAI-PMH v.2.0? w hat’s next?
3
herbert van de sompel & carl lagoze e volution towards OAI-PMH v.2.0 OAI-PMH 1.0 [01/2001] OAI-PMH 2.0 [06/2002] Santa Fe Convention [02/2000]
4
herbert van de sompel & carl lagoze abouteprints document like objects resourcesmetadata OAMS unqualified Dublin Core unqualified Dublin Core transport HTTP responsesXML requests HTTP GET/POST verbs Dienst OAI-PMH natureexperimental stable model metadata harvesting metadata harvesting metadata harvesting Santa Fe convention OAI-PMH v.1.0/1.1 OAI-PMH v.2.0
5
herbert van de sompel & carl lagoze Santa Fe Convention [02/2000] goal: optimize discovery of e-prints input: the UPS prototype RePEc data provider / service provider model Dienst protocol deliberations at Santa Fe meeting [10/99]
6
herbert van de sompel & carl lagoze Santa Fe Convention [02/2000] low-barrier interoperability specification metadata harvesting model: data provider / service provider focus on eprints (e.g. OAMS format) Dienst subset HTTP based XML responses experimental
7
herbert van de sompel & carl lagoze OAI-PMH v.1.0 [01/2001] goal: optimize discovery of document-like objects input: SFC DLF meetings on metadata harvesting deliberations at Cornell meeting [09/00] alpha test group of OAI-PMH v.1.0
8
herbert van de sompel & carl lagoze low-barrier interoperability specification metadata harvesting model: data provider / service provider focus on document-like objects autonomous protocol HTTP based XML responses unqualified Dublin Core experimental: 12-18 months OAI-PMH v.1.0 [01/2001]
9
herbert van de sompel & carl lagoze OAI-PMH v.2.0 [06/2002] goal: recurrent exchange of metadata about resources between systems input: OAI-PMH v.1.0 feedback on OAI-implementers deliberations by OAI-tech [09/01 -] alpha test group of OAI-PMH v.2.0 [03/02 -]
10
herbert van de sompel & carl lagoze low-barrier interoperability specification metadata harvesting model: data provider / service provider metadata about resources autonomous protocol HTTP based XML responses unqualified Dublin Core stable OAI-PMH v.2.0 [06/2002]
11
herbert van de sompel & carl lagoze p rocess leading to OAI-PMH v.2.0 pre-alpha phase alpha-phase creation of OAI-tech beta-phase
12
herbert van de sompel & carl lagoze created for 1 year period charge: review functionality and nature of OAI-PMH v.1.0 investigate extensions release stable version of OAI-PMH by 05/02 determine need for infrastructure to support broad adoption of the protocol communication: listserv, SourceForge, conference calls creation of OAI-tech [06/01]
13
herbert van de sompel & carl lagoze US representatives Thomas Krichel (Long Island U) - Jeff Young (OCLC) - Tim Cole - (U of Illinois at Urbana Champaign) - Hussein Suleman (Virginia Tech) - Simeon Warner (Cornell U) - Michael Nelson (NASA) - Caroline Arms (LoC) - Muhammad Zubair (Old Dominion U) - Steven Bird (U Penn.) European representatives Andy Powell (Bath U. & UKOLN) - Mogens Sandfaer (DTV) - Thomas Baron (CERN) - Les Carr (U of Southampton) OAI-tech
14
herbert van de sompel & carl lagoze review process by OAI-tech: identification of issues conference call to filter/combine issues white paper per issue on-line discussion per white paper proposal for resolution of issue by OAI-exec discussion of proposal & closure of issue conference call to resolve open issues pre-alpha phase [09/01 – 02/02]
15
herbert van de sompel & carl lagoze creation of revised protocol document in-person meeting Lagoze - Van de Sompel - Nelson – Warner autonomous decisions internal vetting of protocol document pre-alpha phase [02/02]
16
herbert van de sompel & carl lagoze alpha-1 release to OAI-tech March 1st 2002 OAI-tech extended with alpha testers discussions/implementations by OAI-tech ongoing revision of protocol document alpha phase [02/02 – 05/02]
17
herbert van de sompel & carl lagoze The British Library Cornell U. -- NSDL project & e-print arXiv Ex Libris FS Consulting Inc -- harvester for my.OAI Humboldt-Universität zu Berlin InQuirion Pty Ltd, RMIT University Library of Congress NASA OCLC OAI-PMH 2.0 alpha testers (1/2)
18
herbert van de sompel & carl lagoze OAI-PMH 2.0 alpha testers (2/2) Old Dominion U. -- ARC, DP9 U. of Illinois at Urbana-Champaign U. Of Southampton -- OAIA, CiteBase, eprints.org UCLA, John Hopkins U., Indiana U., NYU -- sheet music collection UKOLN, U. of Bath -- RDN Virginia Tech -- repository explorer
19
herbert van de sompel & carl lagoze beta phase [05/02] beta release on May 1st 2002 to : registered data providers and service providers interested parties fine tuning of protocol document preparation for the release of 2.0 conformant tools by alpha testers
20
herbert van de sompel & carl lagoze w hat’s new in OAI-PMH v.2.0? corrections new functionality general changes to improve solidity of protocol quick recap
21
herbert van de sompel & carl lagoze service providerdata provider Requests Replies repositoryrepository harvesterharvester 6 OAI-PMH
22
herbert van de sompel & carl lagoze Supporting protocol requests: Identify ListMetadataFormats ListSets Harvesting protocol requests: ListRecords ListIdentifiers GetRecord repositoryrepository service providerdata provider harvesterharvester
23
herbert van de sompel & carl lagoze service providerdata provider Datestamp Identifier Set Records repositoryrepository harvesterharvester
24
herbert van de sompel & carl lagoze general changes clear distinction between protocol and periphery fixed protocol document extensible implementation guidelines: e.g. sample metadata formats, description containers, about containers allows for OAI guidelines and community guidelines
25
herbert van de sompel & carl lagoze general changes clear separation of OAI-PMH and HTTP OAI-PMH error handling all OK at HTTP level? => 200 OK something wrong at OAI-PMH level? => OAI-PMH error (e.g. badVerb)
26
herbert van de sompel & carl lagoze general changes notion of item has become prominent resource / item / record metadata can be disseminated from item item == identifier record == identifier, datestamp, metadataPrefix
27
herbert van de sompel & carl lagoze general changes better definitions of harvester, repository, item, unique identifier, record, datestamp, set oai_dc schema builds on DCMI XML Schema for unqualified Dublin Core usage of must, must not etc. as in RFC2119 wording on response compression
28
herbert van de sompel & carl lagoze general changes all protocol responses can be validated with a single XML Schema easier for data providers no redundancy in type definitions SOAP-ready clean for error handling
29
herbert van de sompel & carl lagoze 2002-0208T08:55:46Z http://arXiv.org/oai2 oai:arXiv:cs/0112017 2001-12-14 cs math ….. response no errors
30
herbert van de sompel & carl lagoze 2002-0208T08:55:46Z http://arXiv.org/oai2 ShowMe is not a valid OAI-PMH verb response with error
31
herbert van de sompel & carl lagoze corrections all dates/times are UTC, encoded in ISO8601, Z-notation 1957-03-20T20:30:00.00Z
32
herbert van de sompel & carl lagoze idempotency of resumptionToken : return same incomplete list when rT is reissued while no changes occur in the repo: strict while changes occur in the repo: all items with unchanged datestamp expirationDate attribute for rT corrections
33
herbert van de sompel & carl lagoze harvesting granularity mandatory support of YYYY-MM-DD optional support of YYYY-MM-DDThh:mm:ssZ granularity of from and until must be the same new functionality
34
herbert van de sompel & carl lagoze Identify more expressive new functionality Library of Congress 1 http://memory.loc.gov/cgi-bin/oai 2.0 dwoo@loc.gov caar@loc.gov transient 1990-02-01T00:00:00Z YYYY-MM-DDThh:mm:ssZ deflate
35
herbert van de sompel & carl lagoze header contains set membership of item new functionality oai:arXiv:cs/0112017 2001-12-14 cs math …..
36
herbert van de sompel & carl lagoze ListIdentifiers returns headers new functionality 2002-0208T08:55:46Z http://arXiv.org/oai2 oai:arXiv:hep-th/9801001 1999-02-23 physic:hep oai:arXiv:hep-th/9801002 1999-03-20 physic:hep physic:exp ……
37
herbert van de sompel & carl lagoze ListIdentifiers mandates metadataPrefix as argument new functionality http://www.perseus.tufts.edu/cgi-bin/pdataprov? verb=ListIdentifiers &metadataPrefix=olac &from=2001-01-01 &until=2001-01-01 &set=Perseus:collection:PersInfo
38
herbert van de sompel & carl lagoze character set for metadataPrefix and setSpec extended to URL-safe characters new functionality A-Z a-z 0-9 _ ! ‘ $ ( ) + -. *
39
herbert van de sompel & carl lagoze introduction of provenance container to facilitate tracing of harvesting history in the periphery http://an.oa.org oai:r1:plog/9801001 2001-08-13T13:00:02Z oai_dc 2001-08-15T12:01:30Z … … …
40
herbert van de sompel & carl lagoze introduction of friends container to facilitate discovery of repositories in the periphery http://cav2001.library.caltech.edu/perl/oai http://formations2.ulst.ac.uk/perl/oai http://cogprints.soton.ac.uk/perl/oai http://wave.ldc.upenn.edu/OLAC/dp/aps.php4
41
herbert van de sompel & carl lagoze revision of oai-identifier guidelines for collection-level and set-level metadata in the periphery
42
herbert van de sompel & carl lagoze f uture adoption communities OAI-PMH
43
herbert van de sompel & carl lagoze release of OAI-PMH v.2.0 [06/2002] no backwards compatibility with v.1.0/1.1 stable migration process for registered repos ? formal standardization ? ? SOAP version ~ web services framework [SOAP, WSDL, UDDI] ? the OAI-PMH
44
herbert van de sompel & carl lagoze proliferation of community-specific add-ons for: collection & set level metadata expressive metadata formats (e.g. qualified DC XML Schema) shared set-structures machine readable rights (about the metadata) communities
45
herbert van de sompel & carl lagoze evolution from talking about OAI-PMH to talking about projects that use OAI-PMH to talking about projects and failing to mention they use OAI-PMH => OAI-PMH becomes part of the infrastructure adoption
46
herbert van de sompel & carl lagoze I just wanted to report what I consider an OAI success. I discovered that RLG had harvested records for two of the American Memory collections I had made available and integrated them into their Cultural Materials Initiative service without the need for a single e-mail or phone call. They reported that it was working very well for them. [Caroline Arms, Library of Congress]
47
herbert van de sompel & carl lagoze http://www.openarchives.org openarchives@openarchives.org
48
herbert van de sompel & carl lagoze i ndicators of a doption of OAI-PMH tools structural support service providers data providers
49
herbert van de sompel & carl lagoze 49 registered repositories [11/2001] 65 registered repositories [03/2002] 5+ million records many unregistered repositories data providers
50
herbert van de sompel & carl lagoze Arc : cross-searching of registered repositories [Old Dominion U] [ http://arc.cs.odu.edu ]http://arc.cs.odu.edu OLAC: cross-searching of Language Archive Community repositories http://www.language-archives.org/index.html service providers
51
herbert van de sompel & carl lagoze Scirus scientific search engine [Elsevier] [ http://www.scirus.com ]http://www.scirus.com my.OAI : user-tailorable cross-searching of registered repositories [FS Consulting, Inc.] [http://www.myoai.com]http://www.myoai.com growing interest from web search engines service providers
52
herbert van de sompel & carl lagoze Repository Explorer: interactive exploration of repositories [Virginia Tech] [ http://www.purl.org/NET/oai_explorer ]http://www.purl.org/NET/oai_explorer eprints.org: generic OAI-PMH compliant repository software [U of Southampton] [ http://www.eprints.org ]http://www.eprints.org ALCME repository and harvester software [OCLC] [ http://alcme.oclc.org/index.html ]http://alcme.oclc.org/index.html OAI-PMH tools
53
herbert van de sompel & carl lagoze Kepler [Old Dominion U] your personal OAI data provider: Kepler archivelet the Kepler service provider harvests from archivelets that register archivelet downloadable http://www.dlib.org/dlib/april01/maly/04maly.ht mlhttp://www.dlib.org/dlib/april01/maly/04maly.ht ml exploration
54
herbert van de sompel & carl lagoze DP9 [Old Dominion U] provides entry page to repositories for web- crawlers provides bookmarkable URL for OAI record provides resolution of OAI identifier into metadata software downloadable exploration
55
herbert van de sompel & carl lagoze CNI & DLF support the day-to-day operation of the OAI Executive structural support
56
herbert van de sompel & carl lagoze Metadata Harvesting Initiative of the Mellon Foundation NSF funded NSDL UK FAIR call for proposals to support disclosure of institutional assets (papers, learning materials, etc.) several EC projects exploring/supporting usage of OAI-PMH: TEL, Leaf, Cyclades, OA Forum, Figaro structural support
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.