Download presentation
Presentation is loading. Please wait.
Published byCecil Brown Modified over 8 years ago
1
The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox fox@vt.edu CS DLRL Virginia Tech, Blacksburg, VA, USA
2
Acknowledgements Sponsors: Mellon Foundation, SOLINET, NSF, DLF, CNI, UK’s JISC, Virginia’s CIT, … OAI Team: Steering Committee, Technical Committee, Developers, Data Providers, Service Providers Emory Team, Partners around Southeast VT Colleagues: Hussein Suleman, Rohit Kelapure, Ming Luo, Ryan Richardson, Marcos Goncalves, Priya Shivakumar, Baoping Zhang, students working on term projects, …
3
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
4
Open Archives Initiative OAI www.openarchives.org openarchives@openarchives.org
5
Open Archives Initiative (OAI) xxx@LANL, high-energy physics (Ginsparg, 1991) CSTR + WATERS = NCSTRL (Lagoze,1994) xxx + NCSTRL = CoRR collaboration (1998) Universal Preprint Service protoproto, Oct. 21-22, 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi Santa Fe Convention (see Feb 2000 D-Lib Magazine article) Archives -> Open Archives Support unique archive identifiers Implement metadata set(s) (DC, using XML) Implement OA harvesting protocol Register the archive Build tools, layer other services: linking, searching, …
6
OAi Philosophy Self-archiving = submission mechanism Long-term storage system = archive Open interface = harvesting mechanism Data provider + service provider Start with “gray literature” e-prints/pre-prints, reports, dissertations, …
7
Began as “archives of the world unite!” OAI
8
Open Archives (protoproto) ArXiv & Los Alamos National Lab CogPrints & U. Southampton NACA & NASA (reports) NCSTRL & Cornell U. NDLTD & Virginia Tech RePEc & U. Surrey Total of around 200K records
9
Original Open Archives Members American Physical Society California Digital Library Caltech Coalition for Networked Info. Cornell University Harvard University Library of Congress Los Alamos Nat’l Lab Mellon Foundation NASA Langley Research Cntr Old Dominion University Stanford University U. of Ghent U. of Surrey U. of Southampton Vanderbilt University Virginia Tech Washington University
10
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
11
Now is a Technical Umbrella for Practical Interoperability… Reference Libraries Publishers E-Print Archives …that can be exploited by different communities Museums
12
Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI
13
Aggregation through OAI Harvesting – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7
14
Aggregation through OAI Harvesting – By Organization TheologyEmoryGAUGAU FLUTKAmSoLibrary
15
Aggregation through OAI Harvesting – By Topic Confederate Constitution Civil WarHistoryOralSportsCultureAmSoDiaries
16
Approaches to Aggregation Build By Discipline Build By Institution
17
Types of Access Possible Build By Discipline Build By Institution Year Category Personage Author Genre Query …
18
OAI Repository Required: Protocol DO MDO
19
Metadata vs. Data Data refers to digital objects or digital representations of objects Metadata is information about the objects (e.g. title, author, etc.) OAI focuses on metadata, with the implicit understanding that metadata usually contains useful links to the source digital objects
20
Metadata: Complex to Simple MARC (>$50)Dublin Core (DC)
21
repository repositoryrepository OAI protocol harvesterharvester support data harvesting data items
22
identifiers oai-identifier = oai:archive-identifier:record-identifier Registered URI Scheme Archive Identifier: Registered within OAI Unique ID within archive: (syntax is archive- specific) example = oai:ncstrl:ncstrl.cornellcs/TR94-1418 locally unique key for extracting a record from a repository
23
selective harvesting - datestamps repositoryrepository harvest within date range record
24
selective harvesting - sets repositoryrepository harvest within set S1 record S2
25
Summary: Protocol for Metadata Harvesting Service Requests Identify ListMetadataFormats ListSets GetRecord ListIdentifiers ListRecords Metadata Multiplicity Date (and Time) Ranges Resumption Tokens
26
Harvesting vs. Federation Competing approaches to interoperability Federation is when services are run remotely on remote data (e.g., federated searching) Harvesting is when data/metadata is transferred from the remote source to the destination where the services are located (e.g., union catalogues) Federation requires more effort at each remote source but is easier for the local system and vice versa for harvesting OAI (currently) focuses on harvesting
27
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
28
Example 1: Union Collection of ETDs (Electronic Theses and Dissertations, for Networked Digital Library of Theses and Dissertations, NDLTD)
29
Example 1: Details
30
Example 2: NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”
31
Example 2: CITIDEL -> NSDL Computing and Information Technology Interactive Digital Education Library A collection project in the National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL www.nsdl.nsf.gov www.nsdl.org
32
Example 2: CITIDEL Distributed repository structure
33
Example 2: NSDL Collections (themes relevant to our projects) Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on pedagogy
34
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
35
Open Digital Libraries XOAI-PMH Dissertation work of Hussein Suleman (member of OAI technical committee) Extending the OAI protocol Supporting rapid development of DLs using networks of components Demonstrated with NDLTD, CSTC Described in Dec. 2001 D-Lib Magazine article, and article scheduled for publication
36
Open Digital Libraries Components Running now XML-File (data provider from file system) Union, search, browse, recent, filter E-journal support system Class projects High performance multilingual search Recommender User rating Others discussed Classification/categorization and browsing
37
Component System Approach (Open) DL = Network of Extended OAs Local Archive Data Input Remote Archive Browse Metadata Repository SearchRecommend Resource Discovery User Interface OAI/ODL archive OAI/ODL protocol legend
38
Example Architecture (NDLTD) Humboldt Duisburg MIT Filter MIT Browse Union Catalog SearchRecent User Interface OAI/ODL archive OAI/ODL protocol legend Virginia Tech PhysNet CalTech Dresden
39
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
40
OAI Tools Related resources, e.g., XML, Unicode Submission / author support XML Schema Validator Servers and utilities, e.g., ARC, Kepler, EPrints Repository Explorer Interactive Browsing Testing of parameters Multiple views of data Multilingual support Automatic test suite
41
Author‘s tools www.physik.uni-oldenburg.de/EPS/mmm
42
XSV Schema Validator
43
ARC (arc.cs.odu.edu)
46
VT Tool: Repository Explorer The Repository Explorer is a tool for browsing and testing Open Archives, by Hussein Suleman You issue commands and see the results You also can perform a sequence of automatic tests http://purl.org/net/oai_explorer
47
VT Tool: RE 1.3
48
VT Tool: Request, Response
49
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
50
What will central service look like? (1 of 2) Harvesting from local sites Rich content, drawn from all participating sites Data management Logging and reporting Repository/preservation/mirroring Adding/updating/deleting User interface and support for digital librarians and data providers
51
What will central service look like? (2 of 2) Adding value De-duping Categorization/classification -> browsing Normalization/standardization -> authority control Tools for communication/collaboration/annotation -> security/privacy User interface for both general users and scholars
52
What are needs at local sites? Increasing OAI expertise Connecting OAI with local systems Supporting standards, normalization Supporting continual updating Passing enhancements upstream
53
How can VT help? (1 of 2) Usability studies for central site Help develop consensus Help plan system architecture & services Education/training Provide and support tools/systems Help sites engage, become OAI compliant
54
How can VT help? (2 of 2) Standards MARC-XML ODL Suite Download and configure Use in packaged forms, or re-architected Support Connecting your system into OAI Help with OAI Tools
55
MARC XML-DTD XML Transport format for US-MARC records Standardized metadata exchange format for traditional library services joining OAI
56
Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion
57
Rethink your efforts in terms of providers of Data, Services Reduced work for data providers Tools available Don’t need to offer services Reduced work for service providers Others provide the data Can use tools and systems for OAI, XOAI Results More data becoming available To more people Supported by improved services MetaScholar can be a win-win-win project!
58
Links Open Archives Initiative http://www.openarchives.org OAI Metadata Harvesting Protocol http://www.openarchives.org/OAI/openarchivesprotocol.htm Virginia Tech DLRL OAI Projects http://www.dlib.vt.edu/projects/OAI/ http://oai.dlib.vt.edu/odl Repository Explorer http://purl.org/net/oai_explorer NDLTD http://www.ndltd.org
59
More Links ARC Cross-Archive Search Service http://arc.cs.odu.edu/ XML Schema Validator http://www.w3.org/2001/03/webdata/xsv Dublin Core Metadata Initiative http://www.dublincore.org E-Prints DL-in-a-box http://www.eprints.org XML Tools at W3C http://www.w3.org/XML/#software
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.