The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox.

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
Advertisements

A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
DLM-Forum - Barcelona, 7-8 May 2002 Promoting and Supporting Open Archives in Europe: The Open Archives Forum Project Donatella Castelli IEI-CNR
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11 th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
Building Reliable Distributed Information Spaces Carl Lagoze CS /22/2002.
Building Digital Libraries on Open Archives Donatella Castelli IEI-CNR Italy.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Digital Library in a Box Ming Luo, Hussein Suleman, Edward Fox Virginia Tech Subcontract to Collaborative Project led by University of Florida (also with.
Digital Library Architecture and Technology
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
ECDL Workshop “Extending Interoperability of Digital Libraries: Building on the Open Archives Initiative” Lisbon – September 21, 2000 Edward A. Fox
US-Korea Joint Workshop on Digital Libraries SDSC - August 10-11, 2000 Open Archives Edward A. Fox CS DLRL Internet TIC.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
32nd LIBER Annual General Conference - Rome, June 2003 Open archive solutions to traditional archive/library cooperation Donatella Castelli ISTI-CNR.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2009: 12 th Int. Symp. on ETDs Pittsburgh, PA: Newcomers Edward A. Fox, Executive.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
Building Digital Libraries Made Easy: Toward Open Digital Libraries ICADL 2002 – Singapore – Dec Edward A. Fox (with Hussein Suleman, Ming Luo)
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
Open Virginia Tech DLRL Hussein Suleman
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Mirroring an OAI archive with an I2-DSI channel Ryan Richardson Edward A. Fox Digital Library Research Laboratory Virginia Tech May 7 th, 2002.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
Digital Library Component Models hussein suleman uct cs honours 2005.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
XXDL and CSTC and Virginia Tech NSDL Fall 2000 PI Meeting September 22-24, 2000 NSF, Arlington, VA Edward A. Fox CS DLRL.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Enforcing Interoperability with the Open Archives Initiative Repository Explorer Hussein Suleman, Digital Library Research.
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
Open Archives Initiative Gail McMillan Digital Library and Archives, Virginia Tech Society for Scholarly Publishing: June 1, 2000.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Foundations of, and Experiences with, Componentized Digital Libraries OCKHAM Panel ECDL Rome, Italy Edward A. Fox Digital Library Research.
Open Archives Initiative CNI Phoenix December 13, 1999 Dale Flecker, Harvard Carl Lagoze, Cornell John Ober, CDL Don Waters, Mellon.
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
NDLTD Toward Universal Accessibility of ETDs: Building the NDLTD Union Archive Hussein Suleman, Edward A. Fox,
OAI and ODL Building Digital Libraries from Components Ryan Richardson Virginia Tech DLRL 18 September 2003.
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
NDLTD Standards, Metadata and the OAI-PMH Hussein Suleman University of Cape Town October 2003.
Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Systems for scholarly communication
OAI and Metadata Harvesting
Open Archive Initiative
Presentation transcript:

The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox CS DLRL Virginia Tech, Blacksburg, VA, USA

Acknowledgements Sponsors: Mellon Foundation, SOLINET, NSF, DLF, CNI, UK’s JISC, Virginia’s CIT, … OAI Team: Steering Committee, Technical Committee, Developers, Data Providers, Service Providers Emory Team, Partners around Southeast VT Colleagues: Hussein Suleman, Rohit Kelapure, Ming Luo, Ryan Richardson, Marcos Goncalves, Priya Shivakumar, Baoping Zhang, students working on term projects, …

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

Open Archives Initiative OAI

Open Archives Initiative (OAI) high-energy physics (Ginsparg, 1991) CSTR + WATERS = NCSTRL (Lagoze,1994) xxx + NCSTRL = CoRR collaboration (1998) Universal Preprint Service protoproto, Oct , 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi Santa Fe Convention (see Feb 2000 D-Lib Magazine article) Archives -> Open Archives Support unique archive identifiers Implement metadata set(s) (DC, using XML) Implement OA harvesting protocol Register the archive Build tools, layer other services: linking, searching, …

OAi Philosophy Self-archiving = submission mechanism Long-term storage system = archive Open interface = harvesting mechanism Data provider + service provider Start with “gray literature” e-prints/pre-prints, reports, dissertations, …

Began as “archives of the world unite!” OAI

Open Archives (protoproto) ArXiv & Los Alamos National Lab CogPrints & U. Southampton NACA & NASA (reports) NCSTRL & Cornell U. NDLTD & Virginia Tech RePEc & U. Surrey Total of around 200K records

Original Open Archives Members American Physical Society California Digital Library Caltech Coalition for Networked Info. Cornell University Harvard University Library of Congress Los Alamos Nat’l Lab Mellon Foundation NASA Langley Research Cntr Old Dominion University Stanford University U. of Ghent U. of Surrey U. of Southampton Vanderbilt University Virginia Tech Washington University

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

Now is a Technical Umbrella for Practical Interoperability… Reference Libraries Publishers E-Print Archives …that can be exploited by different communities Museums

Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI

Aggregation through OAI Harvesting – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7

Aggregation through OAI Harvesting – By Organization TheologyEmoryGAUGAU FLUTKAmSoLibrary

Aggregation through OAI Harvesting – By Topic Confederate Constitution Civil WarHistoryOralSportsCultureAmSoDiaries

Approaches to Aggregation Build By Discipline Build By Institution

Types of Access Possible Build By Discipline Build By Institution Year Category Personage Author Genre Query …

OAI Repository Required: Protocol DO MDO

Metadata vs. Data Data refers to digital objects or digital representations of objects Metadata is information about the objects (e.g. title, author, etc.) OAI focuses on metadata, with the implicit understanding that metadata usually contains useful links to the source digital objects

Metadata: Complex to Simple MARC (>$50)Dublin Core (DC)

repository repositoryrepository OAI protocol harvesterharvester support data harvesting data items

identifiers oai-identifier = oai:archive-identifier:record-identifier Registered URI Scheme Archive Identifier: Registered within OAI Unique ID within archive: (syntax is archive- specific) example = oai:ncstrl:ncstrl.cornellcs/TR locally unique key for extracting a record from a repository

selective harvesting - datestamps repositoryrepository harvest within date range record

selective harvesting - sets repositoryrepository harvest within set S1 record S2

Summary: Protocol for Metadata Harvesting Service Requests Identify ListMetadataFormats ListSets GetRecord ListIdentifiers ListRecords Metadata Multiplicity Date (and Time) Ranges Resumption Tokens

Harvesting vs. Federation Competing approaches to interoperability Federation is when services are run remotely on remote data (e.g., federated searching) Harvesting is when data/metadata is transferred from the remote source to the destination where the services are located (e.g., union catalogues) Federation requires more effort at each remote source but is easier for the local system and vice versa for harvesting OAI (currently) focuses on harvesting

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

Example 1: Union Collection of ETDs (Electronic Theses and Dissertations, for Networked Digital Library of Theses and Dissertations, NDLTD)

Example 1: Details

Example 2: NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”

Example 2: CITIDEL -> NSDL Computing and Information Technology Interactive Digital Education Library A collection project in the National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL

Example 2: CITIDEL Distributed repository structure

Example 2: NSDL Collections (themes relevant to our projects) Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on pedagogy

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

Open Digital Libraries XOAI-PMH Dissertation work of Hussein Suleman (member of OAI technical committee) Extending the OAI protocol Supporting rapid development of DLs using networks of components Demonstrated with NDLTD, CSTC Described in Dec D-Lib Magazine article, and article scheduled for publication

Open Digital Libraries Components Running now XML-File (data provider from file system) Union, search, browse, recent, filter E-journal support system Class projects High performance multilingual search Recommender User rating Others discussed Classification/categorization and browsing

Component System Approach (Open) DL = Network of Extended OAs Local Archive Data Input Remote Archive Browse Metadata Repository SearchRecommend Resource Discovery User Interface OAI/ODL archive OAI/ODL protocol legend

Example Architecture (NDLTD) Humboldt Duisburg MIT Filter MIT Browse Union Catalog SearchRecent User Interface OAI/ODL archive OAI/ODL protocol legend Virginia Tech PhysNet CalTech Dresden

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

OAI Tools Related resources, e.g., XML, Unicode Submission / author support XML Schema Validator Servers and utilities, e.g., ARC, Kepler, EPrints Repository Explorer Interactive Browsing Testing of parameters Multiple views of data Multilingual support Automatic test suite

Author‘s tools

XSV Schema Validator

ARC (arc.cs.odu.edu)

VT Tool: Repository Explorer The Repository Explorer is a tool for browsing and testing Open Archives, by Hussein Suleman You issue commands and see the results You also can perform a sequence of automatic tests

VT Tool: RE 1.3

VT Tool: Request, Response

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

What will central service look like? (1 of 2) Harvesting from local sites Rich content, drawn from all participating sites Data management Logging and reporting Repository/preservation/mirroring Adding/updating/deleting User interface and support for digital librarians and data providers

What will central service look like? (2 of 2) Adding value De-duping Categorization/classification -> browsing Normalization/standardization -> authority control Tools for communication/collaboration/annotation -> security/privacy User interface for both general users and scholars

What are needs at local sites? Increasing OAI expertise Connecting OAI with local systems Supporting standards, normalization Supporting continual updating Passing enhancements upstream

How can VT help? (1 of 2) Usability studies for central site Help develop consensus Help plan system architecture & services Education/training Provide and support tools/systems Help sites engage, become OAI compliant

How can VT help? (2 of 2) Standards MARC-XML ODL Suite Download and configure Use in packaged forms, or re-architected Support Connecting your system into OAI Help with OAI Tools

MARC XML-DTD XML Transport format for US-MARC records Standardized metadata exchange format for traditional library services joining OAI

Contents Early history Key concepts Examples ODL, XOAI OAI Tools Technical Plan Conclusion

Rethink your efforts in terms of providers of Data, Services Reduced work for data providers Tools available Don’t need to offer services Reduced work for service providers Others provide the data Can use tools and systems for OAI, XOAI Results More data becoming available To more people Supported by improved services MetaScholar can be a win-win-win project!

Links Open Archives Initiative OAI Metadata Harvesting Protocol Virginia Tech DLRL OAI Projects Repository Explorer NDLTD

More Links ARC Cross-Archive Search Service XML Schema Validator Dublin Core Metadata Initiative E-Prints DL-in-a-box XML Tools at W3C