Using OAI-PMH for Resource Exchange OAI Metadata Harvesting Workshop, JCDL 03 Michael L. Nelson, Terry L. Harrison Old Dominion University Norfolk VA

Slides:



Advertisements
Similar presentations
3nd Open Archives Forum Workshop – Berlin, 27th-29th March 2003 Requirements and lessons from the open archives service providers Report back.
Advertisements

Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
METS Creation in a production environment METS Opening Day Corey Keith
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003.
Thomas G. Habing – University of Illinois at Urbana-Champaign Recap: SIGIR 2001 OAI Workshop 19 September OAI Provider Workshop, University of.
Adventures in Digital Asset Management: Fedora at the National Library of Wales Glen Robson National Library of Wales
XML: The Strategic Opportunity Roy Tennant Challenges*  Only librarians like to search, everyone else likes to find  Our users want more information.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
“Old Style” Libraries, Digital Libraries: Convergences, Divergences, And the Troubles in Between.
Sheet Music Consortium: Tools for Data Providers Jenn Riley Head, Carolina Digital Library and Archives The University of North Carolina at Chapel Hill.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland OAIResource Software Her This work supported in part by the.
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland A New Model for Web Resource Harvesting Her This work supported.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland OAI-PMH for Resource Harvesting Herbert Van de Sompel Digital.
Van de Sompel, Herbert Los Alamos National Laboratory – Research Library OAI-PMH for Resource Harvesting.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Loading Audacity and the LAME encoder for MP3 exports.
UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting CRIS + Open Access = The Route to Research Knowledge on the GRID Brussels.
Archive Ingest and Handling Test: ODU’s Perspective Michael L. Nelson Department of Computer Science Old Dominion University
METS Dissemination METS Opening Day Corey Keith
1 Digital Preservation Testbed Database Preservation Issues Remco Verdegem Bern, 9 April 2003.
Repository Synchronization Using NNTP and SMTP Michael L. Nelson, Joan A. Smith, Martin Klein Old Dominion University Norfolk VA
1 Web Site Creation: Good Practice Guidelines Architectures For Project Web Sites Brian Kelly UK Web Focus UKOLN University of Bath UKOLN is supported.
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Open Archives Initiative Object Reuse & Exchange Resource Map Discovery Michael L. Nelson * Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Robert Sanderson,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
DSpace vs Fedora Ralph LeVan OCLC Research. What Do You Want From a Repository? How do you create your metadata? How do you assemble your objects? How.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
May 26-28ICNEE 2003 ARCHON: BUILDING LEARNING ENVIRONMENTS THROUGH EXTENDED DIGITAL LIBRARY SERVICES Hesham Anan, Kurt Maly, Mohammad Zubair,et al. Digital.
Archive Ingest and Handling Test: ODU’s Perspective Michael L. Nelson Department of Computer Science Old Dominion University
Feb 21-25, 2005ICM 2005 Mumbai1 Converting Existing Corpus to an OAI Compliant Repository J. Tang, K. Maly, and M. Zubair Department of Computer Science.
Digitization with Millennium & CONTENTdm Stuart Hunt IUG17 Anaheim May 2009.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland The American Physical Society Project: Standards-based Mirroring.
MathArc Co-operating Preservation Archives Sharing Collections Among Dissimilar OAIS Repositories William Kehoe, Adam Smith, Marcy Rosenkrantz Cornell.
Evaluating Ingest Success: Using the AIHT Michael L. Nelson, Joan A. Smith Department of Computer Science Old Dominion University Norfolk VA DCC.
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
User Evaluation of the NASA Technical Report Server Recommendation Service Michael L. Nelson, Johan Bollen Old Dominion University
Thinking Long Term - Archive Strategies for Alfresco Nathan McMinn Remote Service Engineer Alfresco Chetan Lalye Senior Software Architect Agilent Technologies.
Mod_oai: Metadata Harvesting for Everyone Michael L. Nelson, Herbert Van de Sompel, Xiaoming Liu, Aravind Elango
OAI metadata: why and how Jenn Riley Metadata Librarian Indiana University.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Data Management and Archival Storage Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Updated :02 Hong Kong University of Science & Technology Library Workshop on XML-Based Library Applications 4. XML Standards and Tools.
DP Knowhow: Open Archival Information Systems (OAIS) in ISO APA/C-DAC International Conference on Digital Preservation and the Development of Trusted.
Accomplishments RSM v0.7 First draft XML Schema completed: VOResource.xsd NVO: Working prototype resource using VOResource as format for metadata exchange.
Outline Pursue Interoperability: Digital Libraries
November 22, 2002 Patricia Galloway School of Information
Introduction to Digital Libraries Assignment #4
Introduction to Digital Libraries Assignment #3
Introduction to Digital Libraries Assignment #4
A New Model for Web Resource Harvesting
Oya Y. Rieger Cornell University Library May 2004
Introduction to Digital Libraries Assignment #3
JISC Information Environment Service Registry (IESR)
If You Harvest arXiv.org, Will They Come?
Introduction to Digital Libraries Assignment #4
Introduction to Digital Libraries Assignment #3
Introduction to Digital Libraries Assignment #3
Introduction to Digital Libraries Assignment #4
Presentation transcript:

Using OAI-PMH for Resource Exchange OAI Metadata Harvesting Workshop, JCDL 03 Michael L. Nelson, Terry L. Harrison Old Dominion University Norfolk VA

OAI-PRH? using OAI-PMH for resource extraction / exchange –yes, OAI-PMH is for metadata not resources, but its going to happen anyway… mirroring preservation (archive “zipping”) convergence with OAIS –assumptions a digital resource rsync et al. neither appropriate nor possible defer metadata vs. data discussion

Possible Approaches 1.Exploit knowledge outside the scope of the OAI- PMH to extract the resource 2.Base64 encode the resource and transmit via OAI-PMH as a separate “metadata” prefix? 3.Separate metadata prefix with instructions on how to extract / scrape the resource 4.Separate metadata format with XML encoded metadata, along with XSLT to decode it

Out of Band Knowledge direct pdf 1.take url in dc:identifier 2.parse report number 3.append “reportnumber.pdf” to url

Out of Band Knowledge pros: tailored, no “accidental” harvesting cons: not scalable wrt # of repositories & harvesters, false negatives no metadata change metadata change no data change okunnecessary download data changemissed update! ok assumption: change in metadata means a change in data -- not always true!

Base64 Encoding define separate metadata formats –base64:application/pdf –base64:application/powerpoint pros: describable with OAI-PMH semantics, accomplished with standard OAI-PMH tools cons: heavyweight (could use compression), suitable for simple objects only, accidental harvesting would produce high loads for repositories and harvesters

Metadata as Instructions cf.

Metadata as Instructions the resource described in could be a complex object –may not be appropriate to: “tar” the object into a single file expose all constituent objects through OAI-PMH –define a metadata prefix that provides machine readable instructions on how extract the complex object METS?

Metadata as Instructions

XSLT if the resource is already XML encoded, include an XSLT to transform into the desired format –use separate metadata formats or even sets for the harvester to express their transformation preferences? pros: elegant, limited work for repository cons: assumes client-side transformation capability, applicable only for XML-encodable resources