The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson The Open Archives Initiative Michael L. Nelson Computer Science,

Slides:



Advertisements
Similar presentations
OAI from 50,000 Feet OAI develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. Begun in 1999.
Advertisements

A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
Depositing e-material to The National Library of Sweden.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
UKOLN is supported by: OAI-ORE a perspective on compound information objects ( Defining Image Access.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
UKOLN is supported by: A non-technical introduction to: OAI-ORE ( Defining Image Access project meeting.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel The Open Archives Initiative Object Re-Use & Exchange.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
Using OAI-PMH Resource Harvesting & MPEG-21 DIDL for Digital Preservation Joan A. Smith & Michael L. Nelson Old Dominion University Department of Computer.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
A New Model for Web Resource Harvesting Michael L. Nelson Old Dominion University joint work with: Her Herbert Van de Sompel Xiaoming Liu Carl Lagoze Simeon.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
LIS 654 BUILDING DIGITAL LIBRARIES FALL 2011 NOVEMBER 03, 2011 The OAI-PMH Harvester Plugin for The Omeka Content Management System JAMES R. GRIFFIN III.
Extensible Markup Language (XML) Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).ISO 8879 XML is a.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The China Digital Museum Project.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland A New Model for Web Resource Harvesting Her This work supported.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland OAI-PMH for Resource Harvesting Herbert Van de Sompel Digital.
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Van de Sompel, Herbert Los Alamos National Laboratory – Research Library OAI-PMH for Resource Harvesting.
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Archive Ingest and Handling Test: ODU’s Perspective Michael L. Nelson Department of Computer Science Old Dominion University
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Repository Synchronization Using NNTP and SMTP Michael L. Nelson, Joan A. Smith, Martin Klein Old Dominion University Norfolk VA
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
An Update on the OAI-ORE Project CNI Spring 2007 Task Force Meeting, Phoenix AZ, April 17, 2007 Lagoze, Nelson & Van de Sompel An Update on the Open Archives.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
CNI, 4th April 2006 Slide 1 Key Standards Update: SRU (“Technical” Details) Dr. Robert Sanderson Dept. of Computer Science University of Liverpool
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Introduction to the Semantic Web and Linked Data
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Open Archives Initiative Protocol for Metadata Harvesting.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Archive Ingest and Handling Test: ODU’s Perspective Michael L. Nelson Department of Computer Science Old Dominion University
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Evaluating Ingest Success: Using the AIHT Michael L. Nelson, Joan A. Smith Department of Computer Science Old Dominion University Norfolk VA DCC.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Mod_oai: Metadata Harvesting for Everyone Michael L. Nelson, Herbert Van de Sompel, Xiaoming Liu, Aravind Elango
Metadata & Repositories Jackie Knowles RSP Support Officer.
International Planetary Data Alliance Registry Project Update September 16, 2011.
The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi.
Getting a Leg Up on OAI for the NSDL
Repository Software - Standards
Georges Arnaout Chaitanya Krishna
Accessing a national digital library: an architecture for the UK DNER
Jenn Riley Metadata Librarian Digital Library Program
OAI and Metadata Harvesting
A New Model for Web Resource Harvesting
Characterization of Search Engine Caches
Open Archive Initiative
IVOA Interoperability Meeting - Boston
Jenn Riley Metadata Librarian Digital Library Program
Presentation transcript:

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson The Open Archives Initiative Michael L. Nelson Computer Science, Old Dominion University

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Open Archives Initiative Protocol for Metadata Harvesting data providers / repositories: o “A repository is a network accessible server that can process the 6 OAI-PMH requests in the manner described in [the OAI-PMH document]. A repository is managed by a data provider to expose metadata to harvesters.” service providers / harvesters: o “A harvester is a client application that issues OAI-PMH requests. A harvester is operated by a service provider as a means of collecting metadata from repositories.”

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Data Providers / Service Providers data providers (repositories) service providers (harvesters)

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Overview of OAI-PMH Verbs VerbFunction Identifydescription of repository ListMetadataFormatsmetadata formats supported by repo ListSetssets defined by repository ListIdentifiersOAI unique ids contained in repo ListRecordslisting of N records GetRecordlisting of a single record repository metadata harvesting verbs most verbs take arguments: dates, sets, ids, metadata formats and resumption token (for flow control)

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson resource item Dublin Core metadata MARCXML metadata records entry point to all records pertaining to the resource metadata pertaining to the resource OAI-PMHidentifier metadataPrefix datestamp OAI-PMH identifierOAI-PMH sets OAI-PMH data model

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Complexity Comes to OAI-PMH… First noticed in how people would populate their Dublin Core records o people need the HTML splash page o crawlers need the PDF file Ad-hoc conventions and methods used to expose the repository’s knowledge about the structure of the object Next three slides taken from “Resource Harvesting Within the OAI-PMH Framework” o

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Dublin Core Encoding Type 1 A Simple Parallel-Plate Resonator Technique for Microwave. Characterization of Thin Resistive Films Vorobiev, A. ING-INF/01 Elettronica A parallel-plate resonator method is proposed for non-destructive characterisation of resistive films used in microwave integrated circuits. A slot made in one... Microwave engineering Europe 2002 Documento relativo ad una Conferenza o altro Evento PeerReviewed pdf locator of resource splash page

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Dublin Core Encoding Type 2 … … locator of resource splash page

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Dublin Core Encoding Type 3 … … locator of resource splash page

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson OAI Object Re-Use and Exchange Develop, identify, and profile extensible standards and protocols to allow repositories, agents, and services to interoperate in the context of use and reuse of compound digital objects beyond the boundaries of the holding repositories. Aim for more effective and consistent ways: o to facilitate discovery of these objects, o to reference (link to) these objects (and parts thereof), o to obtain a variety of disseminations of these objects, o to aggregate and disaggregate these objects, o Enable processing by automated agents

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson The Structure of Compound Objects is Obfuscated When Mapped to the Web

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Useful for humans and useful for applications is often different HTTP LINK HEADER

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Through the Resource Map, the Web application sees the compound object

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson This approach reveals compound objects in the Web graph

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson OAI-PMHOAI-ORE Repository structureObject structure Metadata centricResource centric Metadata harvestingObject re-use (obtain, harvest, register) OAI-PMH and OAI-ORE are complimentary; o you can do one without the other o you can do them together OAI: Its Not Just for Metadata Harvesting Anymore…

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson OAI-ORE : Current Status Ongoing definition of the ORE framework o Reach joint problem statement o Issues regarding identification o Model for ORE resource o Publishing ORE resources to the Web o Discovering ORE resources Review of appropriate technologies for ORE Model and Resource Map o ATOM o DID/DIDL, IMS/CP, METS, Ramlet o RDF, RDF/XML o Dublin Core Abstract Model o …

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson OAI-ORE : Current Status Explore demonstrators using these concepts in preparation of May 2007 ORE Technical Committee meeting Post May 2007 meeting: o Hopefully work towards alpha specs for ORE resource, Resource Map, discovery of ORE resource o Experimentation with alpha specs

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson My research group’s approach to OAI/Preservation integration…

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Preservation: Fortress Model 1. Get a lot of $ 2. Buy a lot of disks, machines, tapes, etc. 3. Hire an army of staff 4. Load a small amount of data 5. “Look upon my archive ye Mighty, and despair!” image from: Five Easy Steps for Preservation:

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Alternate Models of Preservation Lazy Preservation o Let Google, IA et al. preserve your website Just-In-Time Preservation o Wait for it to disappear first, then a “good enough” version Shared Infrastructure Preservation o Push your content to sites that might preserve it Web Server Enhanced Preservation o Use Apache modules to create archival-ready resources image from:

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Web Site Preservation: 2 Problems The counting problem How many pages are on that site? To save it you have to find it The representation problem What’s that page all about? Future use requires understanding Guess the bean count, win the jar

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson OAI-PMH Data Model resource item Dublin Core metadata MARCXML metadata MPEG-21 DIDL records OAI-PMH identifier = entry point to all records pertaining to the resource METS metadata pertaining to the resource modeled representation of the resource simple model more expressive model complex model complex model

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Integrate OAI-PMH functionality into the web server itself… 1. Use mod_oai - an Apache 2.0 module - automatically answers OAI-PMH requests for an http server - written in C - respects values in.htaccess, httpd.conf 2. Install mod_oai on 3. Define baseURL: Result: web harvesting with OAI-PMH semantics (e.g., from, until, sets) mod_oai implementation Using OAI-PMH Give me all resources And their preservation metadata From site foo, dating from 9/15/2004 through today that are MIME type video-MPEG

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Addressing the Counting Problem: ListIdentifiers CRAWLER: issues a ListIdentifiers, finds URLs of updated resources does HTTP GET updates only can get URLs of resources with specified MIME types

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Addressing the Representation Problem: ListRecords in DIDL Format CRAWLER: Makes a ListRecords query, Gets updates as MPEG-21 DIDL records (HTTP headers, resource By Value or By Reference) can get resources with specified MIME types

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson CRATE: Preservation Metadata at Dissemination Time Harnesses web server to support preservation Moves preservation metadata from “strict validation at ingest” to “best-effort description at dissemination” Plug-in Name Executable path

The Open Archives Initiative DRIADE Workshop, Durham NC, May 16-17, 2007 Michael L. Nelson Validation is Subjective images from: Preservation metadata is like a David Hockney photo collage: each image is both true and incomplete, and while the result is not faithful, it does capture the “essence”