A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,

Slides:



Advertisements
Similar presentations
Richard Jones, Systems Developer Technical Issues for Repository Software Theses Alive! Edinburgh University Library SHERPA Nottingham.
Advertisements

Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
OAI from 50,000 Feet OAI develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. Begun in 1999.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
A busy persons introduction to OAI-PMH Christopher Gutteridge ALT, April 2003.
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
UKOLN is supported by: ePrints UK Workshops and Business Models Philip Hunter ePrints UK Project Manager A centre of expertise in.
ePrints UK: progress so far Ruth Martin UKOLN, University of Bath
Tim Brody University of Southampton CiteBase Services 13/07/2001.
DLM-Forum - Barcelona, 7-8 May 2002 Promoting and Supporting Open Archives in Europe: The Open Archives Forum Project Donatella Castelli IEI-CNR
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
OLAC Process and OLAC Protocol: A Guided Tour Gary F. Simons SIL International ___________________________ OLAC Workshop 10 Dec 2002, Philadelphia.
Deconstructing Cataloging A Web Services Approach to Bibliographic Control Thomas Hickey.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
Pete Johnston UKOLN, University of Bath Bath, BA2 7AY
PwC SCHEMAS Forum for metadata schema implementers The SCHEMAS project and metadata ETB Workshop, London, 9-10 January 2001 Michael Day,
UKOLN is supported by: The JISC Information Environment Bath Profile Four Years On: whats being done in the UK? 7 th July 2003 Andy Powell, UKOLN, University.
Distributed Service Registries Workshop, July 2005 Slide 1 NISO Metasearch Initiative Registries Robert Sanderson Dept. of Computer Science University.
The metadata challenge for libraries: a view from Europe Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
UKOLN, University of Bath
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
A centre of expertise in digital information management UKOLN: providing support to the RSCs. Dr Liz Lyon, Director RSC Managers Meeting.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
OAI and Publishers metadata Using the static repositories approach to disclose small journals.
Andy Powell, Eduserv Foundation Feb 2007 The Dublin Core Abstract Model – a packaging standard?
Dublin Core, OAI-PMH and the eBank UK schema Monica Duke UKOLN, University of Bath, UK UKOLN is supported by:
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
Collection-level description in practice Collection-Level Description & NOF-digitise projects NOF-digitise programme seminar, London, 22 February 2002.
Collection description & Collection Description Focus JISC/DNER Moving Image & Sound Cluster Steering Group meeting, HEFCE Office, London, 24 September.
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
OAI in DigiTool DigiTool Version 3.0.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
1 Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Alon Kadury.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
Thomas G. Habing – University of Illinois at Urbana-Champaign Recap: SIGIR 2001 OAI Workshop 19 September OAI Provider Workshop, University of.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Metadata Harvesting Interoperable digital collections.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting CRIS + Open Access = The Route to Research Knowledge on the GRID Brussels.
UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting and ePrints UK AULIC Institutional Repositories Meeting University.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Open Archives Initiative Protocol for Metadata Harvesting.
Metadata Harvesting Interoperable digital collections.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
OAI and Metadata Harvesting
Digitometric Services for Open Archives Environments
Open Archive Initiative
JISC Information Environment Service Registry (IESR)
Presentation transcript:

a centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN, University of Bath IVOA Registry Meeting, London March 2003

2 Contents a brief history of OAI 10 technical things you should know about the OAI-PMH

3 OAI roots the roots of OAI lie in the development of eprint archives… –arXiv, CogPrints, NACA (NASA), RePEc, NDLTD, NCSTRL each offered Web interface for deposit of articles and for end-user searches difficult for end-users to work across archives without having to learn multiple different interfaces recognised need for single search interface to all archives –Universal Pre-print Service (UPS)

4 Searching vs. harvesting two possible approaches to building a single search interface to multiple eprint archives… –cross-searching multiple archives based on protocol like Z39.50 –harvesting metadata into one or more central services – bulk move data to the user-interface US digital library experience in this area indicated that cross-searching not preferred approach –distributed searching of N nodes viable, but only for small values of N

5 Searching vs. harvesting search service …or…

6 Harvesting requirements in order that harvesting approach can work there need to be agreements about… –transport protocols – HTTP vs. FTP vs. … –metadata formats – DC vs. MARC vs. … –quality assurance – mandatory elements, mechanisms for naming of people, subjects, etc., handling duplicated records, best-practice –intellectual property and usage rights – who can do what with the records work in this area resulted in the Santa Fe Convention

7 Development of OAI-PMH 2 year metamorphosis thru various names –Santa Fe Convention, OAI-PMH versions 1.0, 1.1… –OAI Protocol for Metadata Harvesting 2.0 development steered by international technical committee inter-version stability helped developer confidence move from focus on eprints to more generic protocol –move from OAI-specific metadata schema to mandatory support for DC

8 Bluffers guide to OAI 1.OAI-PMH is a low-cost mechanism for harvesting metadata records –from data providers to service providers 2.allows service provider to say give me some or all of your metadata records –where some is based on date-stamps, sets, metadata formats 3.not limited to repositories of eprints –images, museum artefacts, learning objects, … 4.based on HTTP and XML –simple, Web-friendly, autonomous –fast, flexible deployment

9 Bluffers guide to OAI 5.OAI-PMH is not a search protocol –but use can underpin search-based services based on Z39.50 or SRW or SOAP or… 6.OAI-PMH carries only metadata –content (e.g. full-text or image) made available separately – typically at URL in metadata 7.mandates simple DC as record format –but extensible to any XML format – IMS, ONIX, MARC, METS, etc. 8.extensible framework for metadata about –repository, resources, items, sets –can include rights metadata

10 Bluffers guide to OAI 9.metadata and content often made freely available – but not a requirement –OAI-PMH can be used between closed groups –or, can make metadata available but restrict access to content in some way 10.underlying HTTP protocol provides –access control – e.g. HTTP BASIC –compression mechanisms (for improving performance of harvesters) –could, in theory, also provide encryption if required

11 Resources, items and records resource all available metadata about David item Dublin Core metadata MARC metadata SPECTRUM metadata records item = identifier

12 Protocol requests six different request types –Identify –ListMetadataFormats –ListSets –ListIdentifiers –ListRecords –GetRecord harvester need not use all types repository must implement all types required and optional arguments –on request types

13 Record structure metadata about a resource in a particular XML format header (mandatory) identifier (1) datestamp (1) setSpec elements (*) status attribute for deleted item (?) metadata (mandatory) XML encoded metadata within root tag which provides namespace and schema repositories must support Dublin Core about (optional) rights statements provenance statements

14 Dublin Core OAI-PMH mandates use of simple DC as lowest common denominator agreed XML schema – oai_dc –simple DC – 15 metadata properties –all DC properties optional and repeatable TitleContributorSource CreatorDateLanguage SubjectTypeRelation DescriptionFormatCoverage PublisherIdentifierRights

15 OAI demonstration repository explorer demo

16 OAI and Google Web site(s) multimedia database(s) DP9 gateway OAI gateway makes harvested metadata available to Google… eprint archive(s)

17 Implementing OAI OAI protocol is relatively simple implementation and deployment tends to be very fast lots of available toolkits –Java, Perl, PHP, etc. complete tools also available –e.g. tools that sit in front of existing databases see tools area on the OAI Web site…

18 Creative Commons CC is devoted to expanding the range of creative work available for others to build upon and share provides standard licences for content –attribution –noncommercial –no derivative works –share alike mechanisms for indicating licence on Web pages need similar mechanism in OAI

19 Questions…

a centre of expertise in digital information management