UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting CRIS + Open Access = The Route to Research Knowledge on the GRID Brussels.

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
Advertisements

UKOLN is supported by: ePrints UK Workshops and Business Models Philip Hunter ePrints UK Project Manager A centre of expertise in.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
UKOLN is supported by: An overview of the OpenURL UKOLN/JIBS OpenURL Meeting London, September 2003 Andy Powell, UKOLN, University of Bath
Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
JISC IE Architecture external trends and their potential impact Andy Powell UKOLN, University of Bath
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
A centre of expertise in digital information management UKOLN is.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
DNER Architecture Andy Powell UKOLN, University of Bath Web of Science Enhancements Committee, Centre Point 5 March.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Images Application Profile meeting 29th October 2007, London Julie Allinson Digital Library Manager Library & Archives, University of York SWAP a Dublin.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
1 Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Alon Kadury.
A centre of expertise in digital information management UKOLN is supported by: Eprints Application Profile UK Repositories Search Project.
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
SWAP FOR DUMMIES. Scholarly Works Application Profile a Dublin Core Application Profile for describing scholarly works (eprints) held in institutional.
A centre of expertise in digital information management The MEG Metadata Schemas Registry Pete Johnston, Research Officer (Interoperability),
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
The DNER - a national digital library Andy Powell ZIG Meeting, York October 2001 UKOLN, University of Bath UKOLN is funded by Resource:
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
The JISC IE Metadata Schema Registry and IEEE LOM Application Profiles Pete Johnston UKOLN, University of Bath CETIS Metadata & Digital Repositories SIG,
DNER Architecture Andy Powell, Liz Lyon MLE Steering Group 4 May 2001 UKOLN, University of Bath UKOLN is funded by.
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
Van de Sompel, Herbert Los Alamos National Laboratory – Research Library OAI-PMH for Resource Harvesting.
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting and ePrints UK AULIC Institutional Repositories Meeting University.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
CombeDay Making Data Openly Available Simon Coles.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
UKOLN is supported by: Content packaging and MPEG-21 DID Andy Powell, UKOLN, University of Bath JISC Joint Programmes Meeting, July.
IESR, A Registry of Collections and Services: Using the DCMI Collection Description Profile in Practice Ann Apps MIMAS, The University of Manchester, UK.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
Open Access Tools for Scholars Scholarly Communication Retreat Wednesday December 12, 2007 Presented by Marcia Salmon.
Collections, services, and interoperability in the information environment Minerva Project WP3/4 meeting, Paris, 5 July 2002 Pete Johnston UKOLN, University.
A centre of expertise in digital information management UKOLN is supported by: IEMSR, the Information Environment & Metadata Application.
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Resource Discovery Landscape
Accessing a national digital library: an architecture for the UK DNER
Open Archive Initiative
JISC Information Environment Service Registry (IESR)
Presentation transcript:

UKOLN is supported by: The Open Archives Initiative Protocol for Metadata Harvesting CRIS + Open Access = The Route to Research Knowledge on the GRID Brussels – 21 September 2004 Andy Powell, UKOLN, University of Bath A centre of expertise in digital information management

2 Contents a brief history of OAI 10 technical things you should know about the OAI-PMH potential impact… –institutional context –the role of the library? –the researcher current activities/issues OAI and the semantic Web note: primary focus is on the technology

3 OAI roots the roots of OAI lie in the development of eprint archives… –arXiv, CogPrints, NACA (NASA), RePEc, NDLTD, NCSTRL each offered Web interface for deposit of articles and for end-user searches difficult for end-users to work across archives without having to learn multiple different interfaces recognised need for single search interface to all archives –Universal Pre-print Service (UPS)

4 Searching vs. harvesting two possible approaches to building a single search interface to multiple eprint archives… –cross-searching multiple archives based on protocol like Z39.50 –harvesting metadata into one or more ‘central’ services – bulk move data to the user-interface US digital library experience in this area indicated that cross-searching not preferred approach –distributed searching of N nodes viable, but only for small values of N

5 Harvesting requirements in order that harvesting approach can work there need to be agreements about… –transport protocols – HTTP vs. FTP vs. … –metadata formats – DC vs. MARC vs. … –quality assurance – mandatory elements, mechanisms for naming of people, subjects, etc., handling duplicated records, best-practice –intellectual property and usage rights – who can do what with the records work in this area resulted in the “Santa Fe Convention”

6 Development of OAI-PMH 2 year metamorphosis thru various names –Santa Fe Convention, OAI-PMH versions 1.0, 1.1… –OAI Protocol for Metadata Harvesting 2.0 development steered by international technical committee inter-version stability helped developer confidence move from focus on eprints to more generic protocol –move from OAI-specific metadata schema to mandatory support for DC

7 Bluffer’s guide to OAI 1.OAI-PMH short for Open Archives Initiative Protocol for Metadata Harvesting 2.a low-cost mechanism for harvesting metadata records –from ‘data providers’ to ‘service providers’ 3.allows ‘service provider’ to say ‘give me some or all of your metadata records’ –where ‘some’ is based on date-stamps, sets, metadata formats 4.eprint heritage but widely deployed –images, museum artefacts, learning objects, …

8 Bluffer’s guide to OAI 5.based on HTTP and XML –simple, Web-friendly, fast deployment 6.OAI-PMH is not a search protocol –but use can underpin search-based services based on Z39.50 or SRW or SOAP or… 7.OAI-PMH typically carries metadata –content (e.g. full-text or image) made available separately – typically at URL in metadata 8.mandates simple DC as record format –but extensible to any XML format – IEEE LOM, ONIX, MARC, METS, MPEG-21, etc.

9 Bluffer’s guide to OAI 9.metadata and ‘content’ often made freely available – but not a requirement –OAI-PMH can be used between closed groups –or, can make metadata available but restrict access to content in some way 10.underlying HTTP protocol provides –access control – e.g. HTTP BASIC –compression mechanisms (for improving performance of harvesters) –could, in theory, also provide encryption if required

10 Dublin Core OAI-PMH mandates use of simple DC as lowest common denominator agreed XML schema – ‘oai_dc’ –simple DC – 15 metadata properties –all DC properties optional and repeatable TitleContributorSource CreatorDateLanguage SubjectTypeRelation DescriptionFormatCoverage PublisherIdentifierRights

11 OAI and Google OAI gateway OAI gateway makes harvested metadata available to Google… eprint archive(s) HTTP OAI-PMH Examples… Dspace and Google OAIster and Yahoo

12 Impact on institutions… OAI-PMH technology provides an open, relatively stable technical framework –allows institution to re-consider management of intellectual output –greater confidence in availability of external services (e.g. discovery, access, analysis) the technical bit is easy –eprints.org software (Southampton), DSpace (MIT/HP), Fedora but, technical solutions are always easy! –real problem is cultural change required to get academics to deposit

13 Impact on libraries… library is natural choice as ‘managing agent’ for the institutional repository –quality control –metadata enhancement –preservation but libraries often weak technically (not always!) therefore technical collaboration within institution may be required beginning to see some evidence of externally ‘hosted’ repository services being offered

14 Impact on researchers… OAI-PMH technology provides a ‘disruptive’ technical framework that supports –new ways for individual researcher to disclose his/her research output –development of new kinds of ‘research’ discovery services can use ‘personal’ OAI repository but, need to –clarify roles of institutional, discipline and personal repositories –overcome FUD – IPR, peer-review, ability to ‘publish’, quality control, inertia

15 Current activities/issues protocol now stable and few changes being discussed some lightweight noises about re- implementing OAI-PMH using SOAP (Web services) but little enthusiasm for pushing these kinds of changes forward some work on OAI-rights issues – formalising mechanisms for attaching IPR statements and/or licences to the records being exchanged using the protocol, e.g. Creative Commons

16 Creative Commons CC is “devoted to expanding the range of creative work available for others to build upon and share” provides ‘standard’ licences for content –attribution –noncommercial –no derivative works –share alike mechanisms for indicating licence on Web pages

17 Works vs. manifestations implementers have tended to see ‘eprints’ as single-entity objects some evidence that this is too simplistic –some repositories expose metadata about the ‘work’, others expose metadata about the ‘expressions’ need more consistency in our use the OAI- PMH to expose metadata about both ‘works’ and ‘manifestations’ complex objects encoded using METS or MPEG-21 DIDL (may include ‘objects’ as well as ‘metadata about objects’)

18 Works vs. manifestations work manifestations metadata about the work metadata about manifestation 1 metadata about manifestation 2 oai_dc

19 OAI and the SW most metadata carried by the protocol currently is not RDF not suitable for processing directly by semantic Web applications need to build ‘knowledge’ about the structure of the metadata formats in use into the harvesting application but could use the protocol to carry RDF/XML

20 Questions…