National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.

Slides:



Advertisements
Similar presentations
THE DONOR PROJECT Titia van der Werf-Davelaar. Project Financed by: Innovation of Scientific Information Provision (IWI) Duration: –phase 1: 1 may 1998.
Advertisements

OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
White Paper on Establishing an Infrastructure for Open Language Archiving Steven Bird and Gary Simons.
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
A centre of expertise in digital information management UKOLN is supported by: XML and the DCMI Abstract Model DC Architecture WG Meeting,
CIS 375—Web App Dev II SOAP.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
SOAP SOAP is a protocol for accessing a Web Service. SOAP stands for Simple Object Access Protocol * SOAP is a communication protocol * SOAP is for communication.
An Introduction to XML Based on the W3C XML Recommendations.
OAI in DigiTool DigiTool Version 3.0.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
Making Metadata Work for the NSDL. Starting from Sept with...  A prototype with not much behind it that was re-usable (
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Metadata Harvesting Interoperable digital collections.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
Using an Application Profile Based Service Registry Ann Apps Mimas, The University of Manchester, UK.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Creating an Open Archives Metadata Harvesting Protocol Compliant Repository for the American Memory Online Collections OAI Open Meeting, Washington, DC.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
The OAI Protocol for Metadata Harvesting Van de Sompel, Herbert Los Alamos National Laboratory – Research Library.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
DSpace vs Fedora Ralph LeVan OCLC Research. What Do You Want From a Repository? How do you create your metadata? How do you assemble your objects? How.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Enforcing Interoperability with the Open Archives Initiative Repository Explorer Hussein Suleman, Digital Library Research.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
Best Practices for OAI: A Status Report Kat Hagedorn Sarah Shreeves DLF Spring Forum San Diego, CA April
Open Archives Initiative Protocol for Metadata Harvesting.
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
VIVA Special Collections Committee GRANT MEETING January 26, 2007 METADATA: The Who, What, Why, Where, and When Bob Vay George Mason University.
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
DLESE Metadata Frameworks March Talk Organizer Terminology DLESE metadata history (DC/IMS to DLESE- IMS to ADN) ADN Collection News-opps Object.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Do Real Archivists Use OAI? Mid-Atlantic Regional Archives Conference Gettysburg, PA October 31, 2003 Chris Prom Assistant University Archivist University.
1 CAA 2009 Cross Cal 9, Jesus College, Cambridge, UK, March 2009 Caveats, Versions, Quality and Documentation Specification Chris Perry.
Mod_oai: Metadata Harvesting for Everyone Michael L. Nelson, Herbert Van de Sompel, Xiaoming Liu, Aravind Elango
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Introduction to OAI Static Repositories By Thomas G. Habing Grainger Engineering Library.
1 XML and XML in DLESE Katy Ginger November 2003.
Getting a Leg Up on OAI for the NSDL
Repository Software - Standards
Introduction to Metadata
XML Schemas for Dublin Core Metadata
OAI and Metadata Harvesting
Open Archive Initiative
WebDAV Design Overview
JISC Information Environment Service Registry (IESR)
Attributes and Values Describing Entities.
Presentation transcript:

National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University

Aggregator Issues: Deleted Records indicated but transient indicated but transient reharvested soon enough – no problem, mark our copy “deleted” reharvested soon enough – no problem, mark our copy “deleted” reharvested as “disappeared” reharvested as “disappeared” not indicated not indicated reharvested as “disappeared” reharvested as “disappeared”Solution? “Full reharvest” “Full reharvest” Mark all the site’s records in our repository “deleted” Mark all the site’s records in our repository “deleted” Do a full harvest Do a full harvest Ingest each newly retrieved record into our repository, “un- deleting” if we over-write an old record Ingest each newly retrieved record into our repository, “un- deleting” if we over-write an old record

Aggregator Issues: Poor Quality Harvested Metadata What is poor quality? OAI protocol problems OAI protocol problems XML problems XML problems metadata “content” problems metadata “content” problems … it’s a knowledge gap … it’s a knowledge gapSolutions? Clearer documentation Clearer documentation “OAI for Dummies” - details coming up “OAI for Dummies” - details coming up “XML for OAI Dummies” - details coming up “XML for OAI Dummies” - details coming up “Metadata for dummies” – details coming up “Metadata for dummies” – details coming up More, better self-test tools for sites … More, better self-test tools for sites … error messages for “dummies” error messages for “dummies” stricter, more thorough OAI validation checking stricter, more thorough OAI validation checking more XML schema validation of metadata more XML schema validation of metadata user friendly, extremely low entry user friendly, extremely low entry OAI static repository OAI static repository Normalize metadata locally Normalize metadata locally

“OAI for Dummies” identifiers (OAI vs. DC; the need for persistence) identifiers (OAI vs. DC; the need for persistence) datestamps ( vs. header vs. dc:date; format confusion) datestamps ( vs. header vs. dc:date; format confusion) resumptionTokens (exclusive argument, stateless vs. stateful) resumptionTokens (exclusive argument, stateless vs. stateful) chunk size recommendation or rule of thumb chunk size recommendation or rule of thumb “stateless resumption token” general scheme for User Guidelines doc? (To be indicated via Identify response description?) “stateless resumption token” general scheme for User Guidelines doc? (To be indicated via Identify response description?) about containers and their use (additional examples) about containers and their use (additional examples) distinction between “about the metadata” and “about the resource” concepts (dc:rights vs. rights described in about) distinction between “about the metadata” and “about the resource” concepts (dc:rights vs. rights described in about) sets sets multiple metadata formats are allowed (many sites believe OAI means simple DC only) multiple metadata formats are allowed (many sites believe OAI means simple DC only) MUST have valid XML schema MUST have valid XML schema Web service vs. flat file Web service vs. flat file HTTP vs. HTML HTTP vs. HTML We offer: Donna Bergmark’s OAI validation tool ( me to get more info) Donna Bergmark’s OAI validation tool ( me to get more info)

“XML for OAI Dummies” encoding encoding XML encoding XML encoding character encoding (UTF-8, UTF-16, etc.) character encoding (UTF-8, UTF-16, etc.) URL encoding URL encoding XML vs. URL vs. character XML vs. URL vs. character Namespaces Namespaces what are they for? how are they used? what are they for? how are they used? full syntax explanation full syntax explanation declaration, prefix, URI, scope, default, missing … declaration, prefix, URI, scope, default, missing … XML schemas XML schemas what are they for? how are they used? what are they for? how are they used? xsi:schemaLocation xsi:schemaLocation validation – what it will and won’t find validation – what it will and won’t find validators – what’s there, what’s best for “my” site? validators – what’s there, what’s best for “my” site?

“Metadata for Dummies” simple DC vs. qualified DC simple DC vs. qualified DC What refers to metadata, what refers to resource? What refers to metadata, what refers to resource? Think identifiers Think identifiers Think rights Think rights other … other … We offer: Metadata Primer (currently being revised) Metadata Primer (currently being revised) me to get URL me to get URL

Normalize Metadata Locally Aim to improve services (e.g. search results) Aim to improve services (e.g. search results) Improve quality when possible Improve quality when possible Supply missing information, if known Supply missing information, if known site is about Math; add “Mathematics” site is about Math; add “Mathematics” Correct wrong information, when possible Correct wrong information, when possible “text/pdf”  “application/pdf” in “text/pdf”  “application/pdf” in for further details, read our paper Analyzing Metadata for Effective Use and Re-use, submitted to DC 2003 for further details, read our paper Analyzing Metadata for Effective Use and Re-use, submitted to DC me to get URL for draft me to get URL for draft