Improving Metadata Quality: Augmentation and Recombination Diane I. Hillmann Naomi Dushay Jon Phipps National Science Digital Library.

Slides:



Advertisements
Similar presentations
GEOSS StP Browse Scenario Doug Nebert 13Jun2011. Support rapid discovery of data in support of critical EO priorities The GEO Web Portal supports search.
Advertisements

Metadata for Digital Content at the Library of Congress Jane Mandelbaum Information Technology Services Library of Congress May 2009.
Strategic issues for digital projects... …or, what are we doing here?
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
METS: An Introduction Structuring Digital Content.
RDF AND LINKED DATA Jenn Riley Head, Carolina Digital Library and Archives The University of North Carolina at Chapel Hill.
ComPADRE Experiences developing an OAI server over an existing database repository Resources for Physics and Astronomy Education Lyle Barbato American.
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Information Retrieval in Practice
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
Metadata Standards Anita Coleman, Asst. Prof. School of Information Resources & Library Science, University of Arizona, Tucson.
Making Metadata Work for the NSDL. Starting from Sept with...  A prototype with not much behind it that was re-usable (
Metadata Standards & Applications 7. Approaches to Models of Metadata Creation, Storage, and Retrieval.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
The NSDL Registry Jon Phipps Stuart Sutton Diane Hillmann Ryan Laundry Cornell U. U. of Washington.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Grey Literature, E-Repositories and Evaluation of Academic & Research Institutes. The case study of BPI e-repository Maria V. Kitsiou - Head Librarian,
Malaysian Grid for Learning October DC 2004, Shanghai, China. © 2004 MIMOS Berhad. All Rights Reserved Metadata Management System DC2004: International.
EAD: A Technical Introduction Julie Hardesty, Metadata Analyst June 3, 2014.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
1 On the Record Report of the Library of Congress Working Group on the Future of Bibliographic Control Diane Boehr Head of Cataloging, NLM
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
7. Approaches to Models of Metadata Creation, Storage and Retrieval Metadata Standards and Applications.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
1 Metadata Standards Catherine Lai MUMT-611 MIR January 27, 2005.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
The Digital Library for Earth System Science: Contributing resources and collections GCCS Internship Orientation Holly Devaul 19 June 2003.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
Improving Description through Collaboration: The Ethnomusicological Video for Instruction & Analysis Digital Archive Music Library Association, February.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
May 26-28ICNEE 2003 ARCHON: BUILDING LEARNING ENVIRONMENTS THROUGH EXTENDED DIGITAL LIBRARY SERVICES Hesham Anan, Kurt Maly, Mohammad Zubair,et al. Digital.
Sally McCallum Library of Congress
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
DLF Fall Forum The Distributed Library: OAI for Digital Library Aggregation UIUC’s Role: Registry of OAI Data Providers
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Metayogi Increasing the Accessibility of the Semantic Web Karim Tharani Doug Macdonald Rachel Heidecker.
OAI metadata: why and how Jenn Riley Metadata Librarian Indiana University.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Information Retrieval in Practice
Getting a Leg Up on OAI for the NSDL
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Catherine Lai MUMT-611 MIR January 27, 2005
Attributes and Values Describing Entities.
NSDL Data Repository (NDR)
Open Archive Initiative
Web Mining Department of Computer Science and Engg.
University of Hawaii at Manoa
Data Warehousing Data Mining Privacy
Proposal of a Geographic Metadata Profile for WISE
Attributes and Values Describing Entities.
Anatomy of a modern data-driven content product
The role of metadata in census data dissemination
Presentation transcript:

Improving Metadata Quality: Augmentation and Recombination Diane I. Hillmann Naomi Dushay Jon Phipps National Science Digital Library

Introduction Useful services depend on good metadata, but most metadata not very good Human created metadata is expensive Automated crawling strategies limited by: –Accessibility barriers (rights issues, technical issues) –Variability of crawling technologies for non-text Best metadata does not rely solely on information contained within the resource itself –Ex.: Controlled vocabularies, descriptions, links

The NSDL Environment Functions as a metadata aggregator –Simple, two-level hierarchy (Collections & items) –Based on OAI-PMH harvest model –Each harvested item associated with a collection Collection records managed via internal system that also drives automated harvest/ingest processes –Harvested records split into elements for storage and reassembled for output

Why Transform Metadata at All? Four categories of problems associated with decreased user capability –Missing data: elements not present –Incorrect data: values not conforming to proper usage –Confusing data: embedded html tags, improper separation of multiple elements, etc. –Insufficient data: no indication of controlled vocabularies

Transforming Metadata “Safely” Enhance original data with no risk of degradation Provide low cost, scaleable way to improve the quality and predictability of data –Remove “noise”: empty elements, useless values –Detect and identify controlled vocabularies: DCMIType and IMT values –Normalize presentation: clean up values, remove double XML encodings, extra whitespace, etc.

Replacing Safe Transforms with Metadata Augmentation Managing each "record" separately made automated maintenance and enhancement difficult Many sources of data required better definitions of “quality” “Augmentation” makes the knowledge and expertise of NSDL data managers available to consumers of the data

From Records to Elements Metadata record -- “a series of statements about resources” which can be aggregated to build a more complete profile of a resource Statements come with source information, and links to detail about the service that created them

Exposing Quality Information Metadata statements vary in quality, and may be subjective Quality of statements can be determined by knowledge of the source, and knowledge of the methodology used to create it Detailed provenance itself is an indicator of quality metadata

Exposing Data to Downstream Users Two major issues: –Linking statements to particular harvested source records (including the datestamp of the harvest) –Linking records to the services that provided them (including descriptions of those services and the methods used to create the metadata) Required the creation and exposure of service records and a service vocabulary to categorize them

- An Introduction to Surface Chemistry Nix, Roger Theoretical and descriptive material for an introductory surface science course. Topics covered include structure of surfaces and detailed information on a variety of surface analytical techniques. Text text/html colloids surface chemistry

oai:nsdl.org:316878:oai:asdlib.org:asdl T15:19:15Z

Analytical Sciences Digital Library (ASDL) The ASDL is an electronic library that collects, catalogs and links web-based information or discovery material... collection iVia The iVia metadata augmentation service provides subject keyword and LCSH subject headings... augmentation

Conclusions New role for “metadata aggregators”— providing enhanced metadata for other services to re-use –Integrating fragmentary metadata created by automated services –Improving metadata in standard ways –Exposing all relevant data in ways that allow consumers to evaluate quality and usefulness