Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel herbertv@cs.cornell.edu.

Slides:



Advertisements
Similar presentations
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
William Y. Arms Corporation for National Research Initiatives March 22, 1999 Object models, overlay journals, and virtual collections.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
1 CS 502: Computing Methods for Digital Libraries Lecture 13 Descriptive Metadata I: cataloguing, classification, authority files.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
1 CS 430 / INFO 430 Information Retrieval Lecture 16 Library Catalogs 1.
DIGITIZATION OF RARE LIBRARY MATERIALS Metadata Format Access to Digital Documents © Adolf Knoll, National Library of the Czech Republic.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC.
Homework Full-text article – entire textual contents of article in online format Abstract – brief summary of article Citation – basic information required.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
Automation and Related Standards; Metadata Dr. Bilal IS 582 Spring 2006.
Copy cataloguing in Finland Juha Hakala The National Library of Finland
ODINCINDIO Marine Information Management Training Course February 2006 Cataloguing: Introduction Murari P Tapaswi National Institute of Oceanography,
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs Dublin Core.
Developing Databases and Selecting an Appropriate Library System.
Robin L. Dale Director of Digital & Preservation Services LYRASIS Getting Started with the Digital Commonwealth.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 Metadata Standards Catherine Lai MUMT-611 MIR January 27, 2005.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Homework Full-text article – entire textual contents of article in online format Abstract – brief summary of article Citation – basic information required.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
Introduction to metadata
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
A Whirlwind Tour Through Part of the Metadata Landscape Jenn Riley Metadata Librarian IU Digital Library Program.
The physical parts of a computer are called hardware.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Libraries Catalogs Dublin Core.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
OCLC Research Library Partnership Work-In-Progress webinar 3 December 2015 A Close Look at the Four Million Archival MARC Records in WorldCat Jackie Dooley.
1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Future of Cataloguing: how RDA positions us for the future for RDA Workshop June, 2010.
Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – Carl Lagoze – Cornell University.
Metadata (and cataloging?) Jenn Riley Metadata Librarian IU Digital Library Program.
The Catalog of the Future: Integrating Electronic Resources By Dana M. Caudle Cataloging Librarian Auburn University Libraries
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Presenter: Tito Wawire US Embassy, Library of Congress.
The ___ is a global network of computer networks Internet.
An information retrieval system may include 3 categories of information:  Factual  Bibliographical  Institutional  Exchange and sharing of these categories.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
CS 501: Software Engineering Fall 1999 Lecture 5 (a) Requirements Analysis (continued) (b) Requirements Specification.
Queensland University of Technology Faculty of Information Technology Michael Middleton 1 CRICOS No J Bibliographic description.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Information organization Week 2 Lecture notes INF 380E: Perspectives on Information Spring 2015 Karen Wickett UT School of Information.
1 Metadata: an overview Alan Hopkinson ILRS Middlesex University.
Information organization Week 2 Lecture notes INF 380E: Perspectives on Information Spring 2015 Karen Wickett UT School of Information.
Headline.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Catherine Lai MUMT-611 MIR January 27, 2005
Introduction to Metadata
Headline.
Cataloging Tips and Tricks
DIGITAL ARCHIVES Into the Light
Metadata to fit your needs... How much is too much?
A Whirlwind Tour Through Part of the Metadata Landscape
Introduction to Metadata
Open Archive Initiative
Some Options for Non-MARC Descriptive Metadata
CS 430: Information Discovery
Presentation transcript:

Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel herbertv@cs.cornell.edu

Notes Carl Lagoze on Wednesday No Lab on Friday But Paul Ginsparg on 04/03 XML Schema & XSLT - later

Content – Data - Metadata Content refers to digital library materials as information that is of interest to a user. Data emphasize bits and bytes to be processed by a computer. Metadata : data about data

Metadata – focus on description/discovery data about data origins in library cataloguing, A&I databases now: an amplification of traditional bibliographic cataloguing practices in an electronic environment; now: any data used to aid the identification, description and location of networked electronic resources. actually, it is more

Metadata - broader descriptive: facilitating resource discovery and identification (record in OPAC system) administrative: facilitating resource management within a collection (loan record in OPAC system) structural: binding together the components of complex information objects (series title in record in OPAC system)

descriptive/discovery Metadata - evolution descriptive/discovery library objects descriptive/discovery administrative structural library objects networked resources

Metadata Traditionally stored separately from the objects that it describes, For digital objects, sometimes is embedded in the objects (cf. KWF). Usually the metadata is a set of text fields. Textual metadata can be used to describe non-textual objects, e.g., software, images, music, …

Metadata – why? Some methods of information discovery search descriptive metadata about the objects. Generally, it enables digital library services: explicitly (discovery metadata) or implicitly (terms and conditions) helps to impose order on chaos enables automated discovery/manipulation of objects

Metadata – generation (traditional) cataloguing rules object metadata record reference data

Metadata – generation (traditional) Advantage: Human expertise leads to high-quality catalogs and indexes Disadvantages: Expensive ($50+ per record) Time consuming Requires cumbersome cataloguing rules Slow to adapt to new formats and types of digital objects Human cataloging and indexing is too expensive to apply to all but a small proportion of digital objects => automatic generation of metadata

Metadata – roots (Library cataloguing) Anglo American Cataloguing Rules (AACR2) • rules for what goes into each field of a catalog record MARC format • an exchange format for catalog records "MARC Catalog" • catalog in MARC format, where content of each field follows AACR2

Citation: a monograph -- book! Caroline R. Arms, editor, Campus strategies for libraries and electronic information. Bedford, MA: Digital Press, 1990.

MARC tags MARC subfield code MARC subfield MARC field MARC indicator

ISBN Title statement Imprint – location, publisher, year Collation Series Title

directory leader field terminator 001 field

MARC: the good news A great achievement: Developed in 1960s Magnetic tape exchange format for printing catalog records The dawn of computing: mixed upper and lower case variable length fields, repeated fields non-Roman scripts 100(?) million records with standard content and format Thousands of trained librarians (millions?)

MARC: the bad news A great problem: Not designed for computer algorithms One record per item (poor links between records) Tied to traditional materials and traditional practices Not Unicode 100 million records at $50+/record A classic legacy system!

Metadata –- simplicity/complexity Variety of metadata formats for description/discovery: basic, proprietary, records used in global internet search services; simple attribute/value records such as the ROADS templates used in eLib subject services; unqualified Dublin Core (12 elements only) the more structured TEI and MARC formats; qualified Dublin Core detailed formats such as CIMI and EAD, typically applied to archival material.

Metadata –- one-size-fits-all/application-profiles There is an evolution from a “one size fits all” concept for metadata towards: the use of a specific format depending on the purpose; the co-existence of formats in relation to an object; combining metadata elements from various formats; Choice of format can depend on: the functional purpose of the metadata –- [description/ discovery/location] ; [administration] ; [structuring] level of detail required to fulfill the purpose discipline/domain/audience of the objects that are described legacy issues interoperability requirements

Internet Commons Metadata – interoperability Commerce Home Pages Whatever... Home Pages Geo Internet Commons Library Museums Scientific Data

Metadata – descriptive/other There is an evolution towards the creation of standards for non-discovery related metadata formats: Preservation metadata [NedLib, CEDARS, …] (see OCLC overview document - http://www.oclc.org/digitalpreservation/presmeta_wp.pdf Data Dictionary for Technical Metadata for Digital Still Images“ (http://www.niso.org/pdfs/DataDict.pdf) book e-commerce [ONIX] resource administration: Circulation Interchange Protocol (NCIP) Standard – see http://www.niso.org/drafts/Z3982v1.html Electronic resources (cf. Adam Chandler)