UCLA School of Education & Information Visual Materials: Metadata, Standards, and Best Practices for Digital Libraries Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard
Metadata for Digital Libraries- Models for Digital Libraries Importance of Metadata Standards Types and Uses of Metadata Discovery Metadata: The Dublin Core Administrative and Structural Metadata: The Making of America II Project Longevity Metadata Identification/Provenance The 4/99 NISO/DLF Image Metadata Workshop Various other Metadata
Key problems we’re facing Discovery Longevity- Interoperability-
Serious Longevity Problems What we know from prior widespread digital file formats Images separating from their metadata Inaccessibility of software needed to view an image Inability to even decode the file format of an image
Traditional Digital Library Model DL DL DL DL search & presentation search & presentation search & presentation search & presentation user user
Ideal Digital Library Model DL DL DL DL search & presentation user user
For Interoperability Digital Libraries Need Standards Discovery Metadata for finding Administrative Metadata for viewing and maintaining Structural Metadata for navigation ... IP Rights Management Metadata for controlling access...
Why are Standards and Metadata consensus important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create applications that support this
Why Standards? Why do we need a standards? To make information universally available to users facilitate sharing and interchange of information To preserve information (make it safe from changes in hardware and software) Standards are the work of communities They are necessary so that communities can work. 11
Why are you Managing this Information? Organizational mission & type Users Uses
Questions to Ask What communities is this standard designed for? What type of information is this standard designed to handle? What functions is this standard designed to serve? What previous standards is it built upon? Does the standard prescribe how to create new records (or parts of records), or how to map from existing records? How far does the standard go? Semantics: Does it define element sets? Rules? Syntax?-
What is Metadata Structured data describing other data used to find or help manage information resources Aids in interoperability Titles, dates, captions, cataloging and indexing data, file headers, rights info, provenance, code books, transaction logs, ... One person’s metadata is another’s data
Sorting through the Standards Morass Data Structures (DC, CDWA, MARC, VRA Core, TEI, EAD, MESL data dict) Data Interchange (Z39.50) Data Values/vocabularies (LCSH, AAT, ULAN, TGN) Data Content/syntax (AACR2)
Semantics/Syntax/Structure meaning, as defined by a community to meet their particular needs (DC) Syntax a systematic arrangement of data elements for machine processing facilitates the exchange and use of metadata among various applications (HTML, XML, RDF) Structure a formal arrangement of the syntax with the goal of consistent representation of the semantics (rules defining field contents like 1/11/99)
What is Metadata Types & Uses lots of different ways of dividing the clusters
Uses of Metadata Discovery & Retrieval Identification/Provenance Rights Management Viewing Integrity Longevity Content rating
Types of Metadata Descriptive Discovery & Retrieval Structural Administrative Intellectual Other Metadata
Metadata -- Detailed Types Identification metadata Instance or Fixation metadata Source image metadata Content metadata Subject metadata Form and format metadata Context metadata Structure metadata Relationships metadata Terms & Conditions metadata Use history metadata
Containers and Packages of Metadata Warwick, not MARC modular overlapping extensible community-based designed for a networked world to aid commonality btwn communities while still providing full functionality within each community
Some different schemes where Metdata is kept embedded withing the object (HTML tags) in a separate related DB maintained by same organization (OPAC, MOA II) in a separate DB maintained by a separate organization (Books in Print, ratings systems) derived on-the-fly from a different scheme (MARC-to-DC)
Some Standards/Metadata Efforts Dublin Core Visual Resources Association (VRA) Core Encoded Archival Description (EAD) Computerized Interchange of Museum Information (CIMI) Records Export for Art and Cultural Heritage (REACH)
Dublin Core (3/95) improve resource discovery anticipate precision problems of Web Crawler-based searching tools existing metadata could be “dumbed down” elements should be simple to understand and use, so that any individual should be able to assign terms him/herself software might eventually automatically generate very base-level metadata
Dublin Core Title Creator Subject Description Publisher Contributors Date Type Format Identifier Source Language Relation Coverage Rights
Dublin Core every element is both optional and repeatable elements are cross-disciplinary elements are extensible by organized communities can employ a syntax such as html’s <META> tagset
DC Qualifiers allows one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs our community can reflect alternate title, transliterated title, and main title, yet they will all be found under a simple Web search under “title”
Discovery Metadata: Recent History Dublin Core (3/95) Warwick Framework (4/96) Image Metadata Workshop (9/96) Canberra, Helsinki, ... DC (98) Digital Library Collaboratory (97-) DC-8, Frankfurt 10/99
Dublin Core--further work Warwick Framework metadata packages for extensible functions layed groundwork for RDF Canberra Qualifiers refining the semantics of the element set to provide more precise info SUBELEMENT, SCHEME, LANG Granularity no hierarchical relationships w/i a given DC record; only one record per discrete object (collection or item-level), and relationship field plus qualifier links them
The Research Process and Functional Categories of Metadata Discovery Retrieval Collation Analysis Re-presentation
Making of America II- Background of the DLF Project Administrative Metadata Structural Metadata
Other Types of Metadata- Longevity Identification/Provenance Rights Management
The Short Life of Digital Info: Digital Longevity Problems- Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem
Identification/Provenance (Images)- The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the quality and veracity of the image (Dublin Core "Source" and "Relation") how to deal with different versions derived from the same scan or different encoding schemes Vocabulary Standards to express this
NISO/DLF Image Metadata Workshop Possible Goals Metadata fields Rules for Field Contents (authority control) Core set of necessary fields Syntax for expressing fields and contents (headers)
Other Metadata Description of depiction/surrogate (What VRA calls its "Surrogate Categories") Description of original object Rights and Reproduction Information Location Information
Data Structures: The VRA Core 28 elements specifically for visual resource collections Work Description Categories- Visual Document Description Categories- http://www.oberlin.edu/~art/vra/dsc.html
Data Value Metadata (vocabularies) LCSH TGM AAT ULAN TGN VRA Core
Metadata for Digital Commerce DOI <indecs>-
Metadata Mapping- Crosswalks Resource Description Framework (RDF)
Metadata Philosphies Minimalists vs. Structuralists From Pidgeon to Creole (add structure and tenses)
Collaborative Metadata Projects- OCLC CORC Project Computerized Interchange of Museum Information (CIMI)
UCLA School of Education & Information Visual Materials: Metadata, Standards, and Best Practices for Digital Libraries Howard Besser UCLA School of Education & Information Baca, Murtha (ed). Introduction to Metadata, Los Angeles: Getty Information Institute, 1998 http://www.gseis.ucla.edu/~howard/image-meta.html http://www.gseis.ucla.edu/~howard http://sunsite.Berkeley.EDU/Imaging/Databases/#standards http://sunsite.Berkeley.EDU/moa2/ http://sunsite.Berkeley.EDU/Longevity/ http://www.gii.getty.edu/timeandbits/ http://www.nlc-bnc.ca/ifla/II/metadata.htm http://purl.oclc.org/metadata/dublin_core/ http://purl.oclc.org/corc// http://lcweb.loc.gov/ead/ http://www.cimi.org/