METS: Implementing a metadata standard in the digital library Richard Gartner Oxford University Library Services
The digital library: a status report Digitisation technology now well established and well-understood Standards for digitisation processes have settled down and are widely recognised Still a disparity in approaches to metadata - no MARC standard for digital library
Approaches to metadata - some examples from Oxford Ad-hoc databases: eg. Allegro SGML: including:- TEI alone TEI + EAD Ad-hoc DTDs Proprietary databases: eg Olive
The lack of a standard: what it mean for the digital library poor cross-searching limited interchange facilities metadata tied to proprietary packages consequent obsolescence and costs of conversion
What is needed? A standard for metadata content : analogous to AACR2 A standardised framework for holding and exchanging metadata : analogous to the MARC record
Three types of metadata (defined by DLF) Descriptive Administrative Structural Information about intellectual content (analogous to standard catalogue record) Information for handling, maintenance and archiving of object Description of internal structure of object
METS: Metadata Encoding and Transmission Standard Produced by Library of Congress Standards Office and Digital Library Federation Provides framework for holding all types of metadata for digital object Written in XML Does not prescribe content of metadata, but recommends a number of schemes for this
Why XML? An ISO standard, not dependent on any given application Interchangeability with other applications Handles structural metadata easily Easy to integrate cataloguing information with text transcription, images etc.
Features of a METS file All metadata (descriptive, administrative and structural) encoded in single document Each type is held in a separate section, linked by identifiers All metadata and external data (eg. images, text, video) is either referenced from METS file or can be held internally
The structure of a METS file METS dmdSec admSec behaviorSec structMap fileSec file inventory descriptive metadata administrative metadata behaviour metadata structural map
The structure of a METS file METS dmdSec admSec behaviorSec structMap div fptr fileSec div structMap
Title Page title page Preface page i page ii Chapter 1 page 1 page 2 page 3 page 4 page 5 page 6 Chapter 2 page 7
The structure of a METS file METS dmdSec admSec behaviorSec structMap fileSec
The structure of a METS file METS dmdSec admSec behaviorSec structMap fileSec fileGrp file
Page 1 image 1 (thumbnail) image 1 (master) image 1 (delivery)
Descriptive and administrative metadata Descriptive and administrative metadata may be handled in two ways:- embedding directly within the METS file within an element (with any namespace) being held in an external file and referenced from the METS file using an element
The structure of a METS file METS dmdSec admSec behaviorSec structMap fileSec admSec dmdSec mdWrap Any XML metadata
Ðtudes sur les glaciers atlas Agassiz, Louis, Glaciers Plates accompanying a study of glaciers by 19th century glaciologist Louis Agassiz Dessinés d'après nature et lithographiés par Jph. Bettannier Neuch‚tel, Lithographie de H. Nicolet Neuch‚tel (Switzerland): Jent et Gassmann Bettannier, Joseph 1840 Image.Graphic.Map 480 x plates: ill. OUM:E. 52 fre Alps ODL:munahi010- aaa
The structure of a METS file METS dmdSec admSec behaviorSec structMap fileSec admSec dmdSec mdRef Reference to external file containing metadata
IDs and METS All compontents of a METS file need to be identified with a logical (and easily generated) sets of identifiers Project IDmunahi010 Item IDmunahi010-aaa Technical metadatamunahi010-aaa-tmd-0001 File groupsmunahi010-aaa-fgrp-0001 File IDsmunahi010-aaa divsmunahi010-aaa-div.1
What to put in a METS file? METS does not prescribe the content (particularly the descriptive metadata) which it can contain However, the METS board does endorse some schemas as recommended:- Descriptive Metadata Dublin Core MODS (Metadata Object Description Schema) MARCXML MARC 21 Schema (MARCXML) Administrative Metadata Schema for Technical Metadata for Text (NYU) Library of Congress Audio-Visual Prototyping Project NISO Technical Metadata for Digital Still Images Schema for Rights Declaration
METS Profiles METS is very flexible in its application – there are multiple ways of encoding everything:- metadata and data can be embedded or referenced any scheme can be used for this metadata file inventory can be organised in multiple ways (by referenced object, by type of file etc) This all reduces interchangeability of METS records.
METS Profiles This can be countered to some extent by METS Profiles:- XML documents describing application of METS in a given project/institution follows METS Profile schema and each profile has to validate against it registered with central repository at Library of Congress But does not allow automated cross-mapping of METS files: this has to be explored
METS in action Oxford Digital Library:- a collection of collections of material held in Oxford libraries METS files generated by automated webform-based cataloguing system descriptive metadata qualified Dublin Core following strict cataloguing guidelines (aimed to map to AACR2) – moving to MODS being investigated METS files easily converted to formats of digital library systems (currently investigating Greenstone)