November 22, 2003DASER Conference. Copyright MIT, METS: Metadata Encoding & Transmission Standard
November 22, 2003DASER Conference. Copyright MIT, Part One: Problem definition
November 22, 2003DASER Conference. Copyright MIT, Digital (Library) Objects Reformatted to digital scanned photographs, books and journals digitized audio/video files Born digital TEI-encoded texts digital images, audio, video files GIS, statistical datasets interactive content
November 22, 2003DASER Conference. Copyright MIT, Digital (Library) Objects Simple Objects –single files, e.g. visual TIFF images MP3 files TEI-encoded text –objects stand alone no relationships to other objects
November 22, 2003DASER Conference. Copyright MIT, Digital (Library) Objects Complex Objects –multiple related files, e.g. –page images from books or articles –multiple channels in digital audio files –related sound and text files (multimedia) –statistical dataset and codebook –objects cannot stand alone multiple files required to interpret the object requires structural metadata to model
November 22, 2003DASER Conference. Copyright MIT, Structural metadata Maps physical files (digital assets) to logical items (complex digital objects) Examples –Scanned print material complex publication structures (e.g. journals runs) ordered relationship between digital page images –A/V material multiple resolutions of an image multiple channels of an audio file
November 22, 2003DASER Conference. Copyright MIT, Structural metadata Examples, continued –Multimedia presentations relationship between images, text, sound, video, etc. (time-based or other) –Web sites linkages between web pages sitemaps –Databases table models and ER diagrams
November 22, 2003DASER Conference. Copyright MIT, Digital (Library) Objects Also have other (non-structural) metadata –descriptive MARC, DC, FGDC, VRA core, other ontologies –administrative rights, provenance –technical format details, OAIS representation information Standards exist or emerging for these
November 22, 2003DASER Conference. Copyright MIT, Part Two: Introduction to METS
November 22, 2003DASER Conference. Copyright MIT, METS Scope Supports –Structural metadata complex reformatted or born digital objects –Metadata wrapper framework descriptive, administrative, structural, etc. structural required others use namespaces to reference extension schemas
November 22, 2003DASER Conference. Copyright MIT, Brief History Making Of America II project –Funded by DLF and NEH –Included Berkeley, Cornell, NYPL, Penn State, Stanford, U of Michigan –Designed for scanned archival collections –SGML DTD included pre-defined descriptive, administrative, structural metadata February 2001 DLF workshop on structural metadata produced METS framework
November 22, 2003DASER Conference. Copyright MIT, METSHeader Administrative metadata File Inventory Structure map Descriptive metadata Behavioral metadata METS metadata buckets optional required optional
November 22, 2003DASER Conference. Copyright MIT, METS metadata XML extension schemas –descriptive metadata Dublin Core, MARC, FGDC, VRA, etc. Berkeleys GDM schema (from MOA2) –administrative/technical metadata NISO image technical metadata LC schemas for A/V technical metadata Rights metadata (e.g. PRISM, XrML, etc.) Provenance metadata
November 22, 2003DASER Conference. Copyright MIT, Metadata Reference (mdRef): A link to external descriptive metadata. The type of link (URN/Handle/etc.) is included as an attribute, as is the metadata type. Metadata Wrapper (mdWrap): Included descriptive metadata, as either binary data (Base64 encoded) or arbitrary XML using namespace mechanism. The metadata type is specified as an attribute. METS Descriptive Metadata Section
November 22, 2003DASER Conference. Copyright MIT, Technical Metadata (techMD): technical metadata regarding content files IP Rights Metadata (rightsMD): rights metadata regarding content files or primary source material Source Metadata (sourceMD): provenance information for content files. Preservation Metadata (preservationMD): metadata to assist in preservation of digital content All sections use generic metadata reference and wrapper subelements. METS Administrative Metadata Section
November 22, 2003DASER Conference. Copyright MIT, File Group (fileGrp): provides mechanism for hierarchically subdividing physical files, for example by type File (file): provides a pointer to an external file (Flocat) or includes file content internally (Fcontent) in Base64 encoding METS File Inventory
November 22, 2003DASER Conference. Copyright MIT, The Structural Map provides a tree structure describing the original document. Each division (div) element is a node in that tree, and can identify content files associated with that division by a METS Pointer (mptr) or a File Pointer (fptr) METS Structural Map
November 22, 2003DASER Conference. Copyright MIT, METS Pointer and File Pointer METS Pointer (mptr): xlink to another METS file containing the content for the associated div. Useful for breaking up large objects (e.g., a journal run) into a series of smaller METS documents. File Pointer (fptr): Identifies one or more entries in the File Inventory section containing the content for the associated div element. Can also limit the link from a div element to a portion of a content file (e.g., a segment of an audio or video file, a subarea of an image or video file, etc.).
November 22, 2003DASER Conference. Copyright MIT, File Pointer (fptr): Can identify a single file in File Inventory using ID/IDREF linking Parallel/Sequential(par/seq): Allows a div to be associated with several content files that should be played/displayed in parallel (video with separate audio track file) or sequentially. Area (area): identifiers a point, linear segment, or 2D area within content file that corresponds with associated div element. METS File Pointer Mechanisms
November 22, 2003DASER Conference. Copyright MIT, METS Area Element Attribtes FILE:ID for File element in File Inventory SHAPE:As in HTML Area element COORDS:As in HTML Area element BEGIN:A start point within a file for defining a segment END:An end point within a file for defining a segment BETYPE:Begin/End type: IDREF, Byte Offset, or SMPTE time code EXTENT:Length Duration of Segment EXTYPE:Extent Type: Bytes, or SMPTE
November 22, 2003DASER Conference. Copyright MIT, Structure Example urn:x-nyu:violet42 <area FILE=f1 BEGIN=00:23:17:00 END=00:23:38:00 BETYPE=SMPTE>
November 22, 2003DASER Conference. Copyright MIT, Created for multimedia structural encoding SMIL has time-based orientation –for playing multimedia presentations Very complex May eventually be incorporated Related standards: SMIL (W3C), MPEG-7 (ISO)
November 22, 2003DASER Conference. Copyright MIT, Related standards: RDF (W3C) Also metadata wrapper framework Structural metadata could be supported, but doesnt specify how… Opaque to use No element semantics provided element names deliberately meaningless Originally designed for descriptive metadata
November 22, 2003DASER Conference. Copyright MIT, Related standards: OAIS framework
November 22, 2003DASER Conference. Copyright MIT, METS and OAIS framework Submission Information Package (SIP) METS as transfer syntax Dissemination Information Package (DIP) METS as tranfer syntax METS as input to display applications Archival Information Package (AIP) METS stored internally in an archive
November 22, 2003DASER Conference. Copyright MIT, Library Applications Digital Object transfer syntax –between systems enables interoperability –between institutions enables collection sharing –implements OAIS SIP/DIP/AIP
November 22, 2003DASER Conference. Copyright MIT, Library Applications Input to Digital Object delivery systems (aka disseminators) –Simple bit-streaming –XSL stylesheet –Custom program for complex digital object display
November 22, 2003DASER Conference. Copyright MIT, Part Three: METS Summary
November 22, 2003DASER Conference. Copyright MIT, METS summary Descriptive/technical/administrative metadata –not defined internally –points to external standard schemas Dublin Core, MARC, MPEG-7, etc. AES audio metadata –set of best practice schemas being identified
November 22, 2003DASER Conference. Copyright MIT, METS summary Structural metadata –defined internally and required –SMIL-lite simple support for multimedia, audio/visual SMIL may replace eventually
November 22, 2003DASER Conference. Copyright MIT, METS summary Current users include UC Berkeley (archival collections) Harvard (scanneded print publications, e- journals) Library of Congress (audio/visual collections) British Library RLG and OCLC EU METAe project (historic newspapers) Michigan State (oral history collections) Univ of Virginia (FEDORA digital objects) more daily...
November 22, 2003DASER Conference. Copyright MIT, METS summary Tools under development for –metadata capture –transformation –transfer –dissemination/display Profiles necessary for interoperation –Which extension schemas used? –How structure maps are organized…
November 22, 2003DASER Conference. Copyright MIT, METS summary Current status –version 1.3 available from LC –editorial board in place –LC standards office for maintenance agency –DLF and RLG underwriting RLG will host editorial board, offer documentation and training, develop tools –Several extension schemas available –Opening Day in October 2004
November 22, 2003DASER Conference. Copyright MIT, METS summary METS is not all things to all people… –Designed for local institutional application support Solving an immediate local problem Common to many institutions Flexible framework supports many institutional situations –Profiling necessary to interoperate For OAIS packages For shared tools For other kinds of interoperation (e.g. cross repository search)