Presentation is loading. Please wait.

Presentation is loading. Please wait.

UCSD Digital Library Program Working Group February 6, 2002 METS: Metadata Encoding & Transmission Standard.

Similar presentations


Presentation on theme: "UCSD Digital Library Program Working Group February 6, 2002 METS: Metadata Encoding & Transmission Standard."— Presentation transcript:

1 UCSD Digital Library Program Working Group February 6, 2002 METS: Metadata Encoding & Transmission Standard

2 UCSD Digital Library Program Working Group February 6, 2002 Part One: Problem definition

3 UCSD Digital Library Program Working Group February 6, 2002 Digital (Library) Objects Reformatted to digital scanned photographs, books and journals digitized audio/video files “Born digital” TEI-encoded texts digital images, audio, video files GIS, statistical datasets interactive content

4 UCSD Digital Library Program Working Group February 6, 2002 Digital (Library) Objects Simple –single files, e.g. visual TIFF images MP3 files TEI-encoded text –objects stand alone no relationships to other objects

5 UCSD Digital Library Program Working Group February 6, 2002 Digital (Library) Objects Complex –multiple related files, e.g. page images from books or articles multiple channels in digital audio files related sound and text files (multimedia) statistical dataset and codebook –objects cannot stand alone one or more related files required to interpret the object –requires structural metadata to model

6 UCSD Digital Library Program Working Group February 6, 2002 Structural metadata Maps physical files (digital assets) to logical items (complex digital objects) Examples –Scanned print material complex publication structures (e.g. journals runs) ordered relationship between digital page images –A/V material multiple resolutions of an image multiple channels of an audio file

7 UCSD Digital Library Program Working Group February 6, 2002 Structural metadata Examples, continued –Multimedia presentations relationship between images, text, sound, video, etc. (time-based or other) –Web sites linkages between web pages sitemaps –Databases table models and ER diagrams

8 UCSD Digital Library Program Working Group February 6, 2002 Digital (Library) Objects Also have other (non-structural) metadata –descriptive MARC, DC, FGDC, VRA core, other ontologies –administrative rights, provenance –technical format details, OAIS “representation information” Standards exist or emerging for these

9 UCSD Digital Library Program Working Group February 6, 2002 Part Two: Introduction to METS

10 UCSD Digital Library Program Working Group February 6, 2002 METS Scope Supports –Structural metadata complex reformatted or born digital objects –Metadata wrapper framework descriptive, administrative, structural, etc. structural required others use namespaces to reference “extension schemas”

11 UCSD Digital Library Program Working Group February 6, 2002 Evolved from MOA2 Making Of America II project –Developed November 1997-January 2000 –Funded by DLF and NEH, participants Cornell, NYPL, Penn State, Stanford, Berkeley –Designed for scanned archival collections –XML DTD defining explicit descriptive, administrative and structural metadata

12 UCSD Digital Library Program Working Group February 6, 2002 Evolved from MOA2 February 2001 DLF workshop on structural metadata –Harvard, LC, MOA2 participants, others Outcome: METS definition –emphasis on structural metadata –wider scope of participants, content types –change to XML schema, framework architecture

13 UCSD Digital Library Program Working Group February 6, 2002 METSHeader Administrative metadata File Inventory Structure map Descriptive metadata Behavioral metadata METS metadata “buckets” optional required optional

14 UCSD Digital Library Program Working Group February 6, 2002 METS metadata XML “extension schemas” –descriptive metadata Dublin Core, MARC, FGDC, VRA, etc. Berkeley’s GDM schema (from MOA2) –administrative/technical metadata NISO image technical metadata LC schemas for A/V technical metadata Rights metadata (e.g. PRISM, XrML, etc.) Provenance metadata

15 UCSD Digital Library Program Working Group February 6, 2002 Metadata Reference (mdRef): A link to external descriptive metadata. The type of link (URN/Handle/etc.) is included as an attribute, as is the metadata type. Metadata Wrapper (mdWrap): Included descriptive metadata, as either binary data (Base64 encoded) or arbitrary XML using namespace mechanism. The metadata type is specified as an attribute. METS Descriptive Metadata Section

16 UCSD Digital Library Program Working Group February 6, 2002 Technical Metadata (techMD): technical metadata regarding content files IP Rights Metadata (rightsMD): rights metadata regarding content files or primary source material Source Metadata (sourceMD): provenance information for content files. Preservation Metadata (preservationMD): metadata to assist in preservation of digital content All sections use generic metadata reference and wrapper subelements. METS Administrative Metadata Section

17 UCSD Digital Library Program Working Group February 6, 2002 File Group (fileGrp): provides mechanism for hierarchically subdividing physical files, for example by type File (file): provides a pointer to an external file (Flocat) or includes file content internally (Fcontent) in Base64 encoding METS File Inventory

18 UCSD Digital Library Program Working Group February 6, 2002 The Structural Map provides a tree structure describing the original document. Each division (div) element is a node in that tree, and can identify content files associated with that division by a METS Pointer (mptr) or a File Pointer (fptr) METS Structural Map

19 UCSD Digital Library Program Working Group February 6, 2002 METS Pointer and File Pointer METS Pointer (mptr): xlink to another METS file containing the content for the associated div. Useful for breaking up large objects (e.g., a journal run) into a series of smaller METS documents. File Pointer (fptr): Identifies one or more entries in the File Inventory section containing the content for the associated div element. Can also limit the link from a div element to a portion of a content file (e.g., a segment of an audio or video file, a subarea of an image or video file, etc.).

20 UCSD Digital Library Program Working Group February 6, 2002 File Pointer (fptr): Can identify a single file in File Inventory using ID/IDREF linking Parallel/Sequential(par/seq): Allows a div to be associated with several content files that should be played/displayed in parallel (video with separate audio track file) or sequentially. Area (area): identifiers a point, linear segment, or 2D area within content file that corresponds with associated div element. METS File Pointer Mechanisms

21 UCSD Digital Library Program Working Group February 6, 2002 METS Area Element Attributes FILE:ID for File element in File Inventory SHAPE:As in HTML Area element COORDS:As in HTML Area element BEGIN:A start point within a file for defining a segment END:An end point within a file for defining a segment BETYPE:Begin/End type: IDREF, Byte Offset, or SMPTE time code EXTENT:Length Duration of Segment EXTYPE:Extent Type: Bytes, or SMPTE

22 UCSD Digital Library Program Working Group February 6, 2002 Structure Example urn:x-nyu:violet42 <area FILE=“f1” BEGIN=00:23:17:00 END=“00:23:38:00” BETYPE=“SMPTE”>

23 UCSD Digital Library Program Working Group February 6, 2002 Created for multimedia structural encoding SMIL has “time-based” orientation –for playing multimedia presentations Very complex May eventually be incorporated Related standards: SMIL (W3C), MPEG-7 (ISO)

24 UCSD Digital Library Program Working Group February 6, 2002 Related standards: RDF (W3C) Also metadata wrapper framework Structural metadata could be supported, but doesn’t specify how… Opaque to use No element semantics provided element names deliberately meaningless Originally intended for descriptive metadata

25 UCSD Digital Library Program Working Group February 6, 2002 Related standards: OAIS framework

26 UCSD Digital Library Program Working Group February 6, 2002 METS and OAIS framework Submission Information Package (SIP) METS as transfer syntax Dissemination Information Package (DIP) METS as transfer syntax METS as input to display applications Archival Information Package (AIP) METS stored internally in an archive

27 UCSD Digital Library Program Working Group February 6, 2002 Part Three: Library Applications of METS

28 UCSD Digital Library Program Working Group February 6, 2002 Library Applications Digital Object transfer syntax –between systems enables interoperability –between institutions enables collection sharing –implements OAIS SIP/DIP/AIP

29 UCSD Digital Library Program Working Group February 6, 2002 Library Applications Input to Digital Object delivery systems (aka “disseminators”) –Simple bit-streaming –XSL stylesheet –Custom program for complex digital object display

30 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s Page Delivery Service (PDS) Range of publication types supported –0-4 levels of hierarchy simple 3 page letter, 20 page article diary with entries book containing chapters containing sections report run containing reports containing sections journal bound in volumes containing issues containing articles Implemented as METS “tree” example on METS web site

31 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s PDS Letter Citation level and Leaf level METS TIFF

32 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s PDS Diary Entry Citation level METS Leaf level METS TIFF Entry

33 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s PDS Journal Volume Issue Article TIFF Citation level METS Intermediate level METS Leaf level METS

34 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s PDS “Page turner” system –implemented as a web application java servlet, SAX parser –minimal descriptive metadata display only (not for discovery) –no administrative metadata –file inventory only for “leaf” nodes

35 UCSD Digital Library Program Working Group February 6, 2002

36 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s PDS METS maintenance system –implemented as a web applications java servlet, DOM parser –supports structure updates add a missing volume to a run add a missing page to a scanned manuscript switch two page images –supports cascading deletes entire logical object including all underlying digital assets

37 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s E-Journal Archive Capture e-journals of three scholarly journal publishers –Wiley, Blackwell, University of Chicago Press Accept normative data formats –descriptive, administrative metadata –article text, images, figures, etc. –reference links –other supplementary material

38 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s E-Journal Archive OAIS Submission Information Package –received from publishers for each journal issue and article, along with digital content files OAIS Archival Information Package –stored in Digital Repository Service OAIS Dissemination Information Package –delivered to subscribers on demand

39 UCSD Digital Library Program Working Group February 6, 2002 Harvard’s E-Journal Archive Issue-level metadata includes –METS header –descriptive (i.e.bibliographic) metadata –administrative (e.g. rights, provenance, technical) metadata –structural metadata issue-level content –masthead, editorial board, etc. issue content –articles, correspondence, reviews, editorials, errata, etc.

40 UCSD Digital Library Program Working Group February 6, 2002 OAIS Article-level metadata –METS header –descriptive (i.e. bibliographic) metadata –administrative (e.g. rights, provenance, technical) metadata –structural metadata article content –xml-encoded text plus images, figures, links, etc. –and/or PDF

41 UCSD Digital Library Program Working Group February 6, 2002 Example Issue SIP <mets xmlns=”http://www.loc.gov/METS/” xmlns:ejar=”http://hul.harvard.edu/EJAR/METADATA/” xmlns:xsi=”http://www.w3.org/2001/XMLSchema” xsi:schemaLocation=”http://www.loc.gov/METS http://www.loc.gov/standards/mets/mets.xsd” xmlns:xlink=”http://www.w3.org/1999/xlink” TYPE=”EJARISSUE-major.minor” OBJID=”issueid” LABEL=”issue bibliographic citation” PROFILE=”EJAR”> content provider issue descriptive metadata

42 UCSD Digital Library Program Working Group February 6, 2002 Example Issue SIP issue copyright metadata issue content technical metadata content file checksum

43 UCSD Digital Library Program Working Group February 6, 2002 Example Issue SIP cover image technical metadata cover image copyright metadata cover image checksum

44 UCSD Digital Library Program Working Group February 6, 2002 Example Issue SIP <file ID=”file:issue-content” ADMID=”admin:issue-content” CREATED=”yyyy-mm-dd” MIMETYPE=”text/xml” OWNERID=”id” SIZE=”n”> <file ID=”file:1” ADMID=”admin:1” CREATED=”yyyy-mm-dd” MIMETYPE=”image/tiff” OWNERID=”id” SIZE=”n”>...

45 UCSD Digital Library Program Working Group February 6, 2002 Example Issue SIP <div TYPE=”EJARISSUE” ADMID=”admin:issue” DMD=”descr:issue” LABEL=”issue bibliographic citation”> <div TYPE=”EJARITEM” LABEL=”item bibliographic citation” ORDERLABEL=”n”>......

46 UCSD Digital Library Program Working Group February 6, 2002 GenDL (Generic Digital Library Focus of METS-based tools –Specify how files and parts of files fit together –Coordinate external and internal descriptive and administrative metadata with object structure –Mitigate complexity of METS for users Efficiency and coherence through standardization. –Automatic generation of digital objects –Presentation of disparate digital material through coherent tools

47 UCSD Digital Library Program Working Group February 6, 2002 METS tools at UC Berkeley GenDB: Generic database to capture structural, descriptive and administrative metadata for digital reformatting projects GenX: Java program to extract metadata from GenDB database and package it up into METS GenView: Java programs for end user navigation of METS objects GenRep: Repository for METS objects

48 UCSD Digital Library Program Working Group February 6, 2002 Database (SQL Server) Digital Object Repository (Unix file system) Gathering Metadata: GenDB Viewing METS Objects: GenView GenDB Client (browser/ servlet) GenDB Database Server Creating METS Objects: GenX METS Generator GenView Client (browser/ servlet) GenView Repository Server

49 UCSD Digital Library Program Working Group February 6, 2002 GenDB Tool to capture structural, descriptive and administrative metadata First implemented as an MS Access DB Now implemented as a SQL server with web front end Java client?

50 UCSD Digital Library Program Working Group February 6, 2002

51 UCSD Digital Library Program Working Group February 6, 2002

52 UCSD Digital Library Program Working Group February 6, 2002

53 UCSD Digital Library Program Working Group February 6, 2002

54 UCSD Digital Library Program Working Group February 6, 2002

55 UCSD Digital Library Program Working Group February 6, 2002 GenDB Key Features Exposes Digital Object’s structure –UI enables easy visualization to build object structure Highly configurable –Project manager specifies what fields should appear and how they should be tagged Layered architecture enhances flexibility –UI doesn’t know underlying DB table structure –Different UIs can be layered over same middle layer

56 UCSD Digital Library Program Working Group February 6, 2002 GenView Tool to view and navigate METS objects Web-based user interface (Java)

57 UCSD Digital Library Program Working Group February 6, 2002

58 UCSD Digital Library Program Working Group February 6, 2002

59 UCSD Digital Library Program Working Group February 6, 2002

60 UCSD Digital Library Program Working Group February 6, 2002

61 UCSD Digital Library Program Working Group February 6, 2002

62 UCSD Digital Library Program Working Group February 6, 2002

63 UCSD Digital Library Program Working Group February 6, 2002

64 UCSD Digital Library Program Working Group February 6, 2002

65 UCSD Digital Library Program Working Group February 6, 2002

66 UCSD Digital Library Program Working Group February 6, 2002

67 UCSD Digital Library Program Working Group February 6, 2002

68 UCSD Digital Library Program Working Group February 6, 2002

69 UCSD Digital Library Program Working Group February 6, 2002

70 UCSD Digital Library Program Working Group February 6, 2002

71 UCSD Digital Library Program Working Group February 6, 2002 GenView: Key Features Exposes Digital Object’s structure –Table of Contents for navigation –Select from multiple manifestations of currently selected TOC entry (including side by side display) –Link to descriptive/administrative metadata for highest-level object currently selected TOC entry Supports non-Roman text (beyond ISO- 8859)

72 UCSD Digital Library Program Working Group February 6, 2002 Part Four: METS Summary

73 UCSD Digital Library Program Working Group February 6, 2002 METS summary Descriptive/technical/administrative metadata –not defined internally –points to external standard schemas Dublin Core, MARC, MPEG-7, etc. AES audio metadata –set of “best practice” schemas being identified

74 UCSD Digital Library Program Working Group February 6, 2002 METS summary Structural metadata –defined internally and required –SMIL-lite simple support for multimedia, audio/visual SMIL may replace eventually

75 UCSD Digital Library Program Working Group February 6, 2002 METS summary Current users include UC Berkeley (archival collections) Harvard (scanned print publications, e- journals) Library of Congress (audio/visual collections) EU MetaE project (historic newspapers) Michigan State (oral history collections) Univ of Virginia (FEDORA digital objects) National Library of Australia more daily...

76 UCSD Digital Library Program Working Group February 6, 2002 METS summary Tools under development for –metadata capture –transformation –transfer –dissemination/display Profiles necessary for interoperation –Which extension schemas used? –How structure maps are organized…

77 UCSD Digital Library Program Working Group February 6, 2002 METS summary Current status –version 1.0 due out in February –editorial board being set up –LC standards office for maintenance agency –DLF and RLG underwriting RLG will host editorial board, offer documentation and training, develop tools, seek funding

78 UCSD Digital Library Program Working Group February 6, 2002 METS summary METS is not all things to all people… –Designed for local institutional application support Solving an immediate local problem Common to many institutions Flexible framework supports many institutional situations –Profiling necessary to interoperate For OAIS packages For shared tools For other kinds of interoperation (e.g. cross repository search)


Download ppt "UCSD Digital Library Program Working Group February 6, 2002 METS: Metadata Encoding & Transmission Standard."

Similar presentations


Ads by Google