Using and to Create XML Standards-based Digital Library Applications Morgan Cundiff & Nate Trail Network Development and MARC Standards Office (NDMSO)

Slides:



Advertisements
Similar presentations
METS: Metadata Encoding & Transmission Standard Merrilee Proffitt Society of American Archivists August 2002.
Advertisements

METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Introduction to METS (Metadata Encoding and Transmission Standard) Jerome McDonough New York University
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
Standards showcase: MODS, METS, MARCXML ALA Annual 2006 Rebecca Guenther and Jackie Radebaugh Network Development and MARC Standards Office Library of.
METS: An Introduction Towards a Digital Object Standard Rick Beaubien Library Systems Office U.C. Berkeley.
METS: An Introduction Structuring Digital Content.
METS at UC Berkeley Part I: Generating METS Objects.
METS Dr. Heike Neuroth EMANI – Project Meeting February 14 th - 16 th, 2002 Springer-Verlag Heidelberg Göttingen State and University Library (SUB)
MODS, METS, and other metadata standards
Susan Dahl University of Alberta METS and the Peel’s Prairie Provinces Project.
Creating METS Application Profiles using METS and MODS Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
Out topic is… METS and MODS to express data for digital objects
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
3. Technical and administrative metadata standards Metadata Standards and Applications.
METS AT THE L IBRARY OF C ONGRESS Nate Trail Sept 11, 2014.
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in.
DigiTool METS Profile DigiTool Version 3.0. DigiTool METS Profile 2 What is METS? A Digital Library Federation initiative built upon the work of MOA2.
Angelika Menne-Haritz The MEX editor - METS and the presentation of digitised archives The MEX editor: METS and the Internet presentation of.
METS: Metadata Encoding and Transmission Standard Richard Gartner Oxford University Library Services
Use of METS in CDL Digital Special Collections Brian Tingle.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
A METS Application Profile for Historical Newspapers
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Guest Lecture LIS 656, Spring 2011 Kathryn Lybarger.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
METS Intro & Overview Mets Opening Day Germany May 7, 2007 Nancy J. Hoebelheinrich Stanford University Libraries.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Case History: Library of Congress Audio-Visual Prototyping Project METS Opening Day October 27, 2003 Carl Fleischhauer Office of Strategic Initiatives.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
13 Oct DC2004--IFLA New and traditional descriptive formats in the library environment DC2004: IFLA session 13 Oct Rebecca Guenther
Case History: Library of Congress Audio-Visual Prototyping Project METS Opening Day (2003), Revised For the CUL Metadata Working Group July 22, 2004 Carl.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
An Introduction to METS Morgan Cundiff Network Development and MARC Standards Office Library of Congress Metadata Encoding and Transmission Standard.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
METS at UC Berkeley Generating METS Objects. Background Kinds of materials: –primarily imaged content & tei encoded content archival materials: manuscripts.
Nate Trail Network Development & MARC Standards Office 8/1/2006 With help from Sydney Olive How to Build, Display and Find METS Objects.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
METS Navigator Jenn Riley John Walsh Michelle Dalmau David Jiao Indiana University Digital Library Program Digital Library Federation Spring Forum
Habing1 Integrating PREMIS and METS PREMIS Tutorial Implementers’ Panel June 21, 2007, 9:00-5:30 Library of Congress, Jefferson Building, Whittall.
OCLC Online Computer Library Center Preservation Metadata Standards PREMIS & METS Taylor Surface, OCLC.
METS: Implementing a metadata standard in the digital library Richard Gartner Oxford University Library Services
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
IMPLEMENTATION ISSUES. How PREMIS can be used  For systems in development as a basis for metadata definition  For existing repositories as a checklist.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Introduction to the Semantic Web and Linked Data
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Lifecycle Metadata for Digital Objects The Final Curtain December 4, 2006.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
and Transmission Standard overview – and case study
7th Annual Hong Kong Innovative Users Group Meeting
Introduction to Metadata
Integrating PREMIS and METS
Metadata to fit your needs... How much is too much?
PREMIS Tools and Services
Introduction to METS (Metadata Encoding and Transmission Standard)
Presentation transcript:

Using and to Create XML Standards-based Digital Library Applications Morgan Cundiff & Nate Trail Network Development and MARC Standards Office (NDMSO) Library of Congress

XML is the lingua franca of the Web » Web pages increasingly use XHTML » Business use for data exchange/ messaging » Family of technologies can be leveraged XML Schema, XSLT, XPath, and XQuery » Software tools widely available (open source) Storage, editing, parsing, validating, transforming and publishing XML – constantly and actively improved » Microsoft Office 2003 supports XML as document format (WordML and ExcelML) » Web 2.0 applications based on XML (AJAX, Semantic Web, Web Services, etc.)

XML (Extensible Markup Language) “XML has become the de-facto standard for representing metadata descriptions of resources on the Internet.” Dr. Jane Hunter University of Queensland, Australia Working towards MetaUtopia – A Survey of Current Metadata Research

Interoperability and Standards “In moving from dispersed digital collections to interoperable digital libraries, the most important activity we need to focus on is standards… most important is the wide variety of metadata standards [including] descriptive metadata… administrative metadata…, structural metadata, and terms and conditions metadata…” Dr. Howard Besser, New York University The Next Stage: Moving from Isolated Digital Collections to Interoperable Digital Libraries

XML and Digital Libraries » Family of XML data standards METS – Metadata Encoding and Transmission Standard MODS – Metadata Object Description Schema MIX – Metadata for Images in XML PREMIS – PREservation Metadata Implementation Strategies TEI – Text Encoding Initiative EAD – Encoded Archival Description

XML and Digital Libraries » METS Implementors Library of Congress, OCLC, RLG, California Digital Library (CDL), Harvard, Princeton, National Library of Portugal, National Library of Wales, University of Indiana, Stanford, New York University, University of Göttingen, Oxford University, and more … » METS Software Tools METS Toolkit & DRS METS Archive Tool (Dmart) for Audio Deposit (Harvard), 7train METS Generation Tool (CDL), MEX Authoring Tools (Das Bundesarchiv), ContentE (Biblioteca Nacional Digital, Portugal), METS Navigator (Indiana University DL Program) ResCarta Metadata Creation Tool (ResCarta Foundation), and more … » METS listserv: 550 subscribers

XML at LC: A Historical Perspective » 1995 – American Memory released (not XML-based) » 1998 – XML 1.0 becomes W3C Recommendation » 2002 – METS and MODS released » 2002 – Digital Audio-Visual Preservation Prototyping Project (first use of METS, MODS, and MIX at LC) » 2003 – Patriotic Melodies (first use of METS and MODS in production at LC – this is later added to I Hear American Singing) » 2003 – Veterans History Project database released, MINERVA project (MODS) continued…

XML at LC: A Historical Perspective » 2004 – I Hear America Singing released (since renamed to LC Presents) » 2004 – Justice Blackmun Papers collection released » 2006 – National Digital Newspaper Project as repository submission package at LC (LC and partners, 1st use of METS, MODS, MIX, PREMIS) » 2006 – Ser2Dig (Digital Serials workgroup, METS for multi-volume monographs) » 2006 – Draft METS profile for “article-level” historical newspapers

What is METS? » Metadata Encoding and Transmission Standard » An XML Schema for the purpose of creating XML document instances that express… the hierarchical structure of digital library objects the names and locations of the files that comprise the digital object the associated metadata (e.g., MODS) » METS can be used as a tool for modeling real world objects, such as specific document types

What is MODS? » Metadata Object Description Schema » An XML Schema designed for expressing bibliographic data Can be viewed as an alternative to the MARC format Especially useful for XML-based digital library projects Can be used as an extension schema to METS

What is MODS? » Metadata Object Description Schema » An XML Schema designed for expressing bibliographic data Can be viewed as an alternative to the MARC format Especially useful for XML-based digital library projects Can be used as an extension schema to METS » Note to catalogers: MODS does not make you obsolete! The same knowledge and skills needed for traditional cataloging (AACR, controlled vocabularies, etc.) still apply. You will only need to learn a different syntax (i.e., different from MARC) for expressing bibliographic information in machine-readable form.

Structure of METS » There are 7 sections in a METS document - METS header (document talks about itself) - Descriptive metadata (MODS, etc.) - Administrative metadata (copyright info., etc.) - File section (names and locations of files) - Structural map (relationships of the parts) - Linking information - Binding executables/actions to object

Wrap Descriptive Metadata in METS » Use to embed descriptive metadata within a METS document … … Metadata wrap section acts as “socket” to hold metadata from other XML schemas or “vocabularies”

with MODS Extension Schema … … Descriptive metadata section MODS data contained inside the metadata wrap section Use of prefixes before element names to identify schema

with … … The MODS releatedItem element can be nested and can be used to express a hierarchy.

Bernstein conducts Beethoven Bernstein, Leonard Symphony No. 5 Beethoven, Ludwig van Allegro con moto Adagio

MODS relatedItem type=“constituent” » Child element to MODS » relatedItem element uses MODS content model titleInfo, name, subject, physicalDescription, note, etc. » Makes it possible to create rich analytics for contained works within a MODS record » Repeatable and nestable recursively Making it possible to build a hierarchical tree structure » Makes it possible to associate descriptive data with any structural element

METS 2 Hierarchies: Logical & Physical Hierarchy to represent “logical” structure (nested relatedItems) Hierarchy to represent “physical” structure (nested div elements)

(XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr Linking in METS Documents

(XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr Linking in METS Documents

(XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr Linking in METS Documents

DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr (XML ID/IDREF links) Linking in METS Documents

DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr (XML ID/IDREF links) Linking in METS Documents

DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr (XML ID/IDREF links) Linking in METS Documents

What is a METS Profile? » Description of a class of METS documents provides document authors and programmers guidance to create and process conformant METS documents » XML document using a schema Expresses the requirements that a METS document must satisfy » “Data standard” in its own right A sufficiently explicit METS Profile may be considered a “data standard” » METS Profiles are human-readable prose and not intended to be “machine actionable”

METS Profile Excerpt » Recorded Event – structMap requirement

METS Profiles Used in LC Presents » Sheet Music » Musical Score (score, score and parts, or a set of parts only) » Print Material (books, pamphlets, etc) » Music Manuscript (score or sketches) » Recorded Event (audio or video) » PDF Document » Bibliographic Record » Photograph » Compact Disc » Collection

Multiple Inputs to Common Data Format New Digital Objects Legacy Database Profile-based METS Object A common data format for searching and display Harvest of American Memory Objects

Example 1: New Digital Object » METS Musical Score Profile » Library of Congress March by John Philip Sousa » Musical score and parts

Example 2: New Digital Object » METS Recorded Event Profile » Juilliard String Quartet » Sound Recording

Example 3: Legacy Database » METS Bibliographic Record Profile » Duke Ellington & His Orchestra (1962) [Motion Picture] » Bibliographic Information Convert database from Filemaker Pro to a single XML file. XSLT stylesheet creates 14,000 METS/MODS records. XSL-FO stylesheet creates single PDF document.

Example 4: American Memory Harvest » METS Photograph Profile » William P. Gottlieb Collection Portrait of Louis Armstrong » Photographic object Convert file of 1600 MARC records, using marc4j, to XML modsCollection (single file). Used XSLT stylesheet to create 1600 records conforming to the METS photograph profile.

Logical (MODS) Original Work Derivative Work 1 Derivative Work 2 Physical (METS structMap) mods:mods and mods:relatedItem type ="otherVersion" elements create a sequence of 3 nodes div TYPE=“photo:version” elements correspond to the 3 nodes using a logical sequence of ID to DMDID relationships Logical & Physical Relationships

Validation in METS Profiles » 3 levels of validation for METS objects » Validation of XML (well-formed) » Validation of METS/MODS (XML Schema) » Validation of METS Profile

Example 1: Aggregation » METS Song Collection Object » Hierarchy of METS documents Collection members include sheet music, an audio recording, a manuscript, and a biography of the composer.

Example 2: Aggregation » MODS relatedItem type=“host” » memberOf:Baseball sheet music Objects can be related to a virtual aggregate – in this case “Baseball sheet music”

Example 3: Aggregation » “See also” reference » MODS relatedItem (no type)

Example: Administrative Metadata » PREMIS and MIX for digital images

Software/Tools for METS/MODS » Emacs – text editor (used to edit MODS) » nxml-mode – plug-in for schema-aware XML editing » XML Schemas for METS, MODS, MIX, PREMIS

Software/Tools for METS/MODS » cygwin – bash shell command line and tools » Saxon – XSLT transformations » Xerces – XML validation » mysql-jdbc-connector – connect to mySQL » SRU – retrieve records from ILS » Cocoon – facilities to retrieve and load records, retrieve xml version of a file system, etc. » Ant – used to automate all of the above tasks and create pipelines of multiple tasks (runs from Emacs) continued…

Advantages of METS/MODS Approach » Ability to model complex library objects » Ease of change and extension both the data and the application » Use of modern, non-proprietary software tools » Use of XSLT for… Legacy data conversion Batch METS creation and editing Web displays and behaviors » Use of a common syntax – XML For data creation, editing, storage and searching continued…

Advantages of METS/MODS Approach » Creation of multiple outputs from XML HTML/XHTML for Web display; PDF for printing » Ease of editing Single records or selected batches of records » Ability to validate data » Ability to aggregate disparate data sources » Ease of data management and publishing » Excellent positioning for the future New web applications (Web 2.0) Repository submission and OAI harvesting Cooperative projects (test interoperability)