A METS Application Profile for Historical Newspapers

Slides:



Advertisements
Similar presentations
METS: Metadata Encoding & Transmission Standard Merrilee Proffitt Society of American Archivists August 2002.
Advertisements

METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
Standards showcase: MODS, METS, MARCXML ALA Annual 2006 Rebecca Guenther and Jackie Radebaugh Network Development and MARC Standards Office Library of.
METS: An Introduction Towards a Digital Object Standard Rick Beaubien Library Systems Office U.C. Berkeley.
METS: An Introduction Structuring Digital Content.
Using and to Create XML Standards-based Digital Library Applications Morgan Cundiff & Nate Trail Network Development and MARC Standards Office (NDMSO)
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
METS Dr. Heike Neuroth EMANI – Project Meeting February 14 th - 16 th, 2002 Springer-Verlag Heidelberg Göttingen State and University Library (SUB)
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
Creating METS Application Profiles using METS and MODS Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
Joachim Bauer Senior System Engineer, CCS
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Out topic is… METS and MODS to express data for digital objects
3. Technical and administrative metadata standards Metadata Standards and Applications.
MODS What is MODS: – Stands for Metadata Object Descriptive Schema – MODS is an XML descriptive metadata standard. – Extension schema to METS – MODS was.
METS What is METS ? What is METS ? A schema that provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a.
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in.
METS Metadata Encoding and Transmission Standard Metadata Working Group Forum April 19, 2002.
DigiTool METS Profile DigiTool Version 3.0. DigiTool METS Profile 2 What is METS? A Digital Library Federation initiative built upon the work of MOA2.
MODS What is MODS: When is MODS use:
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
Metadata: use of METS with Fedora Marie Lagerwall Technical Officer Centre for Learning Technology London School of Economics and.
Introduction to Databases Transparencies
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
METS: An Introduction Part III METS and MOA2. MOA2: A Brief History Digital Library Federation project started in 1997 Main goal was to create a digital.
MODS What is MODS: o Stands for Metadata Object Descriptive Schema o MODS is an XML descriptive metadata standard.  Uses the XML schema language of the.
METS at UC Berkeley Part II: Viewing METS Objects via GenView.
METS: An Introduction Part II
Presented by Karen W. Gwynn LS – Metadata University of Alabama Prof. Steven MacCall Spring 2011.
METS: Metadata Encoding and Transmission Standard Richard Gartner Oxford University Library Services
Use of METS in CDL Digital Special Collections Brian Tingle.
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Gathering Audio Metadata for the Monterey Jazz Festival Concerts OLAC 2006 By Nancy J. Hoebelheinrich, Stanford University Libraries.
An Introduction to METS Morgan Cundiff Network Development and MARC Standards Office Library of Congress Metadata Encoding and Transmission Standard.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
METS Navigator Jenn Riley John Walsh Michelle Dalmau David Jiao Indiana University Digital Library Program Digital Library Federation Spring Forum
Habing1 Integrating PREMIS and METS PREMIS Tutorial Implementers’ Panel June 21, 2007, 9:00-5:30 Library of Congress, Jefferson Building, Whittall.
PREMIS Implementation Fair – SF 2009 PREMIS use in Rosetta Yair Brama – Ex Libris.
METS: Implementing a metadata standard in the digital library Richard Gartner Oxford University Library Services
Introduction to metadata
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
IMPLEMENTATION ISSUES. How PREMIS can be used  For systems in development as a basis for metadata definition  For existing repositories as a checklist.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Sobek for Curators and Collection Managers Training Two: Submitting and Editing Resource Files and Metadata Mark Sullivan November 2013 University of Florida.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
The Care and Feeding of Digital Collections Amy Jackson March 14, 2005.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Lifecycle Metadata for Digital Objects The Final Curtain December 4, 2006.
Presenting Documents How to Build a Digital Library Ian H. Witten and David Bainbridge.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
and Transmission Standard overview – and case study
Introduction to Metadata
Integrating PREMIS and METS
Presentation transcript:

A METS Application Profile for Historical Newspapers Morgan Cundiff Network Development and MARC Standards Office Library of Congress

Outline XML and Standards Definition of METS Definition of METS Profiles Use of MODS relatedItem element Draft METS Profile for Historical Newspapers Parting Thoughts

XML “XML has become the de-facto standard for representing metadata descriptions of resources on the Internet.” Jane Hunter Working towards MetaUtopia - A Survey of Current Metadata Research

The Importance of Standards “In moving from dispersed digital collections to interoperable digital libraries, the most important activity we need to focus on is standards… most important is the wide variety of metadata standards [including] descriptive metadata… administrative metadata…, structural metadata, and terms and conditions metadata…” Howard Besser The Next Stage: Moving from Isolated Digital Collections to Interoperable Digital Libraries

What is METS? METS is an XML Schema designed for the purpose of creating XML document instances that express the hierarchical structure of digital library objects, the names and locations of the files that comprise those objects, and the associated metadata. METS can, therefore, be used as a tool for modeling real world objects, such as particular document types.

What are the 7 Sections of a METS Document? <metsHdr/> <dmdSec/> <amdSec/> <fileSec/> <structMap/> <structLink/> <behaviorSec/> </mets>

The Descriptive Metadata Section with mdWrap <mets> <dmdSec> <mdWrap> <xmlData> <!-- insert data from different namespace here --> </xmlData> </mdWrap> </dmdSec> <fileSec></fileSec> <structMap></structMap> </mets>

The Descriptive Metadata Section with MODS as extension schema <mets:mets> <mets:dmdSec> <mets:mdWrap> <mets:xmlData> <mods:mods></mods:mods> </mets:xmlData> </mets:mdWrap> </mets:dmdSec> <mets:fileSec></mets:fileSec> <mets:structMap></mets:structMap> </mets:mets>

The Descriptive Metadata Section with MODS and relatedItem elements <mets:mets> <mets:dmdSec> <mets:mdWrap> <mets:xmlData> <mods:mods> <mods:relatedItem type=“constituent”> <mods:relatedItem type=“constituent”></mods:relatedItem> </mods:relatedItem> </mods:mods> </mets:xmlData> </mets:mdWrap> </mets:dmdSec> <mets:fileSec></mets:fileSec> <mets:structMap></mets:structMap> </mets:mets>

MODS relatedItem element Child element to MODS relatedItem element has same content model as mods (titleInfo, name, subject, physicalDescription, note, etc) The relatedItem element makes it possible to create very rich analytic descriptions for contained works within a MODS records relatedItem element is repeatable and it can be nested recursively (thus making it possible to build a hierarchical tree structure) relatedItem elements make it possible to associate descriptive data with any structural element.

Use of MODS relatedItem element to express logical structure <mods:mods> <mods:titleInfo> <mods:title>Baltimore Sun</mods:title> </mods:titleInfo> <mods:relatedItem type="constituent"> <mods:title>Sports</mods:title> <mods:relatedItem type="constituent"> <mods:title>O’s Split Beantown Twi-niter</mods:title> </mods:relatedItem> <mods:title>Chisox Nip Tribe</mods:title> </mods:mods>

METS document with two hierarchies (logical and physical) <mets:mets> <mets:dmdSec> <mets:mdWrap> <mets:xmlData> <mods:mods> <mods:relatedItem> <mods:relatedItem></mods:relatedItem> </mods:relatedItem> </mods:mods> </mets:xmlData> </mets:mdWrap> </mets:dmdSec> <mets:fileSec></mets:fileSec> <mets:structMap> <mets:div> <mets:div></mets:div> </mets:div> </mets:structMap> </mets:mets>

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD (mix) sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr

What is a METS Application Profile? “METS Profiles are intended to describe a class of METS documents in sufficient detail to provide both document authors and programmers the guidance they require to create and process METS documents conforming with a particular profile.” A profile is expressed as an XML document. There is a schema for this purpose. The profile expresses the requirements that a METS document must satisfy. A sufficiently explicit METS Profile may be considered a “data standard”. Note: A METS Profile is a human-readable prose document and is not intended to be “machine actionable”.

METS Profile for Historical Newspapers [draft] The METS Profile for Historical Newspapers specifies how METS documents representing digitized historical newspapers should be encoded. Note that the profile is to be used to represent a single issue of a newspaper. The profile uses MODS to express the logical structure of a newspaper issue, and uses the METS structMap to express the physical structure of the newspaper issue. [draft abstract] URL to find Profile and related documents: http://www.loc.gov/standards/mets/test/ndnp/profile_notes.html mcundiff@loc.gov

METS Profile (features) Represents one issue of a newspaper.

METS Profile (features) The Profile presumes the use of alto files (or some equivalent) where the zones on the corresponding digital image (expressed as coordinates) are correlated to the corresponding logical entity (e.g. article or paragraph) and also to the corresponding OCR text.

METS Profile (features) The Profile maintains a strict separation between logical entities and physical entities.

METS Profile (features) The primary logical entities are issue, issue section, article, article section, illustration, and advertisement. The top-level MODS record describes the issue. The other primary logical entities (issue section, article, article section, illustration, and advertisement) are described in a heirarchy of MODS relatedItem elements.

METS Profile (features) Logical structure is represented using MODS in the METS dmdSec. It is necessary to use the latest version (version 3.2) of MODS.

Hierarchy of Logical Entities issue issue section article (or article-like entity) paragraph illustration (photograph, drawing, map, table) article section illustration advertisements article

METS Profile (features) The primary logical entities are expressed as values of the MODS genre element.

Use of MODS relatedItem element to express logical structure <mods:mods> <mods:titleInfo> <mods:title>Baltimore Sun</mods:title> <mods:genre>newspaper</genre> </mods:titleInfo> <mods:relatedItem type="constituent"> <mods:title>Sports</mods:title> <mods:genre>section</genre> <mods:relatedItem type="constituent"> <mods:title>O’s Split Beantown Twi-niter</mods:title> <mods:genre>article</mods:genre> <mods:title>Aparicio puts tag on Jensen to end 7th</mods:title> <mods:genre>photograph</genre> </mods:relatedItem> </mods:mods>

METS Profile (features) The allowable genre values (for Profile compliance) are listed in Newspaper Genre Terms [draft].

METS Profile (features) It is also possible to tag subparts of the primary logical entities. The typical example of this is tagging the paragraph. This is accomplished using the MODS part element.

METS Profile (features) There are only three physical entities. They are: issue, page, and pageRegion.

METS Profile (features) The physical entities are represented in the structMap section of the METS document as div types (div type="news:page"). There is only one structMap.

METS Profile (features) Page regions are correlated to the corresponding logical entity by means of an IDREF link. Note that one or more page regions may correspond to a single logical entity. This makes it possible to make the necessary associations when the logical entity is split into more than one physical entity, e.g. when a paragraph is continued on the next column or an article is continued on a different page.

METS Profile (features) Example document http://memory.loc.gov/cocoon/diglib/loc.news.sr.1002/default.html

Parting Thoughts Agreement on a data standard (such as a METS profile) will facilitate interoperability. Interoperability can be between any two agents (digital library applications, preservation repositories, search and retrieval systems, etc.) Newspaper community has a “quality vs. quantity” dilemma. Large volume of material to be digitized necessitates automatic processing. Automatic processing produces dirty data and less satisfying results. High quality processing (requiring more human intervention) is more expensive but produces far better results and pays dividends far into the future (the data will be used over and over without additional cost).