METS Case Study: The NYU Digital Library Team METS Opening Day 27 October, 2003 Leslie Myrick.

Slides:



Advertisements
Similar presentations
Permanent Hosting, Archiving and Indexing of Digital Resources and Assets Raman Ganguly Computer Center University of Vienna.
Advertisements

Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
Introduction to METS (Metadata Encoding and Transmission Standard) Jerome McDonough New York University
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
MacKenzie Smith Associate Director for Technology MIT Libraries.
METS: An Introduction Towards a Digital Object Standard Rick Beaubien Library Systems Office U.C. Berkeley.
METS at UC Berkeley Part I: Generating METS Objects.
Interoperability and Preservation with the Hub and Spoke (HandS) Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University of Illinois Urbana-Champaign.
Interoperability and Preservation with the Hub and Spoke (HandS) Tom Habing, Bill Ingram, Robert Manaster University of Illinois Urbana-Champaign
Workflows for Digital Curation and Preservation Stacy Kowalczyk PASIG Dublin 2012 October 17, 2012.
MODS, METS, and other metadata standards
From EAD to METS An overview and history of METS Rick Beaubien UC Berkeley.
3. Technical and administrative metadata standards Metadata Standards and Applications.
Merrilee Proffitt e(X)literature / Digital Cultures Project April 2003 News from the Digital Library The Metadata Encoding and Transmission Standard; the.
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
WMS: Democratizing Data
Metadata: use of METS with Fedora Marie Lagerwall Technical Officer Centre for Learning Technology London School of Economics and.
Access to Digital Materials through the Library of Congress OPAC Presentation by Dr. Barbara B. Tillett Chief, Cataloging Policy and Support Office Library.
Demonstration of repositories Fedora (Flexible Extensible Digital Object Repository Architecture) Marie Lagerwall MIDESS Partners Meeting February 9, 2007.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Architecting an Extensible Digital Repository Anoop Kumar, Ranjani Saigal,Rob Chavez, Nikolai Schwertner Tufts University, Medford, MA.
METS Intro & Overview Mets Opening Day Germany May 7, 2007 Nancy J. Hoebelheinrich Stanford University Libraries.
Case Study: Using METS as a DIP to Navigate Archived Websites Leslie Myrick, NYU METS Opening Day / UK The British Library 12 July, 2004.
Adventures in Digital Asset Management: Fedora at the National Library of Wales Glen Robson National Library of Wales
Dspace 1 Introduction to DSpace Mukesh Pund Scientist NISCAIR, New Delhi.
University of Illinois at Urbana-Champaign OAI Alpha Experiences Timothy W. Cole Thomas G. Habing Grainger Engineering.
METS Dissemination: Interfaces METS Opening Day 28 October, 2003 Leslie Myrick.
Web based METS creation Ralf Stockmann case study.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
From Creation to Dissemination A Case Study in the Library of Congress’s use Open Source Software DLF Spring Forum Corey Keith
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
An Introduction to METS Morgan Cundiff Network Development and MARC Standards Office Library of Congress Metadata Encoding and Transmission Standard.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
UVa's Digital Library CSG - September 2005 Slides courtesy of: Leslie Johnston Director, Digital Access Services, UVA Library Tim Sigmon University of.
METS at UC Berkeley Generating METS Objects. Background Kinds of materials: –primarily imaged content & tei encoded content archival materials: manuscripts.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Introduction to metadata
VITAL at the National Library of Wales Glen Robson
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Interoperability and Collection of Preservation Metadata for Digital Repository Content Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
The NLW Digital Asset Management System Paul Bevan DAMS Implementation Manager
DSpace - Digital Library Software
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Permanent Hosting, Archiving and Indexing of Digital Resources and Assets Markus Höckner Computer Center University of Vienna.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
NLW. Object Classes Class 1  1 MARC Record  1 Image  No METS Class 2  1 MARC Record  Many images  No METS Class 3  1 MARC Record  Many.
Lifecycle Metadata for Digital Objects The Final Curtain December 4, 2006.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
2/26/2004 Dan Swaney 1 Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation of Digital Objects A review.
METS and MODS / MINERVA PART 3 METS Profiles for Web Sites Leslie D. Myrick, NYU DLF Forum, Spring 2004.
The Fedora Project March 19, 2003 ISTEC Symposium, Brazil
Building Search Systems for Digital Library Collections
Introduction to DSpace
Metadata to fit your needs... How much is too much?
Preserving Our Collective Digital History
Introduction to METS (Metadata Encoding and Transmission Standard)
Presentation transcript:

METS Case Study: The NYU Digital Library Team METS Opening Day 27 October, 2003 Leslie Myrick

Projects at NYU using METS EAD Finding Aid Project Tokyo Tribunal Proceedings Afghanistan Digital Library CRL Political Web Archiving Project DRAM * Hemispheric Institute * REPO History Sign Project *

WHY METS? (1) METS was formulated to serve as a: Submission Information Package Archival Information Package Dissemination Information Package

Why METS? (2) In other words, it’s a … Transfer Syntax Archival Syntax Functional Syntax

METS and Complex Digital Objects Finding aid + images with multiple scans/versions Page turner for photo albums, documents, books – Edisto Album, Tokyo Tribunal brief, Afghanistan Digital Library Multimedia/Time-Based Media Navigators: Hemispheric Institute; SMIL Viewer Web Site Navigator – CRL Political Communications Web Archiving Project

Using METS as a SIP Berol Collection Finding Aid -- in negotiations with RLG Cultural Materials Project METS will be bundled with objects; EAD

METS as a Functional Syntax METS designed not only for transfer and archival management, but for giving access to, navigating an object METS + XSLT can create dynamic interfaces with links to resources and their metadata METS can be dumped into Oracle, indexed and searched using context-aware queries.

METS Plays Well With Others We have … EAD Finding Aids pointing to METS METS pointing to Finding Aids and marcxml records METS pointing to and manipulating TEI

METS and Extensions at NYU MODS and DC for descriptive MIX for Images/technical textMD for text/technical LC A/V Prototype + smptetechMD + AES Missing Links: overall Preservation Schema plugin (PREMIS); rights MD schema

Ingredients (so far) Perl MySQL and some Oracle Tomcat Servlets and jsp Saxon and XT XSLT

Tools for Creation zeroDB Database Input via interface as well as batch loading of metadata extracted by scripts e.g. ImageMagick identify, arcscraper.pl Outputs METS using Perl DBI

Tools for Dissemination Page-turners Multimedia Viewers Thumbnail Browsers

Typical METS Creation Workflow ImageMagick extraction of image metadata Database input (batch and manual entry) of descriptive and technical metadata Generation of METS using Perl DBI against MySQL

Image Magick Verbose Dump Image: taqw_001s.jpg Format: JPEG (Joint Photographic Experts Group JFIF format) Geometry: 625x886 Class: DirectClass Type: true color Depth: 8 bits-per-pixel component Colors: Profile-color: 552 bytes Profile-iptc: 5636 bytes unknown: êëÿ Resolution: 100x100 pixels/inch Filesize: 210kb Interlace: None Background Color: white Border Color: #dfdfdf Matte Color: grey74 Iterations: 0 Compression: JPEG signature: 8c37d0b82374d8eaa6b4d6b062699a9b8d7d86f2ba1d4e320f d Tainted: False

Image Magick non-Verbose Dump taqw-fr001.tif TIFF 6500x6817 DirectClass 8- bit 126mb 4.3u 0:06 taqw-fr001s.jpg[1] JPEG 625x886 DirectClass 8-bit 191kb 0.0u 0:01 taqw-fr001t.jpg[2] JPEG 100x142 DirectClass 8-bit 9954b 0.0u 0:01

Extracting METS from a DB doWebArchive.cgi MODS for homepage; DC for pages MIX for images/technical textMD for web page/technical

METS for Discovery Dump METS files into Oracle as CLOB Create Oracle Intermedia index – XML-aware full-text search Example: CRL political web archiving project

CRL Political Web Archive Collaboration between Stanford, Cornell, Texas, NYU, IA under aegis of CRL, Mellon Sub-Saharan Africa, South East Asia, Latin America, Western Europe Testbed: 400 URLs; websites from radical groups, NGOs Internet Archive.arc files

.arc file 100 MB aggregate of harvested files, along with HTTP headers and crawler- generated header for each file Fine as a simple SIP, but basically unmanageable as an AIP or DIP At present accessed using byte offsets to grab content from aggregate file Only searchable by URL (Wayback Machine)

Automated extraction of text-based metadata e.g. web pages arcscraper.pl – Descriptive and technical MD for object datscraper.pl – Checksums, titles – Links from each object makeLinkTable.pl – Creates link to object relationships

Go to Videotape

The Future? Persistent Identifiers Preservation Metadata Schema Java development Move from Oracle to Cheshire II