IMPLEMENTATION ISSUES. How PREMIS can be used  For systems in development as a basis for metadata definition  For existing repositories as a checklist.

Slides:



Advertisements
Similar presentations
METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.
Advertisements

Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Implementing PREMIS in Container Formats Rebecca Guenther, Library of Congress Zhiwu Xie, Los Alamos National Laboratory IS&T’s.
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
Standards showcase: MODS, METS, MARCXML ALA Annual 2006 Rebecca Guenther and Jackie Radebaugh Network Development and MARC Standards Office Library of.
METS: An Introduction Towards a Digital Object Standard Rick Beaubien Library Systems Office U.C. Berkeley.
PREMIS Conformance. Agenda 1.NLNZ and NLB conformance exercise 2.History of PREMIS Conformance 3.Current status 4.Mapping to functionality.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
METS Dr. Heike Neuroth EMANI – Project Meeting February 14 th - 16 th, 2002 Springer-Verlag Heidelberg Göttingen State and University Library (SUB)
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Creating METS Application Profiles using METS and MODS Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
1 Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata.
Joachim Bauer Senior System Engineer, CCS
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
The Promise of PREMIS: background, scope and purpose of the Data Dictionary for Preservation Metadata Rebecca Guenther, Library of Congress Long-term Repositories:
3. Technical and administrative metadata standards Metadata Standards and Applications.
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
Merrilee Proffitt e(X)literature / Digital Cultures Project April 2003 News from the Digital Library The Metadata Encoding and Transmission Standard; the.
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in.
METS Metadata Encoding and Transmission Standard Metadata Working Group Forum April 19, 2002.
US GPO AIP Independence Test CS 496A – Senior Design Fall 2010 Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong.
PREMIS What is PREMIS? o Preservation Metadata Implementation Strategies When is PREMIS use? o PREMIS is used for “repository design, evaluation, and archived.
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
Metadata: use of METS with Fedora Marie Lagerwall Technical Officer Centre for Learning Technology London School of Economics and.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
A Registry for controlled vocabularies at the Library of Congress
METS: An Introduction Part III METS and MOA2. MOA2: A Brief History Digital Library Federation project started in 1997 Main goal was to create a digital.
Metadata : Setting the Scene or a Basic Introduction Wendy Duff University of Toronto, Faculty of Information Studies.
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
METS: Metadata Encoding and Transmission Standard Richard Gartner Oxford University Library Services
A METS Application Profile for Historical Newspapers
Rebecca Guenther Library of Congress
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
METS Intro & Overview Mets Opening Day Germany May 7, 2007 Nancy J. Hoebelheinrich Stanford University Libraries.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
1 The Universal Object Format - A METS Profile for an archiving and exchange format for digital objects.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
An Introduction to METS Morgan Cundiff Network Development and MARC Standards Office Library of Congress Metadata Encoding and Transmission Standard.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
HUB AND SPOKE TOOL SUITE PREMIS Implementation Fair – 7 October 2009 Bill Ingram Visiting Research Programmer University of Illinois at Urbana-Champaign.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Habing1 Integrating PREMIS and METS PREMIS Tutorial Implementers’ Panel June 21, 2007, 9:00-5:30 Library of Congress, Jefferson Building, Whittall.
OCLC Online Computer Library Center Preservation Metadata Standards PREMIS & METS Taylor Surface, OCLC.
PREMIS Implementation Fair – SF 2009 PREMIS use in Rosetta Yair Brama – Ex Libris.
METS: Implementing a metadata standard in the digital library Richard Gartner Oxford University Library Services
Conceptual Data Modelling for Digital Preservation Planets and PREMIS Angela Dappert.
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Lifecycle Metadata for Digital Objects The Final Curtain December 4, 2006.
An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program.
Author(s): Paul Conway, License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Integrating PREMIS and METS
Implementing an Institutional Repository: Part II
METS, MODS and PREMIS, Oh My! (and a little MIX and other schema too)
METS, MODS and PREMIS, Oh My! (and a little MIX and other schema too)
Metadata in Digital Preservation: Setting the Scene
Medusa at the University of Illinois
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

IMPLEMENTATION ISSUES

How PREMIS can be used  For systems in development as a basis for metadata definition  For existing repositories as a checklist for evaluation “It seems that often people say they aren't ready to implement PREMIS yet, but they don't seem to realise they are already collecting some of the same information that PREMIS describes. The metadata is the same because it is often common sense that it is needed in a repository system. PREMIS can be useful to point out a few extra areas they perhaps hadn't thought of yet.” Deborah Woodyard-Robinson

Implementation issues: models  Reconciling data models PREMIS data model is for convenience of aggregation Many arbitrary decisions, e.g. is an anomaly discovered during validation a property of the object or an outcome of the validation event? Other data models equally valid, e.g. NLNZ has Process, Object, File, Metadata However: PREMIS encourages consistent application of preservation metadata across different categories of objects (representation, file, bitstream)  Implementation in relational databases PREMIS data model is not entity-relationship model

Implementation issues: obtaining values  How to create or obtain metadata values? Most can be populated by program but tools would help JHOVE, NLNZ Metadata Extraction Tool Tool page under development Need registries for format and environment information Pronom, GDFR  What values to use for controlled vocabularies? PREMIS does not have “scheme” element but probably ought to

Implementation issues: conformance  Conformance is defined in PREMIS Final Report if you use the name, use the definition local metadata can supplement but not modify PREMIS can define more stringent repeatability and obligation but not more liberal  Meaning of mandatory: you have to know it, and you have to be able to supply it if exporting for exchange you don’t have to record it in repository

Implementation issues: need for additional metadata  preservation metadata not considered core core = all objects, all preservation strategies example of non-core = installation requirements  more detailed information on Rights and Agents  metadata describing Intellectual Entity  format-specific technical metadata  business rules of the repository  information about the metadata itself (e.g., who obtained or recorded a value, when last changed...)

XML issues

PREMIS XML schemas  One schema for each PREMIS entity in data model Allows user to choose which parts of PREMIS to use  PREMIS container schema References schema for each entity type Provides a container if it is desirable to keep some or all PREMIS metadata together If using container requires at least an object which in turn requires objectIdentifier and objectCategory Individual schemas may used alone or with container  Semantic units in PREMIS schemas XML is faithful to data dictionary Only those units mandatory for all categories of objects are mandatory in object schema

PREMIS Schemas  Container schema Container schema  Object schema Object schema  Event schema Event schema  Agent schema Agent schema  Rights schema Rights schema

Proposed schema changes for new version  Define an abstract object type to allow for better validation of object category (representation, file, bitstream)  Define main elements globally to allow for reuse  Implement an extensibility mechanism to provide for further structure when needed  Implement a mechanism to use controlled vocabularies  Adjust schemas to support changes in version 2 of data dictionary

Implementing PREMIS using XML in METS

METS introduction  METS records the (possibly hierarchical) structure of digital objects, the names and locations of the files that comprise those objects, and the associated metadata  A METS document may be a unit of storage (e.g. OAIS AIP) or a transmission format (e.g. OAIS SIP or DIP)  METS is extensible and modular  METS uses extension “wrappers” or “sockets” where elements from other schemas can be plugged in  METS uses the XML Schema facility for combining vocabularies from different Namespaces  The METS Editorial Board has endorsed PREMIS as an extension schema  Many institutions trying to use PREMIS within the METS context

The structure of a METS file METS dmdSec amdSec behaviorSec structMap fileSec file inventory descriptive metadata administrative metadata behaviour metadata structural map

Inserting technical metadata in a METS Document

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr

Linking in METS Documents (XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr

Linking in METS Documents ( XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr

Linking in METS Documents ( XML ID/IDREF links) DescMD mods relatedItem AdminMD techMD sourceMD digiprovMD rightsMD fileGrp file StructMap div fptr div fptr

METS extension schemas  “wrappers” or “sockets” where elements from other schemas can be plugged in  Provides extensibility  Uses the XML Schema facility for combining vocabularies from different Namespaces  Endorsed extension schemas: Descriptive: MODS, DC, MARCXML Technical metadata: MIX (image); textMD (text) Preservation related: PREMIS

Issues in using PREMIS with METS  Which METS sections to use and how many  Whether to record elements redundantly in PREMIS that are defined explicitly in the METS schema  How to record elements that are also part of a format specific technical metadata schema (e.g. MIX)  Recording structural relationships  How to deal with locally controlled vocabularies  Whether to use the PREMIS container

PREMIS and METS sections  Flexibility of METS requires implementation decisions  You can’t put all PREMIS metadata directly under amdSec  What sections to use for PREMIS metadata? Alternative 1 Object in techMD Event in digiProvMD Rights in rightsMD Agent with event or rights Alternative 2 Everything in digiProvMD Alternative 3 Everything in techMD  How many administrative MD sections to use?  Experimentation will result in best practices

SHA bc65c5b d09ad373eefd147382ecbf EchoDep/messageDigestOriginator> Elements defined in both METS and PREMIS: METS: Checksum, Checksumtype attribute of not repeatable  PREMIS: fixity also includes messageDigestOriginator allows multiples

<file ID="FID1" ADMID="TMD1PREMIS DP1EVENT DP1AGENT“ MIMETYPE="image/jpeg" <techMD ID="TMD1PREMIS“ image/jpeg 1.02 Elements defined both in METS and PREMIS: METS: MIMETYPE attribute of optional  PREMIS: more granular; includes name and version (although name may be MIMETYPE) mandatory

ECHODEP Hub Event echo12345 ECHODEP Hub Event echo12345 ingestion T15:12:53 Elements defined both in METS and PREMIS  METS ID/Idref: used to associate metadata in different sections and for different files  PREMIS identifiers: explicit linking between entity types

structural is sibling of UCB FID2 1 Elements defined both in METS and PREMIS:  METS: structMap details structural relationships and is the heart of the METS document hierarchical, so may be more expressive than PREMIS semantic units links the elements of the structure to content files and metadata  PREMIS: details all kinds of relationships, including structural data dictionary says that implementations may record by other means

Should semantic units be recorded redundantly?  Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemas Record only in METS Record only in PREMIS Record in both  Are there advantages in using PREMIS semantic units?  Is it important to keep PREMIS metadata together as a unit? There may be an advantage for reuse and maintenance purposes

How to record elements from 2 different technical metadata schemas  Format specific metadata may be included in addition to PREMIS general technical metadata  Use multiple techMD sections and specify source in MDType attribute and/or namespace declaration e.g. MDTYPE=“NISOIMG” or “PREMIS” Give MIX schema declaration in METS document  MIX was recently revised to correspond with the revision of the Z39.87 technical metadata for digital still images standard; names harmonized with corresponding PREMIS semantic units  For digital still images, best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy

Examples of PREMIS in XML  PREMIS in METS: Portrait of Louis Armstrong (Library of Congress) Portrait of Louis Armstrong Peoria County, Illinois aerial photograph (ECHO Depository, UIUC Grainger Engineering Library) Peoria County, Illinois aerial photograph  MATHARC implementation: set_descr_mets_premis_02v2.xml

MPEG-21 Digital Item Declaration (DID)  ISO/IEC : Digital Item Declaration  A promising alternative to represent Digital Objects  Starting to get supported by some repositories, e.g., aDORe, DSpace, Fedora  A flexible and expressive model that easily represents compound objects (recursive “item”)  Attach well-formed XML from persistent namespaces as metadata

Abstract Model for MPEG-21 DID resource component descriptor/statement item container resource: datastream component: binding of descriptor/statements to datastreams item: represents a Digital Item aka Digital Object aka asset. Descriptor/statement constructs convey information about the Digital Item container: grouping of items and descriptor/statement constructs pertaining to the container

Mapping resource object3object4 premis: object premis:premis DIDInfo object2 object1 DID All rights, events, and agents go here. The top level object goes here. Other objects may be duplicated here or linked here. premis: object

Partial Implementation in DID resource object3object4 premis: format premis:significantProperties premis:premis DIDInfo object2 object1 DID When metadata are not sufficient to form the top level PREMIS elements, partial implementation may be done if PREMIS elements are globally defined. premis: creatingApplication

Example of PREMIS in MPEG DID  PREMIS in MPEG DID: aDORe example (LANL) aDORe example

Summary: container formats  A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content  Use of a container is compatible with and an implementation of the OAIS information package concept  Co-existence with other types of metadata requires best practices for both approaches; redundancy seems to be preferred  Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation  Development of registries (informal or formal) for controlled vocabularies will benefit implementation  Tools are being developed to facilitate implementation

Summary: METS vs. MPEG 21 DID  METS and MPEG DID are similar types of container formats in that both are expressed in XML, both represent the structure of digital objects, and both include metadata  MPEG DID doesn’t have the segmentation in metadata sections that METS does, so this implementation decision need not be made in DID  METS is open source and developed by open discussion, mainly cultural heritage community  MPEG DID is an ISO standard and has industry support, but is often implemented in a proprietary way and standards development is closed  It would be possible to transform a METS container to a MPEG DID and vice versa; development of stylesheets will enable transformations

Implementers’ panel  What types of objects are you preserving?  Has your institution implemented a preservation repository?  What preservation metadata are you recording?  How are you recording it, e.g. database, METS/XML, other  Do you plan to exchange preservation metadata with other repositories?  Are you planning to or already using PREMIS?  Which semantic units are most useful?  Which semantic units are least useful?  What difficulties have you had applying PREMIS units?