Presentation is loading. Please wait.

Presentation is loading. Please wait.

ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA.

Similar presentations


Presentation on theme: "ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA."— Presentation transcript:

1 ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA (JSTOR / Portico) evan.owens@ithaka.org

2 Background Portico Preservation Metadata designed & implemented in 2002-2003 –Inspired by PREMIS working group participation –Operational before PREMIS was completed! Portico Archive as of October 2009 –>14 Million E-Journal Articles plus other content –~150 Million Files –~1 Billion Events –Only 1K manual events; 99.999% system generated –Over 1 TB of Preservation Metadata Portico / JSTOR / Ithaka merger in 2009

3 2.0 PMD Revision Project Begun in 2008; Implementation now underway Design Goals for Revision to Events: –Consistent editorial/coding practices (capitalization, verb tenses, etc.) –Clarify what event goes with which object and why –Eliminate redundant information where possible –Make explicit all data constraints not currently expressed in our schemas –Synchronize event metadata with the high-level preservation metadata so that the events properly document changes in the core metadata –Establish a clean base line for future expansion of events metadata

4 PMD 2.0 Design Choices Use our own data model / information architecture –Optimized for Java, Oracle, and XML instantiations –XML designed to reduce future versioning: XSD schema for frame (syntax) only All business rules (semantics) expressed in Schematron –Not METS, not DIDL, not PREMIS XML –PREMIS compliant Optimized for size and speed –Fully relationally normalized –Inheritable attributes / metadata –Events attached to objects

5 Processing Record “master” for each processing pass Bring together information common to all the events from a given processing pass; e.g., initial ingest, future migration, etc.

6 Not a real event! Example XML serialization showing all possible child elements to illustrate the information model

7 Event Types Check: Virus, Fixity, … Characterize: File, … Generate: Desc. MD, Tech. MD, Fixity, … Edit: Desc. MD, … Set: Status, Format, Preservation Level, … Ingest: into Archive Add, Create, Remove File

8 Mapping PMD 2.0 to PREMIS

9 Observations Large-scale automated events feel very different from human events ITHAKA archive will quadruple in 2010 –Likely 3-5 billion events... Every bit of metadata has to be need justified Events have proved their value –An entire talk on that subject alone Nothing is easy in quantities of billions We still have to work on full lifecycle events THIS IS STILL A WORK IN PROGRESS!


Download ppt "ITHAKA Preservation Metadata 2.0: Revising the Event Model A last-minute presentation on work currently in progress Evan Owens VP, Content Management ITHAKA."

Similar presentations


Ads by Google