The PREMIS Working Group: Preservation Metadata for Digital Repositories DLF Fall Forum October 26, 2004 Rebecca Guenther LC/NDMSO
Oct. 26, 2004DLF-PREMIS2 Preservation Metadata Functions Information that supports and documents the digital preservation process: Establish provenance: track chain of custody and alterations over time Details authenticity Documents technical processes object has undergone Describes technical details of object Describes the environment from which it originated Specify rights management information
Oct. 26, 2004DLF-PREMIS3 Preservation Metadata Functions (cont.) Provide information to maintain resources over the long term: viability: objects bitstream is intact renderability: object can be translated to a form that can be viewed or used understandability: rendered content can be interpreted and understood
Oct. 26, 2004DLF-PREMIS4 Background March 2000: OCLC and RLG jointly sponsor international working group on preservation metadata Identify key issues/challenges Seek consensus on recommendations and best practice White paper (January 2001) Defined preservation metadata; role in preservation process Reviewed/synthesized existing preservation metadata schemes Preservation metadata framework (June 2002) Comprehensive description of types of information constituting preservation metadata Based on OAIS information model Set of prototype preservation metadata elements
Oct. 26, 2004DLF-PREMIS5 Aftermath … Framework … Consolidated expertise Provided foundation for developing formal preservation metadata specifications Common departure point for different schema implementations But... further scope for collaboration in preservation metadata Needed best practices/recommendations for implementing preservation metadata in real world digital archiving systems
Oct. 26, 2004DLF-PREMIS6 Issues unresolved in WG How minimal is a core preservation metadata element set? How much metadata can be generated automatically? Is it useful to apply metadata elements by object type or object behavior? Levels of granularity not addressed Need to provide less abstract view of preservation metadata for implementation
Oct. 26, 2004DLF-PREMIS7 PREMIS June 2003: OCLC and RLG sponsored new working group: PREMIS Preservation Metadata: Implementation Strategies Objectives Define core set of preservation metadata elements, with supporting data dictionary, applicable to broad range of digital preservation activities Identify and evaluate alternative strategies for encoding, storing, managing, and exchanging preservation metadata
Oct. 26, 2004DLF-PREMIS8 Membership Priscilla Caplan, FCLA (Chair) Rebecca Guenther, LC (Chair) Michael Alexander, British Library George Barnum, GPO Charles Blair, U. of Chicago Olaf Brandt, U. of Gottingen Adam Farquhar, British Library David Gewirtz, Yale Kevin Glavash, MIT/Dspace Cathy Hartman, U. of N. Texas Helen Hodgart, British Library Nancy Hoebelheinrich, Stanford Roger Howard/Sally Hubbard, Getty Museum Pam Kircher, OCLC John Kunze, Calif. Digital Library Brian Lavoie, OCLC liaison Robin Dale, RLG liaison Vicky McCarger, LA Times Jerry McDonough, NYU/METS Evan Owens, JSTOR Erin Rhodes, NARA Madi Solomon, Walt Disney Co. Angela Spinazze, ATSPIN Stefan Strathmann, U. of Gottingen Gunter Waibel, RLG Lisa Weber, NARA Robin Wendler, Harvard Hilde van Wijngaarden, KB Andrew Wilson, NAA
Oct. 26, 2004DLF-PREMIS9 Advisory Committee Howard Besser, UCLA Liz Bishoff, OCLC (via Colorado Digitization Program) Gerard Clifton, National Library of Australia Gail Hodge, CENDI Steve Knight, National Library of New Zealand Maggie Jones, Digital Preservation Coalition Nancy McGovern, Cornell Cliff Morgan, Wiley UK Richard Rinehart, U. of California, Berkeley
Oct. 26, 2004DLF-PREMIS10 PREMIS Subgroups Core elements Establish core metadata elements and data dictionary Developed a data model Has had 2 face-to-face meetings Weekly conference calls Implementation Examine alternative strategies for encoding, storage and management of preservation metadata Conducted a survey of practices Monthly conference call Expect to complete activities by end of 2004
Oct. 26, 2004DLF-PREMIS11 Core elements subgroup Development of data model Objects Events Agents Intellectual entities Rights Data dictionary structured according to entities
Oct. 26, 2004DLF-PREMIS12 Core Elements Conducting element-by-element review of prototype elements from metadata framework Is the element core? How is it being used at WG members institutions? How should it be implemented/populated? Elements not covered by the framework?
Oct. 26, 2004DLF-PREMIS13 Objects Identifiers Location Descriptive metadata out of scope Technical metadata not specific to particular file format Levels of objects: representation, file, filestream, bitstream
Oct. 26, 2004DLF-PREMIS14 Objects: Technical metadata Object characteristics Fixity Size Format (including link to format registry) Inhibitors Significant properties Creating application information Environment (software, hardware) Externally defined technical metadata (e.g. Z39.87/MIX)
Oct. 26, 2004DLF-PREMIS15 Events Digital provenance/process information Actions that involve one or more objects May be related to one or more agents Semantic units Event identifier Event type Event outcome Event detail Event date/time
Oct. 26, 2004DLF-PREMIS16 Agents Agent descriptions out of scope Attributes of agents associated with preservation events and rights management May carry-out, authorize, or compel one or more events may create or act upon one or more objects may hold or grant one or more rights Semantic units Agent identifier Agent name
Oct. 26, 2004DLF-PREMIS17 Rights and relationships Rights Only in context of right to preserve Collecting rights use cases Relationships Data model expresses relationships between entities Relationships between objects Derivative, dependency, structural
Oct. 26, 2004DLF-PREMIS20 Implementation Strategies subgroup Conducted survey of preservation repositories to explore the state of the art Questions about policies, governance, funding, system architecture, preservation strategies, metadata implementation 70 surveys sent Responses from 28 libraries, 7 archives, 14 other in 13 different countries 10 national libraries, 6 national archives Survey published Oct. 2004
Oct. 26, 2004DLF-PREMIS21 Survey findings Little experience with digital preservation Most didnt have active preservation strategy Many not yet in production Cannot assess adequacy of metadata Lack of common vocabulary and conceptual framework Informed by OAIS reference model Difference of opinion as to meaning of OAIS compliance
Oct. 26, 2004DLF-PREMIS22 Survey findings (cont.) Metadata Many recording rights, provenance, technical, administrative, descriptive and structural Consistent roles in preservation scope and policies (academic libraries, archives, national libraries) Substantial use of METS, Z39.87/MIX, OCLC sets Most repositories serve goals of both preservation and access
Oct. 26, 2004DLF-PREMIS23 Trends Store metadata redundantly in XML or relational database and with content data objects Use METS for structural metadata and as container for descriptive and administrative; MIX for images Use OAIS as framework and starting point Maintain multiple versions (originals, some normalized or migrated) in repository with complete metadata for all versions Choose multiple strategies for digital preservation
Oct. 26, 2004DLF-PREMIS24 Looking ahead Finalize core preservation metadata elements set Complete data dictionary XML schemas to support exchange of core elements for digital provenance/process and technical metadata Final PREMIS report by end of 2004 Community outreach: opportunities for public comment Follow-on activities?
Oct. 26, 2004DLF-PREMIS25 More information… PREMIS Web site: Implementing Metadata in Digital Preservation Systems: The PREMIS Activity D-Lib (April 04) Rebecca Guenther: Priscilla Caplan: