Download presentation
Presentation is loading. Please wait.
Published byEric Perkins Modified over 8 years ago
1
2/26/2004 Dan Swaney 1 Preservation Metadata and the OAIS Information Model A Metadata Framework to Support the Preservation of Digital Objects A review of the report by the OCLC/RLG Working Group on Preservation Metadata June 2002 http://www.rlg.org/longterm/pm_framework.pdf http://www.rlg.org/longterm/pm_framework.pdfhttp://www.rlg.org/longterm/pm_framework.pdf (Report: http://www.oclc.org/research/pmwg/) http://www.oclc.org/research/pmwg/ Presented by Dan Swaney
2
2/26/2004 Dan Swaney 2 The OCLC/RLG Working Group March 2000 –Working Group was Formed by OCLC – Online Computer Library Center, Inc. RLG – Resource Library Group, Inc. Started with a White Paper entitled –“Preservation Metadata for Digital Objects: A Review of the State of the Art” –Introduced concepts that were followed by the development of the actual framework discussed later.
3
2/26/2004 Dan Swaney 3 What is OAIS? Open Archival Information System –May 1999 (Original Model) Supported the Space Community –June 2001 (Revised Model) Extended to support libraries/cultural heritage institutions, gov’t agencies, and private sector –Information Model embedded in OAIS Direct Relevance to Preservation Metadata
4
2/26/2004 Dan Swaney 4 OAIS Information Model: The Bottom -- From Data to Information Information Object Knowledge Base Data Object Representation Information Digital Object Physical Object OR External to the Archival System Programmers must Have the knowledge base To understand Java source Representation Information Describes the Data Object’s bits: 1010101001 = sound file, paragraph of text, an image
5
2/26/2004 Dan Swaney 5 OAIS Information Model: Moving from the Bottom to the Top Information Object Knowledge Base Data Object Representation Information Digital Object Physical Object OR External to the Archival System Representation Information Describes the Data Object’s bits: 1010101001 = sound file, paragraph of text, an image
6
2/26/2004 Dan Swaney 6 OAIS Information Model: The Top -- From Object to Package Information Object Information Package Archival (AIP) Submission (SIP) Dissemination (DIP) Content Information Preservation Description Information Packaging Information Descriptive Information
7
2/26/2004 Dan Swaney 7 Three Types of Information Packages Information Producer Archive Archival Information Package (AIP) Submission Information Package (SIP) Dissemination Information Package (DIP) Responding to a Query Request Submitting an Information Object
8
2/26/2004 Dan Swaney 8 Inside the Information Package Information Package Content Information (CI) Preservation Description Information (PDI) Packaging Information Descriptive Information - ‘Content’ Data Object - Representation Info - Info to manage preservation of Content Info - Reference Info - Provenance Info - Context Info - Fixity Info - Metadata for Resource Discovery - Assists finding aids - An Abstract? - Derived from: CI & PDI - Header block of info that binds together an Archive Information Package - Binds together: - digital object + - assoc. metadata
9
2/26/2004 Dan Swaney 9 Implementing Two of the Components of the OAIS Model First: Content Information (CI) –‘Content’ Data Object (CDO) Raw Data Bits – Representation Info (2 components) Structure Info – technical desc/spec –Example: format, data structs, encoding –Makes CDO Understandable by Machines/Systems Semantic Info – explains the data –Example: interpret as English or temperatures delimited by tabs –Makes CDO Understandable by Humans
10
2/26/2004 Dan Swaney 10 Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description -Details for Rendering/Viewing in Human-readable form -Defines Attributes: 1. Abstract of Steps -Steps to restore a ZIP file back to files/folders -Steps to restore into a DBMS 2. Structural Type 3. Technical infrastructure (Web Page and all it’s req’d files) 4. File Description 5. Installation requirements 6. Size 7. Access Inhibitors 8. Access Facilitators 9. Significant Properties (whether to enable special features) 10. Functionality (Web Page requires JavaScript) 11. Desc of Rendered Content 12. Quicks (Lost Features) 13. Documentation
11
2/26/2004 Dan Swaney 11 Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description -Rendering Programs is a two-step process: 1. Transform 2. Display/Access -Defines Attributes: 1. Transform Process + Transformer Engine + Params + Input Format + Output Format + Location + Documentation 2. Display/Access App + Input Format + Output Format + Location + Documentation Hardware Environment Software Environment Rendering Programs Operating System
12
2/26/2004 Dan Swaney 12 Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description -Defines Attributes: + OS Name + OS version + Location + Documentation Lacks/Needs: - Recommended Env. or - Minimum Env. - It’s easier to define the environment in terms of recommended or minimum. Hardware Environment Software Environment Rendering Programs Operating System
13
2/26/2004 Dan Swaney 13 Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description -Defines Attributes: 1. Computation Resources + Microprocessor Required (e.g. Pentium 4 1Ghz) + Memory Required + Documentation + Location (URL) 2. Storage + Storage Information (req’s 10GB diskspace) + Documentation + Location (URL) 3. Peripherals + Peripheral Requirements (Sound card, Monitor Resolution) + Documentation + Location (URL) Hardware Environment Software Environment StoragePeripherals Computational Resources
14
2/26/2004 Dan Swaney 14 Content Information (CI) Attributes Content Information (CI) Package ‘Content’ Data Object Representative Information ‘Content’ Data Object Description Environment Description -Defines Attributes: 4. Hardware Environment as a Whole + Location (e.g. the machine is in a ‘technology museum’ or available through a emulation program like VMWare) Hardware Environment Software Environment StoragePeripherals Computational Resources
15
2/26/2004 Dan Swaney 15 Implementing Two of the Components of the OAIS Model Second: Preservation Description Information (PDI) –Focuses on the information to track a history of the ‘Content’ Data Object How it was added/scanned into digital form Who did it Who took care of it at some point in time Like a Library Index Card in the back of a book tracking who checked it out
16
2/26/2004 Dan Swaney 16 PDI’s Four Categories Preservation Description Information (PDI) Reference Info Context Info Provenance Info Fixity Info Describes mechanisms for assigning an ID to represent the Data Object both: -Locally (within the archive) (and) -Globally (referenced by an external system) Defines Attributes: 1. Archival System ID + Value + Constr. Method + Resp. Agency 2. Global ID (ISBN, URL) + Value + Constr. Method + Resp. Agency 3. Resource Description + Existing Metadata (MARC bibl. record) + Existing Records (bibliographic record in WorldCat)
17
2/26/2004 Dan Swaney 17 PDI: 3 Types of Reference Info Preservation Description Information (PDI) Reference Information Context Information Provenance Information Fixity Information Archival System Identification Global Identification Resource Description Defines Attributes: 1. Archival System ID + Value + Constr. Method + Resp. Agency 3. Resource Description + Existing Metadata (MARC bibl. record) + Existing Records (bibliographic record in WorldCat) 2. Global ID (ISBN, URL) + Value + Constr. Method + Resp. Agency
18
2/26/2004 Dan Swaney 18 PDI: Types of Context Information Preservation Description Information (PDI) Reference Information Context Information Provenance Information Fixity Information Reason for Creation Relationships Intellectual Content Defines Attributes: 1.Reason for Creation (TIFF file created to save a rare book) 2.Relationships (Part of a Collection) (Chapters in a Book) + Manifestation (Change History, Recording outcome of a migration) + Relationship Type (Translated to HTML) + Identification (ID/Link to Description of Object) + Intellectual Content (Relates a chapter to a book) + Relationship Type (Web Page, Collection) + Identification (ID/Link to Description of ‘related’ object) Manifestation
19
2/26/2004 Dan Swaney 19 PDI: Types of Provenance Information Preservation Description Information (PDI) Reference Information Context Information Provenance Information Fixity Information There are 5 Event Types defined as Attributes: 1.Origin (Event) Describ es the process by which the object was created. 2.Pre-Ingest (Event) - Chain of Custody or Audit Trail. - Tracks History of Content before it was digitized or added to the archive. 3.Ingest (Event) - Tracks how the object was added to the archive 4.Archival Retention - Tracks migration history of what happened since it’s original ingest/add into the archive. If transformed, records what was lost. 5.Rights Management (Event) - Access Permissions - Legal Deposit Responsibilities (if sensitive) *. Event + Designation - Change in Custody - Migration + Procedure + Date + Resp. Agency + Outcome + Note
20
2/26/2004 Dan Swaney 20 PDI: Types of Provenance Information Preservation Description Information (PDI) Reference Information Context Information Provenance Information Fixity Information Goal: To not have something altered and not know when, how, or why. Defined Attributes: 1.Object Authentication - Digital Signature - Watermark - Checksum + Auth Type (Signed using 128-bit one-way SHA-1 hash) + Auth Procedure (Pointer to software capable of generating a new SHA-1 hash for comparison) + Auth Date (Last time this procedure was used/ran) + Auth Result (Latest result of running this procedure).
21
2/26/2004 Dan Swaney 21 Review of the PDI Content Information Package Preservation Description Information (PDI) -Reference Info -Identifiers both internal and external to the archive (e.g. ISBN, URN) -Provenance Info -Documents history of the CI (simulates a library checkout card that shows who checked out the book) -Context Info -Relates CI to why it was created, relations to other objects -Fixity Info -Data Integrity (Checksum, Hash, Signature) -History of Changes -Keeps content from being altered without knowing when or why - Info to manage preservation of Content Info
22
2/26/2004 Dan Swaney 22 Inside the Information Package Information Package Content Information (CI) Preservation Description Information (PDI) Packaging Information Descriptive Information - ‘Content’ Data Object - Representation Info - Info to manage preservation of Content Info - Reference Info - Provenance Info - Context Info - Fixity Info - Metadata for Resource Discovery - Assists finding aids - An Abstract? - Derived from: CI & PDI - Header block of info that binds together an Archive Information Package - Binds together: - digital object + - assoc. metadata
23
2/26/2004 Dan Swaney 23 Conclusion Extended the OAIS Information Model to define a Framework of Metadata Elements that implement the concept. Focused on only 2 areas critical to preserving a Data Object
24
2/26/2004 Dan Swaney 24 What’s Next to Do? Develop ‘best practices’ toward populating a database archive. –Assess degree of technical richness –Develop automated algorithms –Determine scope of sharing Later move from ‘best practices’ to a formalized standard of processes.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.