The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
The problem Too much digital data! –It takes about 15 minutes for the world to churn out new digital information equivalent to the entire collection in US Library of Congress Proprietary file formats –Expected lifetime of typical manufacturing software application only 3 years Short-lived Computing hardware and software –Expected lifetime of today’s storage/retrieval technologies only 10 years Products often outlive computer software/hardware by an order of magnitude –Aircraft can last 50 years or more –Healthcare records should be preserved through the patient’s lifetime, and perhaps beyond
It’s not just about preservation How will the repository be accessed in the future? –Reference, reuse, rationale? –Should drive present-day records management policies Is the repository trustworthy? –Organizational infrastructure –Digital object management –Technical infrastructure, security
What is OAIS? International standard: ISO 14721: 2003 –Online at Reference model for an Open Archival Information System Domain-general Implementation-agnostic Widely used –Digital libraries –Scientific data repositories –Product data engineering repositories
OAIS functional entities SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package
Information = Data + Interpretation Data Object Representation Information (metadata) Information Object
An information package Content Information Preservation Description Information Information Objects ReferenceReference ProvenanceProvenance ContextContext FixityFixity Sub-categories
Preservation Descriptive Information (PDI) Reference info identifies the content information using a unique ID or a bibliographic attribute Provenance info specifies history, including chain of custody. Guides consumers in judging trustworthiness Context info relates the content to other information outside the package Fixity info helps ensure authenticity using methods such as checksums or digital signatures PDI is the part of the OAIS reference model most closely pertaining to identity management
So what’s next?
Develop metrics for OAIS Goal is to measure how well a repository conforms to OAIS reference model ISO working group addressing this –Starting point: Trustworthy Repositories Audit & Certification: Criteria and Checklist, Center for Research Libraries, URL: NIST MEL developing metrics specific to long-term management of product-related engineering data –Sustaining Engineering Informatics: Toward Methods and Metrics for Digital Curation, 3rd International Digital Curation Conference, December 2007, URL:
Tailor OAIS to specific domains Emphasis –Producer/consumer interfaces –Metadata Functional entities –Ingest –Access Packaging –PDI –Packaging standards METS (Metadata Encoding Transmission Standard) PREMIS (Preservation Metadata Implementation Strategies) schemas