Presentation is loading. Please wait.

Presentation is loading. Please wait.

Don Sawyer NASA/National Space Science Data Center (NSSDC) Lou Reich

Similar presentations


Presentation on theme: "Don Sawyer NASA/National Space Science Data Center (NSSDC) Lou Reich"— Presentation transcript:

1 ISO “Reference Model For an Open Archival Information System (OAIS)” Tutorial Presentation
Don Sawyer NASA/National Space Science Data Center (NSSDC) Lou Reich Computer Sciences Corporation (CSC) Library of Congress June 13, 2003 1

2 Outline History Reference Model overview Some Applications
Follow-on Activities Producer-Archive Ingest Methodology Abstract Standard Standard Submission Information Package Archive Certification

3 NASA Role National Space Science Data Center
NASA’s first digital archive Experienced many technology changes since 1966 Consultative Committee for Space Data Systems International group of space agencies Developed variety of science discipline- independent standards Became working body for an ISO TC 20/ SC 13 about 1990 TC20: Aircraft and Space Vehicles SC13: Space Data and Information Transfer Systems

4 Initial Archive Standards Proposal
ISO suggested that SC 13 should develop archive standards Address data used in conjunction with space missions Address intermediate and indefinite long term storage of digital data

5 Response Response to Consultative Committee for Space Data Systems (CCSDS) and ISO TC 20/SC 13 No framework widely recognized for developing specific digital archive standards Begin by developing a ‘Reference Model’ to establish common terms and concepts Ensure broad participation, including traditional archives (Not restricted to space communities; all participation is welcome!) Focus on data in electronic forms, but recognize that other forms exist in most archives Follow up with additional archive standards efforts as appropriate

6 What is a Reference Model?
A framework for understanding significant relationships among the entities of some environment, and for the development of consistent standards or specifications supporting that environment. A reference model is based on a small number of unifying concepts is an abstraction of the key concepts, their relationships, and their interfaces both to each other and to the external environment may be used as a basis for education and explaining standards to a non-specialist.

7 Organizational Approach
Organize US contribution under a framework with NASA lead Established liaison with Federal Geographic Data Committee (FGDC) and National Archives and Records Administration (NARA) Agency archives and users must be represented in this process An “Open” process Important to stimulate dialogue with broad archive/user communities Results of US and International workshops put on WEB Support comments/critiques Broad international workshops also held UK and France Issue resolution at ISO/Consultative Committee for Space Data Systems international workshops Broad US participation is viewed as essential Can not be viewed as a “NASA show” NISO is creditable with private and academic communities FGDC has a federal role by executive order

8 Technical Approach Investigate other Reference Models.
ISO “Seven Layer”Communications Reference Model ISO Reference Model for Open Distributed Processing ISO TC211 Reference Model for Geomantics Define what is meant by ‘archiving of data’ Break ‘archiving’ into a few functional areas (e.g., ingest, storage, access, and preservation planning) Define a set of interfaces between the functional areas Define a set of data classes for use in Archiving Choose formal specification techniques Data flow diagrams for functional models and interfaces Unified Modeling Language (UML) for data classes

9 Results Reference Model targeted to several categories of reader
Archive designers Archive users Archive managers, to clarify digital preservation issues and assist in securing appropriate resources Standards developers Adopted terminology that crosses various disciplines Traditional archivists Scientific data centers Digital libraries

10 Reference Model Status
Already widely adopted as starting point in digital preservation efforts Digital libraries (e.g., Netherlands National Library) Traditional archives (e.g., US National Archives) Scientific data centers (e.g., National Space Science Data Center) Commercial Organizations (e.g., Aerospace Industries Association preservation working team) Published as final CCSDS standard (Blue Book) available from: Recently published as a final ISO standard: ISO 14721: 2003

11 Reference Model for an Open Archival Information System Technical Overview

12 Open Archival Information System (OAIS)
Reference Model standard(s) are developed using a public process and are freely available Information Any type of knowledge that can be exchanged Independent of the forms (i.e., physical or digital) used to represent the information Data are the representation forms of information Archival Information System Hardware, software, and people who are responsible for the acquisition, preservation and dissemination of the information This reference model is called : Reference Model for an OAIS (Open Archival Information System) These tems have to be explained : In the framework of the archiving reference model elaboration, a glossary has been defined. This is a crucial point for the common understandding of archival concepts 4

13 Document Organization
Introduction Purpose and Scope, Applicability, Rationale, Road Map for Future Work, Document Structure, and Definitions of Terms OAIS Concepts and Responsibilities High level view of OAIS functionality and information models OAIS external environment Minimum responsibilities to become an “OAIS” Detailed Models Functional model descriptions and information model perspectives Preservation perspectives Media migration, compression, format conversions, and access service preservation Archive Interoperability Criteria to distinguish types of cooperation among archives Annexes Scenarios of existing archives, compatibility with other standards The current version ot the Reference model document is organised as follow The parts in blue will no detailed to day 6

14 Purpose, Scope, and Applicability
Framework for understanding and applying concepts needed for long-term digital information preservation Long-term is long enough to be concerned about changing technologies Starting point for model addressing non-digital information Provides set of minimal responsibilities to distinguish an OAIS from other uses of ‘archive’ Framework for comparing architectures and operations of existing and future archives Basis for development of additional related standards Addresses a full range of archival functions Applicable to all long-term archives and those organizations and individuals dealing with information that may need long-term preservation Does NOT specify an implementation

15 Model View of an OAIS Environment
(archive) Management Producer Consumer Producer is the role played by those persons, or client systems, who provide the information to be preserved Management is the role played by those who set overall OAIS policy as one component in a broader policy domain Consumer is the role played by those persons, or client systems, who interact with OAIS services to find and acquire preserved information of interest The environment surrounding the OAIS is given by this simple model outside the OAIS are : producers : play the rôle of those who provide the information to preserve management : play the rôle of those who set overall OAIS policy consumers : play the rôle of those who interact with the OAIS services to find information of interest and to access this information 7

16 OAIS Responsibilities
Negotiates and accepts Information from information producers Obtains sufficient control to ensure long-term preservation Determines which communities (designated) need to be able to understand the preserved information Ensures the information to be preserved is independently understandable to the Designated Communities Follows documented policies and procedures which ensure the information is preserved against all reasonable contingencies Makes the preserved information available to the Designated Communities in forms understandable to those communities 11

17 OAIS Information Definition
Information is always expressed (i.e., represented) by some type of data Data interpreted using its Representation Information yields Information Information Object preservation requires clear identification and understanding of the Data Object and its associated Representation Information Data Object Interpreted Using its Representation Information Yields

18 Information Package Definition
Preservation Description Information Content Information An Information Package is a conceptual container holding two types of information Content Information Preservation Description Information (PDI)

19 Information Package Variants
Submission Information Package Negotiated between Producer and OAIS Sent to OAIS by a Producer Archival Information Package Information Package used for preservation Includes complete set of Preservation Description Information (PDI) for the Content Information Dissemination Information Package Includes part or all of one or more Archival Information Packages Sent to a Consumer by the OAIS

20 External Data Flow View
Producer Submission Information Packages OAIS Archival Information Packages queries result sets Dissemination Information Packages orders Consumer

21 Detailed Models Overview

22 Overview of Detailed Models
It was decided to do both a functional and an information model of the OAIS Both models were tasked to: Use the models to better communicate OAIS Concepts Use a well established, formal modeling technique Stay as implementation independent as possible Avoid detailed designs

23 Detailed Models Information Model

24 General Principles Define classes of “information objects’ that illustrate information necessary to enable Long-term storage and access to Archives The class definition should be implementation Independent Use a subset of Unified Modeling Language (UML)

25 UML Notation

26 Information Object 1+ Data Object 1+ Representation Information
Interpreted using Data Object 1+ Representation Information Interpreted using Physical Object Digital Object 1+ Bit Sequence

27 Representation Information
The Representation Information accompanying a physical object, like a moon rock, may give additional meaning It typically is a result of some analysis of the physically observable attributes of the rock The Representation Information accompanying a digital object, or sequence of bits, is used to provide additional meaning. It typically maps the bits into commonly recognized data types such as character, integer, and real and into groups of these data types. It associates these with higher level meanings which can have complex inter-relationships that are also described

28 Recursive Nature of Representation Information
Interpreted using * Structure Information Semantic Information Other Representation Information Representation Information 1 1 * Other Representation Information Structure Information Semantic Information adds meaning to

29 Types of Information Used in OAIS
Object . . . Preservation Description Information Packaging Information Descriptive Information Content Information

30 Content Information The information which is the primary object of preservation An instance of Content Information is the information that an archive is tasked to preserve. Deciding what is the Content Information may not be obvious and may need to be negotiated with the Producer The Data Object in the Content Information may be either a Digital Object or a Physical Object (e.g., a physical sample, microfilm)

31 Preservation Description Information
Provenance Information Describes the source of Content Information, who has had custody of it, what is its history Context Information Describes how the Content Information relates to other information outside the Information Package Reference Information Provides one or more identifiers, or systems of identifiers, by which the Content Information may be uniquely identified Fixity Information Protects the Content Information from undocumented alteration

32 PDI Examples

33 Descriptive Information
Contain the data that serves as the input to documents or applications called Access Aids. Access Aids can be used by a consumer to locate, analyze, retrieve, or order information from the OAIS.

34 Packaging Information
Information which, either actually or logically, binds and relates the components of the package into an identifiable entity on specific media Examples of Packaging Information include tape marks, directory structures and filenames

35 OAIS Archival Information Package
Package (AIP) Package Description derived from delimited by Packaging Information e.g., Information supporting customer searches for AIP e.g., How to find Content information and PDI on some medium Preservation Description Information (PDI) Content Information further described by e.g., • Hardcopy document • Document as an electronic file together with its format description • Scientific data set consisting of image file, text file, and format descriptions file describing the other files e.g., • How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured

36 AIP Types Archival Information Unit (AIU) contains a single Data Object as the Content Object Archival Information Collection (AIC) contains multiple AIPs in its Content Object Each member of an AIC is an AIP containing Content Information and PDI The AIC contains unique PDI on the collection process Archival Information Package Archival Information Unit Archival Information Collection

37 Package Descriptions and Access Aids
Package Descriptions are needed by an OAIS to provide visibility and access to the OAIS holdings Package Descriptions contain 1 or more Associated Descriptions which describe the AIP Content Information from the point of view of a single Access Aid Some example of Access Aids Include: Finding Aids - assist the consumer in locating information of interest Ordering Aids - allow the consumer to discover the cost of and order AIUs of interest Retrieval Aids - enable authorized users to retrieve the AIU described by the Unit Descriptor from Archival Storage

38 Information Model Summary
Presented a model of information objects as containing data objects and representation objects Classified information required for Long-term archiving into 4 classes: Content Information, PDI, Packaging Information and Descriptive Information Described how these classes would be aggregated and related in an AIP to fully describe an instance of Content Information Presented information needed for Access, in addition to that needed for Long-term Preservation Put the Access oriented structures in the context of the other data needed to operate an OAIS

39 Detailed Models Functional View

40 General Principles Highlight the major functional areas important to digital archiving Use functional decomposition to clarify the range of functionality that might be encountered Don't decompose beyond two levels to avoid becoming too implementation dependent Provide a useful set of terms and concepts Do not imply that all archives need to implement all the sub-functions Identify some common services which are likely to be needed, and are assumed to be available, as underlying support

41 Common Services Modern, distributed computing applications assume a number of supporting services Examples of Common Services include: inter-process communication name services temporary storage allocation exception handling security file and directory services

42 Open Archival Information System: Six Functional Entities
Preservation Planning P R O D U C E Data Management C O N S U M E R Descriptive Info. queries result sets SIP Ingest Access Archival Storage orders DIP AIP Administration MANAGEMENT SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package

43 Functional Entities In An OAIS
Ingest: This entity provides the services and functions to accept Submission Information Packages (SIPs) from Producers and prepare the contents for storage and management within the archive Archival Storage: This entity provides the services and functions for the storage, maintenance and retrieval of Archival Information Packages Data Management: This entity provides the services and functions for populating, maintaining, and accessing both descriptive information which identifies and documents archive holdings and internal archive administrative data. Administration: This entity manages the overall operation of the archive system Preservation Planning: This entity monitors the environment of the OAIS and provides recommendations to ensure that the information stored in the OAIS remain accessible to the Designated User Community over the long term even if the original computing environment becomes obsolete. Access: This entity supports consumers in determining the existence, description, location and availability of information stored in the OAIS and allowing consumers to request and receive information products

44 Ingest Data Flow Diagram

45 Preservation Planning

46 Preservation Perspectives

47 Preservation Description
Migration Context Content Information Identifier Data Management And Access View Descriptive Information Mapping AIP Identifier Archival Storage View Archival Storage Mapping Packaging Information Preservation Description Information Content Information

48 Digital Migration Digital Migration is defined to be the transfer of digital information, while intending to preserve it, within the OAIS. Focus on preservation of the full information content New information implementation replaces the old OAIS has full control and responsibility over all aspects of the transfer

49 Migration Motivators Motivators driving digital migrations Media Decay
Often this is superceded by escalating media drive maintenance costs Increased Cost Effectiveness More cost-effective media types with higher volumes and lower drive maintenance costs New User/Consumer Service Requirements New formats more compatible with user’s technology and applications Proprietary software evolution New software versions used to ‘upgrade’ formats of the information objects being preserved

50 Digital Migration Approaches
Four primary types of digital migration in response to motivators, ordered by increasing risk of information loss: Refreshment Media replacement with no bit changes Replication No change to Packaging Information or Content Information bits Repackaging Some bit changes in Packaging Information Transformation Reversible: Bit changes in Content Information are reversible by an algorithm Non-reversible: Bit changes in Content Information are not reversible by an algorithm

51 Access Preservation Effective access to digital information requires the use of software Application Programming Interfaces (APIs) may be cost-effectively maintained across time by an OAIS when: API is not too complex API is applicable to a wide variety of AIUs API source code may be ported to new environments Extensive testing is needed to ensure against information loss Preservation of executables by full emulation of underlying hardware is problematic Hard to know what is the information being preserved May not be possible to fully emulate associated devices

52 Archive Interoperability

53 Archive Interoperability Motivators
Users of multiple OAIS archives have reasons to wish for some interoperability or cooperation among the OAISs. Consumers Common finding aids to aid in locating information over several OAIS archives Common Package Descriptor schema for access Common DIP schema for dissemination, or a single global access site. Producers common SIP schema for submission to different archives a single depository for all their products. Managers Cost reduction through sharing of expensive hardware increasing the uniformity and quality of user interactions with the OAIS

54 Categories of Archive Interactions
Independent: no knowledge by one OAIS of Standards implemented at another Cooperating: Potentially common submission standards, and common dissemination standards, but no common access. One archive may make subscription requests for key data at the cooperating archive Federated: Access to all federated OAIS is provided through a common set of access aids that provide visibility into all participating OAISs. Global dissemination and Ingest are options Shared resources: An OAIS in which Management has entered into agreements with other OAISs is to share resources to reduce cost. This requires various standards internal to the archive (such as ingest-storage and access-storage interface standards), but does not alter the community’s view of the archive

55 Federated Archives Local Consumer Global OAIS 1 Common Catalog OAIS 2
Dissemination Information Package (Optional) Administration Common Catalog Access Ingest OAIS 1 OAIS 2

56 3 Levels of Autonomy in Associated Archives
No interactions and therefore no association Associations that maintain your autonomy. You have to do certain things to participate, but you can leave the association without notice or impact to you. Associations that bind you by contract. To change the nature of this association you will have to re-negotiate the contract. The amount of autonomy retained depends on how difficult it is to negotiate the changes.

57 Reference Model Summary
Reference model is to be applicable to all digital archives, and their Producers and Consumers Identifies a minimum set of responsibilities for an archive to claim it is an OAIS Establishes common terms and concepts for comparing implementations, but does not specify an implementation Provides detailed models of both archival functions and archival information Discusses OAIS information migration and interoperability among OAISs

58 Some Applications

59 Selected OAIS Usage Examples
Networked European Deposit Library (NEDLIB) Royal Library of the Netherlands IBM is developing an ‘OAIS like’ mplementation British National Library Asking IBM to extend its ‘OAIS like’ implementation Research Library Group and OnLine Computer Library Center Developed an OAIS based approach to ‘trusted repositories’ Web page to track OAIS implementation efforts/issues Library of Congress Hosting METS XML data packaging approach National Digital Information Infrastructure Preservation Program (NDIIPP)

60 Selected OAIS Usage Examples-2
InterPARES Body of National Archives from many countries, adopted OAIS as a starting point for their modeling work France set up a working group within ARISTOTE interested in archive of digital information, including libraries and Dept of Justice. (in french) “astonishing unifying role” from OAIS reference model System for Preservation and Access to Data and Information (SIPAD) French space agency plasma physics archive used the OAIS as a basis for design National Space Science Data Center (NSSDC) Evolving our archive using OAIS as a basis for a new architecture

61 Selected OAIS Usage Examples-3
National Archives and Records Administration contracted preservation work with San Diego Super Computer Center Both parties claimed use of the OAIS RM saved several weeks of effort in the specification of the task Similar experiences between: National Library of France and French space agency (CNES) representatives National Center for Supercomputer Applications HDF format developers and DNA researchers Life Sciences Archive developer and micro-gravity researchers United States Department of Agriculture and digital preservation experts

62 Follow-on Activities Research Libraries Group has established a web page to track OAIS implementation efforts and issues CCSDS Certification Coordination Function Will track and summarize various archive certification efforts Will attempt to extract high-level model/checklist RLG is organizing a group to establish certification approaches

63 Follow-on Activities - 2
Standard Submission Information Package Just getting started under CCSDS Archive Ingest Working Group CCSDS/ISO Producer-Archive Interface Methodology Standard Provides framework for Producer/Archive interactions Identifies steps and types of information exchanged during the ‘negotiation’ May be used as a checklist by archives

64 Producer-Archive Interface Methodology Abstract Standard
CCSDS/ISO Producer-Archive Interface Methodology Abstract Standard Overview

65 Model View of an OAIS Environment
(archive) Management Producer Consumer Producer is the role played by those persons, or client systems, who provide the information to be preserved Management is the role played by those who set overall OAIS policy as one component in a broader policy domain Consumer is the role played by those persons, or client systems, who interact with OAIS services to find and acquire preserved information of interest The environment surrounding the OAIS is given by this simple model outside the OAIS are : producers : play the rôle of those who provide the information to preserve management : play the rôle of those who set overall OAIS policy consumers : play the rôle of those who interact with the OAIS services to find information of interest and to access this information 7

66 Purpose Standardize the relationships and interactions between an information Producer and an Archive Abstract Model Terms and Concepts Define a methodology Allows all actions to be structured within this context Covers times from first contact by Producer until all information objects are received by the Archive Provide guidance on the specialization of the methodology to meet the needs of classes of archives, or of specific archives

67 Scope Identifies different phases in process of transferring information between Producer and Archive Defines objectives of each phase Actions to be carried out during each phase Expected results from end of each phase General framework able to be re-used for all processes related to Producer-Archive interactions Basis for development of additional related standards Basis for development of software tools to assist in different stages of the interactions between Producer and Archive

68 Applicability All archives conformant to OAIS Reference Model
May be of interest to archives not conformant to OAIS Reference Model Relevant to archives holding physical as well as digital materials

69 Methodology Conformance
When methodology is used by an archive for a particular ‘archive project’ (acquiring a set of information) Usage conforms when all actions in this standard have been considered and implemented as appropriate When methodology has been specialized or extended to be a Community Standard, it conforms when: All actions have been considered and incorporated appropriately, AND Methodology for creating the Community Standard has addressed the various work phases defined in section 4

70 Document Organization
Section1 Purpose, Scope, Applicability, Conformance Section 2 Overview of methodology, players, their relationships, activity phases Section 3 Detailed analysis of the four phased defined Preliminary definition phase Formal phase Transfer Phase Validation Phase Section 4 Work stages leading to a Community Standard Annex: Overview of OAIS Reference Model applicable to this standard

71 Overview Schematic

72 Preliminary Phase Outline
First contact Preliminary definition, Feasibility and assessment Information to be archived Digital objects and standards applied Quantification Object references Security conditions Legal and contractual aspects Transfer operations Validation Schedule Permanent impact on archive Summary of cost, risks Critical points Preliminary agreement

73 Example Actions

74 Formal Phase Outline

75 Transfer Phase Actions

76 Validation Phase Actions

77 Creating a Community “Producer-Archive” Standard
Examples of communities creating such a standard National or international standards bodies National or international organizations An individual archive to guide interactions with its Producers Work stages to be considered: Definition of terminology Information model for community Standards and tools available or required Address actions defined in the Abstract Standard Best practices Broad definition of the community Include diverse representation on the writing committee Publicize and seek comments from the community Submit to standards body as appropriate

78 Status Track versions of the document from
Register to participate Version “R-1, April 2003” just released for formal review Review Site Document Broad US participation is viewed as essential Can not be viewed as a “NASA show” NISO is creditable with private and academic communities FGDC has a federal role by executive order


Download ppt "Don Sawyer NASA/National Space Science Data Center (NSSDC) Lou Reich"

Similar presentations


Ads by Google