Presentation is loading. Please wait.

Presentation is loading. Please wait.

LOC 13 June 2003 1 NSSDC Role and OAIS Implementation Brief Overview Don Sawyer.

Similar presentations


Presentation on theme: "LOC 13 June 2003 1 NSSDC Role and OAIS Implementation Brief Overview Don Sawyer."— Presentation transcript:

1 LOC 13 June 2003 1 NSSDC Role and OAIS Implementation Brief Overview Don Sawyer

2 LOC 13 June 2003 2 NSSDC Roles NSSDC is the NASA Office of Space Science (OSS) permanent archive — Astronomy, Solar & Space Plasma Physics, Planetary & Lunar data — Digital and film data spanning 1958-2002 from >1300 instruments flown on >375 spacecraft — Distinguished from OSS Active Archives (AA) Interacts in a timely manner with all distributed OSS active archives in space physics, solar physics, astrophysics, and planetary science disciplines to acquire the OSS data and supporting metadata needed for long term preservation and understanding; — interact directly with projects when mediated by an active archive; — interact with PI's and related individuals when they have data needing long-term preservation.

3 LOC 13 June 2003 3 OSS Archive Relationships Planetary AAsSolar AAsSEC AAsAstrophysics AAs Various OSS S/C Projects NSSDC Permanent Archive DLTs, Tapes, CD/DVDs, Film, Paper Anonymous FTP OSS Researchers, Non-OSS Researchers Education Community, General Public PDS and SEC data on media

4 LOC 13 June 2003 4 NSSDC Roles (concl’d) NASA's lead for Consultative Committee for Space Data Systems (CCSDS) Archiving and Data Packaging/Registry Working Groups (on-ground data management) — Led development of CCSDS/ISO Open Archival Information System reference model standard Comprehensive information base about all launched spacecraft (~6000) Host of World Data System for Satellite Information — Part of worldwide World Data Center infrastructure established ~1958

5 LOC 13 June 2003 5 NSSDC’s Permanent Archive Environment - Legacy View ~20 TB in ~2,300 digital data sets on ~40,000 offline media — Most on tape — Most newly arriving media are CD's or DVD's "Data set" is all data from a given source (e.g., instrument on a spacecraft) at a given "processing level." Wide range of data characteristics (e.g., documented binaries specific to now-obsolete computers) Also, ~2,000 data sets on large number of film media of various form factors. — Gradually being digitized into TIFF via scanning.

6 LOC 13 June 2003 6 Initial Drivers for OAIS Re-engineering Needed to solve a migration problem — Remove dependencies of VAX VMS files on the operating system — Include record defining attributes in a standard form to accompany the data file content — Result was package of data/metadata Had software, based on CCSDS/ISO packaging standard, that could be augmented OAIS reference model provided an architectural view

7 LOC 13 June 2003 7 Created Archival Information Package Single File (binary/ascii content) Uses CCSDS/ISO packaging (SFDU) to hold multiple data objects — NSSDC defined attribute object expressed in CCSDS/ISO Parameter Value Language (PVL) — NSSDC data file content in one of four canonical forms Two flavors each of binary and ascii — 20-byte SFDU ascii labels to separate data objects

8 LOC 13 June 2003 8 NSSDC Attribute Object — Object identification and version — Archival Storage Id ( unique) — Collection Id — Checksum over rest of attribute object — Attributes for original data stream Date/time created, operating system, size in bytes, record format, binary/ascii flag, file name, checksum, etc. — Attributes for canonical form of data stream Date/time created, operating system, size in bytes, record format, binary/ascii flag, file name, checksum, processing report, format identifier (ADID), etc. — Order applied encodings (e.g., tar,gzip) — Start date/time of data observations

9 LOC 13 June 2003 9 NSSDC Permanent Archive - New Direction Bundle data files (objects) with data_file-descriptive attribute file (object) and pointers to further documentation into OAIS "Archive Information Package (AIP)" — Write to Digital Linear Tape (DLT)-based jukebox in unix environment — Write data files and attribute files to RAID disk for ftp-based access by external customer AIP Structure Attribute Object (AO) Label Sensor Data Object (SDO) CCSDS/ISO Label for Packaging CCSDS/ISO Label for Attribute Object CCSDS/ISO Label for Sensor Data Object Globally Unique Registry Identifiers Globally Unique Registry Identifier Expressed using CCSDS/ISO language

10 LOC 13 June 2003 10 “New Direction”

11 LOC 13 June 2003 11 Migrating Data into AIPs Have created AIPs for data previously on NSSDC's newly retired 12" WORM data dissemination jukebox — VMS-based, so some attributes placed in attribute objects compensate for loss of VMS/Files-11 support — Modified data files in cases of variable-length records, and introduced "CR/LF" for appropriate ASCII data Now creating multi-data-file AIP and upgrading software to accommodate data migrating from legacy offline tapes — Will start ingest from tape imminently

12 LOC 13 June 2003 12 Facilitating Archiving via Data Supplier Support NSSDC has provided software to the IMAGE spacecraft project — Generates attribute objects and bundles these with data files into Archive Information Packages (AIP — IMAGE script transmits these to NSSDC Looking for other opportunities to support NASA spacecraft projects equivalently —Cost-effective data ingest Data files Configuration information NSSDC Package Generator AIPs National Space Science Data Center ftp IMAGE Script IMAGE Science Operations Centre

13 LOC 13 June 2003 13 NSSDC Architecture Summary For the system architecture: — compliant with the OAIS functional model separates different functions : ingest, archival storage, data management, access — Compliant with the OAIS information model defines an Archival Information Package (AIP) for preservation in Archival Storage Data are being migrated into Archival Information Packages for long-term storage on DLTs New data received arrive as AIPs (e.g., the IMAGE project) or are put into AIPs during the Ingest process

14 LOC 13 June 2003 14 Current Activities Developing a better integration of our metadata databases — Many have grown up over the years — Taking advantage of Java and web capabilities Developing an Archival Information Package type that allows multiple ‘canonical data files’ in a single package file. — Needed for the migration of legacy data on magnetic tape — Needed to put small files together for ease of management Planning a better overall integration of our architecture — E.g., tighter coupling between AIPs and other information bases

15 LOC 13 June 2003 15 Backups

16 LOC 13 June 2003 16 NSSDC AIP Schematic

17 LOC 13 June 2003 17 NSSDC Archive - Logical Architecture

18 LOC 13 June 2003 18 Archive Challenges Making most cost-benefit favorable judgements on modernization of low-access-potential older data sets. — Convert vendor-specific binaries to IEEE-binary? Via EAST? Convert to ASCII? Implement efficient production process for migrating data from ~10,000 tapes through AIP-creation software to nearline DLT-based permanent archive Define post-DLT permanent archive environment Ensuring existence of all material needed to make data correctly and independently usable — Couple such material to the data being supported

19 LOC 13 June 2003 19 NSSDC Metadata Environment Information base (JEDS) about — All launched spacecraft, — Instruments on space science spacecraft, — NSSDC-held data sets therefrom. — Underlies "NSSDC Master Catalog" interface. Information base (DIOnAS) about data files — Written to new nearline permanent archive — Written to anonymous nssdcftp/spacecraft_data/ Attribute objects with technical information about data files Information base (JIN) about data media

20 LOC 13 June 2003 20 NSSDC Metadata Environment (concl’d) Information base (CAOIS) of CCSDS-registered data set-descriptive information (e.g., formats) — Assigns globally-unique registry identifiers — Relevant to growing fraction of NSSDC data plus other data Array of "data set catalogs" with detailed information on NSSDC-held legacy data sets — Presently on CD's as TIFF and PDF images Other special purpose information bases and metadata collections NSSDC data set ID's are primary mechanism currently linking these "metadata modules"

21 LOC 13 June 2003 21 NSSDC’s Metadata Challenges To ensure flow to NSSDC of material needed for the correct and independent use of data along with the flow of data to NSSDC To optimally integrate metadata modules to support: — Users' finding, retrieval and use of data, — NSSDC staffers' archive management activities To ensure that all relevant supporting material is visible to and readily retrievable by NSSDC's data-accessing customers.

22 LOC 13 June 2003 22 Software NSSDC has growing amount of low-processing-level (lpl) data — Started archiving such data only in past decade NSSDC has very little data set-specific READ/PROCESS software — This greatly limits usability of lpl data Lpl data handled by systems/formats like SDDAS/IDFS and IMAGE_Archive/UDF Major need for software standards/approaches to accompany lpl data into archives — Ensure long-term usability of such data Archiving of relevant software source code a minimal requirement


Download ppt "LOC 13 June 2003 1 NSSDC Role and OAIS Implementation Brief Overview Don Sawyer."

Similar presentations


Ads by Google