1 Data Management with HDF5 Quincey Koziol Director of Core Software Development and HPC The HDF Group September 10, 2012NASA Digital Twin Workshop
HDF5 is… A file format for managing any kind of data Software system to manage data in the format Designed for high volume or complex data Designed every size and type of system Open format and software September 10, 2012NASA Digital Twin Workshop2
Data Life Cycle 3 Acquisition Planning Cleaning Transformation Packaging Use Reuse Distribution Processing Analysis Repurposing Archival Products Data Life Cycle September 10, 2012NASA Digital Twin Workshop
NASA Earth Observing System (EOS) September 10, 2012NASA Digital Twin Workshop4 Aqua (6/01) Aura TESHRDLS MLSOMI Terra CERESMISR MODISMOPITT Aqua CERES MODIS AMSR
Aberdeen Test Center September 10, 2012NASA Digital Twin Workshop5 5
Aberdeen Test Center September 10, 2012NASA Digital Twin Workshop6 Application SQL query Query results RDBMS HDF5 Relational tables Indexes RDBMS used for SQL queries; points to objects in HDF5 files Direct access thru HDF5 API for scientific analysis Anaylsis Anaysis results Relational tables Indexes Hybrid: HDF5 and DB side-by-side
NARA – TWR Collection Goal: Using NARA’s TWR collection, investigate the possibilities and limitations of using HDF5 as a container for archiving heterogeneous collections of records, with special attention to STEP data. September 10, 2012NASA Digital Twin Workshop7
NARA – TWR Collection Use files, datatypes, structures in NARA TWR collection – STEP files, photos, schematics, etc. Map these to HDF5 objects and structures, exploiting features of HDF5 Assess benefits and costs in terms of storage efficiency and accessibility Investigate use of HDF5 as container for collection September 10, 2012NASA Digital Twin Workshop8 Activities
NARA – TWR Collection September 10, 2012NASA Digital Twin Workshop9 TWR files from NARA Converted to HDF5, displayed in HDFView
Protecting Access to Your Data 10 Parallel File System Cloud Exascale Alternate File Formats Remote Access API Multimedia Portability Flash Storage Programming Languages Processor Architecture Vendor Lock-in Open Source REST OPeNDAP Supercomputer PC Embedded Device XML Database HDF5 File Format Evolution Scalability Performance Technological Fad Insurance Backward/Forward Compatibility Extensibility September 10, 2012NASA Digital Twin Workshop
HDF5 Wrap-Up For all scientific and engineering data Provides flexible, efficient storage and I/O Supports long-term data access Data platform for mission critical operations Big solutions today and tomorrow 11September 10, 2012NASA Digital Twin Workshop
12 Questions? September 10, 2012NASA Digital Twin Workshop
NASA Earth Observing System (EOS) September 10, 2012NASA Digital Twin Workshop13 Aqua (6/01) Aura TESHRDLS MLSOMI Terra CERESMISR MODISMOPITT Aqua CERES MODIS AMSR
HDF5 Wrap-Up (with audio) For all scientific and engineering data Provides flexible, efficient storage and I/O Supports long-term data access Data platform for mission critical operations Big solutions today and tomorrow 14September 10, 2012NASA Digital Twin Workshop