Moving from HDF4 to HDF5/netCDF-4

Slides:



Advertisements
Similar presentations
The HDF Group Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Ruth Duerr, NSIDC Christopher Lynnes, GES DISC Mike.
Advertisements

Chapter 10: Designing Databases
1 Projection Indexes in HDF5 Rishi Rakesh Sinha The HDF Group.
The HDF Group Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal The HDF Group HDF/HDF-EOS Workshop XIV September 30, 2010.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
The HDF Group Apr , 2012HDF/HDF-EOS Workshop XV1 Interoperability with netCDF-4 Kent Yang, Larry Knox, Elena Pourmal The HDF Group.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
The HDF Group July 8, 2014HDF 2014 ESIP Summer Meeting HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann The.
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1.
University of Illinois at Urbana-ChampaignHDF 9/19/2000 McGrath 9/19/ Transition from HDF4 to HDF5: Issues Robert E. McGrath NCSA University of Illinois.
The HDF Group ESIP Summer Meeting HDF OPeNDAP update Kent Yang The HDF Group 1 July 8 – 11, 2014.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
The future of MINC Robert D. Vincent
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
Data Structure & File Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
Towards Long-Term Archiving of NASA HDF-EOS and HDF Data Data Maps and the Use of Mark-Up Language Ruth Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter.
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
1 HDF5 Life cycle of data Boeing September 19, 2006.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
The HDF Group HDF/HDF-EOS Workshop XV1 Tools to Improve the Usability of NASA HDF Data Kent Yang and Joe Lee The HDF Group April 17, 2012.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
HDF Hierarchical Data Format Nancy Yeager Mike Folk NCSA University of Illinois at Urbana-Champaign, USA
NPOESS Enhanced Description Tool - “ned” Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPO Data / Information Architecture Algorithm / System Engineering.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
10/16/2012Annual HDF briefing1 HDF OPeNDAP support Kent Yang, Joe Lee, Mike Folk The HDF Group Oct. 16, 2012.
FITSIO, HDF4, NetCDF, PDB and HDF5 Performance Some Benchmarks Results Elena Pourmal Science Data Processing Workshop February 27, 2002.
11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD1 Software to access HDF5 Datasets via OPeNDAP MuQun Yang, Hyo-Kyung Lee The HDF Group.
NASA HDF-EOS File Format Overview Joseph M Glassy, Director, MODIS Software Development at NTSG School of Forestry, Numerical Terradynamics Simulation.
Development of a CF Conventions API Russ Rew GO-ESSP Workshop, LLNL
Utilities for netCDF-4 Dr. Dennis Heimbigner Unidata Advanced netCDF Workshop July 25, 2011.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
HDF/HDF-EOS Meeting Oct th 2008, Aurora CO Proposal for adding Named Dimensions to HDF5 Arrays Daniel Kahn Science Systems and Applications, Inc.
HDF and HDF-EOS Workshop XII
Emergent Information Technologies, Inc.
HDF Product Designer: Using Templates to Achieve Interoperability
Hierarchical Data Formats (HDF) Update
Data Are from Mars, Tools Are from Venus
Transition from HDF4 to HDF5: Issues
Type Checking Generalizes the concept of operands and operators to include subprograms and assignments Type checking is the activity of ensuring that the.
Introduction to HDF5 Session Five Reading & Writing Raw Data Values
HDF5 October 8, 2017 Elena Pourmal Copyright 2016, The HDF Group.
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
HDF5 Metadata and Page Buffering
Kent Yang, Mike Folk The HDF Group March 31, 2009
Introduction to HDF5 Session Four Java Products
Efficiently serving HDF5 via OPeNDAP
National Scientific Library at Tbilisi State University
What NetCDF users should know about HDF5?
Unidata Advanced netCDF Workshop
HDF5 Virtual Dataset Elena Pourmal Copyright 2017, The HDF Group.
HDF Support for NASA Data Producers
Moving applications to HDF
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension Debbie Mao, Daniel Ziskin, Merritt Deeter, Sara Martinez-alonso MOPITT is an.
Hierarchical Data Format (HDF) Status Update
HDF-EOS Workshop XXI / The 2018 ESIP Summer Meeting
Presentation transcript:

Moving from HDF4 to HDF5/netCDF-4 Elena Pourmal, Kent Yang, Joe Lee epourmal@hdfgroup.org , myang6@hdfgroup.org, hyoklee@hdfgroup.org The HDF Group This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C

Outline Difference between HDF4 and HDF5 Data model and capabilities Moving data and applications from HDF4 to HDF5 Taking advantage of HDF5 when converting data Creating compatibility with netCDF-4 when migrating data from HDF4 to HDF5

HDF4 and HDF5 Data Models HDF4 Objects HDF5 Objects A scientific dataset (SD), a multidimensional array with dimension scales An 8-bit raster image (DFR8), a 2- dimensional array of 8-bit pixels A 24-bit raster image (DF24), a 2- dimensional array of 24-bit pixels A general raster image (GR), a 2- dimensional array of multi-component pixels An 8-bit color lookup table or palette (DFP), a 256 by 3 array of 8 bit integers A table (Vdata), a sequence of records An annotation (AN), a stream of text that can be attached to any object A group (Vgroup), a structure for grouping objects A dataset, a multidimensional array of records; no dimension scales (HL library) HDF5 dataset with attributes HDF5 one-dim dataset of the records Attributes, scale down version of HDF5 dataset A group, a structure for grouping objects

HDF4 and HDF5 Capabilities 2GB limit on file size Limit on number of objects (~20000) One unlimited dimension; dataset cannot be compressed Compression doesn’t require chunking storage Limited number of compression methods Limited number of supported datatypes Numerical data is always in BE format No limit on file size No limit on number of objects Up to 32 unlimited dimensions; dataset can be compressed Compression requires chunking storage Custom compression methods supported Datatypes of any complexity User-defined endianess

Moving data and applications from HDF4 and HDF5 H4h5tools conversion toolkit Mapping Spec https://support.hdfgroup.org/HDF5/doc/ADGuide/H4toH5Mapping.pdf Library Command-line tools h4toh5 and h5toh4 Moving Applications Software has to be rewritten if using HDF library HDF-EOS2 and netCDF based applications require minimum rework

Taking advantages of HDF5 and avoiding pitfalls Data endianess Chunked storage for compression and data extensibility Contiguous vs. chunked storage Chunk sizes Compression methods in HDF4 and HDF5 Using strings in HDF5 HDF4 fixed character arrays vs HDF5 strings Working with dimension scales

Creating compatibility with netCDF-4 HDF5 files can be read by netCDF-4 library and tools unless they use features that are not supported by netCDF-4 See unsupported HDF5 features in netCDF-4 http://www.unidata.ucar.edu/software/netcdf/docs/faq.html#fv15 To assure maximum interoperability do not use Hierarchical HDF5 structure (nested groups) HDF5 user-defined datatypes HDF5 compound datatypes  http://www.unidata.ucar.edu/software/netcdf/docs/interoperability_hdf5.html NetCDF-4 intentionally supports a simpler data model than HDF5, which means there are HDF5 files that cannot be converted to netCDF-4, including files that make use of features in the following list: * Multidimensional data that doesn't use shared dimensions implemented using HDF5 "dimension scales". (This restriction was eliminated in netCDF 4.1.1, permitting access to HDF5 datasets that don't use dimension scales.) * Non-hierarchical organizations of Groups, in which a Group may have multiple parents or may be both an ancestor and a descendant of another Group, creating cycles in the subgroup graph. In the netCDF-4 data model, Groups form a tree with no cycles, so each Group (except the top-level unnamed Group) has a unique parent. * HDF5 "references" which are like pointers to objects and data regions within a file. The netCDF-4 data model does not support references. * Additional primitive types not included in the netCDF-4 data model, including H5T_TIME, H5T_BITFIELD, and user-defined atomic types. * Multiple names for data objects such as variables and groups. The netCDF-4 data model requires that each variable and group have a single distinguished name. * Attributes attached to user-defined types. * Stored property lists * Object names that begin or end with a space If you know that an HDF5 file conforms to the netCDF-4 enhanced data model, either because it was written with netCDF function calls or because it doesn't make use of HDF5 features in the list above, then it can be accessed using netCDF-4, and analyzed, visualized, and manipulated through other applications that can access netCDF-4 files.

Creating compatibility with netCDF-4 (to be expanded) Tools NetCDF-3 Format NetCDF-3 Format following CF Conventions NetCDF-4 Format following netCDF-4 generic model NetCDF-4 Format following netCDF-4 classic model NetCDF-4 Format following netCDF-4 classic model and CF HDF-EOS5 augmentation tool No Yes. Note: Users have flexibility to specify dimension scales. Tested with NASA Aura files. HDF-EOS5 to netCDF-4 converter Yes. Note: Users have no control. The converter tries to map the HDF-EOS5 dimension information provided by the file to netCDF-4 enhanced model. In addition to using the tools, one may follow some instructions to create netCDF-4 files via HDF5 APIs, At NASA File Format standard document, https://cdn.earthdata.nasa.gov/conduit/upload/497/ESDS-RFC-022v1.pdf Appendix B  Creating a valid netCDF4 file using the HDF5 API may be a reference for people to check.

This work was supported by NASA/GSFC under Raytheon Co This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C