Presentation is loading. Please wait.

Presentation is loading. Please wait.

Moving from HDF4 to HDF5/netCDF-4

Similar presentations


Presentation on theme: "Moving from HDF4 to HDF5/netCDF-4"— Presentation transcript:

1 Moving from HDF4 to HDF5/netCDF-4
Elena Pourmal, Kent Yang, Joe Lee , The HDF Group This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C

2 Outline Difference between HDF4 and HDF5 Data model and capabilities
Moving data and applications from HDF4 to HDF5 Taking advantage of HDF5 when converting data Creating compatibility with netCDF-4 when migrating data from HDF4 to HDF5

3 HDF4 and HDF5 Data Models HDF4 Objects HDF5 Objects
A scientific dataset (SD), a multidimensional array with dimension scales An 8-bit raster image (DFR8), a 2- dimensional array of 8-bit pixels A 24-bit raster image (DF24), a 2- dimensional array of 24-bit pixels A general raster image (GR), a 2- dimensional array of multi-component pixels An 8-bit color lookup table or palette (DFP), a 256 by 3 array of 8 bit integers A table (Vdata), a sequence of records An annotation (AN), a stream of text that can be attached to any object A group (Vgroup), a structure for grouping objects A dataset, a multidimensional array of records; no dimension scales (HL library) HDF5 dataset with attributes HDF5 one-dim dataset of the records Attributes, scale down version of HDF5 dataset A group, a structure for grouping objects

4 HDF4 and HDF5 Capabilities
2GB limit on file size Limit on number of objects (~20000) One unlimited dimension; dataset cannot be compressed Compression doesn’t require chunking storage Limited number of compression methods Limited number of supported datatypes Numerical data is always in BE format No limit on file size No limit on number of objects Up to 32 unlimited dimensions; dataset can be compressed Compression requires chunking storage Custom compression methods supported Datatypes of any complexity User-defined endianess

5 Moving data and applications from HDF4 and HDF5
H4h5tools conversion toolkit Mapping Spec Library Command-line tools h4toh5 and h5toh4 Moving Applications Software has to be rewritten if using HDF library HDF-EOS2 and netCDF based applications require minimum rework

6 Taking advantages of HDF5 and avoiding pitfalls
Data endianess Chunked storage for compression and data extensibility Contiguous vs. chunked storage Chunk sizes Compression methods in HDF4 and HDF5 Using strings in HDF5 HDF4 fixed character arrays vs HDF5 strings Working with dimension scales

7 Creating compatibility with netCDF-4
HDF5 files can be read by netCDF-4 library and tools unless they use features that are not supported by netCDF-4 See unsupported HDF5 features in netCDF-4 To assure maximum interoperability do not use Hierarchical HDF5 structure (nested groups) HDF5 user-defined datatypes HDF5 compound datatypes   NetCDF-4 intentionally supports a simpler data model than HDF5, which means there are HDF5 files that cannot be converted to netCDF-4, including files that make use of features in the following list: * Multidimensional data that doesn't use shared dimensions implemented using HDF5 "dimension scales". (This restriction was eliminated in netCDF 4.1.1, permitting access to HDF5 datasets that don't use dimension scales.) * Non-hierarchical organizations of Groups, in which a Group may have multiple parents or may be both an ancestor and a descendant of another Group, creating cycles in the subgroup graph. In the netCDF-4 data model, Groups form a tree with no cycles, so each Group (except the top-level unnamed Group) has a unique parent. * HDF5 "references" which are like pointers to objects and data regions within a file. The netCDF-4 data model does not support references. * Additional primitive types not included in the netCDF-4 data model, including H5T_TIME, H5T_BITFIELD, and user-defined atomic types. * Multiple names for data objects such as variables and groups. The netCDF-4 data model requires that each variable and group have a single distinguished name. * Attributes attached to user-defined types. * Stored property lists * Object names that begin or end with a space If you know that an HDF5 file conforms to the netCDF-4 enhanced data model, either because it was written with netCDF function calls or because it doesn't make use of HDF5 features in the list above, then it can be accessed using netCDF-4, and analyzed, visualized, and manipulated through other applications that can access netCDF-4 files.

8 Creating compatibility with netCDF-4 (to be expanded)
Tools NetCDF-3 Format NetCDF-3 Format following CF Conventions NetCDF-4 Format following netCDF-4 generic model NetCDF-4 Format following netCDF-4 classic model NetCDF-4 Format following netCDF-4 classic model and CF HDF-EOS5 augmentation tool No Yes. Note: Users have flexibility to specify dimension scales. Tested with NASA Aura files. HDF-EOS5 to netCDF-4 converter Yes. Note: Users have no control. The converter tries to map the HDF-EOS5 dimension information provided by the file to netCDF-4 enhanced model. In addition to using the tools, one may follow some instructions to create netCDF-4 files via HDF5 APIs, At NASA File Format standard document,  Appendix B  Creating a valid netCDF4 file using the HDF5 API may be a reference for people to check.

9 This work was supported by NASA/GSFC under Raytheon Co
This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C


Download ppt "Moving from HDF4 to HDF5/netCDF-4"

Similar presentations


Ads by Google