NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett Unidata, 8/4/9.

Slides:



Advertisements
Similar presentations
MOSS 2007 Document Management Adam McCarthy 1 st April 2009.
Advertisements

Data Formats: Using self-describing data formats Curt Tilmes NASA Version 1.0 Review Date.
The NCAR Command Language (NCL) and the NetCDF Data Format Research Tools Presentation Matthew Janiga 10/30/2012.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
Data Analytics and Dynamic Languages Lee E. Edlefsen, Ph.D. VP of Engineering 1.
1 Introducing Collaboration to Single User Applications A Survey and Analysis of Recent Work by Brian Cornell For Collaborative Systems Fall 2006.
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
NetCDF Ed Hartnett Unidata/UCAR
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
Show of Hands... How many traveled to be here? University/Gov't/Industry How many use netCDF? Primary programming language for netCDF? Other data formats.
Linux Operations and Administration
The HDF Group July 8, 2014HDF 2014 ESIP Summer Meeting HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann The.
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1.
1 of 14 Substituting HDF5 tools with Python/H5py scripts Daniel Kahn Science Systems and Applications Inc. HDF HDF-EOS Workshop XIV, 28 Sep
Data Formats: Using Self-describing Data Formats Curt Tilmes NASA Version 1.0 February 2013 Section: Local Data Management Copyright 2013 Curt Tilmes.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
NetCDF-4 The Marriage of Two Data Formats Ed Hartnett, Unidata June, 2004.
1 HDF-EOS and Related Tools Status Update. 2 Overview.
NetCDF and HDF5 Ed Hartnett, Unidata/UCAR, Unidata Mission: To provide the data services, tools, and cyberinfrastructure leadership that advance.
1 HDF-EOS Status, Related Tools and Issues. 2 Overview.
HDF5 A new file format & software for high performance scientific data management.
April 6, 2010GMQS Meeting1 Optional Feature Support in HDF5 Tools Albert Cheng The HDF Group.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
NetCDF-4 and Parallel I/O GSFC, Nov 20,2008 Ed Hartnett.
NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
The future of MINC Robert D. Vincent
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
Page 1 Status of HDF-EOS, Related Software, and Tools Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshp XIII Riverdale, MD November 4, 2009.
HDF 1 New Features in HDF Group Revisions HDF and HDF-EOS Workshop IX November 30, 2005.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
Towards Long-Term Archiving of NASA HDF-EOS and HDF Data Data Maps and the Use of Mark-Up Language Ruth Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter.
1 HDF-EOS Status, Related Tools and Issues. 2 Overview.
Unidata TDS Workshop THREDDS Data Server Overview
Accessing Remote Datasets using the DAP protocol through the netCDF interface. Dr. Dennis Heimbigner Unidata netCDF Workshop August 3-4, 2009.
Advanced Utilities Extending ncgen to support the netCDF-4 Data Model Dr. Dennis Heimbigner Unidata netCDF Workshop August 3-4, 2009.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
1 HDF5 Life cycle of data Boeing September 19, 2006.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
Remote Data Access with OPeNDAP Dr. Dennis Heimbigner Unidata netCDF Workshop October 25, 2012.
LDOPE QA Tools Sadashiva Devadiga (SSAI) MODIS LDOPE January 18, 2007.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
1 Status of HDF-EOS, Related Software and Tools. 2 TOOLKIT / HDF-EOS Support.
Comments from User Services C. Boquist/Code 423 The HDF Group Meeting 1 April 2009.
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
SDM Center Parallel I/O Storage Efficient Access Team.
Development of a CF Conventions API Russ Rew GO-ESSP Workshop, LLNL
Update on Unidata Technologies for Data Access Russ Rew
Libcf – A CF Convention Library for NetCDF Ed Hartnett Unidata Program Center Boulder Colorado June 11, 2007.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
DAP+NETCDF Using the netCDF-4 Data Model
Moving from HDF4 to HDF5/netCDF-4
NetCDF 3.6: What’s New Russ Rew
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
Requirements for GSICS Plotting Tool to support VIS/NIR products
Operation System Program 4
Moving applications to HDF
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Accessing Remote Datasets through the netCDF interface.
Libcf – A CF Convention Library for NetCDF
NCL variable based on a netCDF variable model
Palestinian Central Bureau of Statistics
Presentation transcript:

NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett Unidata, 8/4/9

Purpose of Interoperability Features: World Conquest The purpose of the interoperability features is to allow users to use netCDF programs on non- netCDF data archives. NetCDF-Java can read many data formats; the idea is to bring some of this functionality to the C/Fortran/C++ libraries.

Warning and Request HDF4 and HDF5 interoperability features are still being tested. They are not ready for operational use yet. The interoperability features are available in the netCDF daily snapshot release. Please use them and send feedback to:

Overview HDF4 Interoperability – What is HDF4 and why bother with it? – Reading HDF4 files with netCDF. – Limitations and request for help. HDF5 Interoperability – What is HDF5 and why bother with it? – Reading HDF5 files with netCDF. – Limitations.

What is HDF4? The original HDF format, superseded by HDF5. HDF4 has built-in 32-bit limits that make it unattractive for new data sets. It is still actively supported by The HDF Group, but no new features are added. Get more info about HDF4 at:

Why Read HDF4? Some important data sets are distributed in HDF4, for example the Aqua/Terra satellite data.

HDF4 Background HDF4 has several different APIs. The one of greatest interest to netCDF users is the SD (Scientific Data) API. The SD API is (intentionally) very similar to the netCDF classic data model.

Confusing: HDF4 Includes NetCDF v2 API A netCDF V2 API is provided with HDF4 which writes SD data files. This must be turned off at HDF4 install-time if netCDF and HDF4 are to be linked in the same application. There is no easy way to use both HDF4 with netCDF API and netCDF with HDF4 read capability in the same program.

Reading HDF4 SD Files Starting with version 4.1, netCDF will be able to read HDF4 files created with the “Scientific Dataset” (SD) API. This is read-only: NetCDF can't write HDF4! The intention is to make netCDF software work automatically with important HDF4 scientific data collections.

Building NetCDF to Read HDF4 This is only available for those who also build netCDF with HDF5. HDF4, HDF5, zlib, and other compression libraries must exist before netCDF is built. Build like this:./configure –with-hdf5=/home/ed –enable-hdf4

Compiling with HDF4 Include netcdf header file as usual. Include locations of netCDF, HDF5, and HDF4 include directories: -I/loc/of/netcdf/include -I/loc/of/hdf5/include - I/loc/of/hdf4/include

Linking with HDF4 The HDF4 and HDF5 libraries (and associated libraries) are needed and must be linked into all netCDF applications. The locations of the lib directories must also be provided: -L/loc/of/netcdf/lib -L/loc/of/hdf5/lib - L/loc/of/hdf4/lib -lmfhdf -ldf -ljpeg -lhdf5_hl -lhdf5 -lz

Use nc-config to Help with Compile Flags The nc-config utility is provided to help with compiler flags: $./nc-config --cflags -I/usr/local/include $./nc-config --libs -L/usr/local/lib -lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5 -lz -lm -lhdf4 $./nc-config --flibs -M/usr/local/lib -lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5 - lz -lm -lhdf4

Implementation Notes You don't need to identify the file as HDF4 when opening it with netCDF, but you do have to open it read-only. The HDF4 SD API provides a named, shared dimension, which fits easily into the netCDF model. The HDF4 SD API uses other HDF4 APIs, (like vgroups) to store metadata. This can be confusing when using the HDF4 data dumping tool hdp.

C Code to Read HDF4 SD File /* Create a file with one SDS, containing our phony data. */ sd_id = SDstart(FILE_NAME, DFACC_CREATE); sds_id = SDcreate(sd_id, PRES_NAME, DFNT_INT32, DIMS_2, dim_size); SDwritedata(sds_id, start, NULL, edge, (void *)data_out); if (SDendaccess(sds_id)) ERR; if (SDend(sd_id)) ERR; /* Now open with netCDF and check the contents. */ if (nc_open(FILE_NAME, NC_NOWRITE, &ncid)) ERR; if (nc_inq(ncid, &ndims_in, &nvars_in, &natts_in, &unlimdim_in)) ERR;...

ncdump and HDF4 SD Files With HDF4 reading enabled, ncdump works on HDF4 files. Sample MODIS file:../ncdump/ncdump -h MOD29.A hdf netcdf MOD29.A { dimensions: Coarse_swath_lines_5km\:MOD_Swath_Sea_Ice = 406 ; Coarse_swath_pixels_5km\:MOD_Swath_Sea_Ice = 271 ; Along_swath_lines_1km\:MOD_Swath_Sea_Ice = 2030 ; Cross_swath_pixels_1km\:MOD_Swath_Sea_Ice = 1354 ; variables: float Latitude(Coarse_swath_lines_5km\:MOD_Swath_Sea_Ice, Coarse_swath_pixels_5km\:MOD_Swath_Sea_Ice) ; Latitude:long_name = "Coarse 5 km resolution latitude" ; Latitude:units = "degrees" ;...

HDF-EOS Not Understood Many HDF4 data sets of interest follow the HDF-EOS metadata standard. Stored as a long text string in global attributes, the HDF-EOS metadata looks messy. // global attributes: :HDFEOSVersion = "HDFEOS_V2.9" ; :StructMetadata.0 = "GROUP=SwathStructure\n\tGROUP=SWATH_1\n\t\tSwathName =\"MOD_Swath_Sea_Ice\"\n\t\tGROUP=Dimension\n\t\t\\tOBJEC T=Dimension_1\n\t\t\t\tDimensionName=\"Coarse_swath_lines_5 km\"\n\t\t\t\tSize=406\n\t\t\tEND_OBJECT=Dimension_1\n\t\t\tOB JECT=Dimension_2\n\t\t\t\tDimensionName=\"Coarse_swath_pix els_5km\"\n\t\t\t\tSize=271\n\t\t\t...

HDF4 Read Testing Tested in libsrc4/tst_interops2.c, which creates some HDF4 files with the SD API, and then reads them with netCDF. If –enable-hdf4-file-tests is used with netCDF configure, some Aura/Terra satellite data files are downloaded from Unidata FTP site, then read by libsrc4/tst_interops3.c.

HDF4 Interoperability Limitations File must be opened read-only. Only HDF4 SD data files are currently understood. This feature cannot be used at the same time as HDF4's netCDF v2 API, because HDF4 steals the netCDF v2 API function names. So you must use –disable-netcdf when building HDF4. (It might also work to –disable-v2 for the netCDF build.)

Future HDF4 Work More tests. Support for HDF4 image types. Test support for compressed data. Add some support for HDF-EOS metadata in the libcf library, using the HDF-EOS toolkit.

Request for User Help – What Data to Read? Please send me pointers to scientifically important HDF4 datasets. The intention is not to read any HDF4 data, just those of wide scientific interest.

Contribute Code to Write HDF4? Some programmers use the netCDF v2 API to write HDF4 files. It would not be too hard to write the glue code to allow the v2 API -> HDF4 output from the netCDF library. The next step would be to allow netCDF v3/v4 API code to write HDF4 files. Writing HDF4 seems like a low priority to our users. I would be happy to help any user who would like to undertake this task.

What is HDF5? HDF5 is an extremely general data storage format with many advanced features: on-the- fly compression, parallel I/O, a rich data model, etc. Starting with netCDF-4.0, netCDF has been able to use HDF5 as a storage layer, exposing some of the advanced features. But, until version 4.1, only HDF5 files created with netCDF-4 could be understood by netCDF-4.

Why Read HDF5 Files? Many important datasets are available in HDF5 format, including data from the Aqua satellite.

Rules for Reading HDF5 Files NetCDF-4.1 provides read-only access to existing HDF5 files if they do not violate some rules: – Must not use circular group structure. – HDF5 reference type (and some other obscure types) are not understood. – Write access still only possible with netCDF- 4/HDF5 files.

HDF5 Version 1.8 Background In version 1.8, HDF5 introduced “dimension scales” as a way of supporting shared dimensions. Also in version 1.8, HDF5 introduced ordering by creation, rather than ordering alphabetically. But most data providers don't use these features, but instead use HDF5 1.6.

NetCDF-4.1 Relaxes Some Restrictions for HDF5 Files Before netCDF-4.1, HDF5 files had to use creation ordering and dimension scales in order to be understood by netCDF-4. Starting with netCDF-4.1, read-only access is possible to HDF5 files with alphabetical ordering and no dimension scales. (Created by HDF5 1.6 perhaps.) HDF5 may have dimension scales for all dimensions, or for no dimensions (not for just some of them).

HDF5 C Code to Write HDF5 File /* Create file. */ if ((fileid = H5Fcreate(FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0) ERR; /* Create the space for the dataset. */ dims[0] = LAT_LEN; dims[1] = LON_LEN; if ((pres_spaceid = H5Screate_simple(DIMS_2, dims, dims)) < 0) ERR; /* Create a variable. It will not have dimension scales. */ if ((pres_datasetid = H5Dcreate(fileid, PRES_NAME, H5T_NATIVE_FLOAT, pres_spaceid, H5P_DEFAULT)) < 0) ERR; if (H5Dclose(pres_datasetid) < 0 || H5Sclose(pres_spaceid) < 0 || H5Fclose(fileid) < 0) ERR;

NetCDF C Code to Read HDF5 File /* Read the data with netCDF. */ if (nc_open(FILE_NAME, NC_NOWRITE, &ncid)) ERR; if (nc_inq(ncid, &ndims_in, &nvars_in, &natts_in, &unlimdim_in)) ERR; if (ndims_in != 2 || nvars_in != 1 || natts_in != 0 || unlimdim_in != -1) ERR; if (nc_close(ncid)) ERR;

Future Plans for HDF5 Interoperability More testing. Proper handling of reference types. This will require (probably) an extension of the netCDF APIs. Better handling of strange group structures, if this proves necessary to read important data.

Summary With the 4.1 release, the netCDF C/Fortran/C++ libraries allow read-only access to some existing HDF4 and HDF5 data archives. The intention is not to develop a completely general translation, but instead to focus on datasets of significance to the Earth science community. Write capability is quite possible, but we don't plan on providing it because the demand for this is low.