Introduction to NetCDF4 MuQun Yang The HDF Group 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD.

Slides:



Advertisements
Similar presentations
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Advertisements

Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
NetCDF Ed Hartnett Unidata/UCAR
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
Show of Hands... How many traveled to be here? University/Gov't/Industry How many use netCDF? Primary programming language for netCDF? Other data formats.
Developing a NetCDF-4 Interface to HDF5 Data
1 Writing NetCDF Files: Formats, Models, Conventions, and Best Practices Russ Rew, UCAR Unidata June 28, 2007.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
The HDF Group April 17-19, 2012HDF/HDF-EOS Workshop XV1 Introduction to HDF5 Barbara Jones The HDF Group The 15 th HDF and HDF-EOS Workshop.
NetCDF-4 The Marriage of Two Data Formats Ed Hartnett, Unidata June, 2004.
NetCDF and HDF5 Ed Hartnett, Unidata/UCAR, Unidata Mission: To provide the data services, tools, and cyberinfrastructure leadership that advance.
Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
NetCDF for Developers and Data Providers Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 14 April 2011.
HDF5 A new file format & software for high performance scientific data management.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
NetCDF-4 and Parallel I/O GSFC, Nov 20,2008 Ed Hartnett.
NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using.
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
Page 1 Status of HDF-EOS, Related Software, and Tools Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshp XIII Riverdale, MD November 4, 2009.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
11/7/2007HDF and HDF-EOS Workshop XI, Landover, MD1 HDF5 Software Process MuQun Yang, Quincey Koziol, Elena Pourmal The HDF Group.
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
1 N-bit and ScaleOffset filters MuQun Yang National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Urbana, IL
Advanced Utilities Extending ncgen to support the netCDF-4 Data Model Dr. Dennis Heimbigner Unidata netCDF Workshop August 3-4, 2009.
1 HDF5 Life cycle of data Boeing September 19, 2006.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
Page 1 TOOLKIT / HDF-EOS Status and Development Abe Taaheri, Raytheon IIS Aura DSWG meeting October 2007.
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 Introduction to HDF5 Command-line Tools.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
An HDF5-WRF module -A performance report MuQun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University of Illinois,
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD1 Software to access HDF5 Datasets via OPeNDAP MuQun Yang, Hyo-Kyung Lee The HDF Group.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
CF 2.0 Coming Soon? (Climate and Forecast Conventions for netCDF) Ethan Davis ESO Developing Standards - ESIP Summer Mtg 14 July 2015.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
Developing Conventions for netCDF-4 Russ Rew, UCAR Unidata June 11, 2007 GO-ESSP.
Development of a CF Conventions API Russ Rew GO-ESSP Workshop, LLNL
The HDF Group Introduction to HDF5 Session 7 Datatypes 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Utilities for netCDF-4 Dr. Dennis Heimbigner Unidata Advanced netCDF Workshop July 25, 2011.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
Libcf – A CF Convention Library for NetCDF Ed Hartnett Unidata Program Center Boulder Colorado June 11, 2007.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
Moving from HDF4 to HDF5/netCDF-4
SRNWP Interoperability Workshop
NetCDF 3.6: What’s New Russ Rew
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
Unidata Advanced netCDF Workshop
Peter Cao The HDF Group November 28, 2006
Moving applications to HDF
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Libcf – A CF Convention Library for NetCDF
Hierarchical Data Format (HDF) Status Update
NCL variable based on a netCDF variable model
Presentation transcript:

Introduction to NetCDF4 MuQun Yang The HDF Group 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Notes Require basic knowledge of HDF5 and netCDF3 Cover general NetCDF4 concepts - Several new features and their performances Cover some NetCDF4 APIs but won’t review all new APIs Is not a netCDF3 tutorial 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Contents History review Overview of NetCDF4 features, builds and etc Performance issues Suggestions for users 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

History Review Funded by NASA ESTO AIST Program Joint project between Unidata and HDF Group Used HDF5 as the storage layer of NetCDF 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF-4/HDF5 Goals 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD Combine desirable characteristics of netCDF and HDF5, while taking advantage of their separate strengths: -Widespread use and simplicity of netCDF -Generality and performance of HDF5 Preserve format and API compatibility for netCDF users Demonstrate benefits of combination in advanced Earth science modeling efforts (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop)

NetCDF-4 Architecture HDF5 Library netCDF-4Library netCDF-3 Interface netCDF-3 applications netCDF-3 applications netCDF-4 applications netCDF-4 applications HDF5 applications HDF5 applications netCDF files netCDF files netCDF-4 HDF5 files HDF5 files (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop) 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Contents History review Overview of NetCDF4 features, builds and etc Performance issues Suggestions for users 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Current Status beta 1 based on HDF5 1.8 beta 1 on April, beta 2 release is coming soon 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Compilers, platforms and language supports Platforms -Linux, IBM AIX, Sun OS, HP-UX, OSF1, IRIX, Cygwin Programming Languages - C/C++ and fortran Compilers - Vendor compilers on the supported platforms 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD Watch for Snapshot

Configuration Only NetCDF3 will be built if you just type./configure Before building NetCDF4, one must -install HDF5 1.8 beta 1 or later (note: parallel HDF5 needs separate build) -install zlib library if using data compression To build sequential version -./configure --enable-netcdf-4 --with-hdf5=/HDF5path --with-zlib=/zlibpath To build parallel version -./configure --enable-netcdf-4 –enable-parallel –disable-shared --with-hdf5=/parallel HDF5path --with-zlib=/zlibpath Parallel NetCDF4 needs more work. It has been tested on IBM AIX. 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

API Changes Existing APIs: Essentially no differences but with new flags NetCDF3: NetCDF4: Adding new APIs for new features such as: nc_def_var_deflate(ncid, varid, shuffle, deflate, deflate level) Hereafter blue color in APIS implies this is an output parameter 11/6/2007 nc_create(FILE_NAME, NC_NOCLOBBER, &ncid); HDF and HDF-EOS Workshop XI, Landover, MD nc_create(FILE_NAME, NC_NETCDF4,&ncid);

Overview of NetCDF4 new features Data Type -Compound data type -Variable length type Group Multiple Unlimited Dimension Compression Parallel IO 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

A compound datatype example 11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD14 types: compound wind_vector_t { float eastward ; float northward ; } dimensions: lat = 18 ; lon = 36 ; pres = 15 ; time = 4 ; variables: wind_vector_t gwind(time, pres, lat, lon) ; wind:long_name = "geostrophic wind vector" ; wind:standard_name = "geostrophic_wind_vector" ; data: gwind = {1, -2.5}, {-1, 2}, {20, 10}, {1.5, 1.5},...;

Variable length type 11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD15 Simple example: ragged array types: float(*) row_of_floats; dimensions: m = 50; variables: row_of_floats ragged_array(m);

An Example – variable length and compound datatype 11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD16 struct sea_sounding { int sounding_no; nc_vlen_t temp_vl; } data[DIM_LEN]; /*1. Create a netcdf-4 file. */ nc_create(FILE_NAME, NC_NETCDF4, &ncid); /* 2. Create the vlen type, with a float base type. */ nc_def_vlen(ncid, "temp_vlen", NC_FLOAT, &temp_typeid); /* 3. Create the compound type to hold a sea sounding. */ nc_def_compound(ncid, sizeof(struct sea_sounding), "sea_sounding", &sounding_typeid); nc_insert_compound(ncid, sounding_typeid, "sounding_no", NC_COMPOUND_OFFSET(struct sea_sounding, sounding_no), NC_INT); nc_insert_compound(ncid, sounding_typeid, "temp_vl", NC_COMPOUND_OFFSET(struct sea_sounding, temp_vl), temp_typeid); /* 4. Define a dimension, and a 1D var of sea sounding compound type. */ nc_def_dim(ncid, DIM_NAME, DIM_LEN, &dimid); nc_def_var(ncid, "fun_soundings", sounding_typeid, 1, &dimid, &varid); /* 5. Write our array of phone data to the file, all at once. */ nc_put_var(ncid, varid, data); /*6. Close the file*/ nc_close(ncid);

Group Use of Groups is optional, with backward compatibility maintained by putting everything in the top-level unnamed Group. Unlike HDF5, netCDF-4 requires that Groups form a strict hierarchy. Potential uses for Groups include o Factoring out common information o Containers for data within regions, ensembles o Organizing a large number of variables o Providing name spaces for multiple uses of same names for dimensions, variables, attributes o Modeling large hierarchies 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Group APIs APIs for creating group( define APIs) nc_def_grp(parent_group_id, group name, &group_id) Examples: nc_def_grp(ncid, HENRY_VII, &henry_vii_id) nc_def_grp(henry_vii_id, MARGARET, &margaret_id) APIs for inquiring information from a group ( inquiry APIs) number of groups: nc_inq_grps(group_id, &num_grps, NULL); children group id list: nc_inq_grps(group_id, NULL, group_id_list); children group name: nc_inq_grpname(group_id_list[0], children_group_name); 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Multiple Unlimited Dimension APIs APIs for defining multiple unlimited dimensions Old API with the same flag: nc_def_dim(ncid, dimension name, NC_UNLIMITED, int *idp) Examples: nc_def_dim(ncid, dimname_1, NC_UNLIMITED, &dimid[0]) nc_def_dim(ncid, dimname_2,NC_UNLIMITED, &dimid[1]) APIs for inquiring multiple dimensions Old API with the same flag: nc_inq_unlimdim(ncid,,int *idp) New API: nc_inq_unlimdims(ncid, int nunlimdims_in, int unlimdimid[ ]) How to use the new API 1) First obtain the number of unlimited dimensions: nc_inq_unlimdims(ncid, &nunlimdims,NULL) 2) Then obtain the unlimited dimensional list: nc_inq_unlimdims(ncid, &nunlimdims, unlimdimid) 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Deflate now Scaleoffset, N-bit and maybe szip in the future Only need to add one routine nc_def_var_deflate( int netcdf id, int variable id, int shuffle, int deflate, int deflate_level ); Compression 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

----- Data writing Define variable nc_def_var(ncid, VAR_BYTE_NAME, NC_BYTE, 2, dimids, &byte_varid); 2. Set deflate compression nc_def_var_deflate(ncid, byte_varid, 0, 1, DEFLATE_LEVEL_3); 3. Write the data nc_put_var_schar(ncid, byte_varid, (signed char *)byte_out); Data reading nc_get_var_schar(ncid, byte_varid, (signed char *)byte_in); Compression example code 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Parallel IO Support either collective or independent Support MPI-IO or MPI-POSIX IO via parallel HDF5 Special functions are used to create/open a netCDF file in parallel. 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

New APIs to do parallel IO nc_create_par (const char *path, int mode,MPI_Comm comm, MPI_Info info, int *ncidp) “mode” must be NC_NETCDF4|NC_MPIIO or NC_NETCDF4|NC_MPIPOSIX nc_var_par_access (int ncid, int var_id, int data_access ) Data_access can be either NC_COLLECTIVE or NC_INDEPENDENT nc_open_par (const char *path,int mode,MPI_Comm comm, MPI_Info info,&ncid) “mode” must be either NC_MPIIO or NC_MPIPOSIX 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Parallel IO Programming Model Data writing : /* 1. Initialize MPI. */ MPI_Init(&argc,&argv) /* 2. Create a parallel netcdf-4 file. */ nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid) nc_var_par_access(ncid, v1id, NC_COLLECTIVE) /* 3. Write data. */ nc_put_vara_int(ncid, v1id, start, count,data ) /*4. Close the file */ nc_close(ncid); /* 5. Shut down MPI. */ MPI_Finalize(); Data reading: Use nc_open_par instead of nc_create_par 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Other features Datatype -More atomic datatype: unsigned integer(1,2,4 and 8 bytes) -Strings: replace character arrays -Enums,Opaque types -User-defined datatype Fletcher32 checksum filter UTF-8 support Reader-Makes-Right conversion Using HDF5 dimensional scale 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Content History review Overview of NetCDF4 features, builds and etc Performance issues Suggestions for users 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

<2 % NetCDF4 Data Compression: Size 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF4 Data Compression: Data Write time 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF4 Data Compression: Data Read Time 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

WRF Output in HDF5 -File Size 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

WRF Output in HDF5- Data writing time 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

 FM 92 GRIB, NORDRAD, Universal Format,  netCDF, HDF4,HDF5,  XML and Scalable Vector Graphics (SVG), and GeoTIFF Based on the results of the detailed evaluation, HDF5 is recommended for consideration as an official European standard format for weather radar data and products. Compared to other formats, HDF5’s compression algorithm (ZLIB) is more efficient… A file format with efficient compression and platform independence is essential 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD PyTables One of the beauties of PyTables is that it supports compression on tables and arrays EUMETNET OPERA Report in 2006 They evaluated the following data format: Their Recommendation: Why?

33 Evaluation of Parallel NetCDF4 Performance Regional Oceanographic Modeling System History file writer in parallel NetCDF4(PnetCDF4) History file writer in parallel NetCDF from Argonne(PnetCDF) Data: 60 1D-4D double-precision float and integer arrays

34 PnetCDF4 and PnetCDF performance comparison Fixed problem size = 995 MB Performance of PnetCDF4 is close to PnetCDF Number of processors Bandwidth (MB/S) PNetCDF collectiveNetCDF4 collective

35 ROMS Output with Parallel NetCDF4 The IO performance gets improved as the file size increases. It can provide decent I/O performance for big problem size.

Chunking Using chunking wisely Review chunking tips for HDF5 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Content History review Overview of NetCDF4 features, builds and etc Performance issues Suggestions for users 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF Classic Model 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Using the NetCDF Classic Model NetCDF-4 files can be created with the CLASSIC_MODEL flag. This enforces the rules of the classic netCDF data model on this file. nc_create(FILE_NAME, NC_NETCDF4|NC_CLASSIC_MODEL, &ncid) Once a classic model file, always a classic model file. This sticks with the file and there is no way to change in within the netCDF API. Classic model files don't use any elements of the expansion of the data model in netCDF-4. They don't have groups, user-defined types, multiple unlimited dimensions, or the new atomic types. Since they conform to the classic model, they can be read and understood by any existing netCDF software (as soon as that software upgrades to netCDF-4 and HDF ). NetCDF-4 features which don't affect the data model are still available: compression, parallel I/O. 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

HDF5 Features not in current NetCDF4.0 No Scaleoffset, N-bit, szip filters (Plan for 4.1 release) No supports for user-defined filters Can only read HDF5 files having dimensional scales Can only write data in chunking storage No Fortran 90 APIs No corresponding APIs for optimizations - cache, MPI-IO 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF 4.1 Plan 4/req_4_1.html 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

NetCDF4, HDF5 which one should I use? Familiarity Features Performance Compatibility Release/feature lags 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD Evaluate the followings:

11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD High Performance + many advanced HDF5 features HDF5 definitely Care about performance, Possibly need to use many new advanced features HDF5: maybe NetCDF4:Avoid transition cost from NetCDF to HDF5 NetCDF4: maybe 1. Just need one or two HDF5 features for intensive NetCDF applications NetCDF4/CLASSIC_MODEL (compression,parallel IO) 2. Existing NetCDF software or applications that don’t care about performance NetCDF4 definitely Priority Recommendation Based on stability of NetCDF4

More NetCDF4 information Release and snapshot: 4/ 4/ Tutorial in 2007 NetCDF workshop: ps/2007/ Paper in 2006 AMS annual meeting: ams.pdf 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD

Thanks Russ Rew and Ed Hartnett from Unidata for generously allowing me to use their slides and sharing their compression performance results in this workshop Some contents that describe New features of are copied from 2007 Unidata NetCDF workshop The Radar NetCDF data compression performance results are provided by Ed Hartnett at Unidata 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD Acknowledgements