Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
Overview Primary goal: Integrate client-side DAP protocol into netCDF C Library Access any DAP data source (thru DAP server) using the netCDF API Initially netCDF-3, later netCDF-4 Rationale: Combine two commonly used API’s for access to scientific datasets Issues: Data model translation DAP dataset URL support (Server side: transparent access to netCDF-4 data)
DAP Data Model Primitive types: byte, (u)int16, (u)int32, float32, float64, string Arrays: FORTRAN style rectangular arrays with bounded dimensions Limited naming of dimensions Structure: heterogeneous collection of fields Analog to C/C++ Structs Sequence: variable length array of Structures Allows relational constraints
DAP Data Model (cont.) Grid: Combination of an n-dimensional array with n 1-dimensional mapping arrays In effect a structure for an array plus its coordinate variables (in netCDF-speak) Structures, Grids, and Sequences may be arbitrarily nested with each other All types are “singletons” Type reuse requires repeating the definition
Specifying a DAP Data Source A DAP data source is specified using an extended URL syntax that refers to the DAP server containing that data Format: ? & Client parameters: [name=value]… URL extension specific to the DAP/netCDF integration Base URL: e.g. Points to the DAP server
Specifying a DAP source (cont.) DAP URL also specifies constraints on the data to be returned by the server Projection: variable-name[first:stride:last] Returns a slice of a rectangular array Selection: boolean expression over variables E.g. x > 5 or y < 6 Only applies to sequences
netCDF-3 (aka classic) Data Model Primitive types: char, byte, short, int, float, double Named shared dimensions N-dimensional FORTRAN style arrays Single unlimited dimension May only occur as first (slowest changing) dimension E.g. int var(unlimited,lat,long)
netCDF-3 Translation Issues Result must conform to legal classic model E.g. no nested sequences or arrays of sequences Synthesize shared dimensions Infer from DAP dimension name and value Convert grids to equivalent netCDF-3 coordinate variable convention Coordinate variable = 1-d variable with same name as a dimension Contains coordinate values for that dimension Flatten non-dimensioned structures and grids Sequence = unlimited dimension 1-d array
netCDF-4 (aka enhanced) Data Model Derived from the HDF5 data model netCDF-3 model plus: More primitives: ubyte, ushort, uint (u)int64, string, enums, opaque (fixed length byte strings) Named user defined types: Compound (=Structure) Vlen – variable length 1-d array Arbitrary use of unlimited dimensions Groups: similar to file system directory tree Group can contain types, dimensions, and variables
netCDF-4 Translation Issues netCDF-4 is effectively a superset of the current OPenDAP data model Carryover issues from netCDF-3: Inference of shared dimensions Grid translation to coordinate variable convention Translate structures, grids, and sequences to compound types or maybe groups? Explore DAP data model extensions to include selected netCDF Enhanced concepts Esp. groups and shared dimensions
Server-side issues Desirable to be able to pass a netCDF-4 file through a DAP server to a DAP client and through the translation and get the same file Information is currently lost in translation Solutions: add various attribute tags to restore missing information Extend OPenDAP data model
Status netCDF 4.1-alpha: available now Libdap+libnc-dap version integrated into current netCDF snapshot build Supports translation of subset of the DAP protocol to netCDF-3 Requires C++ netCDF 4.1-beta: end of 2008 Utilizes Ocapi + modified netCDF => no C++ Limited translation similar to libnc-dap netCDF 4.1: 2009 Utilizes Ocapi + modified netCDF Complete support for translating DAP to netCDF-4 Java version also exists now Uses somewhat different translation rules
Acknowledgement NSF Award # Title: SDCI NMI Improvement: OPeNDAP and NetCDF Integration Principal Investigators: James Gallagher (opendap.org) and Russell Rew (Unidata)