Download presentation
Presentation is loading. Please wait.
Published byKristian Brooks Modified over 8 years ago
1
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop 2009-08-04
2
NetCDF and HDF5 Data Models The netCDF classic data model: simple and flat Dimensions Variables Attributes The netCDF enhanced data model added More primitive types Hierarchical groups User-defined datatypes Multiple unlimited dimensions The HDF5 data model also has Hard- and soft-links (providing multiple names for things) User-defined primitive datatypes References (pointers to objects and data regions in a file) Attributes attached to user-defined types A few other miscellaneous features
3
The Enhanced NetCDF Data Model Additions to classic netCDF data model Still a subset of HDF5 data model Made possible by adding a few things to HDF5 so netCDF could fit within it Criteria for additions to classic model: handling identified classic limitations HDF5 netCDF enhanced netCDF classic
4
Classic netCDF data model A file has variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One dimension may be of unlimited length. Dimension name: String length: int isUnlimited( ) Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … File location: Filename create( ), open( ), … Variables and attributes use one of six primitive data types. DataType PrimitiveType char byte short int float double
5
Variables versus Attributes For data May be too large for memory May be multidimensional Support partial access Individual values may be changed More data may be appended May have associated attributes Shape specified with shared dimensions Intended for metadata For single values, strings, or small 1-D arrays Accessed atomically (written or read all at once) Typically values don’t change after creation May not have attributes Length specified when created Characteristics of variables:Characteristics of attributes:
6
Characteristics of the classic data model Strengths Simple to explain Good for discussing data representation issues Efficient implementation is possible Writing generic applications is practical For gridded data, good data representations available Shared dimensions are useful Weaknesses Multiple variable-length data structures hard to represent Additional conventions required for earth science, e.g. coordinate systems Lacks compound data structures Lacks nested data structures
7
Enhanced netCDF data model, for netCDF-4 A file has a top-level unnamed group. Each group may contain one or more named subgroups, user-defined types, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions, indicating a common grid. One or more dimensions may be of unlimited length. Dimension name: String length: int isUnlimited( ) Attribute name: String type: DataType values: 1D array Variable name: String shape: Dimension[ ] type: DataType array: read( ), … Group name: String File location: Filename create( ), open( ), … Variables and attributes have one of twelve primitive data types or one of four user-defined types. DataType PrimitiveType char byte short int int64 float double unsigned byte unsigned short unsigned int unsigned int64 string UserDefinedType typename: String Compound VariableLength Enum Opaque
8
Characteristics of the enhanced data model Strengths Simpler than HDF5, with similar representational power Completely contains and is backward compatible with classic model Efficient implementation available Fixes identified weaknesses of netCDF classic model Incremental adoption of model features possible Potential weaknesses Writing generic applications more difficult Types must be defined and named separately from use, even if not shared No attributes allowed on compound members
9
Some details of the enhanced data model No attributes permitted for compound type members (because HDF5 doesn’t allow such attributes): compound wind_vector_type { float eastward; float northward; } Inclusion of user-defined opaque types (why not just use variable-length array of bytes?) Type definitions as first-class objects Type containment in groups, but global scope for use Inheritance through group hierarchy of only dimensions (why not coordinate variables or attributes?)
10
Natural convention for assigning attributes to members of a compound type types: compound wind_vector_t { float eastward ; float northward ; } compound wind_vector_units_t { string eastward ; string northward ; } variables: wind_vector_t wind(station) ; wind_vector_units_t wind:units = {"m/s", "m/s"} ;
11
Enhancing a Data Model with Backward Compatibility Benefits Data in archives don’t have to change Client program sources don’t have to change Software can access archived data without being aware of format version Costs Effort required to support older interfaces and formats Can’t easily fix mistake in released interfaces Comprehensive compatibility testing needed Implementation Evolve data model incrementally Add or grow abstractions, instead of modifying or removing them Ensures previous data model is included in enhanced data model
12
NetCDF-4 classic-model: a transitional format netCDF-3 netCDF-4 classic model netCDF-4 Compatible with existing applications Simplest data model and API Not compatible with some existing applications Enhanced data model and API, more complex, powerful Uses classic API for compatibility Uses netCDF-4/HDF5 storage for compression, chunking, performance To use, just recompile, relink
13
Concluding remarks Serious use of netCDF enhanced data model just beginning Future adjustments to model, if any, will be made by addition, not modification or deletion of existing features Preserves previous programming interfaces Supports access to previous format variants transparently
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.