Download presentation
Presentation is loading. Please wait.
Published bySabina Ward Modified over 9 years ago
1
Better data management through NetCDF Jaison Kurian CAOS IISc
2
Introduction There are different formats for geoscience data... Is NetCDF the best one ??????? HDF, CDF, NetCDF, Binary, GRIB, ASCII and many more..............
3
Let us start with few examples....... The story of a Rainfall data....................
4
Features of NetCDF 1. self-describing - all data and meta-data is encapsulated in one file 2. machine independent - works on almost all platforms 3. direct access – efficiently read subsets of large datasets 4. appendable - data can be quickly added to old files 5. sharable – one writing process and several reading process can occure at once 6. easy to learn 7. freely available, well documented & well supported 8. supported by variety of data analysis, processing, and visualization tools
5
NetCDF conventions There are different conventions.... “COARDS” is the most widely accepted one. COARDS ( Cooperative Ocean/Atmosphere Research Data Service) Filename extension.cdf or.nc ????.nc !!!
6
Commands : ''ncdump'' & ''ncgen'' ncdump can be used to get a CDL (network Common Data form Language) file CDL file is the ascii representation of a NetCDF file ncdump / CDL file provides an easy way to look at the structure and contents of a NetCDF file. [user@machine]$ ncdump -c ncfile.nc | less OR [user@machine]$ ncdump -c ncfile.nc > ncfile.cdl [user@machine]$ man ncdump
7
ncgen can be used 1. to check the syntax of input CDL file. 2. to make a fortran/c program to write the NetCDF file described in input CDL file. 3. to make a binary NetCDF file from fom given CDL file. We will see the details later..................
8
Components of a NetCDF file are : 1. Header part 1.1 Dimensions 1.2 Variables 1.3 Attributes 2. Data part coards
9
1. dimension/variable name should start with a letter and can have digits and '_'. ''temp'' 2. names are “case sensitive” Rules.... ''1temp'' ''temp'' = ''Temp'' = ''TEMP''
10
1.1 Dimensions Maximum number of dimensions for a file is 512 (netcdf-3.5.1) dimensions: time = 31 ; height = 1 ; latitude = 122 ; longitude = 182 ;......................... ;......................... Maximum dimensions for a variable is 4 name length/siz e dimension s
11
dimensions: time = UNLIMITED ; // (31 currently) height = 1 ; latitude = 122 ; longitude = 182 ;......................... ;......................... dimension s unlimited dimension ??? A NetCDF dataset can have at most one unlimited dimension, but need not have any. NetCDF model does not cater for variables with several changeable dimension sizes. Variables should have rectangular shapes.
12
1.2 Variables Maximum number of variables for a file is 4096 (netcdf-3.5.1) dimensions => shape short wind_speed(time, height, latitude, longitude) ; wind_speed:long_name = "wind speed" ; wind_speed:units = "m/s" ;................................................................................. short zonal_wind_speed(time, height, latitude, longitude) ; zonal_wind_speed:long_name = "zonal wind speed" ; zonal_wind_speed:units = "m/s" ;....................................... typename variable s
13
variable data “types” TypeFortranNetCDF Bits byte BYTE NF_BYTE 8 char CHARACTER NF_CHAR 8 short INTEGER*2 NF_SHORT 16 long INTEGER*4 NF_LONG 32 float(real)REAL*4 NF_FLOAT 32 NF_REAL 32 doubleDOUBLE PRECISION NF_DOUBLE 64 REAL*8 64 variable s
14
Which dimension varies fastest ????? CDL / C short wind_speed(time, height, latitude, longitude) slowest varying dim fastest varying dim Fortran INTEGER*2 wind_speed(longitude, latitude, height, time) slowest varying dimfastest varying dim variable s
15
T0 : 1900-01-01 00:00:00 t_end Lon(X) Lat(Y) Depth / Height Time = Tn
16
Coordinate / Independent Variables (with same name as dims) dimensions: time = 31 ; height = 1 ; latitude = 122 ; longitude = 182 ; variables: float time(time) ; time:units = "hours since 1900-1-1 0:0:0" ; time:time_of_day = "12:00" ; float height(height) ; height:units = "meters" ; height:positive = "up" ; float latitude(latitude) ; latitude:units = "degrees_N" ; float longitude(longitude) ; longitude:units = "degrees_E" ; short wind_speed(time, depth, latitude, longitude) ; variable s
17
Coordinate variables have no special meaning to the NetCDF library. But it typically defines a physical coordinate corresponding to that dimension for the “software” using this library Softwares/packages that make use of coordinate variables commonly assume they are numeric vectors and strictly monotonic : all values are different & either increasing or decreasing and no missing/Fill values variable s
18
Primary / Dependent Variables dimensions: time = 31 ; depth = 1 ; latitude = 122 ; longitude = 182 ; variables:............................ short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ;............................................................... short zonal_wind_speed(time, depth, latitude, longitude) ; zonal_wind_speed:long_name = "zonal wind speed" ;................................................................
19
1.3 Attributes Variable attributes => provides information about a particular variable short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ; variable name attr. nameattr. data (character string) wind_speed:missing_value = 32767s ; (numeric value) attribute s
20
Character Variable Attributes short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ; <= Title wind_speed:units = "m/s" ; => OR "ms-1" long_name & units are recognized by tools like Ferret We can add any other attributes if needed but this does not be recognized by any tools....example.. wind_speed:var_desc = "scalar wind speed"; wind_speed:dataset = "quikscat_01_2001.nc” ; wind_speed:level_desc = "Surface" wind_speed:statistic = "3 day Mean" wind_speed:parent_stat = "Satellite Observation" wind_speed:history = ''no processing'' attribute s
21
Character Variable Attributes float latitude(latitude) ; latitude:long_name = "Latitude in Degrees'' latitude:units = "degrees_N" ; latitude:point_spacing = ''even'' <= perfomance improvement float longitude(longitude) ; longitude:long_name = ''Longitude in Degrees'' longitude:units = "degrees_E" ; longitude:modulo = '' '' longitude:point_spacing = ''even'' degrees_E / degrees_east / degree_E / degree_east degrees_N / degrees_north / degree_N / degree_north
22
Character Variable Attributes float depth(depth) ; depth:long_name = ''Depth wrt sea surface'' depth:units = "meters" ; depth:positive = "down" ; float height(height) ; height:long_name = ''Height wrt Ground'' height:units = "meters" ; height:positive = "up" ; for ocean for atmosp. attribute s
23
Character Variable Attributes float time(time) ; time:long_name = ''Time'' time:units = "hours since 1900-1-1 0:0:0" ; time:time_of_day = "12:00" ; time:calendar = “JULIAN” ==> OR calendar_type Reccomented time units are : seconds, minutes, hours & days. months & years are not of equal length GREGORIAN or STANDARD 365.2425 default calendar JULIAN 365.25 with leap years NOLEAP or COMMON_YEAR 365 no leap years 360_DAY 360 each month is 30 days calendar(tool specific) attribute s
24
Character Variable Attributes climatological time axis float time(time) ; time:long_name = ''Climatological Time'' time:units = "hours since 0000-1-1 0:0:0" ; time:modulo = '' '' attribute s
25
Numeric Variable Attributes short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ;............................................................... wind_speed:valid_min = 0.f ; wind_speed:valid_max = 60.f ; OR wind_speed:valid_range = 0.f, 60.f ; Numeric ''type'' of attribute should be same as that of variable. attribute s
26
Numeric Variable Attributes short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ;........................................................ wind_speed:scale_factor = 0.01f ; wind_speed:add_offset = 0.f ; scale_factor & offset together offers ''packing'' of data while a tool ''reads'' packed data : first multiply by scale_factor then add offset while ''packing'' data : first subtract offset then devide by scale_factor scale_factor and add_offset => of the type of unpacked data(float or double) ''packed'' data is typically of type byte or short attribute s
27
Numeric Variable Attributes short wind_speed(time, depth, latitude, longitude) ; wind_speed:long_name = "wind speed" ;............................................................. wind_speed:missing_value = 32767s ; wind_speed:_FillValue = 32767s ; _FillValue : value used to pre-fill disk space allocated to the variable scalar, same ''type'' as the variable missing_value : value/values indicating missing data scalar/vector, same ''type'' as the variable These values should all be outside the valid_range. If variable is ''packed'' ==> missing_value/_FillValue flags are likewise packed attribute s
28
Global Attributes provides information about the netCDF dataset as a whole such as title, processing history, instrument...... can be of character / numeric type a good option to store all the necessary details about the data set to make it ''really self-describing'' attribute s
29
Global Attributes // global attributes: :WOCE_Version = "3.0" ; :CONVENTIONS = "COARDS/WOCE" ; :long_name = "QuikSCAT daily mean wind fields" ; :producer_agency = "IFREMER" ; :producer_institution = "CERSAT" ; :product_version = "1.0" ; :time_resolution = "one day mean" ; :spatial_resolution = "0.5 degrees" ; :platform_id = "QuikSCAT" ; :instrument = "QuikSCAT" ; :objective_method = "kriging" ; :data_processing = "data missing dates are filled with dummy _FillValue-s" ; :time_modification = "to avoid the problems with 12:00 hrs in ferret" attribute s
30
data 2. Data time = 902891.9, 902915.9, 902939.9, 902963.9, 902987.9, 903011.9, 903035.9, 903059.9, 903083.9, 903107.9, 903131.9, 903155.9,..................................................................................................... ; height = 10 ; latitude = 30.25, 29.75, 29.25, 28.75, 28.25, 27.75, 27.25, 26.75, 26.25,................................................................................................ ; longitude = 29.75, 30.25, 30.75, 31.25, 31.75, 32.25, 32.75, 33.25, 33.75,.............................................................................................. ; wind_speed = -129, -129, -129, -129, -129, -129, -129, -129, -129, -129, -129, -129,...................................................................................................................................................................................................................................... ;
31
Fortran Interface How to read/write NetCDF files using Fortran ?? - use the ''include'' header file to define NetCDF related variables. INCLUDE 'netcdf.inc' - Explicitly specify NetCDF ''include'' & ''lib'' directories if the files ''netcdf.inc'' and ''libnetcdf.a'' are not in default search directories for the compiler (like /usr/include & /usr/lib) [user@machine ] $ f77 mync_pgm.f -I/home/pkgs/netcdf- 3.5.1/include -L/home/pkgs/netcdf-3.5.1/lib -lnetcdf Fortran Interf
32
Fortran Interface steps to create a new NetCDF file : 1. open a new NetCDF file err = NF_CREATE ( 'let_me_learn.nc', NF_WRITE, ncid ) 2. define all the required dimensions err = NF_DEFINE_DIM( ncid, 'latitude', 180, dimid_lat ) 3. define all the required variables err = NF_DEFINE_VAR( ncid, 'latitude', NF_REAL, 1, dimid_lat, varid_lat) 4. define all attributes err = NF_PUT_ATT_TEXT( ncid, varid_lat, 'units', 9, 'degrees_N' ) 5. leave define mode ( and enter ''data'' mode ) err = NF_ENDDEF (ncid) <== Very Important 6. write data err = NF_PUT_VARA_REAL(ncid, varid_lat, 1, 180, lat) 7. close NetCDF file err = NF_CLOSE (ncid) Fortran Interf
33
Fortran Interface Fortran Interf steps to read an existing NetCDF file : 1. open existing NetCDF file err = NF_OPEN ( 'let_me_learn.nc', NF_NOWRITE, ncid ) 2. get all the required variable ''id''s err = NF_INQ_VARID( ncid, 'latitude', varid_lat ) 3. get variable ''data'' err = NF_GET_VARA_REAL( ncid, varid_lat, start, count, lat) 4. close NetCDF file err = NF_CLOSE (ncid)
34
Fortran Interface Fortran Interf OMODE Flags NF_CLOBBER : overwrite any existing dataset with the same file name NF_NOCLOBBER : do not overwrite (clobber) an existing dataset NF_WRITE : open dataset with read-write access. - add/change dim, var, att & data - delete att NF_SHARE : same as NF_WRITE - one process may be writing the dataset and one or more other processes reading the dataset concurrently NF_NOWRITE : open dataset with read-only access
35
Fortran Interface How to write the program in an efficient way ???? 1. Use IMPLICIT NONE option 2. Use HANDLE_ERR subroutine err = NF_CREATE ( 'let_me_learn.nc', NF_CLOBBER, ncid ) if (err.NE. NF_NOERR) call HANDLE_ERR(err) SUBROUTINE HANDLE_ERR(ERR) IMPLICIT NONE INCLUDE 'netcdf.inc' INTEGER ERR PRINT *,'netcdf error : ', NF_STRERROR(ERR) STOP 'Stopped' END 3. Segmentation fault (core dumped) ==> check for number of arguments Fortran Interf
36
Fortran Interface Fortran Interf Let us see few examples...........................
37
ncgen 1. to check the syntax of input CDL file. 2. to make a fortran/c program to write the NetCDF file described in input CDL file. 3. to make a binary NetCDF file from fom given CDL file. [user@machine]$ man ncgen
38
Installing NetCDF (netcdf-3.5.1; RH Linux- 9) Get netcdf-3.5.1.tar.Z from unidata's download site. Login as root (if needed) Do the following 'setenv' stuff (export for bash shell) export CC=/usr/bin/c99 export CPPFLAGS='-DNDEBUG -Df2cFortran' export CFLAGS=-O export FC=/usr/bin/f77 export FFLAGS='-O -w' export CXX=/usr/bin/c++ [root@machine] # zcat./netcdf-3.5.1.tar.Z | tar xvf - [root@machine] #./configure [root@machine] # make [root@machine] # make test [root@machine] # make install [root@machine] # make clean installation
39
Installing NetCDF Let us test wether this new library is working fine............
40
What about curvilinear data ???
41
Limitations of NetCDF 1. File size increases in some cases even with missing data. 2. Only one UNLIMITED dimension is possible. 3. Limited number of external data types......inefficient use of disc space. 4. File size maximum of 2GB. 5. The extent to which data can be completely self-describing is limited: there is always some assumed context without which sharing and archiving data would be impractical. 6. No support for multiple concurrent writers. 7. Dimensions, Variables & Data cannot be DELETED !!!!!!
42
So..... how is NetCDF ???????
43
Good ?????
44
Beware !!!!!!!!!!!!!! Data being in NetCDF format doesnot guarantee that it is better than having the data in other formats (unless it is supplied in proper shape with all necessary details/informations). Here is an example................from Argo data archive.
45
Questions Please..............
46
Some usefull sites.... NetCDF home page : http://my.unidata.ucar.edu/content/software/netcdf/index.html why NetCDF : http://www.cgd.ucar.edu/ccr/bettge/CSM- netCDF/csm_why_netcdf.html Software for Manipulating or Displaying netCDF Data http://www.unidata.ucar.edu/packages/netcdf/software.html Documentaion : http://www.unidata.ucar.edu/packages/netcdf/docs.html COARDS NetCDF Convention : http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html
47
NetCDF man pages : [user@machine]$ man ncdump [user@machine]$ man ncgen [user@machine]$ man netcdf
48
THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.