NetCDF-Java Overview John Caron Oct 29, 2010. Contents Data Models / Shared Dimensions Coordinate Systems Feature Types NetCDF Markup Language (NcML)

Slides:



Advertisements
Similar presentations
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Advertisements

Complex Scientific Analytics in Earth Science at Extreme Scale John Caron University Corporation for Atmospheric Research Boulder, CO Oct 6, 2010.
Reading HDF family of formats via NetCDF-Java / CDM
Recent Work in Progress
The Model Output Interoperability Experiment in the Gulf of Maine: A Success Story Made Possible By CF, NcML, NetCDF-Java and THREDDS Rich Signell (USGS,
THREDDS Status John Caron Unidata 5/7/2013. Outline Release schedule Aggregations -> featureCollections / NCSS GRIB refactor Discrete Sampling Geometry.
A Common Data Model In the Middle Tier Enabling Data Access in Workflows … HDF/HDF-EOS Workshop XIV September 29, 2010 Doug Lindholm Laboratory for Atmospheric.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
® OGC Web Services Initiative, Phase 9 (OWS-9): Innovations Thread - OPeNDAP James Gallagher and Nathan Potter, OPeNDAP © 2012 Open Geospatial Consortium.
THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007.
7 +/- 2 Maybe Good Ideas John Caron June (1) NetCDF-Java (aka CDM) has lots of functionality, but only available in Java – NcML Aggregation – Access.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
NextGen Network-Enabled Weather (NNEW) Concepts Aaron Braeckel.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
1 The NOAA Weather and Climate Toolkit Steve Ansari, Stephen Del Greco (NOAA / NCDC) Mark Phillips (UNC-Asheville / NEMAC) Bill Hankins (STG Inc.)
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
The use of standard OGC web services in integrating distributed model, satellite and in-situ datasets Alastair Gemmell Jon Blower Keith Haines Environmental.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
John Caron Unidata October 2012
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
© University of Reading 2008www.reading.ac.uk Reading e-Science Centre September 10, 2015 Integrating a Web Map Service into the THREDDS Data Server Jon.
Quick Unidata Overview NetCDF Workshop 25 October 2012 Russ Rew.
Implementation of Model Data Interoperability for IOOS: Successes and Lessons Learned Rich Signell USGS Woods Hole, MA / NOAA Silver Spring USA Model Data.
The HDF Group ESIP Summer Meeting HDF OPeNDAP update Kent Yang The HDF Group 1 July 8 – 11, 2014.
NetCDF for Developers and Data Providers Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 14 April 2011.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Unidata’s Common Data Model John Caron Unidata/UCAR Nov 2006.
THREDDS Data Server Ethan Davis GEOSS Climate Workshop 23 September 2011.
Coverages and the DAP2 Data Model James Gallagher.
NcML Aggregation vs Feature Collections. NcML functionality 1.Modify the objects found in CDM files – Especially Attributes – Don’t have to rewrite the.
Unidata’s TDS Workshop TDS Overview – Part II Unidata July 2011.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar 2007.
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
DAP4 James Gallagher & Ethan Davis OPeNDAP and Unidata.
IOOS Modeling Testbed Cyberinfrastructure Rich Signell, USGS, Woods Hole, MA IOOS-RA-Briefing, Feb 14, 2012.
Unidata TDS Workshop THREDDS Data Server Overview
Recent developments with the THREDDS Data Server (TDS) and related Tools: covering TDS, NCML, WCS, forecast aggregation and not including stuff covered.
Unidata’s Common Data Model and the THREDDS Data Server John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006.
IOOS Data Services with the THREDDS Data Server Rich Signell USGS, Woods Hole IOOS DMAC Workshop Silver Spring Sep 10, 2013 Rich Signell USGS, Woods Hole.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
Unidata’s TDS Workshop TDS Overview – Part I July 2011.
HDF4 OPeNDAP Project Progress Report MuQun Yang and Hyo-Kyung Lee 1 HDF Developers' Meeting11/24/2015.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
Unidata’s Common Data Model and NetCDF Java Library API Overview John Caron Unidata/UCAR Nov 2008.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Unidata's Involvement in Developing and Supporting Climate Science Infrastructure Russ Rew UCAR Unidata April 2010.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Data File Formats: netCDF by Tom Whittaker University of Wisconsin-Madison SSEC/CIMSS 2009 MUG Meeting June, 2009.
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
GIS for Atmospheric Sciences and Hydrology By David R. Maidment University of Texas at Austin National Center for Atmospheric Research, 6 July 2005.
Weathertop Consulting, LLC Server-side OPeNDAP Analysis – Concrete steps toward a generalized framework via a reference implementation using F-TDS Roland.
Common Data Model Scientific Feature Types John Caron UCAR/Unidata July 8, 2008.
Unidata Technologies Relevant to GO-ESSP: An Update Russ Rew
CF 2.0 Coming Soon? (Climate and Forecast Conventions for netCDF) Ethan Davis ESO Developing Standards - ESIP Summer Mtg 14 July 2015.
Rich Signell Roland Viger Curtis Price USGS Community for Data Integration Feb 15, 2012.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
NetCDF: Data Model, Programming Interfaces, Conventions and Format Adapted from Presentations by Russ Rew Unidata Program Center University Corporation.
Update on Unidata Technologies for Data Access Russ Rew
THREDDS Data Server (TDS) and Data Discovery John Caron Unidata/UCAR May 15, 2006.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
NetCDF-Java version 2.2 Common Data Model John Caron Unidata/UCAR Dec 10, 2004.
Efficiently serving HDF5 via OPeNDAP
What is NetCDF ? And what are its plans for world domination?
Recent Work in Progress
Presentation transcript:

NetCDF-Java Overview John Caron Oct 29, 2010

Contents Data Models / Shared Dimensions Coordinate Systems Feature Types NetCDF Markup Language (NcML) THREDDS Data Server (TDS)

NetCDF-3 data model Multidimensional arrays of primitive values – byte, char, short, int, float, double Key/value attributes Shared dimensions Fortran77

NetCDF-4 Data Model based on HDF5

NetCDF (extended) NetCDF, HDF5, OPeNDAP Data Models NetCDF (classic) OPeNDAP HDF5 Shared dimensions NetCDF (classic)

Gridded Data float gridData(t,z,y,x); float t(t); float y(y); float x(x); float z(z); Cartesian coordinates Data is 2,3,4D All dimensions have 1D coordinate variables netCDF: coordinate variables OPeNDAP: approx grid map variables HDF: dimension scales

Point Observation Data Set of measurements at the same point in space and time = obs Collection of obs = dataset Sample dimension not connected float obs1(sample); float obs2(sample); float lat(sample); float lon(sample); float z(sample); float time(sample); netCDF-CF: auxiliary coordinate variables OPeNDAP: grid map variables (?) HDF: dimension -> dimension scales must be 1-N

Swath float swathData( track, xtrack) float lat(track, xtrack) float lon(track, xtrack) float alt(track, xtrack) float time(track) two dimensional track and cross-track not separate time dimension aka curvilinear coordinates netCDF-CF: auxiliary coordinate variables OPeNDAP: not in DAP-2 HDF: no

Shared Dimensions Summary netCDF – Shared dimension plus conventions is general solution for coordinates – :coordinates = “lat lon alt time” OPeNDAP – No shared dimensions in current data model – Shared dimensions will be added to DAP-4 (still waiting) – :coordinates = “lat lon alt time” HDF5 – No shared dimensions in current data model – HDF-EOS added shared dimensions in metadata (NPOESS took them out) – NetCDF-4 adds a workaround NetCDF-4 not a subset of HDF-5 – NetCDF-4 does not read all HDF-5 objects HDF-5 not a subset of NetCDF-4

Editorial We’ve “solved” the syntactic issues – Hardware, OS, application independence – Encapsulate file storage details – Efficient strided array access – TODO: parallel I/O, very large files, adapting to new hardware (SSD, etc) The next 5 years are about adding “semantic” support – Unidata : Earth Science domain – Georeferencing (space / time subsetting) – Virtual datasets: Encapsulate file collection details

Unidata’s Common Data Model Scientific Feature Types Grid Point Radial Trajectory Swath StationProfile Coordinate Systems Data Access netCDF-3, HDF5, OPeNDAP BUFR, GRIB1, GRIB2, NEXRAD, NIDS, McIDAS, GEMPAK, GINI, DMSP, HDF4, HDF-EOS, DORADE, GTOPO, ASCII Storage format, multidimensional arrays Georeferencing, topology Objects, How the User sees the data

count[0] = 1; count[1] = NLVL; count[2] = NLAT; count[3] = NLON; start[1] = 0; start[2] = 0; start[3] = 0; /* Read and check one record at a time. */ for (rec = 0; rec < NREC; rec++) { start[0] = rec; nc_get_vara_float(ncid, pres_varid, start, count, &pres_in[0][0][0]))); // process data } C

int[] origin = new int[4]; int[] shape= precVar.getShape(); shape[0] = 1; // only one rec per read // loop over the rec dimension for (int rec = 0; rec < shape[0]; rec++) { origin[0] = rec; // read this index Array presArray = presVar.read(origin, shape); // process data } Java

OPeNDAP NAM_CONUS_80km_ _1200.grib1.dods? Precipitable_water[5][5:1:30][0:2:77] Netcdf-Java NetcdfFile ncfile = NetcdfFile.open(“NAM_CONUS_80km_ _1200.grib1”); Variable v = ncfile.findVariable(“Precipitable_water”); Array data = v.read(“(5)(5:30)(0:77:2)”); “Index Space” Data Access

“Coordinate Space” Access float[] level = new float[]{23.6, 78.0}; float[] lat = new float[]{40.0, 42.0}; float[] lon = new float[]{-103.6, -78.0}; for (float time : timeCoords) { Array data=gridData.read(time,level,lat,lon); // process } float gridData(t,lev,lat,lon); float t(t); // coordinate variables float lev(lev); // must be monotonic float lat(lat); float lon(lon);

Netcdf Subset Service: NAM_CONUS_80km_ _1200.grib2? var=Precipitable_water& time= T12:00:00Z& north=40&south=22&west=-110&east=-80 WMS: NAM_CONUS_80km_ _1200.grib2? SERVICE=WMS&VERSION=1.3.0&REQUEST=GetMap& CRS=EPSG:4326&WIDTH=800&HEIGHT=400&FORMAT=image/png& STYLES=boxfill/redblue&COLORSCALERANGE=-10,10& LAYERS=Precipitable_water& BBOX=-40,22,-110,-80& TIME= T12:00:00Z “Georeferenced” access

LatLonRect bb=new LatLonRect(new LatLonPointImpl(40.0, ), new LatLonPointImpl(42.0, )); Date start= DateFormatter.getISODate(“ T00:00:00”); Date end = DateFormatter.getISODate(“ T12:00:00”); DateRange dr = new DateRange(start, end); PointFeatureCollection points = stationTimeSeriesCollection.subset(bb, range); while( points.hasNext() ) { PointFeature pointFeature = points.next();... } TimeSeries “Feature Type”

LatLonRect bb = new LatLonRect(new LatLonPointImpl(40.0, ), new LatLonPointImpl(42.0, )); Date start= DateFormatter.getISODate(“ T00:00:00”); Date end = DateFormatter.getISODate(“ T12:00:00”); DateRange dr = new DateRange(start, end); Array data = grid.subset(bb, dateRange); while(data.hasNext() ) { double d = data.next();... } Grid “Feature Type”

NetcdfDataset Application Scientific Feature Types NetCDF-Java/ CDM architecture OPeNDAPNetCDF-3 HDF4 I/O service provider GRIB GINI NIDS NetcdfFile NetCDF-4 … Nexrad DMSP CoordSystem Builder Datatype Adapter NcML Georeferencing Access Index Space Access NcML

Coordinate System UML

Netcdf-Java Library parses these Conventions CF Conventions (preferred) COARDS, NCAR-CSM, ATD-Radar, Zebra, GEIF, IRIDL, NUWG, AWIPS, WRF, M3IO, IFPS, ADAS/ARPS, MADIS, Epic, RAF-Nimbus, NSSL National Reflectivity Mosaic, FslWindProfiler, Modis Satellite, Avhrr Satellite, Cosmic, …. Write your own CoordSysBuilder Java class

Projections (CF) albers_conical_equal_area lambert_azimuthal_equal_area lambert_conformal_conic mcidas_area mercator orthographic rotated_pole stereographic (including polar) transverse_mercator UTM (ellipsoidal) vertical_perspective

Vertical Transforms (CF) atmosphere_sigma atmosphere_hybrid_sigma_pressure atmosphere_hybrid_height ocean_s ocean_sigma existing3DField

NOAA's Weather and Climate Toolkit

WMS Clients NASA World Wind Cadcorp SIS Google Earth 3rd-party clients can’t use the custom WMS extensions

Unidata’s Integrated Data Viewer

Servlet Container THREDDS Data Server Datasets catalog.xml motherlode.ucar.edu THREDDS Server NetCDF-Java library Remote Access Client IDD Data HTTPServer WMS WCS OPeNDAP configCatalog.xml

THREDDS Data Server (TDS) Web server for scientific data Provides remote data access – HTTP file transfer – OPeNDAP (index space subsetting) – Open Geospatial Consortium (OGC) WMS and WCS (coordinate space subsetting) – Experimental data access protocols NetCDF Subset Service CdmRemote

OGC Web Map Service WMS 1.3 Jon Blower’s (Reading, UK) ncWMS integrated with TDS Coordinate Space subsetting Produces JPEG output Fast generation of images Reproject images into large number of coordinate systems (uses GeoToolkit)

OGC Web Coverage Service WCS 1.0 Coordinate Space subsetting Returns data, not pictures – GeoTIFF floating point, grayscale – NetCDF-CF No reprojections, resamplings Restricted to CDM files that have Grid coordinate system – evenly spaced x,y

NetCDF Markup Language (NcML) Modify (“fix”) existing datasets without rewriting them Often its possible to add missing coordinate information that allows georeferencing by standard tools

NetCDF-Java Summary NetCDF-Java Library is widely used, because: – Handles multiple file formats, single API – Grungy details of Coordinate System encodings So many standards to choose from! – NcML allows dataset modification without rewrite – NcML aggregation for virtual datasets – Remote access is built in

Issues HDF5 != netCDF-4 – Should it? OPeNDAP hasn’t delivered DAP4 – Can’t transport NetCDF-4 extendd data model Build libraries on top of data access libraries – Coordinate systems: standardize? – Feature Types : Grids, Swaths, Radial, Discrete Sampling, Unstructured Grids, Polylines

How to get this functionality into netCDF C library? Java has to run in its own process, cant be linked into C Reimplement CDM functionality in C library – 200K+ LOC – OO, inheritence – OTOH, avoid blind alleys, 3 rd party libraries Leave Java in its own process, communicate across processes

Local Client CdmRemote Server Local File System Local Client CdmRemote NetCDF-Java Library Cdmremote protocol socket Scientific Features Coordinate Systems Data Access

Possibility: CdmRemote Lightweight server for CDM datasets – Zero configuration – Local filesystem – Cache expensive objects Uses ncstream for on-the-wire protocol, probably over HTTP Java and C clients – Allow non-Java applications access to CDM stack – Coordinate space queries – Virtual datasets – Feature Types

Possibility: ncstream Write-optimized encoding of CDM datasets – append only Streaming data across network Binary encoding using Google's Protobuf Protobuf – Efficient, extensible Ncstream NetCDF4

Summary We should build higher level abstractions on top of existing data access libraries Standardize semantic conventions for coordinate systems, discovery metadata, etc. Deal with very large collections of scientific data files Keep pushing common tasks into libraries, let scientists do science

Thank You