THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar 2007.

Slides:



Advertisements
Similar presentations
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Advertisements

Complex Scientific Analytics in Earth Science at Extreme Scale John Caron University Corporation for Atmospheric Research Boulder, CO Oct 6, 2010.
Reading HDF family of formats via NetCDF-Java / CDM
Recent Work in Progress
The Model Output Interoperability Experiment in the Gulf of Maine: A Success Story Made Possible By CF, NcML, NetCDF-Java and THREDDS Rich Signell (USGS,
A Unified Data Model and Programming Interface for Working with Scientific Data Doug Lindholm Laboratory for Atmospheric and Space Physics University of.
THREDDS Status John Caron Unidata 5/7/2013. Outline Release schedule Aggregations -> featureCollections / NCSS GRIB refactor Discrete Sampling Geometry.
A Common Data Model In the Middle Tier Enabling Data Access in Workflows … HDF/HDF-EOS Workshop XIV September 29, 2010 Doug Lindholm Laboratory for Atmospheric.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
® OGC Web Services Initiative, Phase 9 (OWS-9): Innovations Thread - OPeNDAP James Gallagher and Nathan Potter, OPeNDAP © 2012 Open Geospatial Consortium.
THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
John Caron Unidata October 2012
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
Quick Unidata Overview NetCDF Workshop 25 October 2012 Russ Rew.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
Implementation of Model Data Interoperability for IOOS: Successes and Lessons Learned Rich Signell USGS Woods Hole, MA / NOAA Silver Spring USA Model Data.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Unidata’s Common Data Model John Caron Unidata/UCAR Nov 2006.
THREDDS Data Server Ethan Davis GEOSS Climate Workshop 23 September 2011.
Coverages and the DAP2 Data Model James Gallagher.
Weathertop Consulting, LLC Wednesday, January 14, 2009 IIPS 11A.2 1 A General Purpose System for Server-side Analysis of Earth Science Data Roland Schweitzer.
NetCDF-Java Overview John Caron Oct 29, Contents Data Models / Shared Dimensions Coordinate Systems Feature Types NetCDF Markup Language (NcML)
NcML Aggregation vs Feature Collections. NcML functionality 1.Modify the objects found in CDM files – Especially Attributes – Don’t have to rewrite the.
Unidata’s TDS Workshop TDS Overview – Part II Unidata July 2011.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
WCS Data Exchange at the DataFed Server/Client Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis, MO OGC TC.
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
1 International Standards for Data Interoperability GALEON Geo-interface for Air, Environment, Land, Ocean NetCDF Ben Domenico Unidata Program Center*
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
DAP4 James Gallagher & Ethan Davis OPeNDAP and Unidata.
Unidata TDS Workshop THREDDS Data Server Overview
1 NASA CEOP Status & Demo CEOS WGISS-24 Oberpfaffenhofen, Germany October 15, 2007 Yonsook Enloe.
Recent developments with the THREDDS Data Server (TDS) and related Tools: covering TDS, NCML, WCS, forecast aggregation and not including stuff covered.
Unidata’s Common Data Model and the THREDDS Data Server John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006.
NIEeS Workshop, Cambridge (UK), Sep 2002 Luca Cinquini for the Earth System Grid METADATA DEVELOPMENT for the EARTH SYSTEM GRID Luca Cinquini (SCD/NCAR)
IOOS Data Services with the THREDDS Data Server Rich Signell USGS, Woods Hole IOOS DMAC Workshop Silver Spring Sep 10, 2013 Rich Signell USGS, Woods Hole.
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
Unidata’s TDS Workshop TDS Overview – Part I July 2011.
Unidata’s Common Data Model and NetCDF Java Library API Overview John Caron Unidata/UCAR Nov 2008.
Unidata's Involvement in Developing and Supporting Climate Science Infrastructure Russ Rew UCAR Unidata April 2010.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Data File Formats: netCDF by Tom Whittaker University of Wisconsin-Madison SSEC/CIMSS 2009 MUG Meeting June, 2009.
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
GIS for Atmospheric Sciences and Hydrology By David R. Maidment University of Texas at Austin National Center for Atmospheric Research, 6 July 2005.
Weathertop Consulting, LLC Server-side OPeNDAP Analysis – Concrete steps toward a generalized framework via a reference implementation using F-TDS Roland.
Grids and Beyond: netCDF-CF and ISO/OGC Features and Coverages Ethan Davis, John Caron, Ben Domenico UCAR/Unidata AMS IIPS, 23 January 2008.
ESIP Air Quality Jan Air Quality Cluster Air Quality Cluster Technology Track Earth Science Information Partners Partners NASA NOAA EPA (?) USGS.
Common Data Model Scientific Feature Types John Caron UCAR/Unidata July 8, 2008.
1 Key Results from GALEON John Caron Ben Dominico UCAR/Unidata.
Unidata Technologies Relevant to GO-ESSP: An Update Russ Rew
Rich Signell Roland Viger Curtis Price USGS Community for Data Integration Feb 15, 2012.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
NetCDF: Data Model, Programming Interfaces, Conventions and Format Adapted from Presentations by Russ Rew Unidata Program Center University Corporation.
Interoperability Day Introduction Standards-based Web Services Interfaces to Existing Atmospheric/Oceanographic Data Systems Ben Domenico Unidata Program.
Update on Unidata Technologies for Data Access Russ Rew
THREDDS Data Server (TDS) and Data Discovery John Caron Unidata/UCAR May 15, 2006.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
NetCDF-Java version 2.2 Common Data Model John Caron Unidata/UCAR Dec 10, 2004.
What is NetCDF ? And what are its plans for world domination?
Recent Work in Progress
Remote Data Access Update
Future Development Plans
Presentation transcript:

THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar 2007

HTTP Tomcat Server THREDDS Data Server Datasets catalog.xml motherlode.ucar.edu THREDDS Server Application NetCDF-Java library IDD Data HTTPServer NetcdfSubset WCS OPeNDAP

THREDDS Catalogs XML over HTTP Hierarchical listing of online resources (datasets) Container for arbitrary search metadata –Standard set maps to DC, GCMD, ADN –Unidata/CDP Metadata can be inherited Design goal: Make it easy for data providers TDS uses for configuration –Client view vs. server view Data Access URLS –“Crossing the protocol boundary”

catalog.xml

Motherlode catalog example

THREDDS WCS 1.0 Server Each (gridded) Dataset is WCS Each Grid is a Coverage Return formats –GeoTIFF: floating point, greyscale –NetCDF / CF-1.0 (same as NetcdfSubset Service) No reprojections, resampling GALEON 2 –upgrade to WCS 1.1 –Try returning point datasets

THREDDS OPeNDAP Server Current version 2.0; NASA ESE standard –Working on new 4.0 protocol spec Based on Java-OPeNDAP library –shared development by Unidata/opendap.org Any CDM dataset can be served Server4 (Hyrax): –latest version of opendap.org C++ library –uses THREDDS catalog generation code –THREDDS Catalogs replace dods_dir

HTTP Tomcat Server Common Data Model catalog.xml hostname.edu THREDDS Server Application NetCDF-Java library IDD Data HTTPServer NetcdfSubset WCS OPeNDAP Then a miracle happens Datasets

NetcdfDataset Application Scientific Datatypes NetCDF-Java version 2.2 architecture OPeNDAP THREDDS Catalog.xml NetCDF-3 HDF5 I/O service provider GRIB GINI NIDS NetcdfFile NetCDF-4 … Nexrad DMSP CoordSystem Builder Datatype Adapter ADDE NcML

I/O Service Provider Implementations General: NetCDF, HDF5, OPeNDAP Gridded: GRIB-1, GRIB-2 Radar: NEXRAD level 2 and 3, DORADE, Chinese NEXRAD Point: BUFR, ASCII Satellite: DMSP, GINI, McIDAS AREA In development / tentative –NOAA CLASS legacy files –Barrowdale DataBlade

Coordinate Systems Common Data Model Layers Data Access Scientific Datatypes Grid Point Radial Trajectory Swath StationProfile

NetCDF-4 and Common Data Model (Data Access Layer)

NetCDF-4 C library 4.0 Beta implements CDM access layer –complete, but waiting for HDF5 release 1.8 to finalize file format (Maybe this month, 1.5 years late!) –Persistence format for complete CDM 4.1: adding Coordinate Systems –Optional layer, focus on CF-1 (libcf) 4.?: merge OPeNDAP access (pending funding)

Coordinate Systems UML

NcML: NetCDF Markup Language XML representation of netCDF metadata Core: netCDF data access model Coordinate System: general and georeferencing coordinate system Dataset: redefine, aggregate, subset Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew

NcML NcML Coordinate Systems further developed into NcML-G by Stefano et al. NcML Core and Dataset combined into single schema to allow dataset modification Aggregation: –Union –Syntactic join on (existing or new) outer dimension –Semantic aggregation of (runtime, forecast time) = Forecast Model Run Collection

<netcdf xmlns=" location=“/data/nids/N0R_ _2147"> NcML example

TDS / NcML example

TDS / NcML aggregation

Datasets vs. Files Must hide actual location of data files on your server Would like to hide actual file format Must encapsulate collections of files into logical datasets –Homogenous metadata –Hide arbitrary storage decisions –Minimize number of datasets

Forecast Model Run Collection (FMRC)

Data Model: Sampled Functions Our phenomena are continuous functions: F: Domain → Range where Domain = subset of space-time (3 spatial, time) ( Ε 4 ) Range = R n (product set of real numbers) Our measurements are sampled functions Domain is a point subset = {p, p є Ε 4 } M: E 4 → R n

Variables Variable is a container for an Array of values dimensions lat = 64; lon = 128; variables: float temperature( lat, lon); Domain is a set of points in Index space: Temperature : {[0..63] x [0..127]} → R Temperature : I 2 → R Variable : I m → R n

Coordinate Systems Coordinate Axis : I m → R {Axis} = Coordinate System : I m → E 4 V: I m → R n CS: I m → E 4 V ° CS -1 : E 4 → R n

Scientific Data Types Trying to go beyond index-space subsetting Trying to satisfy V ° CS -1 : E 4 → R n –I.e. support subsetting using Space, Time “queries” Based on datasets Unidata is familiar with –APIs are evolving Intended to scale to large, multifile collections Corresponding “standard” NetCDF file format conventions

Implementations Datatype Grid PointObs RadialSweep Swath Dataset GridDataset FMRCDataset CollectionOfPointObs StationCollectionOfPointObs StationCollectionOfRadialSweep

Conclusions CDM is our implementation data model Map to data access models such as OGC Current work is to serve collections instead of individual files. Dataset is desired level of granularity Scientific data types are implementations with specialized access

Datatype Collection GridDataset collection of GridDatatype

NetcdfDataset Application Scientific Datatypes NetCDF-Java version 2.2 architecture OPeNDAP THREDDS Catalog.xml NetCDF-3 HDF5 I/O service provider GRIB GINI NIDS NetcdfFile NetCDF-4 … Nexrad DMSP CoordSystem Builder Datatype Adapter ADDE NcML

Gridded Datatype float gridData(t,z,y,x); float time(t); float y(y); float x(x); float lat(y,x); float lon(y,x); float z(z); float height(t,z,y,x); Cartesian coordinates All dimensions are connected horizontal: lat,lon or projection x,y time(time) orthogonal 1D seperable: (x, y) X time X z

GridDatatype methods CoordinateAxis getTaxis(); CoordinateAxis getXaxis(); CoordinateAxis getYaxis(); CoordinateAxis getZaxis(); Projection getProjection(); int[] findXYindexFromCoord( double x_coord, double y_coord); LatLonRect getLatLonBoundingBox(); Array getDataSlice (Range[] …) GridDatatype makeSubset (Range[] …)

Radial Data radialData(radial, gate) : distance(gate) azimuth(radial) elevation(radial) time(radial) Polar coordinates All dimensions are connected Not separate time dimension

Swath swathData(line,cell) lat(line,cell) lon(line,cell) time(line) z(line,cell) ?? lat/lon coordinates not separate time dimension all dimensions are connected

Unstructured Grid float unstructGrid(t,z,pt); float lat(pt); float lon(pt); float time(t); float height(z); Pt dimension not connected Looks the same as point data Need to specify the connectivity explicitly

Point Observation Data Structure { lat, lon, z, time; v1, v2,... } obs( pt); Set of measurements at the same point in space and time Point dimension not connected float obs1(pt); float obs2(pt); float lat(pt); float lon(pt); float z(pt); float time(pt);

PointObsDataset Methods // Iterator Iterator getData( LatLonRect boundingBox, Date start, Date end);

Time series Station Data Structure { name; lat, lon, z; Structure{ time; v1, v2,... } obs(*); // connected } stn(stn); // not connected

StationObs Methods // List List getStations( LatLonRect boundingBox); // Iterator Iterator getData( Station s, Date start, Date end);

Structure { name; Structure { lat, lon, z, time; v1, v2,... } obs(*); // connected } traj(traj) // not connected Trajectory Data Structure { lat, lon, z, time; v1, v2,... } obs(pt); // connected pt dimension is connected Collection dimension not connected

Profiler/Sounding Station Data Structure { name; lat, lon, time; Structure { z; v1, v2,... } obs(*); // connected } loc(nloc); // not connected Structure { name; lat, lon; Structure { time, Structure { z; v1, v2,... } obs(*); // connected } time(*); // connected } stn(stn); // not connected

Data Types Summary Data access through a standard API Convenient georeferencing Specialized subsetting methods –Efficiency for large datasets

File Format #N File Format #2 File Format #1 CDM Visualization &Analysis Payoff N + M instead of N * M things on your TODO List! NetCDF file OpenDAP Server WCS Service Web Service

Next: DataType Aggregation Work at the CDM DataType level, know (some) data semantics Forecast Model Collection –Combine multiple model forecasts into single dataset with two time dimensions –With NOAA/IOOS (Steve Hankin) Point/Station/Trajectory/Profile Data –Allow space/time queries, return nested sequences –Start from / standardize “Dapper conventions”

Forecast Model Collections

Coordinate Systems: implicit/explicit NetCDF, OPeNDAP, HDF data models do not have explicit coordinate systems – so georeferencing not part of API –Need conventions to specify (eg CF-1, COARDS, etc) GRIB, HDF-EOS (eg) are explicit –But no uniform API

47 NetCDF-4 C Library HDF5 Library netCDF-4 Library netCDF-3 Interface NetCDF-4 C Library

Conclusion Standardized Data Access in good shape –HDF5, NetCDF, OPeNDAP –Write an IOSP for proprietary formats (Java) But that’s not good enough! To do: –Standard representations of coordinate systems –Classifications of data types, standard services for them