Recent Work in Progress

Slides:



Advertisements
Similar presentations
1 WCS Encoding Format Profiles netCDF Example Stefano Nativi, Lorenzo Bigagli, Ben Domenico, John Caron March 2006 Draft based mainly on presentations.
Advertisements

Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Unidata Seminar Series - 30 January 2004 OPeNDAP and THREDDS: Access and Discovery of Distributed Scientific Data Yuan Ho Ethan Davis UCAR Unidata.
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Reading HDF family of formats via NetCDF-Java / CDM
Dynamic Memory Allocation I Topics Basic representation and alignment (mainly for static memory allocation, main concepts carry over to dynamic memory.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007.
7 +/- 2 Maybe Good Ideas John Caron June (1) NetCDF-Java (aka CDM) has lots of functionality, but only available in Java – NcML Aggregation – Access.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
Implementation of Model Data Interoperability for IOOS: Successes and Lessons Learned Rich Signell USGS Woods Hole, MA / NOAA Silver Spring USA Model Data.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
OOI CyberInfrastructure: Technology Overview - Hyrax January 2009 Claudiu Farcas OOI CI Architecture & Design Team UCSD/Calit2.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Unidata’s Common Data Model John Caron Unidata/UCAR Nov 2006.
THREDDS Data Server Ethan Davis GEOSS Climate Workshop 23 September 2011.
Coverages and the DAP2 Data Model James Gallagher.
NetCDF-Java Overview John Caron Oct 29, Contents Data Models / Shared Dimensions Coordinate Systems Feature Types NetCDF Markup Language (NcML)
NcML Aggregation vs Feature Collections. NcML functionality 1.Modify the objects found in CDM files – Especially Attributes – Don’t have to rewrite the.
Unidata’s TDS Workshop TDS Overview – Part II Unidata July 2011.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
1PeopleDocumentsData Catalog Generation Tools Analysis and Visualization Tools Data Services Discovery and Publication Tools Discovery and Publication.
THREDDS Data Server Unidata’s Common Data Model Background / Summary John Caron Unidata/UCAR Mar 2007.
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
DAP4 James Gallagher & Ethan Davis OPeNDAP and Unidata.
A/WWW Enterprises 28 Sept 1995 AstroBrowse: Survey of Current Technology A. Warnock A/WWW Enterprises
Unidata TDS Workshop THREDDS Data Server Overview
1 HDF5 Life cycle of data Boeing September 19, 2006.
Recent developments with the THREDDS Data Server (TDS) and related Tools: covering TDS, NCML, WCS, forecast aggregation and not including stuff covered.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
Unidata’s Common Data Model and the THREDDS Data Server John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
NIEeS Workshop, Cambridge (UK), Sep 2002 Luca Cinquini for the Earth System Grid METADATA DEVELOPMENT for the EARTH SYSTEM GRID Luca Cinquini (SCD/NCAR)
IOOS Data Services with the THREDDS Data Server Rich Signell USGS, Woods Hole IOOS DMAC Workshop Silver Spring Sep 10, 2013 Rich Signell USGS, Woods Hole.
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES
Unidata’s TDS Workshop TDS Overview – Part I July 2011.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
NetCDF-4: Software Implementing an Enhanced Data Model for the Geosciences Russ Rew, Ed Hartnett, and John Caron UCAR Unidata Program, Boulder
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
Data File Formats: netCDF by Tom Whittaker University of Wisconsin-Madison SSEC/CIMSS 2009 MUG Meeting June, 2009.
Advances in the NetCDF Data Model, Format, and Software Russ Rew Coauthors: John Caron, Ed Hartnett, Dennis Heimbigner UCAR Unidata December 2010.
Weathertop Consulting, LLC Server-side OPeNDAP Analysis – Concrete steps toward a generalized framework via a reference implementation using F-TDS Roland.
11/8/2007HDF and HDF-EOS Workshop XI, Landover, MD1 Software to access HDF5 Datasets via OPeNDAP MuQun Yang, Hyo-Kyung Lee The HDF Group.
Semantic Web underpinnings of the IRI Data Library Semantic Web as a Framework for Multiple Metadata IRI Data Library: presenting Data in multiple frameworks.
Unidata Technologies Relevant to GO-ESSP: An Update Russ Rew
Developing Conventions for netCDF-4 Russ Rew, UCAR Unidata June 11, 2007 GO-ESSP.
NetCDF: Data Model, Programming Interfaces, Conventions and Format Adapted from Presentations by Russ Rew Unidata Program Center University Corporation.
Update on Unidata Technologies for Data Access Russ Rew
THREDDS Data Server (TDS) and Data Discovery John Caron Unidata/UCAR May 15, 2006.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
NetCDF-Java version 2.2 Common Data Model John Caron Unidata/UCAR Dec 10, 2004.
Moving from HDF4 to HDF5/netCDF-4
Efficiently serving HDF5 via OPeNDAP
Unidata & NetCDF BoF Scientific File Formats
Recent Work in Progress
Remote Data Access Update
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
OPeNDAP/Hyrax Interfaces
Presentation transcript:

Recent Work in Progress John Caron, June 3, 2003 THREDDS development Dynamic Catalogs: DQC, Resolvers IDD Data Server ADDE Cataloger NetCDF development NetCDF Markup Language (NcML) More efficient Java I/O (NIO) NetCDF/DODS/HDF5 Data Models 90% done – last 10% take another 90%.

THREDDS Catalogs Catalog Generator Data Server Client Application HTTP Server CatalogRef.xml Catalog.xml Catalog Generator Data Server DODS, ADDE, FTP, HTTP Client Application Catalogs are lists of dataset URLs. Static case. Catgen talks to server or file system. Catalog might be static or dynamically generated. CatalogRef allows dynamic partial generation (benno) Datasets hostname.edu

Dynamic Catalogs = Services HTTP Tomcat Server Catalog.xml Catalog Service Catalog Generator CatalogRef.xml Query Resolver Service DQC.xml Client Application Resolver Service URI URL Need more dynamic system for real time and very large datasets. Catalog is a file, but these are services, that is, code. Show IDD Server catalog – show sattellite DQC, then show radar DQC Data Server DODS, ADDE, FTP, HTTP Datasets hostname.edu

Dataset Query Capability (DQC) XML document. Describes what the user can ask for as a set of orthogonal “selections”. On the client, a “query URL” is formed based on the user’s choices, and sent to the server. The “query resolver” server finds which datasets satisfy the query and returns a list of real dataset URLs. The DQC describes the queries that the server is capable of responding to. In response to Level 3 radar data Generality of it was Ethan’s idea. Robb implemented the radar server. Note its an indirection: a dataset which you use to find other datasets.

Resolver Services Logical Dataset, eg “latest ETA model run” Dataset with Service type “Resolver” On the client, the URI of the logical dataset is sent to the server The server finds what is available and returns a list of real dataset URLs. DQC as a fancy kind of Resolver Service. URI to URL Credit Jeff and Don.

ADDE Cataloger Catalog.xml Client Application ADDE Data Server HTTP Tomcat Server Catalog.xml Catalog Service ADDE Cataloger CatalogRef.xml Query Resolver Service DQC.xml Xxxxx Xxxx Client Application Try to improve response to ADDE Servers. Poll / cache results Add backdoor to ADDE servers. Will try on gozer. If time, show ADDE Cataloger – client. ADDE Data Server hostname.edu Datasets IDD

Summary IDD Data Server Get as much of the IDD Data feeds available via THREDDS as possible. NCEP model data (catgen) (DODS) Level 3 NEXRAD (custom server/DQC) (ADDE) SSEC/Unidata Satellite data (ADDE Cataloger) (ADDE) Text Data: Metars, Surface Obs, etc (DQC/custom server), returns text or XML. Profiler Data (custom server/DQC) (ADDE) Credit Robb

NetCDF 3 NetCDF-3 library API Client Application NetCDF OpenDAP File Dataset protocol NcML Dataset XML Virtual dataset Local file HTTP protocol NetCDF-3 library API HTTP, Dataset Java-only Client Application

NetCDF Markup Language XML representation of netCDF metadata, uses XML Schema Core: existing netCDF data model Coordinate System: general and georeferencing coordinate system Dataset: redefine, aggregate, subset Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew

NcML Coordinate Systems Convention Parser ATDRadar AWIPS COARDS CF CSM GDV NUWG WRF Zebra NetCDF File OpenDAP Dataset Netcdf Dataset NcML Dataset XML

GeoGrids, GeoTiffs, Geowhiz! NetCDF File OpenDAP Dataset Convention Parser Netcdf Dataset GeoGrid factory VisAD / IDV GeoGrid Dataset WCS Server Have sent GeoTIFFS to ESRI using this (credit Roland) using ccm2 files. Limited to lat/lon, want to add projections. Yuan will continue work on this. Stefano work on WCS. GeoTiff Writer Strange land of GIS OpenGIS WCS GeoTiff File

NcML Dataset : “virtual view” NetCDF File OpenDAP Dataset NcML Dataset XML Dataset XML Parser Java-netCDF 2.1 CS output format of Convention parsing Dataset is an input format. Client Application NetCDF Dataset

NcML Dataset Use NcML like CDL, to declare the contents of a netCDF file. Add, delete or rename Variables, Attributes, and Dimensions Subset Variables Reorder a Variable’s dimensions Aggregate multiple netCDF files, a la DODS Aggregation Server NcML Dataset is a “virtual view” or can make copy to a local netCDF file. Declarative vs procedural Subset AWIPS Investigate Integrate with thredds catalogs? Demo copy dods file from Ken Keiser’s UAH “passive microwave” data.

2: NcML Datasets on a Server Catalog.xml DODS Agg/Netcdf Server DODS, ADDE, FTP, HTTP Dataset XML Parser Client Application NcML Dataset XML Datasets hostname.edu

3: NcML Datasets via Catalogs Catalog.xml NcML Dataset XML NetCDF File OpenDAP Dataset Catalog/Dataset XML Parser Java-netCDF v 2.1.1 Third party metadata. Client Application

NIO Rewrite ucar.nc2 I/O layer using java.nio package (currently using ucar.netcdf) Uses memory mapping, bulk I/O transfer Prototype has 7x speedup on large files. Requires JDK 1.4+ HTTP access must be rewritten

NIO vs current Java NIO Current old/new First access small (3.9 Mb) 281 671 2.4 large (240 Mb) 3334 28221 8.5 Average next 5 accesses small 54 290 5.4 large 2239 16367 7.3 Time in millisecs to sequentially read entire file Wintel 2GHz, 1 GB main memory Java 1.4.2 -client

NIO vs optimized C NIO C C/NIO First access small (3.9 Mb) 281 370 1.3 large (240 Mb) 3334 19348 5.8 Average next 5 accesses small 54 24 .44 large 2239 953 .43 Java 1.4.2 –client vs. VC 6.0 /O2 Probably OS / JDK dependent.

NetCDF Data Model NetcdfFile Dimension Variable Attribute Attribute DataType byte char short int float double

OpenDAP Data Model Dataset BaseType primitive (8) string array grid structure sequence Dataset array Dimension BaseType BaseType Attribute Attribute structure / sequence Allow DB like queries on sequences, eg all records with x < 3.0 Attribute BaseType Attribute

HDF5 Data Model Datatype Dataset Groups Datatype DataSpace Attribute Fixed point floating point date/time string bit field Opaque Compound Reference Enumeration Variable length Array Dataset DataSpace Attribute Datatype Groups File directory structure inside HDF file. Data storage Compact External Layout Indexed Striped Neither HDF5 or OpenDAP has shared global dimensions.

Possible Extensions to netCDF data model Add new data types: Strings: variable length arrays of bytes, plus an encoding attribute. Structures: collections of any other element types, allow nested structures. Vector: a variable length 1D array of any type. Allow reusable structure definition = user defined data type. Allow unnamed, undeclared dimensions = anonymous dimensions. Allow multiple unlimited dimensions (outer dimension only) Compression. Push scale/offset into library, allow variable bit sizes. Explicit support for coordinate variables/axes. All probably feasible on top of HDF5.

New NetCDF Data Model NetcdfFile Variable Dimension Attribute Structure DataType byte short int long float double String Structure Vector Dimension DataType DataType DataType Deprecate char, replace with String. Vector Length Attribute DataType Attribute

NetCDF 4 NetCDF 4 library API Client Application NetCDF V.1 and 2 File OpenDAP Dataset HDF5 File OpenDAP 4.0 protocol Local file or HTTP protocol NcML Dataset XML NetCDF 4 library API Influence both OpenDAP and HDF5 Ambitious, but useful for THREDDS clients Unidata Priorities ?? The most compelling reason…. Virtual dataset Client Application