Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration.

Slides:



Advertisements
Similar presentations
James Gallagher OPeNDAP 1/10/14
Advertisements

OPeNDAP in the Cloud Optimizing the Use of Storage Systems Provided by Cloud Computing Environments OPeNDAP James Gallagher, Nathan Potter and NOAA/NODC.
The Model Output Interoperability Experiment in the Gulf of Maine: A Success Story Made Possible By CF, NcML, NetCDF-Java and THREDDS Rich Signell (USGS,
OPeNDAP-Unidata Development of DAP4 (a Data Access Protocol) Describing Progress and Seeking Input at the ESIP Summer Meeting 2012 by Dave Fulker (OPeNDAP.
® OGC Web Services Initiative, Phase 9 (OWS-9): Innovations Thread - OPeNDAP James Gallagher and Nathan Potter, OPeNDAP © 2012 Open Geospatial Consortium.
HDF5 OPeNDAP Project Update and Demo MuQun Yang and Hyo-Kyung Lee (The HDF Group) James Gallagher (OPeNDAP, Inc.)
McIDAS-V McIDAS-V The 5 th Generation of McIDAS by Tom Whittaker Space Science and Engineering Center University of Wisconsin-Madison USA with contributions.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
CSE351/ IT351 Modeling And Simulation Choosing a Mesh Model Dr. Jim Holten.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
Активное распределенное хранилище для многомерных массивов Дмитрий Медведев ИКИ РАН.
OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher.
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Numerical Grid Computations with the OPeNDAP Back End Server (BES)
Use of GNOME to model oil spills in an Environmental Geology Course
THREDDS Data Server, OGC WCS, CRS, and CF Ethan Davis UCAR Unidata 2008 GO-ESSP, Seattle.
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
Implementation of Model Data Interoperability for IOOS: Successes and Lessons Learned Rich Signell USGS Woods Hole, MA / NOAA Silver Spring USA Model Data.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
2 3 ROMS/COAWST NcML file 4 5 Exploiting IOOS: A Distributed, Standards-Based Framework and Software Stack for Searching, Accessing, Analyzing and.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Coverages and the DAP2 Data Model James Gallagher.
Super-Regional Modeling Testbed to Improve Forecasts of Environmental Processes for the U.S. Atlantic and Gulf of Mexico Coasts Super-Regional Modeling.
A Super-Regional Modeling Testbed for Improving Forecasts of Environmental Processes for the U.S. Atlantic and Gulf of Mexico Coasts Don Wright, SURA Principal.
NWS Partners Meeting 2010 Dave Westerholm, Director National Oceanic and Atmospheric Administration’s Office of Response and Restoration June 9, 2010.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
NOAA’s CENTER for OPERATIONAL OCEANOGRAPHIC PRODUCTS and SERVICES SLT Informational Briefing for San Francisco Bay Operational Forecast System (SFBOFS)
DELIVERING ENVIRONMENTAL WEB SERVICES (DEWS) Partners: UK Met Office (Lead Partner), British Atmospheric Data Centre (BADC), British Maritime Technology.
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
DMAC: Infrastructure to enable delivery of IOOS ® information Presentation to the IOOS Advisory Committee Derrick Snowden, System Architect, U.S. IOOS.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
Data Access to Marine Surface Observations and Products from COADS 29 January, 2002 Steven Worley National Center for Atmospheric Research.
HDF5 OPeNDAP Project Update and Demo MuQun Yang and Hyo-Kyung Lee (The HDF Group) James Gallagher (OPeNDAP, Inc.) 1HDF and HDF-EOS Workshop XII10/17/2008.
Integrating netCDF and OPeNDAP (The DrNO Project) Dr. Dennis Heimbigner Unidata Go-ESSP Workshop Seattle, WA, Sept
DAP4 James Gallagher & Ethan Davis OPeNDAP and Unidata.
IOOS Modeling Testbed Cyberinfrastructure Rich Signell, USGS, Woods Hole, MA IOOS-RA-Briefing, Feb 14, 2012.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
National Weather Service Goes Digital With Internet Mapping Ken Waters National Weather Service, Honolulu HI Jack Settelmaier National Weather Service,
Sciamachy features and usage with respect to end-users The typical fate of retrieval people dealing with large datasets… C. Frankenberg, SRON team, IUP.
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
HDF4 OPeNDAP Project Progress Report MuQun Yang and Hyo-Kyung Lee 1 HDF Developers' Meeting11/24/2015.
Remote Data Access with OPeNDAP Dr. Dennis Heimbigner Unidata netCDF Workshop October 25, 2012.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
OPeNDAP Hyrax Harnessing the power of the BES OPeNDAP Hyrax Back-End Server Patrick West
00/XXXX 1 Data Processing in PRISM Introduction. COCO (CDMS Overloaded for CF Objects) What is it. Why is COCO written in Python. Implementation Data Operations.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Implementing Marine XML for NOAA Observing Data Nazila Merati and Eugene Burger NOAA/Pacific Marine Environmental Laboratory Seattle, WA.
Unstructured Mesh Conventions for CF
OPeNDAP Developer’s Workshop Feb Server-side Functions for Geo-spatial Selection James Gallagher 22 Feb 2007.
QARTOD in Practice Luke Campbell, Software Engineer, RPS ASA.
IOOS Sea Surface Temperatures: Realizing Truly Distributed Data with Open Standards Presenter Kyle Draganov.
OGC Web Services with complex data Stephen Pascoe How OGC Web Services relate to GML Application Schema.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
Update on Unidata Technologies for Data Access Russ Rew
HDF5 OPeNDAP Project Update and Demo MuQun Yang and Hyo-Kyung Lee (The HDF Group) James Gallagher (OPeNDAP, Inc.) 1HDF and HDF-EOS Workshop XII, Aurora,
NcBrowse: OPeNDAP Server Access and 3-D Graphics Presented by Nancy N. Soreide NOAA/PMEL Donald W. Denbo UW/JISAO-NOAA/PMEL.
DAP+NETCDF Using the netCDF-4 Data Model
IRI Data Library Overview
Other Services in Hyrax
Access HDF5 Datasets via OPeNDAP’s Data Access Protocol (DAP)
Remote Data Access Update
ExPLORE Complex Oceanographic Data
Future Development Plans
OPeNDAP/Hyrax Interfaces
Presentation transcript:

Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration Emergency Response Division James Gallagher OPenDAP, inc. NOAA ’ s National Ocean Service Office of Response and Restoration

NOAA Emergency Response Division National Contingency Plan specifies NOAA’s role in supporting the Coast Guard: “Provide scientific expertise to support an incident response for Oil and Chemical Spills”

Key Role: Trajectory Modeling Where is the oil (or chemical) going?

Primary Tool: GNOME (General NOAA Operational Modeling Environment) Lagrangian element (particle) model Forcing from external sources: – Winds – Currents Currents: – In house model – External operational models

GOODS GNOME Online Operational Data Server

Example: Deepwater Horizon Ocean models utilized: – NOAA CSDL: NGOM – Navy models: NCOM, HYCOM, IASNFS – USF: West Florida Shelf ROMS – TGLO/TAMU: TX shelf ROMS – NC State: SABGOM – All structured grid models

Unstructured Grid Models? Unstructured Grids: – Allow resolution to vary spatially – Conform to boundaries Nice for oil spills and particle tracking Many more UGRID models coming online – Many papers at this conference

Some Models of Interest FVCOM: – nGOMOFS (NOAA CSDL) – Gulf of Maine/Mass Bay (UMASS) – Salish Sea (PNNL) SELFE: – Columbia River (OHSU) – Texas Estuaries models (UT) ADCIRC: – Gulf of Mexico / Southern LA and Texas grid 9,108,128 nodes--18,061,765 elements

90,310 Nodes 174,550 Elements V6 nGOMOFS (NOAA CSDL)

Mobile Bay, AL detail grid. About 300 m grid resolution along a 13 m deep navigation channel What if I just need Mobile Bay?

FVCOM-GoM/GB for Mass Bay and Nantucket Sounds/Shoals Boston Inner Harbor

ADCIRC: Gulf of Mexico / Southern LA and Texas grid (SL18TX) Gulf of Mexico / Southern LA and Texas grid 9,108,128 nodes--18,061,765 elements Just surface currents: – 275 MB per time step (plus the grid specs)

Obstacles to using UGRID models: No standard for data/results on UGRIDS: – Informal working group for (quite!) a few years – Recent draft standard (netcdf 3) – Work on JavaNetcdf lib to support it (SURA modeling test bed project) Big Grids: – Need server side subsetting

How to get it done? NOAA/ORR post-DWH funding: – Better able to response to large spills We started talking to folks about server-side subsetting options But we’re clients: – We’re not going to run a server We needed something that would become an excepted standard/tool.

How to get it done? NOAA/NESDIS noted assorted issues: – Netcdf/OpenDAP development funding limited – Multiple diverging implementations: “Unfunded Mandate” NESDIS coordinated funding from: – Technology, Planning and Integration for Observations (TPIO) Program – OR&R – National Climatic Data Center (NCDC)

OPeNDAP-Unidata Linked Servers (OPULS) NOAA/BAA grant supports this important collaboration between Unidata & OPeNDAP First goal: conformance between OPeNDAP & Unidata servers, through which access is gained to growing amounts of NOAA & related data. Other short-term goals include: – Asynchronous modes, such as are needed for (delayed) access to near- line data, perhaps stored on tape, e.g. – Improved access (with server-side subsetting) to data organized on non- rectangular meshes, such as in coastal modeling Work began in Boulder during October & will be influenced by an advisory committee (yet to be appointed)

OPeNDAP: the Data Access Protocol DAP2 combines simple data model with a general set of operators. – Data Model: Atomic types (e.g., ‘Integer’); Arrays; Structures; Grids; and Sequences. – Operators: These provide ways to subset all but the atomic types. – Domain neutral: By keeping the semantics of the model clean, we ensure that it can be applied to many different types of data.

But how is it used? DAP is generally used as a ‘web service’ DAP requests are made using a URL DAP responses are ‘documents’: – Text that contains metadata – Combination of text/metadata and binary data. Applications read these responses and use them it whatever ways they see fit: – the netCDF client library makes legacy applications believe they are reading from a local file

About Array and Grid Selection In addition to requesting a Grid or Array, the Selection can be used to subset in indicial space.

About Functions Constraint Expression can contain functions These functions can perform any operation that can be programmed. Thus they provide a good way to extend a data server to perform new operations These include operations that are not domain neutral In Hyrax they are written in C++

Example URLs The base URL: “ To get metadata: – Dataset variables: – … attributes: – Or less readable in XML: To get data: – Just the variables u and v: – … in ASCII so it’s easy to read: With subsetting: – Here’s a function: – 60,”1000<TIME<3000”) – This is an example of how functions can enable domain-specific behavior; this function will return an error if the Grid is not ‘geospatial’

Challenges Unstructured Grids are not a specific type in DAP We must choose a way, or set of ways, to represent these data Datasets are often too large to download – subsetting must be done server-side. Because the subsetting operations are complex, we will need to use server-side functions to implement them

Requirements Must enable subsetting by polygonal regions The result must be an unstructured grid itself A subset must preserve the topological and geometric relationships present in the whole: – we can’t just regrid everything to a more convenient form.

Proposed Solution Server-side function to add subsetting Adopt the proposed unstructured grid encoding using netCDF3 Result of the function will be a DAP2 response – Input is netCDF3 with some additional ‘conventions’: it can be represented in DAP2 – There are existing clients that can read DAP2 If they understand netcdf in the new convention, they will understand the results

The server-side function Ugrid(Mesh, ) – is a comma separated list of latitude and longitude points – However, there is an arbitrary limit to the number of characters in a URL, so We will also support POST when OPULS makes the transition to DAP4 – It will likely take more than a year for all of DAP4 to be realized, but POST for constraint expressions will be set in the first year.

Example ugrid() calls – When ugrid() is called with two points, it will assume the polygon is a box. 45,-60, 20,-60, 20,-80) – Here the polygon the same box as above. – There’s an understood edge connecting the first and last points – Point order is important – self-intersecting polygons will raise an error.

, 42.38, , 42.37, , 42.36, , 42.35, , , 42.34, , 42.35, , 42.38)

Implementation We will use the Gridfields library [Howe 05] The library will be extended to work with the new netCDF3 file format: “Deltares CF proposal for Unstructured Grid data model” And to work with DAP [Howe 05] Bill Howe, David Maier, “Algebraic Manipulation of Scientific Datasets,” VLDB Journal, 14(4) 2005

Progress so far Gridfields has already been used to build a simpler server-side demonstration function The Gridfields code has adopted GNU’s autotools to streamline its build. We will factor out the C++ code into its own project, separate from the Python layer This will simplify moving gridfields into the Linux community builds

Summary Ugrid models are seeing wide deployment Subsetting UGrids on the server is critical to the wide use of model results UGrids will be encoded in netCDF3 We will use a widely available open-source library to perform the actual operations The results will be valid UGrids, in DAP The work has begun

Use for Curvilinear grids, too? Capture arbitrary polygon subset. Rectangle in geo-coordinates not a rectangle in grid coordinates – We generally over sample. -But that’s not always a good solution for highly deformed grids. -What would the result look like? -A new structured grid? -An unstructured grid?

Further Discussion, etc. Meet here at ECM: – Lunch Wed? Discussion on UGRID Google group: OPeNDAP Wiki: