GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.

Slides:



Advertisements
Similar presentations
A Roadmap of Open Source components for GI Web Services and Clients A Paul R Cooper MAGIC.
Advertisements

1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
The Reading e-Science Centre Keith Haines and Rachel Harrison Jon Blower Adit Santokhee Directors Technical Director Data Manager Search and Rescue British.
The Reading e-Science Centre Jon Blower Reading e-Science Centre Environmental Systems Science Centre University of Reading United Kingdom.
Netlobs Manipulating Gridded Data in a Relational World Neil Stamps Technical Architect.
BADC Workshop 1: Data & Services from the BADC Royal Met. Soc. Conference – 12 September 2005 Kevin Marsh et al.
The Model Output Interoperability Experiment in the Gulf of Maine: A Success Story Made Possible By CF, NcML, NetCDF-Java and THREDDS Rich Signell (USGS,
UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)
BARRODALE COMPUTING SERVICES LTD. Managing and serving large volumes of gridded spatial environmental data Adit Santokhee, Chunlei Liu,
® OGC Web Services Initiative, Phase 9 (OWS-9): Innovations Thread - OPeNDAP James Gallagher and Nathan Potter, OPeNDAP © 2012 Open Geospatial Consortium.
T HE W EB - BASED I NTERFACE TO C ENSUS I NTERACTION D ATA - WICID Presentation to the ESRC Research Methods Festival Adam Dennett Centre for Interaction.
Dynamic Quick View, interoperability and the future Jon Blower, Keith Haines, Chunlei Liu, Alastair Gemmell Environmental Systems Science Centre University.
The MashMyData project Combining and comparing environmental science data on the web Alastair Gemmell 1, Jon Blower 1, Keith Haines 1, Stephen Pascoe 2,
Exploring large marine datasets using an interactive website and Google Earth Jon Blower, Dan Bretherton, Keith Haines, Chunlei Liu, Adit Santokhee Reading.
The NERC Cluster Grid Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre Environmental Systems Science Centre.
New ways of exploring environmental data or: Letting do the hard work Jon Blower (ESSC and Reading e-Science Centre)
Printed by STORING AND MANIPULATING GRIDDED DATA IN SPATIALLY-ENABLED DATABASES Adit Santokhee, Jon Blower, Keith Haines Reading.
Christine White, Esri Growing OPeNDAP Support: Current ArcGIS Workflows and Future Directions Christine White, Esri
Implementing ISO Aleta Vienneau and David Danko ESRI.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
Office of Research and Development National Exposure Research Laboratory, Atmospheric Modeling Division, Applied Modeling Research Branch October 8, 2008.
The use of standard OGC web services in integrating distributed model, satellite and in-situ datasets Alastair Gemmell Jon Blower Keith Haines Environmental.
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE) Increasing Accessibility and Interoperability of NASA Data Products with GIS Tools.
CEOS/WGISS 20, Kyev, September 12-16, WTF-CEOP Implementation Plan #1 Status (WTF-CEOP first prototype, by JAXA) September 12, 2005 Osamu Ochiai.
The importance of locality in the visualization of large datasets John Brooke 1, James Marsh 1, Steve Pettipher 1, Lakshmi Sastry 2 1 The University of.
OPeNDAP and the Data Access Protocol (DAP) Original version by Dave Fulker.
© University of Reading 2008www.reading.ac.uk Reading e-Science Centre September 10, 2015 Integrating a Web Map Service into the THREDDS Data Server Jon.
Running Climate Models On The NERC Cluster Grid Using G-Rex Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre Environmental.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
THREDDS Data Server Ethan Davis GEOSS Climate Workshop 23 September 2011.
ATMOSPHERIC SCIENCE DATA CENTER ‘Best’ Practices for Aggregating Subset Results from Archived Datasets Walter E. Baskin 1, Jennifer Perez 2 (1) Science.
1 AJAX and Dapper: The Good, the Bad, and the Ugly Joe Sirott PMEL/NOAA.
NOCS, PML, STFC, BODC, BADC The NERC DataGrid = Bryan Lawrence Director of the STFC Centre for Environmental Data Archival (BADC, NEODC, IPCC-DDC.
DELIVERING ENVIRONMENTAL WEB SERVICES (DEWS) Partners: UK Met Office (Lead Partner), British Atmospheric Data Centre (BADC), British Maritime Technology.
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
BARRODALE COMPUTING SERVICES LTD. Spatial Data Activities at the Reading e-Science Centre Adit Santokhee, Jon Blower, Keith Haines Reading.
VO Sandpit, November 2009 CEDA Metadata Steve Donegan/Sam Pepler.
Unidata TDS Workshop THREDDS Data Server Overview
Composing workflows in the environmental sciences using Web Services and Inferno Jon Blower, Adit Santokhee, Keith Haines Reading e-Science Centre Roger.
TPAC Tasmanian Partnership for Advanced Computing Partner in APAC (Australian Partnership for Advanced Computing) Expertise centre for Earth Systems Science.
Relational Database vs. Data Files By Willa Zhu JISAO/UW - PMEL/NOAA March 25, 2005.
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
Unidata’s TDS Workshop TDS Overview – Part I July 2011.
User Working Group 2013 Data Access Mechanisms – Status 12 March 2013
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Information Technology: GrADS INTEGRATED USER INTERFACE Maps, Charts, Animations Expressions, Functions of Original Variables General slices of { 4D Grids.
DSpace - Digital Library Software
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Weathertop Consulting, LLC Server-side OPeNDAP Analysis – Concrete steps toward a generalized framework via a reference implementation using F-TDS Roland.
Using Google Maps and other OpenSource GIS software for displaying geospatial data Jon Blower, Dan Bretherton, Keith Haines, Chunlei Liu, Adit Santokhee.
LAS and THREDDS: Partners for Education Roland Schweitzer Steve Hankin Jonathan Callahan Joe Mclean Kevin O’Brien Ansley Manke Yonghua Wei.
UC 2006 Tech Session 1 NetCDF in ArcGIS 9.2. UC 2006 Tech Session2 Overview Introduction to Multidimensional DataIntroduction to Multidimensional Data.
Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre
M. Lautenschlager (M&D/MPIM)1 WDC on Climate as Part of the CERA 1 Database System Michael Lautenschlager Modelle und Daten Max-Planck-Institut.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
Enabling the Transition of CPC Products to GIS Format Brian Doty Jennifer Adams Michael Halpert Viviane Silva.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
9/21/04 James Gallagher Server-Side: The Basics This part of the workshop contains an overview of the two servers which OPeNDAP has developed. One uses.
Reading e-Science Centre Technical Director Jon Blower ESSC Director Rachel Harrison CS Director Keith Haines ESSC Associated Personnel External Collaborations.
DELIVERING ENVIRONMENTAL WEB SERVICES (DEWS)
Data Browsing/Mining/Metadata
Spatial Data Activities at the Reading e-Science Centre
MERRA Data Access and Services
Flanders Marine Institute (VLIZ)
Improving Data Access, Discovery, and Usability
SDMX Reference Infrastructure Introduction
Presentation transcript:

GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading

Background  At Reading we hold copies of various datasets (~2TB) –Mainly from models of oceans and atmosphere –Also some observational data (e.g. satellite data) –From Met Office, SOC, ECMWF, more  We serve these datasets to many end users –Scientists (1000s of hits per year) –Industry (e.g. British Maritime Technology)  Datasets are in a variety of formats –netCDF, GRIB, HDF, HDF5 …  Data do not conform to naming conventions –E.g. “temp” instead of “sea_water_potential_temperature”

Background (2)  There is a clear need to make access to these datasets easier –Users shouldn’t have to know details of how data are stored  Hence development of GADS (Grid Access Data Service)  Developed as part of GODIVA project –Grid for Ocean Diagnostics, Interactive Visualisation and Analysis –NERC e-Science pilot project  Originally developed by Woolf et al (2003)  Allows richer queries and more flexibility than DODS standard –Although we plan to implement a DODS translation layer

GODIVA Web Portal Allows users to interactively select data for download using a GUI Users can create movies on the fly cf. Live Access Server

Advantages of GADS  User’s don’t need to know anything about storage details  Can expose data with conventional names without changing data files  Users can choose their preferred data format, irrespective of how data are stored  Behaves as aggregation server –Delivers single file, even if original data spanned several files  Deployed as a Web Service –Can be called from any platform/language –Can be called programmatically (easily incorporated into larger systems), workflows –Java / Apache Axis / Tomcat

Architecture META- DATA FILES Metadata Manager Utility Metadata Interface dataQuery dataRequest GADS Web Service Client

Metadata structure

GADS Methods  dataQuery() is used for querying the data holdings –“What datasets are there?” –“What variables are there in the dataset X?”  dataRequest() is used for downloading data –User can choose the data format –Can easily download subsets of data –Uses start-stride-count semantics (familiar in community)  dataRequestNatural() –Same as dataRequest() but in natural units (degrees, metres …)

dataQuery – examples of use  dataQuery(dataset, variable, axis) – general form  dataQuery(“”, “”, “”) – gets all dataset names in the catalogue  dataQuery(“FOAM_NINTH”, “”, “”) – gets all the variable names in the FOAM_NINTH dataset  dataQuery(“FOAM_NINTH”, “temperature”, “”) – gets the details of the grid for the temperature variable  dataQuery(“FOAM_NINTH”, “temperature”, “z”) – gets all values that the z coordinate can take  dataQuery(“”, “temperature”, “”) – gets all datasets that contain the “temperature” variable

dataRequest – example of use  dataRequest(“FOAM_NINTH”, “temperature”, “CDF”, “t”, 0, 1, 20, “z”, 0, 1, -1, “y”, 100, 4, 400, “x”, 300, 4, 600)  dataRequestNatural(“FOAM_NINTH”, “temp”, “CDF”, “t”, “ :00:00”, “ :00:00”, “z”, “0”, “10”, “y”, “42”, “64”, “x”, “-26”, “9”)  Returns URL to extracted dataset

Metadata manager (in progress) e.g. Adding a dataset – can “harvest” metadata from netCDF file headers

Limitations  Assumes one timestep per file –Hence doesn’t handle timeseries well  Long queries can cause problems (synchronous) –Needs a queuing system  Rotated grids a problem (esp. for dataRequestNatural())  Could have richer metadata queries

Application: Search and Rescue  Search And Rescue Information System (SARIS) –British Maritime Technology (BMT)  Used by Coastguard to locate people who have fallen overboard  Runs a model using wind and surface current data –Forecasts where person will be by the time rescue arrives  By incorporating GADS, SARIS can consume up-to-date Met Office forecasts on demand. –Should improve quality of prediction

Spatial Databases  Database systems now including capability for storing geospatial data –IBM Informix, Oracle 10g, PostgreSQL, mySQL …  ReSC is evaluating some of these –Informix with Grid DataBlade looks promising (  We need capability to store raster data (i.e. gridded data) –Many only store vector data –Gotcha – some vendors use “raster” to mean “photograph”, not “model data”  We also need to store 3-D data –Some only have native understanding of 2-D data

Future plans  Interact more with GIS community –There are already some relevant initiatives out there (e.g. MarineGIS) –Use of databases may help (some are OGC compliant) –But have problem that GIS tends to talk in 2-D  Develop DODS (=OpenDAP) layer  Encourage others to install GADS –We don’t want to hold lots of data in Reading! –POL, Met Office, ECMWF all expressed interest –Software needs “hardening” first…  Find more applications!