DATA ACCESS, QUERYING, ANALYSIS AND DATA MINING IN A DISTRIBUTED FRAMEWORK FOR EARTH SYSTEM SCIENCE SUPPORT Menas Kafatos * Center for Earth Observing.

Slides:



Advertisements
Similar presentations
Conversion of CPC Monitoring and Forecast Products to GIS Format Viviane Silva Lloyd Thomas, Mike Halpert and Wayne Higgins.
Advertisements

ASIAES Project Overview Satellite Image Network for Natural Hazard Management in ASEAN+3 region Pakorn Apaphant Geo-Informatics and Space Technology Development.
1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Climate Shifts. Example of a physical geography problem The global carbon cycle and climate – human actions such as burning of fossil fuels and deforestation.
RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014.
NCAR GIS Program : Bridging Gaps
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
CLIMATE SCIENTISTS’ BIG CHALLENGE: REPRODUCIBILITY USING BIG DATA Kyo Lee, Chris Mattmann, and RCMES team Jet Propulsion Laboratory (JPL), Caltech.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
UNIVERSITY of MARYLAND GLOBAL LAND COVER FACILITY High Performance Computing in Support of Geospatial Information Discovery and Mining Joseph JaJa Institute.
EU 2nd Year Review – Jan – WP9 WP9 Earth Observation Applications Demonstration Pedro Goncalves :
Coordinated Energy and water-cycle Observations Peroject A Well Organized Data Archive System Data Integrating/Archiving Center at University of Tokyo.
Promising data analytics technologies Tiffany Mathews.
, Increasing Discoverability and Accessibility of NASA Atmospheric Science Data Center (ASDC) Data Products with GIS Technology ASDC Introduction The Atmospheric.
, Implementing GIS for Expanded Data Accessibility and Discoverability ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
(Images from NOAA web site). How to use satellite data ?
Introduction to Hands On Training in CORDEX South Asia Data Analysis
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
1 OPeNDAP/ECHO Demo Integrating and Chaining services September, 2006 CEOS WGISS 22 Annapolis, MD.
Global Land Cover Facility The Global Land Cover Facility (GLCF) is a member of the Earth Science Information Partnership (ESIP) Federation providing data,
A Global Agriculture Information System Zhong Liu 1,4, W. Teng 2,4, S. Kempler 4, H. Rui 3,4, G. Leptoukh 3 and E. Ocampo 3,4 1 George Mason University,
Lecture 5 The Climate System and the Biosphere. One significant way the ocean can influence climate is through formation of sea ice. Sea ice is much more.
EarthCube Building Block for Integrating Discrete and Continuous Data (DisConBB) David Maidment, University of Texas at Austin (Lead PI) Alva Couch, Tufts.
, Key Components of a Successful Earth Science Subsetter Architecture ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
1 NASA CEOP Status & Demo CEOS WGISS-24 Oberpfaffenhofen, Germany October 15, 2007 Yonsook Enloe.
September 4, 2003MODIS Ocean Data Products Workshop, Oregon State University1 Goddard Earth Sciences (GES) Distributed Active Archive Center (DAAC) MODIS.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Modern Era Retrospective-analysis for Research and Applications: Introduction to NASA’s Modern Era Retrospective-analysis for Research and Applications:
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Objective Data  The outlined square marks the area of the study arranged in most cases in a coarse 24X24 grid.  Data from the NASA Langley Research Center.
GES DISC DAAC February 28, 2002HDF-EOS Workshop V1 The Goddard DAAC The Goddard DAAC Presented by:
APEC Climate Center Data Service System Chi-Yung Francis Tam APCC.
Page 1 CSISS Center for Spatial Information Science and Systems Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Application Profile.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Jianchun Qin, Liguang Wu, Michael Theobald, A. K. Sharma, George Serafino, Sunmi Cho, Carrie Phelps NASA Goddard Space Flight Center, Code 902 Greenbelt,
NQuery: A Network-enabled Data-based Query Tool for Multi-disciplinary Earth-science Datasets John R. Osborne.
Suhung Shen James G. Acker Denis Nadeau George Serafino Goddard Earth Sciences (GES) Data and Information Services Center (DISC) Distributed Active Archive.
Information Technology: GrADS INTEGRATED USER INTERFACE Maps, Charts, Animations Expressions, Functions of Original Variables General slices of { 4D Grids.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
Vision of an Integrated Global Observing System Gregory W. Withee Assistant Administrator for Satellite and Information Services National Oceanic and Atmospheric.
3-D rendering of jet stream with temperature on Earth’s surface ESIP Air Domain Overview The Air Domain encompasses a variety of topic areas, but its focus.
AIRS/AMSU-A/HSB Data Subsetting and Visualization Services at GES DAAC Sunmi Cho, Jason Li, Donglian Sun, Jianchun Qin and Carrie Phelps, Code 902, NASA.
SPDF Science Advisory Group - September 29-30, 2005 Page 12/24/2016 9:09:48 PM Services of the Space Physics Data Facility (SPDF) / Sun-Earth Connection.
Distributed Data Servers and Web Interface in the Climate Data Portal Willa H. Zhu Joint Institute for the Study of Ocean and Atmosphere University of.
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
Monitoring Global Droughts from Space Zhong Liu 1,4, W.L. Teng 2,4, S. Kempler 4, H. Rui 3,4, G. Leptoukh 4, and E. Ocampo 3,4 1 George Mason University,
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
NAME SWG th Annual NOAA Climate Diagnostics and Prediction Workshop State College, Pennsylvania Oct. 28, 2005.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
Climate Prediction: Products, Research, Outreach Briefing for NOAA’s Science Advisory Board March 19, 2002 National Weather Service Climate Prediction.
A41I-0105 Supporting Decadal and Regional Climate Prediction through NCAR’s EaSM Data Portal Doug Schuster and Steve Worley National Center for Atmospheric.
DataGrid France 12 Feb – WP9 – n° 1 WP9 Earth Observation Applications.
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
Center for Satellite Applications and Research (STAR) Review 09 – 11 March 2010 Image: MODIS Land Group, NASA GSFC March 2000 STAR Enterprise Synthesis.
Accessing Global Precipitation Data Products via TRMM Online Visualization and Analysis System (TOVAS) Zhong Liu Center for Spatial Information Science.
Zhong Liu George Mason University and NASA GES DISC
Global Precipitation Data Access, Value-added Services and Scientific Exploration Tools at NASA GES DISC Zhong Liu1,4, D. Ostrenga1,2, G. Leptoukh4, S.
MERRA Data Access and Services
New GES DISC Services Shortening the Path in Science Data Discovery
KISTERS TimeSeries HUB
H. Michael Goodman Earth-Sun System Division NASA Headquarters
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Presentation transcript:

DATA ACCESS, QUERYING, ANALYSIS AND DATA MINING IN A DISTRIBUTED FRAMEWORK FOR EARTH SYSTEM SCIENCE SUPPORT Menas Kafatos * Center for Earth Observing and Space Research (CEOSR) George Mason University *on Behalf of the SIESIP Team GeoComputation 99

SCIENCE Seasonal to Interannual Earth Science Information Partner (SIESIP) Science Driver: Seasonal-Interannual Climate Variations, Predictability and Prediction

Multidisciplinary/Interdisciplinary Research Coupled atmosphere/ocean Effects on Biosphere Connection to Hydrological Cycle (tropical rainfall, convection, etc.) Multiple Phenomena ENSO Monsoons Teleconnections (effects at continental & sub-continental levels) Relation to Droughts, Event-driven Phenomena, etc. Multiple Time Scales Spans short-scale weather and longer-term climate variability Multi-Agency Data Sets (NASA, NOAA, …) Communities of Scientists (Data Providers and Users) Input being provided by Advisory Board with representation from S-I, TRMM, NSIPP, SCSMEX & IDS communities Seasonal-Interannual Climate

SIESIP Management Committee Science Advisory Board Federation Management & Members

SIESIP Federation Architecture User (Web) Internet GMU User (Web) Exchange Protocols COLA GDAAC Data Ingest Data Orders Data Ingest Data Orders Other Data Sources (e.g. NOAA) Interactive Operations Batch Operations Data Delivery Data Archiving

VDADC ENGINE (Current GMU Prototype) USER WEB BROWSER LOCAL STORAGE SEARCH ENGINE WORLD WIDE WEB VDADC ENGINE QUERY CONVERSION DATA CONVERSIO N (Images, Time Series, etc.) DATA RETRIEVAL Data Center 1 Data Center 2 Data Center N User Interface Java Applet SQL Query RDBMS (COTS) GODDARD DAAC Result interface DISCCD

Current SIESIP Data Sets

El Niño Effects on the U.S.

SIESIP Supports SCSMEX Data Analysis u SIESIP provides TRMM gridded, satellite coincidence data subsets, and GMS data for Field Campaign, seasonal & inter-annual analyses  Data available at /TRMM_FE/scsmex/scsmex.html u SIESIP is producing TRMM SCSMEX data CD for international distribution at SCSMEX Science Team’s request

Tropical Cyclone Leo, 4/29/99 (TSDIS/GMU Orbit Viewer)

Climatology Interdisciplinary Data Collection (CIDC) (click on "Interdisciplinary"under DISCIPLINE SPECIFIC INFORMATION) Comes as a 4-CD-ROM set; in addition, all data is available free by electronic transfer. Over 70 Monthly Mean Global Climate Parameters - Land, Ocean, Sun, Cryosphere, Biosphere, Atmosphere. The CD-ROM set was produced in collaboration with the Center for Earth Observing and Space Research (CEOSR) at George Mason University with GrADS developed at the Center for Ocean Land Atmosphere Studies (COLA).

AVERAGE SEASONAL-CYCLE ESTIMATES FOR THE WORLD Archived are: climatologically averaged values of monthly and annual air temperature (T) and total precipitation (P) reinterpolated to a 0.5x0.5 degree grid, their associated cross-validation fields, and the climatic water balance computed at each grid point from T and P. Gridded datasets are archived on the SIESIP site, as well as on "climate.geog.udel.edu" under the userid "siesip" (password available on request) AVERAGE SEASONAL-CYCLE ESTIMATES FOR SOUTH AMERICA Archived are: climatologically averaged values of monthly and annual air temperature (T) and total precipitation (P) interpolated to a 0.5x0.5 degree grid, and their associated cross- validation fields. Genesis of Available Gridded Datasets a) Average monthly station T and P drawn from station climatology archives, spatially interpolated to each grid. b) Average monthly station T drawn from station climatology archives, spatially interpolated to each grid point using DEM-aided interpolation MONTHLY TIME-SERIES ESTIMATES FOR SOUTH AMERICA Archived are: monthly total precipitation (P) and average air temperature (T) interpolated to a 0.5x0.5 degree grid & associated cross-validation fields.

INFORMATION TECHNOLOGY STRATEGY u Development of science scenarios to serve particular user communities u Web accessibility u Development of user queries u Integration of tools accessibility with data set accessibility to allow meaningful, user-specified queries u Integration of freely/easily accessible analysis tool (GrADS); on-line visualization; data mining (pyramid); with metadata searches (XML and relational data base management systems)

Three-Phase Data Access Model u Phase 1: A user browses and searches the “static” (or description) metadata and content- based metadata provided by the SIESIP system u Phase 2: The user gets a quick look of the contents of the data through on-line data analysis u Phase 3: The user has located the data of interest and then orders the data u It is an interactive and iterative process

COLA IT: GrADS u Integrated User Interface Already in Place for –Selecting, Accessing, and Sampling Data Sets (grids, stations, future - images) –Computing and Deriving New Quantities –Quantitatively Visualizing of Results u Designed to Handle Geophysical Data Sets u Thousands of Users Worldwide

El Ni ñ o 1982/83 El Niño Event in March 1983 Sea Surface Temperature Anomaly (SSTA) and Wind Field High values of SSTA are found near the west coast of S. America Trade winds have dissipated Display using GrADS

SIESIP: Distributed Seasonal-Interannual Data System (Implementation Example) GrADS Server NOAA Data GrADS Server GrADS Server NASA Data GrADS Analysis Workbench Class Libraries J-GrADS Class Libraries SIESIP Data Sets SWIL Local NOAA Server Internet DODS MetaData Server Data Pyramid Server Datamining Interface Applet/Plug-In ContentBrowsing Analysis Data Order Applet/Plug-In Data Order GUI HTML Data Order Server Data Pyramid Metadata User Interface Driver 1 Inter- Operability Wrapper MetaData Search HTML/CGI Data and Metadata Systems on the Internet Outside of SIESIP Internet

Phenomenon Instance Predefined Region Cell Value Specific Parameter Data Product Contact Data File E-R Diagram for SIESIP Parameter Platform Instrument Data Format Temporal Coverage Altitude Coverage Cell

Pyramid Data Model u Motivation -- to support the interactive content-based browsing of large volumes of data u For example, queries on the statistical properties of the data can be used in a content-based browsing process u The challenge in query processing performance for large data volumes u Solution -- to speed up query evaluations by precomputing intermediate results which contribute to answering user queries. u What kind of precomputations? & How to apply them?

Precomputed Data Attributes u Query evaluation performance can be improved through precomputation ( i.e. precompute the predefined data attributes which contribute to query evaluations) and approximation ( i.e. query answers could be derived approximately based on the precomputed data attributes) u Choosing what kind of precomputed data attributes vary with the types of queries to be answered, which further depend on specific domain applications

SIESIP GUI

Data Interoperability SIESIP is one of DODS data server sites. GrADS has been added to the DODS suite of client software. DODS data access enabled through SIESIP GUI interface. COLA ftp data access enabled though SIESIP GUI interface GrADS as part of DODS server -To manipulate DODS data before transferring -To support more data types and data formats