Advisor: Professor Fukuda Student: Jason Woodring A Multi-Agent distributed approach to analyzing large climate data sets.

Slides:



Advertisements
Similar presentations
Introduction to modelling extremes
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
Climate Change: Science and Modeling John Paul Gonzales Project GUTS Teacher PD 6 January 2011.
Advisor: Professor Fukuda Student: Jason Woodring Climate analysis software to assist climate researchers in the detection of extreme weather events.
INTRODUCTORY PHYSICAL GEOGRAPHY. This is NOT a class about remembering the names, locations, or measures of physical features and natural phenomena around.
1 Preparing Washington for a Changing Climate An Integrated Climate Change Response Strategy Department of Ecology Hedia Adelsman, Executive Policy Advisor.
NCAR GIS Program : Bridging Gaps
Alan F. Hamlet Eric P. Salathé Matt Stumbaugh Se-Yeun Lee Seshu Vaddey U.S. Army Corps of Engineers JISAO Climate Impacts Group Dept. of Civil and Environmental.
* Finally, along the lines of predicting system behavior, researchers may want to know what conditions will lead to an optimal outcome of some property.
Dennis P. Lettenmaier Alan F. Hamlet JISAO Center for Science in the Earth System Climate Impacts Group and Department of Civil and Environmental Engineering.
Alan F. Hamlet JISAO/CSES Climate Impacts Group Dept. of Civil and Environmental Engineering University of Washington Hydrologic Implications of Climate.
Recent Climate Change Modeling Results Eric Salathé Climate Impacts Group University of Washington.
Regional Climate Change Water Supply Planning Tools for Central Puget Sound Austin Polebitski and Richard Palmer Department of Civil and Environmental.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Determining the Local Implications of Global Warming Professor Clifford Mass, Eric Salathe, Patrick Zahn, Richard Steed University of Washington.
Implications of 21st century climate change for the hydrology of Washington October 6, 2009 CIG Fall Forecast Meeting Climate science in the public interest.
Alan F. Hamlet Marketa McGuire Elsner Ingrid Tohver Kristian Mickelson JISAO/CSES Climate Impacts Group Dept. of Civil and Environmental Engineering University.
Developing Tools to Enable Water Resource Managers to Plan for & Adapt to Climate Change Amy Snover, PhD Climate Impacts Group University of Washington.
Washington State Climate Change Impacts Assessment: Implications of 21 st century climate change for the hydrology of Washington Marketa M Elsner 1 with.
Planning for Climate Change in the Pacific Northwest Amy Snover, PhD Climate Impacts Group Center for Science in the Earth System University of Washington.
Water Management Presentations Summary Determine climate and weather extremes that are crucial in resource management and policy making Precipitation extremes.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Climate Forecasting Unit Prediction of climate extreme events at seasonal and decadal time scale Aida Pintó Biescas.
School of Information Technologies The University of Sydney Australia Spatio-Temporal Analysis of the relationship between South American Precipitation.
Introduction to Hands On Training in CORDEX South Asia Data Analysis
DOE BER Climate Modeling PI Meeting, Potomac, Maryland, May 12-14, 2014 Funding for this study was provided by the US Department of Energy, BER Program.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
© UKCIP 2006 UKCP09 and the West Midlands region West Midlands Regional Climate Change Adaptation Partnership, 8th July 2009 Chris Thomas, UK Climate Impacts.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
GHP and Extremes. GHP SCIENCE ISSUES 1995 How do water and energy processes operate over different land areas? Sub-Issues include: What is the relative.
Forecasting Streamflow with the UW Hydrometeorological Forecast System Ed Maurer Department of Atmospheric Sciences, University of Washington Pacific Northwest.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

CDC Cover. NOAA Lab roles in CCSP Strategic Plan for the U.S. Climate Change Science Program: Research Elements Element 3. Atmospheric Composition Aeronomy.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Graphene So what’s the most efficient way to spam all your Facebook friends? Team Adith Tekur (System Architect/Tester) Neha Rastogi (System Integrator)
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
1 Critical Water Information for Floods to Droughts NOAA’s Hydrology Program January 4, 2006 Responsive to Natural Disasters Forecasts for Hazard Risk.
Alan F. Hamlet, Philip W. Mote, Dennis P. Lettenmaier JISAO/CSES Climate Impacts Group Dept. of Civil and Environmental Engineering University of Washington.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
00/XXXX 1 Data Processing in PRISM Introduction. COCO (CDMS Overloaded for CF Objects) What is it. Why is COCO written in Python. Implementation Data Operations.
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
Alan F. Hamlet Andy Wood Dennis P. Lettenmaier JISAO Center for Science in the Earth System Climate Impacts Group and the Department.
Climate change impacts and adaptation in the Pacific Northwest Dennis P. Lettenmaier Department of Civil and Environmental Engineering and Climate Impacts.
The HDF Group January 8, ESIP Winter Meeting Data Container Study: HDF5 in a POSIX File System or HDF5 C 3 : Compression, Chunking,
Eric Salathé JISAO Climate Impacts Group University of Washington Rick Steed UW Yongxin Zhang CIG, NCAR Cliff Mass UW Regional Climate Modeling and Projected.
A Cyberinfrastructure for Drought Risk Assessment An Application of Geo-Spatial Decision Support to Agriculture Risk Management.
Application Specific Module Tutorial Zoltán Farkas, Ákos Balaskó 03/27/
By Russ Frith University of Alaska at Anchorage Civil Engineering Department Estimating Alaska Snow Loads.
CF 2.0 Coming Soon? (Climate and Forecast Conventions for netCDF) Ethan Davis ESO Developing Standards - ESIP Summer Mtg 14 July 2015.
UBC/UW 2011 Hydrology and Water Resources Symposium Friday, September 30, 2011 DIAGNOSIS OF CHANGING COOL SEASON PRECIPITATION STATISTICS IN THE WESTERN.
WINTER 2016 – TERM PRESENTATION MICHAEL O’KEEFE. PAST RESEARCH - SUMMER 2015 Continued Jason Woodring’s research on UWCA Main issue with UWCA is the slow.
Big Data is a Big Deal!.
Hadoop MapReduce Framework
RCM workshop, Meteo Rwanda, Kigali
Hydrologic implications of 20th century warming in the western U.S.
EPANET-MATLAB Toolkit An Open-Source Software for Interfacing EPANET with MATLAB™ Demetrios ELIADES, Marios KYRIAKOU, Stelios VRACHIMIS and Marios POLYCARPOU.
Recent Climate Change Modeling Results
Parallel NetCDF + MASS Development
Recent Climate Change Modeling Results
Trends in Runoff and Soil Moisture in the Western U.S
JISAO Center for Science in the Earth System and the
Hydrologic Changes in the Western U.S. from
Charles Tappert Seidenberg School of CSIS, Pace University
Scientific Workflows Lecture 15
Presentation transcript:

Advisor: Professor Fukuda Student: Jason Woodring A Multi-Agent distributed approach to analyzing large climate data sets.

World Climate …. Is changing Rising Temperatures… It’s our fault We need to simulate the future changes

Local sea level rise Decreased snowpack Glaciers in decline Stream flow peaks earlier in year Winter Floods Summer Droughts Ocean acidification The Impact

What's being done about it? Simulated World Climate Data is produced using Many different models Climate Scientists Analyze this Data in different ways to try to predict extreme weather events

Climate Analysis Large climate science facilities may have dedicated computing resources to analyze these climate models. Small to medium size labs often are resource constrained, and create custom tools.

Climate Science Programming Climate scientists typically use domain specific scripting languages (CDO / NCL). They use the scripting languages to do analysis on climate models produced by different scientific agencies. These environments can be difficult to install and work with for the average person.

When the average conditions consistently exceed some threshold extreme weather events are more likely to happen. An essential calculation for predicting climate behavior. Time of Emergence (ToE)

UWCA Web Based Application Performs Time of Emergence calculations Parameterizes input Utilizes UW-320 Linux lab computing cluster Using MASS Agents to travel MASS Places / computing Nodes to analyze NetCDF data

Workflow Based 1. Select ToE variable to calculate 2. Select climate model(s) to use 3. Select parameters for ToE calculations 4. Then wait….. And view job status 5. Download files when done

MASS Library Designed for this type of simulation. Abstracts difficult parallel / distributed programming. Brings the programmer closer to the data they are working with rather than the boiler plate code that is usually necessary in order to achieve these types of simulations.

Input Data NetCDF Software [4] A set of libraries and self-describing, machine independent data formats that support the creation, access, and sharing of array-oriented scientific data. Free software available All sorts of software packages and APIs for working with NetCDF from different environments.

Provenance (to come from…) Records input climate model files used Tracking all calculation steps, and files produced between them. Tracking parameters used in calculations

Implementation Used NetBeans and Glassfish Enterprise Web Application Maven Archetype project Utilizing MASS Places and Agents

Architecture – Hardware View

Architecture – SAVN Notation

Class Diagram

Infrastructure Setup Deployed on Juno Port 8080 opened Large size shared NFS storage for climate models NetBeans Project folder setup on file share.

GUI Application Header Job Status Viewer Job Creator

Algorithms Architecture Using Template Pattern Provides a skeleton for an algorithm while deferring some steps to sub classes. Following Hollywood Principle “Don’t call us, we’ll call you”

ToE calculation 5 step calculation 1. Find days over threshold 2 Find historical tolerances 3. Find climatology 4. Least Squared Regression 5. Find ToE

Step 1: Find days over Threshold Use Temp Threshold parameter to find days over threshold. Every 365 days temp values collapse to a days over threshold integer value in a year index.

Step 1: Code

Step 2: Find Historical Tolerances Agents spawn at year 1950 and travel to year 1999 gathering days over threshold variables. User parameter specifies Min / Max %, and values are calculated

Step 2: Code

Step 3: Find Climatology Agents move to year 1980 and travel along the z axis until 2010 gathering days over threshold values The Climatology is the average of those values collected

Step 3: Code

Step 4: Least Squared Regression All agents move to year 2006, and travel all the way down to year 2099 gathering days over threshold values For each x && y column sets of values, slopes and confidence intervals are calculated.

Step 4: Code

Step 4 results in 3 artifacts 3 2-dimensional arrays These array values represent year 2001

Step 5: Find ToE The arrays from step 4 get expanded by 200 years

Step 5: Code

ToE Found! Start at year 2001, and go down each longitude && longitude column to find the year in which the value exceeds the min or max values.

Final ToE output 3 2-dimensional array NetCdf files with each cell containing year values representing the ToE

Performance Best Timing 1 node 8 thread with current algorithm

Algorithm challenges Reading in the 22gb data is time consuming Significant Heap considerations Had to experiment with many different methods of reading in the data Current method reads in all data at master node and distributes over the network to slave nodes Debugging large data sets Debugging remote nodes

Provenance File Displays all times for steps of the job Which input files were used What output files were produced What steps were executed for the job What parameters were used for the calculations

Usability Analysis Questionnaire Learning how to use the tool was quite simple Adaptability of the tool was rated low, as it requires software developer skillsets. The amount of provenance and detail was rated highly Climate researcher appreciated the workflow logging qualities of the provenance file, it provides information on what steps, files, and parameters were used. The fact that it is a web tool allows access to various interested parties, including funding agencies to perform climate science analysis.

Data Visualization The web application features file downloading. The user can use their software of choice for visualization Some free options are: Panopoly Viewer ncBrowse

Next Steps Experiment with Places reading Algorithm Add test cases for main classes Add new climate models More issues recorded in Redmine for ongoing work

References [1] Fukuda, M., Stiber, M., Salathe, E., & Kim, W. (2013). CDS&E: Small: Multi-Agent-Based Parallelization of Scientific Data Analysis and Simulation. [2] Michalakes, J., Dudhia, J., Henderson, T., Klemp, J., Skamarock, W., & Wang, W. (n.d.). The Weather Research and Forecast Model: Software Architecture and Performance. [3] Fukuda, M. (2010). MASS: Parallel-Computing Library for Multi-Agent Spatial Simulation. [4] Yasutake, B., Simonson, N., Asuncion, H., Fukuda, M., & Salathe, E. (2014). Supporting Provenance in Climate Research. [5] Zender, C., & Mangalam, H. (2007). Scaling Properties of Common Statistical Operators for Gridded Datasets. International Journal of High Performance Computing Applications,21. [6] Dalton, M., Mote, P., & Snover, A. (2013). Climate Change In The Northwest Implications for our Landscapes, Waters, and Communities. Island Press. [7] Salathe, E., Hamlet, A., & Mass, C. (2014). Estimates of 21st Century Flood Risks in the Pacific Northwest. [8] Climate Change Impacts and Adaptation in Washington State. (2013). Climate Impacts Group University of Washington. [9] Climate Change. (2014, January 1). Retrieved July 11, 2014, from [10] NetCDF FAQ. (n.d.). Retrieved July 12, 2014, from [11] Software for Manipulating or Displaying NetCDF Data. (n.d.). Retrieved July 13, 2014, from [12] Muir, L., Brown, J., Risbey, J., & Wijffels, S. (n.d.). Determining the time of emergence of the climate change signal at regional scales. The Center for Australian Weather and Climate Research, Hobart, Australia. [13] Hawkins, E., & Sutton, R. (2012). Time of emergence of climate signals. American Geophysical Union. [14] Keller, K., Joos, F., & Raible, C. (2014). Time of emergence of trends in the ocean biogeochemistry. [15] Time of Emergence of Climate Change Signals in the Puget Sound Basin Quality Assurance Project Plan. (2013). Climate Impacts Group University of Washington. [16] Mahlstein, I., Knutti, R., Solomon, S., & Portmann, R. (2011). Early onset of significant local warming in low latitude countries. [17] Maraun, D. (2013). When will trends in European mean and heavy daily precipitation emerge? [18] Ho, C., Hawkins, E., & Shaffrey, L. (2012). Statistical decadal predictions for sea surface temperatures: A benchmark for dynamical GCM predictions. [19] Capalbo, S., Eigenbrode, S., Glick, P., Littell, J., Raymondi, R., & Reeder, S. (2014). NORTHWEST. In Climate Change Impacts in the United States.