1 A Novel Multi-Target Prediction Framework and its Application to Flood Forecasting BY ANUSHA NEMILIDINNE CHRISTOPH F.EICK YUE CAO CHRISTARINY.

Slides:



Advertisements
Similar presentations
SEKE 2014, Hyatt Regency, Vancouver, Canada
Advertisements

Exploring Reduction for Long Web Queries Niranjan Balasubramanian, Giridhar Kuamaran, Vitor R. Carvalho Speaker: Razvan Belet 1.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Supply Chain Management (SCM) Forecasting 3
1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
Mining Social Network for Personalized Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
© 2002 IBM Corporation IBM Research 1 Policy Transformation Techniques in Policy- based System Management Mandis Beigi, Seraphin Calo and Dinesh Verma.
Data Mining and Decision Support
Downscaling Global Climate Model Forecasts by Using Neural Networks Mark Bailey, Becca Latto, Dr. Nabin Malakar, Dr. Barry Gross, Pedro Placido The City.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech 5 th GO-ESSP Community Meeting.
Team  Spatially distributed deterministic models  Many hydrological phenomena vary spatially and temporally in accordance with the conservation.
Introduction to Machine Learning, its potential usage in network area,
Statistics & Evidence-Based Practice
Statistical Forecasting
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Time Series And Business Forecasting
Learning to Compare Image Patches via Convolutional Neural Networks
SNS COLLEGE OF TECHNOLOGY
CEE 6410 Water Resources Systems Analysis
Week 1.
Regression Analysis Module 3.
Meeting 02/27/2017 Short Overview UH-DAIS Lab Research
An Artificial Intelligence Approach to Precision Oncology
Understanding of Automation Framework
Rule Induction for Classification Using
Bag-of-Visual-Words Based Feature Extraction
WSRec: A Collaborative Filtering Based Web Service Recommender System
It’s All About Me From Big Data Models to Personalized Experience
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Objectives of the Course and Preliminaries
A. HACHANI, M. OUESSAR, A. ZERRIM
PCB 3043L - General Ecology Data Analysis.
USE OF DATA ANALYTICS TO PREDICT THE DEMAND OF BIKES
A Web-enabled Approach for generating data processors
Object oriented system development life cycle
Meetings 05/22/2017 Research Interests in Flooding
Intelligent Information System Lab
Machine Learning Basics
Predict House Sales Price
Meeting 03/24/2017 Short Overview UH-DAIS Lab Research
Research Focus Objectives: The Data Analysis and Intelligent Systems (DAIS) Lab  aims at the development of data analysis, data mining, GIS and artificial.
NBA Draft Prediction BIT 5534 May 2nd 2018
Meeting 02/27/2017 Short Overview UH-DAIS Lab Research
National Water Model (Provided by NOAA)
Lecture 23: Feature Selection
Model based design.
Semantic Interoperability and Data Warehouse Design
Machine Learning with Weka
Overview of Models & Modeling Concepts
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Pose Estimation for non-cooperative Spacecraft Rendevous using CNN
Predicting Frost Using Artificial Neural Network
Other Classification Models: Recurrent Neural Network (RNN)
Course Introduction CSC 576: Data Mining.
Statistical Data Analysis
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
BEC 30325: MANAGERIAL ECONOMICS
DESIGN OF EXPERIMENTS by R. C. Baker
Biological Science Applications in Agriculture
WSExpress: A QoS-Aware Search Engine for Web Services
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Presentation transcript:

1 A Novel Multi-Target Prediction Framework and its Application to Flood Forecasting BY ANUSHA NEMILIDINNE CHRISTOPH F.EICK YUE CAO CHRISTARINY HUTAPEA KHADIJA KHALDI University of Houston

OUTLINE MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

OUTLINE MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Motivation ─ Build a flood resilient society To better cope with flooding, some cities of United States, such as Houston, Austin and Seattle, have developed sensor-based flood warning systems that collect flooding related data (e.g. rainfall and water levels) and which provide web-interfaces to download the data they collected. Our work tries to take advantage of these data as we work on data-driven approaches for flood prediction which will be helpful to provide warnings in advance to people about the potential dangers of flooding and possible impending damage. Data Analysis and Intelligent Systems Lab (UH-DAIS)

Existing Flood Early Warning Systems Houston Harris County Flood Warning System Austin Flood Early Warning System Dallas Flood Warning System State of Iowa -- Iowa Flood Center National Oceanic and Atmospheric Administration (NOAA) Data Analysis and Intelligent Systems Lab (UH-DAIS)

AUSTIN FLOOD EARLY WARNING SYSTEM

Hydrology approaches Vs. Data-driven approaches Simulates water cycle with mathematical representations of different physical processes such as snow melt, infiltration and movement of water. The approach makes lot of assumptions which are often violated in real-world prediction scenarios and hence very complex National Water Model, HEC-RAS It extrapolates the past into the future by utilizing the training data. They utilize methods of computational intelligence and machine learning to describe the physics of the system. Relatively fewer assumptions and hence less complex. Linear Regression ,VAR ,Recurrent Neural Networks etc. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE MULTI-TARGET PREDICTION MOTIVATION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

What is Multi-Target Prediction? Prediction problems in which we consider more than one target variable. The multi multi-target approach helps us consider dependencies and relationships among the sensors/targets such as a Directed Acyclic Graph(DAG). For example, upstream and downstream dependencies can be modelled as DAG in case of water level prediction. Data Analysis and Intelligent Systems Lab (UH-DAIS)

Data Analysis and Intelligent Systems Lab (UH-DAIS) DAG of Addicks Reservoir Watershed 2130 2140 2120 2110 2160 2150 Data Analysis and Intelligent Systems Lab (UH-DAIS)

Why is flooding a MTP problem? Real-world applications like flood forecasting have numerous uncertainties and challenges, such as the presence of noise, missing data, and a more-critical one, compound dependencies among targets or multiple features. As a result, in MTP problems, targets often exhibit specific relationships between each other, such as a tree- shape hierarchy, a directed acyclic graph, a parent-child relationship, and the like. Moreover, correlations between target variables might exist. MTP approaches yield better predictive results than STP approaches in dealing with these challenges . Hence we propose a new MTP approach DAG-based Multi-Target Prediction for the problem of flood forecasting where we apply problem transformation methods to transform the MTP problem into a set of independent STP problems and each problem is then solved using an STP approach. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) MOTIVATION CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

DAG-based Multi-Target Prediction Approach(DBMTP) Our goal is to predict future water levels at a set of measuring points. The inputs of the proposed architecture include : Directed acyclic graph(DAG) of measurement points, Dataset describing various observations for the measuring points in the DAG A prediction scenario; it specifies what the independent variable(s) are predicting water levels The output of the DBMTP consists of a chained set of single-target prediction “calls” these models to predict water levels in the future. Data Analysis and Intelligent Systems Lab (UH-DAIS)

Execution Framework in an Example After collecting all the information for our input components of DBMTP architecture, we designed and developed an execution framework to learn a DBMTP model for a DAG with a prediction scenario. Employed a problem transformation approach that transforms the multi-target prediction problem into a set of independent single-target prediction problems and finally chained these STP models based on the partial order defined by the DAG. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE CASE STUDY ON ADDICKS RESERVOIR MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Case study of ADDICKS RESERVOIR Data Analysis and Intelligent Systems Lab (UH-DAIS)

Data Analysis and Intelligent Systems Lab (UH-DAIS) DAG of Addicks Reservoir Watershed 2130 2140 2120 2110 2160 2150 Data Analysis and Intelligent Systems Lab (UH-DAIS)

Data Collection and Preprocessing Data Acquisition from HCFWS Data Collection and Preprocessing Data Preprocessing We collected historical rainfall and water level data from the HCFWS website Downloaded datasets for each measuring point and used them to learn and evaluate our DBMTP models Once we pre-processed the datasets we passed them to next stage that learns DAG- based Multi-Target Prediction models. Format translation and Data sampling Interpolate Missing Values using averaging Merge Rainfall and Water level data into one file Data Analysis and Intelligent Systems Lab (UH-DAIS)

HCFWS website Data Analysis and Intelligent Systems Lab (UH-DAIS) https://www.harriscountyfws.org/ Interface example for Query Request for site 2060 Data Analysis and Intelligent Systems Lab (UH-DAIS)

For example, a subset of the rainfall data for measuring site 2060 for the past one year that is 10/31/3017 11:59 pm to 10/8/2018 6:01 AM CDT looks like

More on prediction scenario…. Description nc Do not chain any observations from predecessors wc Chain predecessor’s water level prediction outputs Wpre,(T+1) into successor’s prediction models wcm Chain predecessor’s current water level and water level prediction outputs Wpre,(T+1) and Wpre,T into successor’s prediction models rc Chain predecessor’s rainfall at Rpre,(T+1) and Rpre,T into successor’s prediction models rwc Chain predecessor’s rainfall and water level at T and T+1,namely Wpre,(T+1) , Wpre,T , Rpre,(T+1) and Rpre,T into successor’s prediction models There are 5 types of prediction scenarios in our research and we name these prediction scenarios by abbreviations. The prediction scenario tells the execution framework which observations of predecessors we chain into successors’ prediction models. Because predecessors’ predictions or observations and current learning measuring point’s historical data are stored in CSV files, the next step is CSV file combination Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE EXPERIMENTS MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Experiments Data Analysis and Intelligent Systems Lab (UH-DAIS) In our research, we used three different regression models (Linear Regression, SVR, and SVR with parameter improvement) in conjunction with different prediction scenarios. And we learned 14 single-target regression models; these are LR, LRwc, LRwcm, LRrc, LRrwc, SVR, SVRwc, SVRwcm, SVRrc, SVRrwc, SVRI, SVRIwc, SVRIrc, and SVRIrwc. We used RMSE and MAE as evaluation metrics. We employed 4-fold cross validation by separating past 1.5 years of data into 2 subsets , one for training and other for testing. Data Analysis and Intelligent Systems Lab (UH-DAIS)

Data Analysis and Intelligent Systems Lab (UH-DAIS) The table shows 14 STP models used in our research : Wi,T represents the water level at time t of Sensor i; Ri,T represents amount of rainfall at time t of Sensor i; pre refers to the sensors’ predecessors. Data Analysis and Intelligent Systems Lab (UH-DAIS)

Evaluation metrics   Data Analysis and Intelligent Systems Lab (UH-DAIS)

Evaluation metrics for the experiments RMSE MAE

OUTLINE RESULTS MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Results Data Analysis and Intelligent Systems Lab (UH-DAIS) We found that our Case Study, LRwcm always had a better performance than LR, since LRwcm’s average RMSE was 0.22 while LR’s average RMSE was 0.614; LRwcm’s average MAE was 0.12 while LR’s average MAE was 0.27. All linear regression chaining methods performed better than LR with respect to RMSE. We found LRrc always performed better than traditional LR and LRwc with respect to RMSE, MAE. These results suggest that chaining with future rainfall appears to yield better results than only chaining with future water levels. However, this assumes that we would be able to make accurate rain predictions, as we were restricted to using actual rainfall amounts and not the predicted amount of rainfall in the experiment; therefore, this result has to be interpreted with caution. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE CHALLENGES MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Challenges Data Analysis and Intelligent Systems Lab (UH-DAIS) In this experiment we had difficulties in getting real-time data, e.g. hourly data for water level and rainfall as the sampling was irregular and for some days we had very few observations. At the moment , evaluating the quality of these results is quite challenging because designing experiments that compares these results with hydrology models is quite difficult and very time consuming due to the difference in the way data-driven and hydrology models work. To the best of our knowledge, there is no commonly accepted water level prediction benchmark. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE CURRENT RESEARCH WORK MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Current research work Currently, we are exploring ways to collect soil moisture datasets. Soil moisture is believed to be an important factor for water level prediction. We are exploring USGS website for datasets and also contacting various institutions in various states; the goal is to obtain a high data quality dataset with high sampling rates for a complete watershed; that is, we a trying to create a water level prediction benchmark. Exploring Gated Recurrent Units(GRU) , a recurrent neural network architecture to the domain of flood prediction; that is, our goal is to produce a water level prediction benchmark. Data Analysis and Intelligent Systems Lab (UH-DAIS)

OUTLINE CONCLUSION MOTIVATION MULTI-TARGET PREDICTION DAG-based MULTI-TARGET PREDICTION APPROACH(DBMTP) CASE STUDY ON ADDICKS RESERVOIR EXPERIMENTS RESULTS CHALLENGES CURRENT RESEARCH WORK CONCLUSION

Conclusion Data Analysis and Intelligent Systems Lab (UH-DAIS) In order for this technology to be successful , we need datasets that have complete information about the watershed and highly regular sampling rates. This research centered on the design, implementation, and evaluation of a DAG-based Multi- Target Prediction approach for water level prediction for various Harris County datasets: we not only learned models based on the measuring point's historical data, but also dependencies between targets. In general, we believe these challenges made it difficult to provide a comprehensive evaluation of the proposed DAG-based multi-target prediction framework. Moreover, our current approach does not consider distances between measuring points, soil moisture and stream velocity. Data Analysis and Intelligent Systems Lab (UH-DAIS)