STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.

Slides:



Advertisements
Similar presentations
Cointegration and Error Correction Models
Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
FAO assessment of global undernourishment. Current practice and possible improvements Carlo Cafiero, ESS Rome, September CFS Round Table on.
TOPOLOGIES FOR POWER EFFICIENT WIRELESS SENSOR NETWORKS ---KRISHNA JETTI.
Improvement on LEACH Protocol of Wireless Sensor Network
Integrated Approaches for Runoff Forecasting Ashu Jain Department of Civil Engineering Indian Institute of Technology Kanpur Kanpur-UP, INDIA.
STAT 497 APPLIED TIME SERIES ANALYSIS
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Confidence Intervals for Proportions
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
Machine Learning Neural Networks
On the Influence of Weather Forecast Errors in Short-Term Load Forecasting Models Damien Fay, John V. Ringwood IEEE POWER SYSTEMS, 2010.
Pattern Recognition and Machine Learning
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
MAE 552 Heuristic Optimization
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
Radial Basis Function Networks
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Traffic modeling and Prediction ----Linear Models
For the lack of ground data the verification of the TRMM performance could not be checked for the entire catchments, however it has been tested over Bangladesh.
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Experimental research in noise influence on estimation precision for polyharmonic model frequencies Natalia Visotska.
September Bound Computation for Adaptive Systems V&V Giampiero Campa September 2008 West Virginia University.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Robert Engle UCSD and NYU and Robert F. Engle, Econometric Services DYNAMIC CONDITIONAL CORRELATIONS.
PARALLELIZATION OF ARTIFICIAL NEURAL NETWORKS Joe Bradish CS5802 Fall 2015.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Linear Statistical.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
Correlation & Regression Analysis
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Chapter 8: Adaptive Networks
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
NURHIKMAH OLA LAIRI (LAILUOLA) Ph.D International Trade Student Id :
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
DEVELOPMENT OF A CELL BASED MODEL FOR STREAM FLOW PREDICTION IN UNGAUGED BASINS USING GIS DATA P B Hunukumbura & S B Weerakoon Department of Civil Engineering,
Computacion Inteligente Least-Square Methods for System Identification.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Operation and Control Strategy of PV/WTG/EU Hybrid Electric Power System Using Neural Networks Faculty of Engineering, Elminia University, Elminia, Egypt.
LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Neural Network Architecture Session 2
Department of Civil and Environmental Engineering
A Simple Artificial Neuron
Luís Filipe Martinsª, Fernando Netoª,b. 
Product moment correlation
Parametric Methods Berlin Chen, 2005 References:
A Data Partitioning Scheme for Spatial Regression
Random Neural Network Texture Model
Presentation transcript:

STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Dallas, Texas USA

May 6, 2002Li & Dunham, PAKDD2 Our goal  In this paper, we present a novel forecasting framework for spatio-temporal data, in which not only spatial but also temporal characteristics of the data are considered to obtain a more appropriate result.

May 6, 2002Li & Dunham, PAKDD3 Presentation Outline  Motivation  Prior Research  Our Approach: STIFF Combining two approaches to achieve better results: Time Series Analysis and ANNs  Performance  Future Work

May 6, 2002Li & Dunham, PAKDD4 Why  There are many application fields which require spatio-temporal forecasting:  river hydrology, biological patterns, housing price research, rainfall distribution, waste monitoring, fishery, hotel pickup rate, etc.  In spatio-temporal forecasting, both spatial and temporal properties, as well as their mutual correlation, are taken into account.

May 6, 2002Li & Dunham, PAKDD5 What work has been done  [ Jothityangkoon, Sivapalan, and Viney, 2000]  Rainfall forecasting  Hidden Markov Model  De-aggregate high level to lower level  Large error  [Pokrajac and Obradovic,2001]  Current event assumed to be impacted only by immediate temporal ancestors.

May 6, 2002Li & Dunham, PAKDD6  [Cressie and Majure,1997]  Model livestock waste in a river basin  Condensed time into a “three day area of influence”  “large variation of the predicted values ”.  [Deutsch etal,1986]; [Kelly etal,1998]; [Pfeifer etal,1990]  Extended time series analysis with a spatial correlation from a simple distance matrix.  It is too arbitrary to just rely upon the pure distance measurement. More related research

May 6, 2002Li & Dunham, PAKDD7 Flood Forecasting (Our Motivating Application)  Catchment  Many different types of sensors  Predict at one sensor location  Water level or Flow rate  May not be interested in actual prediction of value

May 6, 2002Li & Dunham, PAKDD8 Our approach : Problem definition  Δ={α 0, α 1, α 2, … α n } is the research field, composed of n + 1 spatially separated subcomponents, named by α i accordingly.  WLOG, α 0 is assumed the target place where forecasting is about to be carried out.  For each α i in Δ, there are j observations with equal time intervals between consecutive ones, denoted by Л i ={α i1, α i2, α i3, … α ij }.

May 6, 2002Li & Dunham, PAKDD9 Problem definition (Cont.) - Given Δ={α 0, α 1, α 2, … α n }, Л={Л 1, Л 2, …Л n }, the length of observations j and the look-ahead steps of ι, we are expected to find an as good as possible forecasting relationship ƒ that is defined as follows.

May 6, 2002Li & Dunham, PAKDD10 Our approach : Algorithm sketch 1) Describe the forecasting problem according the problem definition.  Build a time series (ARIMA) model for each α i. Name the forecasting from α 0 time series model as ƒ T. - Construct and train an ANN to capture the spatial correlation and influence over the target subcomponent α 0. Name the forecasting from the neural network as ƒ S. - Combine ƒ T and ƒ S via a statistical regression mechanism.

May 6, 2002Li & Dunham, PAKDD11 Time Series Data Transformation  Convert non-stationary to stationary to prevent skewness as much as possible.  Box and Cox proposed a transformation family, namely, Box-Cox transformation:  The key is to determine the right value for λ so as to find the appropriate transformation. For example, when λ = 0 or.5 the transformation is in fact log or square root accordingly. But how?

May 6, 2002Li & Dunham, PAKDD12 Data transformation (cont’d)  Box and Cox proposed a large-sample maximum- likelihood approach.  Wei proposed to use the λ that minimizes  The former requires much computation while the latter one may incur some problems for it does not consider the difference compared to the real observation.  We therefore propose the following way to determine λ.

May 6, 2002Li & Dunham, PAKDD13 Time series Model  A time series model is chosen as it has the proven capability of describing and capturing the temporal dependency and relationship.  Our work focused on the ARIMA technique which can be embodied in the following formula.  And roughly speaking, the building process can be divided into three main steps. They are - Model identification - Parameter estimation - Diagnostic checking

May 6, 2002Li & Dunham, PAKDD14 Find the spatial influence  Normally it is much harder to find than its temporal counterpart in the problem.  No precise way to convert from the spatial measurement to the value it may change.  Time is only 1 dimension while space is 3 (or 2) dimensions.  A simple “distance” measure is not enough, other factors are important.

May 6, 2002Li & Dunham, PAKDD15 Artificial Neural Network (ANN)  Why is ANN used for finding spatial influence?  Itself a “black-box” and non-linear technology used to find the hidden pattern.  Like human brain, it can self-adjust and learn automatically even if the problem is not defined very well.  Practice proves its usefulness  [See,1997] found ANN was especially useful in “… situations where the underlying physical relationships are not fully understood …”

May 6, 2002Li & Dunham, PAKDD16 ANN Construction  Simple 3-layer back-propagation MLP  One input node for each sensor value except α 0.  Actual input shifted by predicted time lag.  The hidden layer has a certain number of neurons that have to be decided by experiment.  The output layer has only one neuron that corresponds to the target subcomponent α 0.  We also employ a kind of pruning strategy to achieve the most simplicity of ANN structure without harming the efficacy much.

May 6, 2002Li & Dunham, PAKDD17 Integrate the two forecasts  We have two forecasts so far at the target subcomponent α 0. One is ƒ T, from the time series model, and the other is ƒ S, from ANN. We may - Either dynamically select one from the two as the current forecast; - Or fuse them together since they contribute to the overall forecasting from two different aspects. (That’s what we take in the paper.)  The two forecasts are integrated via a very simple linear regression mechanism. Of course other more advanced alternatives can be used instead for better results.

May 6, 2002Li & Dunham, PAKDD18 A case study (National River Flow Archive – Great Britain)  Here we are going to present a practical case study to demonstrate how the framework works.  We will conduct the spatio-temporal forecasting at the outlet gauging station regarding the river water flow rate (m 3 /s). The basin is shown as follows.  The target station is while its siblings are lying upstream.  Derwent Catchment  Daily mean flow values

May 6, 2002Li & Dunham, PAKDD19 Data transformation  Checking the water flow rate data at station tells us the data is not very stable. The abrupt change is obvious and present roughly about 25% of the whole time.  We therefore employ the data transformation first according to the proposed approach discussed before.  We empirically vary the value of λ from –1.0 to 1.0 with the step of.1. It turns out λ = 0.0 is the best (relatively). In other words, we will log-transform the original water flow rate data.

May 6, 2002Li & Dunham, PAKDD20 Actual Flow at Derwent

May 6, 2002Li & Dunham, PAKDD21 Case Study ANN  6 input nodes  1 output node  6 chosen as number of hidden nodes based on experimentation  Number of links pruned based on river topology  Lag time used for input based on expected flow lag time

May 6, 2002Li & Dunham, PAKDD22 Building models  Following the framework specification, we then build a time series model based upon the dataset collected from each gauging station.  An ANN is constructed after that, with the spatially- induced pruning strategy applied to erase as many as possible unnecessary links while sacrificing little to the forecasting accuracy.  The final overall spatio-temporal forecasting is generated then following this simple regression:

May 6, 2002Li & Dunham, PAKDD23 STIFF Model fSfS fTfT x 1 f T + x 2 f S + C

May 6, 2002Li & Dunham, PAKDD24 Performance Analysis  Compared STIFF to pure time series (C TS ) and pure ANN (C ANN )  Data starting at 10/01/75  30, 60, 120 days  Normalized Absolute Ratio Error (NARE)

May 6, 2002Li & Dunham, PAKDD25 Forecasting result  The forecasting comparison result, measured in NARE, is outlined in the following table. The other two models, built to our best knowledge, are used to compare with STIFF.  Here “Over” means overestimation while “Under” for underestimation.

May 6, 2002Li & Dunham, PAKDD26 Result 30 Days

May 6, 2002Li & Dunham, PAKDD27 Conclusion  STIFF has a better forecast accuracy than the normal single time series model and ANN model, and more balanced (over vs. under estimation).  Compared with other related work, it avoids the oversimplification.  Does not have the large variation problem.  STIFF requires much human intervention and interpretation.  STIFF is promising for future research.

May 6, 2002Li & Dunham, PAKDD28 Future work  Extend to multivariate forecasting  Use more sophisticated fusing techniques  Test on more flood data  Compare to other techniques  Examine different ANN structures  So far, it can only deal with univariate forecasting.  Extend to other application domains  …..

May 6, 2002Li & Dunham, PAKDD29