Bangladesh Short term Discharge Forecasting

Slides:

Advertisements

Similar presentations

: INTRODUCTION TO Machine Learning Parametric Methods.

Advertisements

DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 1 review: Quizzes 1-6.

Brief introduction on Logistic Regression

Pattern Recognition and Machine Learning

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models

Day 6 Model Selection and Multimodel Inference

Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material.

Use of regression analysis Regression analysis: –relation between dependent variable Y and one or more independent variables Xi Use of regression model.

R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.

Model Assessment and Selection

Model Assessment, Selection and Averaging

Model assessment and cross-validation - overview

How should these data be modelled?. Identification step: Look at the SAC and SPAC Looks like an AR(1)- process. (Spikes are clearly decreasing in SAC.

Non-Seasonal Box-Jenkins Models

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Prediction and model selection

ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011

Financial Econometrics

Hydrologic Statistics

BOX JENKINS METHODOLOGY

Traffic modeling and Prediction ----Linear Models

AR- MA- och ARMA-.

Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.

John G. Zhang, Ph.D. Harper College

Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID.

Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect.

INTRODUCTION TO Machine Learning 3rd Edition

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.

Dynamic Models, Autocorrelation and Forecasting ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Machine Learning 5. Parametric Methods.

The Box-Jenkins (ARIMA) Methodology

Information criteria What function fits best? The more free parameters a model has the higher will be R 2. The more parsimonious a model is the lesser.

Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,

Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”

Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.

I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)

CEE 6410 Water Resources Systems Analysis

Deep Feedforward Networks

Robert Plant != Richard Plant

Lecture 4 Model Selection and Multimodel Inference

12. Principles of Parameter Estimation

Two-way ANOVA with significant interactions

7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.

Ch8 Time Series Modeling

Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Statistics in MSmcDESPOT

Precipitation Products Statistical Techniques

Data Mining Lecture 11.

Applied Econometric Time-Series Data Analysis

CHAPTER 16 ECONOMIC FORECASTING Damodar Gujarati

Linear Regression.

Model Comparison.

A Weighted Moving Average Process for Forecasting “Economics and Environment” By Chris P. Tsokos.

10701 / Machine Learning Today: - Cross validation,

Linear Model Selection and regularization

Cross-validation for the selection of statistical models

Pattern Recognition and Machine Learning

Stock Prediction with ARIMA

Lecture 4 Model Selection and Multimodel Inference

Parametric Methods Berlin Chen, 2005 References:

12. Principles of Parameter Estimation

Lecture 4 Model Selection and Multimodel Inference

Time Series introduction in R - Iñaki Puigdollers

BOX JENKINS (ARIMA) METHODOLOGY

Presentation transcript:

Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID

Forecasting Probabilities Rainfall Probability Discharge Probability Rainfall [mm] Discharge [m^3/s] Above danger level probablity 36% Greater than climatological seasonal risk?

Data-Based Modeling Linear Transfer Function Approach (=> Used for the lumped model) Mass Balance Combine to get Q=S/T S dS/dt=u-Q TdQ/dt=u-Q For a catchment composed of linear stores in series and in parallel (using finite differences) Qt=a1ut-1+a2ut-2+…+amut-m+b1Qt-1+b2Qt-2+…+bnQt-n where u is effective catchment-averaged rainfall Derived from non-linear rainfall filter ut=(Qt)c Rt Reference: Beven, 2000

Linear Transfer Function Approach (cont) or for a 3-day forecast, say: Qt+3=a1ut+2+a2ut+1+…+amut-m+b1Qt-1+b2Qt-2+…+bnQt-n Our approach: for each day and forecast, use the AIC (Akaike information criterion) to optimize a’s, m, b’s, n, c, and precip smoothing Residuals (model biases) are then corrected using an ARMA (auto-regressive moving average) model => Something available in R

Autoregressive integrated moving average (ARIMA) in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average (ARMA) model. These models are fitted to time series data either to better understand the data or to predict future points in the series. They are applied in some cases where data show evidence of non-stationarity, where an initial differencing step (corresponding to the "integrated" part of the model) can be applied to remove the non-stationarity. The model is generally referred to as an ARIMA(p,d,q) model where p, d, and q are integers greater than or equal to zero and refer to the order of the autoregressive, integrated, and moving average parts of the model respectively. ARIMA models form an important part of the Box-Jenkins approach to time-series modelling. Reference: The Analysis of Time Series: An introduction. Texts in Statistical Science. Chatfield. 1996

Semi-distributed Model -- 2-layer model for soil moisture states S1, S2 + Parameters to be estimated from FAO soil map of the world Solved with a 6hr time-step (for daily 0Z discharge) using 4th-order Runge-Kutta semi-implicit scheme t_s1, tp, t_s2 time constants; r_s1, r_s2 reservoir depths

Model selection -- Akaike information criterion Akaike's information criterion, developed by Hirotsugu Akaike under the name of "an information criterion" (AIC) in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. It is grounded in the concept of entropy, in effect offering a relative measure of the information lost when a given model is used to describe reality and can be said to describe the tradeoff between bias and variance in model construction, or loosely speaking that of precision and complexity of the model. The AIC is not a test on the model in the sense of hypothesis testing, rather it is a tool for model selection. Given a data set, several competing models may be ranked according to their AIC, with the one having the lowest AIC being the best. From the AIC value one may infer that e.g the top three models are in a tie and the rest are far worse, but one should not assign a value above which a given model is 'rejected'

Model selection -- AIC = 2k – 2 ln(L) Akaike information criterion AIC = 2k – 2 ln(L) k = # model parameters; L = maximum likelihood estimator (e.g. square error sum) Bayesian information criterion BIC = ln(n) k - 2 ln(L) n = # of data points BIC penalty function is more demanding than AIC

Model selection -- Cross-validation -- most robust (secure), but most computationally-demanding! -- Set aside part of data for testing, ‘train’ on other part; best to cycle through to use all data for testing. e.g. If divide in halves (minimum), then 2X the computations required!