Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID.

Similar presentations


Presentation on theme: "Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID."— Presentation transcript:

1 Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID

2 Forecasting Probabilities Rainfall Probability Rainfall [mm] Discharge Probability Discharge [m^3/s] Above danger level probablity 36% Greater than climatological seasonal risk?

3

4 Data-Based Modeling Linear Transfer Function Approach S Q=S/T Mass Balance dS/dt=u-Q Combine to get TdQ/dt=u-Q For a catchment composed of linear stores in series and in parallel (using finite differences) Q t =a 1 u t-1 +a 2 u t-2 +…+a m u t-m +b 1 Q t-1 +b 2 Q t-2 +…+b n Q t-n where u is effective catchment-averaged rainfall Derived from non-linear rainfall filter u t =(Q t ) c R t Reference: Beven, 2000 (=> Used for the lumped model)

5 Linear Transfer Function Approach (cont) Q t+3 =a 1 u t+2 +a 2 u t+1 +…+a m u t-m +b 1 Q t-1 +b 2 Q t-2 +…+b n Q t-n or for a 3-day forecast, say: Our approach: for each day and forecast, use the AIC (Akaike information criterion) to optimize a’s, m, b’s, n, c, and precip smoothing Residuals (model biases) are then corrected using an ARMA (auto-regressive moving average) model => Something available in R=> Something available in R

6 Autoregressive integrated moving average (ARIMA) in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average (ARMA) model. These models are fitted to time series data either to better understand the data or to predict future points in the series. They are applied in some cases where data show evidence of non-stationarity, where an initial differencing step (corresponding to the "integrated" part of the model) can be applied to remove the non-stationarity. The model is generally referred to as an ARIMA(p,d,q) model where p, d, and q are integers greater than or equal to zero and refer to the order of the autoregressive, integrated, and moving average parts of the model respectively. ARIMA models form an important part of the Box-Jenkins approach to time-series modelling. Reference: The Analysis of Time Series: An introduction. Texts in Statistical Science. Chatfield. 1996

7 Semi-distributed Model -- 2-layer model for soil moisture states S1, S2 -Parameters to be estimated from FAO soil map of the world -Solved with a 6hr time-step (for daily 0Z discharge) using 4 th -order Runge-Kutta semi-implicit scheme -t_s1, tp, t_s2 time constants; r_s1, r_s2 reservoir depths +

8 Model selection -- Akaike information criterion Akaike's information criterion, developed by Hirotsugu Akaike under the name of "an information criterion" (AIC) in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. It is grounded in the concept of entropy, in effect offering a relative measure of the information lost when a given model is used to describe reality and can be said to describe the tradeoff between bias and variance in model construction, or loosely speaking that of precision and complexity of the model. The AIC is not a test on the model in the sense of hypothesis testing, rather it is a tool for model selection. Given a data set, several competing models may be ranked according to their AIC, with the one having the lowest AIC being the best. From the AIC value one may infer that e.g the top three models are in a tie and the rest are far worse, but one should not assign a value above which a given model is 'rejected'

9 Model selection -- Akaike information criterion AIC = 2k – 2 ln(L) k = # model parameters; L = maximum likelihood estimator (e.g. square error sum) Bayesian information criterion BIC = ln(n) k - 2 ln(L) n = # of data points BIC penalty function is more demanding than AIC

10 Model selection -- -- most robust (secure), but most computationally-demanding! -- Set aside part of data for testing, ‘train’ on other part; best to cycle through to use all data for testing. e.g. If divide in halves (minimum), then 2X the computations required! Cross-validation

11 Model selection -- Fitting an auto-regressive model >zz <- ar(x, order.max = 100, method = “yule-walker”) Fitting an ARIMA model >zz <- arima.mle(x, model = list(order = c(a, b, c))) R commands

12 Model selection -- 1) Create three random normal vectors of data of length 200, call them p, r1, r2 2) Create a 4th vector such that q=10*p+r1 3) Set aside ½ of the data in each vector 4) Using linear regression, solve for q as a function of p using the first ½ of data, and calculate the square error 5) Next, again using linear regression, solve for q as a function of p and r2 using the first ½ of data, and calculate the square error 6) Compare the square errors of 4) and 5). Which one did you expect to be smaller? 7) Next, calculate the AIC of 4) and 5). Which one did you expect to be smaller? 8) Finally, with the coefficients determined in steps 4) and 5), estimate q for each model using the other ½ of the data, and calculate the square error 9) Comparing the square errors, which one is smaller? Makes sense? Try this out in R!


Download ppt "Bangladesh Short term Discharge Forecasting time series forecasting Tom Hopson A project supported by USAID."

Similar presentations


Ads by Google