Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Forecasting

Similar presentations


Presentation on theme: "Statistical Forecasting"β€” Presentation transcript:

1 Statistical Forecasting
Jan Verkade November 3, 2016

2 Statistical Forecasting = forecasting from data
What does that mean? What other types of forecasting do you know?

3 Regression analysis Regression analysis: predicting future values of a variable using information about other variables Predictor: the variable that you want to forecast Predictand: the variable that you use as input what we hope to find is that the different variables do not vary independently (in a statistical sense), but that they tend to vary together. we assume that the future will behave like the past

4 Regression models A predictand may depend on predictor(s) in varying ways: y ~ x y ~ a + bx y ~ x2 …

5 The linear (regression) model
π‘Œ 𝑑 = 𝑏 0 + 𝑏 1 𝑋 1𝑑 + 𝑏 2 𝑋 2𝑑 + …+ 𝑏 π‘˜ 𝑋 π‘˜π‘‘ prediction for Y is a straight-line function of each of the X-variables contributions of different X variables to predictions are additive slopes b1, b2, etc: coefficients of the variables intercept b0

6 Justification of linear model for regression assumptions
Why should we assume that relationships between variables are linear? Because linear relationships are the simplest non-trivial relationships that can be imagined (hence the easiest to work with), and..... Because the "true" relationships between our variables are often at least approximately linear over the range of values that are of interest to us, and... Even if they're not, we can often transform the variables in such a way as to linearize the relationships.

7 Fitting a linear model We fit a linear model through an objective function: minimise the mean squared error (MSE) Steps: Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error Subsitute, re-arrange and solve for b0 and b1

8 Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean 𝑋 𝑑 βˆ— = 𝑋 𝑑 βˆ’π‘šπ‘’π‘Žπ‘›(𝑋) 𝑠𝑑𝑑𝑒𝑣(𝑋) π‘Œ 𝑑 βˆ— = π‘Œ 𝑑 βˆ’π‘šπ‘’π‘Žπ‘›(π‘Œ) 𝑠𝑑𝑑𝑒𝑣(π‘Œ)

9 Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values π‘Ÿ π‘‹π‘Œ = 1 𝑛 𝑋 1 βˆ— π‘Œ 1 βˆ— + 𝑋 2 βˆ— π‘Œ 2 βˆ— +…+ 𝑋 𝑛 βˆ— π‘Œ 𝑛 βˆ—

10 Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error π‘Œ 𝑑 βˆ— = π‘Ÿ π‘‹π‘Œ 𝑋 𝑑 βˆ—

11 Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error Subsitute, re-arrange and solve for b0 and b1 π‘Œ 𝑑 βˆ’π‘šπ‘’π‘Žπ‘›(π‘Œ) 𝑠𝑑𝑑𝑒𝑣(π‘Œ) = π‘Ÿ π‘‹π‘Œ 𝑋 𝑑 βˆ’π‘šπ‘’π‘Žπ‘›(𝑋) 𝑠𝑑𝑑𝑒𝑣(𝑋) π‘Œ 𝑑 βˆ— = π‘Ÿ π‘‹π‘Œ 𝑋 𝑑 βˆ— π‘Œ 𝑑 = 𝑏 0 + 𝑏 1 𝑋 1𝑑 𝑏 1 = π‘Ÿ π‘‹π‘Œ 𝑠𝑑𝑑𝑒𝑣(π‘Œ) 𝑠𝑑𝑑𝑒𝑣(𝑋) 𝑏 0 =π‘šπ‘’π‘Žπ‘› π‘Œ βˆ’ 𝑏 1 π‘šπ‘’π‘Žπ‘›(𝑋)

12 Exercise: piezometric head within a levee

13 Exercise: piezometric head within a levee
river water level water pressure sensor

14 Exercise: piezometric head within a levee
Use voorhavendijk.xls Explore the data by building a scatter (x,y) plot Determine mean and standard deviations Determine standardized values; then explore… marginal distributions (ecdf of either variable) joint distribution (scatter plot) Determine the coefficient of correlation Determine the coefficients of the regression equation Verify by using Excel’s built-in function to show regression line

15 Exercise: piezometric head within a levee

16 Exercise: piezometric head within a levee
Discuss: is the linear model a good model?

17 Exercise: piezometric head within a levee
How to use / interpret the regression line?

18 Exercise: piezometric head within a levee
Use voorhavendijk.xls Explore the data by building a scatter (x,y) plot Determine mean and standard deviations Determine standardizes values; then explore… marginal distributions (ecdf of either variable) joint distribution (scatter plot) Determine the coefficient of correlation Determine the coefficients of the regression equation Verify by using Excel’s built-in function to show regression line Explore the residuals by plotting an empirical cumulative density function. What is the mean value? How are the residuals distributed?

19 LM-model: residuals

20 LM-model: residuals mean: e-18 stdev:

21 Exercise: piezometric head within a levee
How to use / interpret the regression line?

22 Forecasting errors Intrinsic risk: signal v noise
Parameter risk: uncertain parameter values Model risk: the risk of choosing the wrong model (linear model v quadratic model, for example)

23 Confidence Intervals v Prediction Intervals

24 An alternative statistical technique: Quantile Regression
Principles: QR is a method for describing conditional quantiles Rather than minimising the mean squared error (MSE) QR is based on minimising the mean absolute error (MAE) This yields not the sample mean but the sample median Other quantiles may be derived by adding weights to errors E.g. weight = .1 for positive errors and .9 for negative errors Fitting models may be done in transformed space to account for heteroscedasticity

25

26

27 Application in real-time hydrologic forecasting: post-processing
Ensemble techniques Post-processing techniques

28 Application in real-time hydrologic forecasting: post-processing
Once a record of forecasts is in place This record can be analysed for β€˜forecast errors’ And these records can be assumed to occur in future forecasts also

29 1: Find a relationship between forecast and obs
5 december 2017 1: Find a relationship between forecast and obs

30 2. Apply that relation to new forecasts

31 And here’s your forecast
5 december 2017 And here’s your forecast

32 Famous forecasting quotes
"I have seen the future and it is very much like the present, only longer." --Kehlog Albran, The Profit οƒ  Pretty concise description of statistical forecasting: We search for statistical properties of a time series that are constant in time (levels, trends, seasonal patterns, correlations and autocorrelations, etc.) We then predict that those properties will describe the future as well as the present

33 Famous forecasting quotes
"Prediction is very difficult, especially if it's about the future." --Nils Bohr, Nobel laureate in Physics warning of the importance of validating a forecasting model out-of-sample. It's often easy to find a model that fits the past data well--perhaps too well!β€” but quite another matter to find a model that correctly identifies those patterns in the past data that will continue to hold in the future.


Download ppt "Statistical Forecasting"

Similar presentations


Ads by Google