Statistical Forecasting

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Correlation and regression
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Objectives (BPS chapter 24)
1 BIS APPLICATION MANAGEMENT INFORMATION SYSTEM Advance forecasting Forecasting by identifying patterns in the past data Chapter outline: 1.Extrapolation.
Chapter 10 Simple Regression.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Chapter 7 Forecasting with Simple Regression
Simple Linear Regression Analysis
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Hydrologic Statistics
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Applications of Regression to Water Quality Analysis Unite 5: Module 18, Lecture 1.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Regression relationship = trend + scatter
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Correlation & Regression Analysis
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Forecast 2 Linear trend Forecast error Seasonal demand.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Warm-Up The least squares slope b1 is an estimate of the true slope of the line that relates global average temperature to CO2. Since b1 = is very.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
The simple linear regression model and parameter estimation
Regression Analysis AGEC 784.
Inference for Least Squares Lines
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Topic 10 - Linear Regression
Linear Regression.
Correlation and Simple Linear Regression
Inference for Regression
Linear Regression and Correlation Analysis
Multivariate Regression
Understanding Standards Event Higher Statistics Award
Chapter 13 Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Regression
Statistical Methods For Engineers
6-1 Introduction To Empirical Models
No notecard for this quiz!!
The Weather Turbulence
Correlation and Simple Linear Regression
CHAPTER 12 More About Regression
Product moment correlation
Regression Assumptions
Ch 4.1 & 4.2 Two dimensions concept
3 basic analytical tasks in bivariate (or multivariate) analyses:
Regression Assumptions
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Statistical Forecasting Jan Verkade November 3, 2016

Statistical Forecasting = forecasting from data What does that mean? What other types of forecasting do you know?

Regression analysis Regression analysis: predicting future values of a variable using information about other variables Predictor: the variable that you want to forecast Predictand: the variable that you use as input what we hope to find is that the different variables do not vary independently (in a statistical sense), but that they tend to vary together. we assume that the future will behave like the past

Regression models A predictand may depend on predictor(s) in varying ways: y ~ x y ~ a + bx y ~ x2 …

The linear (regression) model 𝑌 𝑡 = 𝑏 0 + 𝑏 1 𝑋 1𝑡 + 𝑏 2 𝑋 2𝑡 + …+ 𝑏 𝑘 𝑋 𝑘𝑡 prediction for Y is a straight-line function of each of the X-variables contributions of different X variables to predictions are additive slopes b1, b2, etc: coefficients of the variables intercept b0

Justification of linear model for regression assumptions Why should we assume that relationships between variables are linear? Because linear relationships are the simplest non-trivial relationships that can be imagined (hence the easiest to work with), and..... Because the "true" relationships between our variables are often at least approximately linear over the range of values that are of interest to us, and... Even if they're not, we can often transform the variables in such a way as to linearize the relationships.

Fitting a linear model We fit a linear model through an objective function: minimise the mean squared error (MSE) Steps: Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error Subsitute, re-arrange and solve for b0 and b1

Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean 𝑋 𝑡 ∗ = 𝑋 𝑡 −𝑚𝑒𝑎𝑛(𝑋) 𝑠𝑡𝑑𝑒𝑣(𝑋) 𝑌 𝑡 ∗ = 𝑌 𝑡 −𝑚𝑒𝑎𝑛(𝑌) 𝑠𝑡𝑑𝑒𝑣(𝑌)

Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values 𝑟 𝑋𝑌 = 1 𝑛 𝑋 1 ∗ 𝑌 1 ∗ + 𝑋 2 ∗ 𝑌 2 ∗ +…+ 𝑋 𝑛 ∗ 𝑌 𝑛 ∗

Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error 𝑌 𝑡 ∗ = 𝑟 𝑋𝑌 𝑋 𝑡 ∗

Fitting a linear model Standardize variables: convert them to units of standard-deviations-from-the-mean Calculate average product of standardized values Minimize mean squared error Subsitute, re-arrange and solve for b0 and b1 𝑌 𝑡 −𝑚𝑒𝑎𝑛(𝑌) 𝑠𝑡𝑑𝑒𝑣(𝑌) = 𝑟 𝑋𝑌 𝑋 𝑡 −𝑚𝑒𝑎𝑛(𝑋) 𝑠𝑡𝑑𝑒𝑣(𝑋) 𝑌 𝑡 ∗ = 𝑟 𝑋𝑌 𝑋 𝑡 ∗ 𝑌 𝑡 = 𝑏 0 + 𝑏 1 𝑋 1𝑡 𝑏 1 = 𝑟 𝑋𝑌 𝑠𝑡𝑑𝑒𝑣(𝑌) 𝑠𝑡𝑑𝑒𝑣(𝑋) 𝑏 0 =𝑚𝑒𝑎𝑛 𝑌 − 𝑏 1 𝑚𝑒𝑎𝑛(𝑋)

Exercise: piezometric head within a levee

Exercise: piezometric head within a levee river water level water pressure sensor

Exercise: piezometric head within a levee Use voorhavendijk.xls Explore the data by building a scatter (x,y) plot Determine mean and standard deviations Determine standardized values; then explore… marginal distributions (ecdf of either variable) joint distribution (scatter plot) Determine the coefficient of correlation Determine the coefficients of the regression equation Verify by using Excel’s built-in function to show regression line

Exercise: piezometric head within a levee

Exercise: piezometric head within a levee Discuss: is the linear model a good model?

Exercise: piezometric head within a levee How to use / interpret the regression line?

Exercise: piezometric head within a levee Use voorhavendijk.xls Explore the data by building a scatter (x,y) plot Determine mean and standard deviations Determine standardizes values; then explore… marginal distributions (ecdf of either variable) joint distribution (scatter plot) Determine the coefficient of correlation Determine the coefficients of the regression equation Verify by using Excel’s built-in function to show regression line Explore the residuals by plotting an empirical cumulative density function. What is the mean value? How are the residuals distributed?

LM-model: residuals

LM-model: residuals mean: -6.16922e-18 stdev: 0.2790263

Exercise: piezometric head within a levee How to use / interpret the regression line?

Forecasting errors Intrinsic risk: signal v noise Parameter risk: uncertain parameter values Model risk: the risk of choosing the wrong model (linear model v quadratic model, for example)

Confidence Intervals v Prediction Intervals

An alternative statistical technique: Quantile Regression Principles: QR is a method for describing conditional quantiles Rather than minimising the mean squared error (MSE) QR is based on minimising the mean absolute error (MAE) This yields not the sample mean but the sample median Other quantiles may be derived by adding weights to errors E.g. weight = .1 for positive errors and .9 for negative errors Fitting models may be done in transformed space to account for heteroscedasticity

Application in real-time hydrologic forecasting: post-processing Ensemble techniques Post-processing techniques

Application in real-time hydrologic forecasting: post-processing Once a record of forecasts is in place This record can be analysed for ‘forecast errors’ And these records can be assumed to occur in future forecasts also

1: Find a relationship between forecast and obs 5 december 2017 1: Find a relationship between forecast and obs

2. Apply that relation to new forecasts

And here’s your forecast 5 december 2017 And here’s your forecast

Famous forecasting quotes "I have seen the future and it is very much like the present, only longer." --Kehlog Albran, The Profit  Pretty concise description of statistical forecasting: We search for statistical properties of a time series that are constant in time (levels, trends, seasonal patterns, correlations and autocorrelations, etc.) We then predict that those properties will describe the future as well as the present

Famous forecasting quotes "Prediction is very difficult, especially if it's about the future." --Nils Bohr, Nobel laureate in Physics warning of the importance of validating a forecasting model out-of-sample. It's often easy to find a model that fits the past data well--perhaps too well!— but quite another matter to find a model that correctly identifies those patterns in the past data that will continue to hold in the future.