Lecture 9 Forecasting
Introduction to Forecasting * * * * * * * * o o o o o o o o Model 1Model 2 Which model performs better? There are many forecasting models available ?
A forecasting method can be selected by evaluating its forecast accuracy using the actual time series. The two most commonly used measures of forecast accuracy are: –Mean Absolute Deviation –Sum of Squares for Forecast Error
Choose SSE if it is important to avoid (even a few) large errors. Otherwise, use MAD. A useful procedure for model selection. –Use some of the observations to develop several competing forecasting models. –Run the models on the rest of the observations. –Calculate the accuracy of each model. –Select the model with the best accuracy measure. Measures of Forecast Accuracy
Selecting a Forecasting Model Annual data from 1970 to 1996 were used to develop three forecasting models. Use MAD and SSE to determine which model performed best for 1997, 1998, 1999, and 2000.
Solution –For model 1 –Summary of results Actual y in 1991 Forecast for y in 1991
The choice of a forecasting technique depends on the components identified in the time series. The techniques discussed next are: –Seasonal indexes –Exponential smoothing –Autoregressive models (a brief discussion) Forecasting Models
Forecasting with Seasonal Indexes Linear regression and seasonal indexes to forecast the time series that composes trend with seasonality The model Linear trend value for period t, obtained from the linear regression Seasonal index for period t.
The procedure –Use simple linear regression to find the trend line. –Use the trend line to calculate the seasonal indexes. –To calculate F t multiply the trend value for period t by the seasonal index of period t.
The exponential smoothing model can be used to produce forecasts when the time series… –exhibits gradual(not a sharp) trend –no cyclical effects –no seasonal effects Forecasting with Exponential Smoothing
Forecast for period t+k is computed by F t+k = S t where t is the current period and S t = y t + (1- )S t-1
Regression Diagnostics The three conditions required for the validity of the regression analysis are: –the error variable is normally distributed. –the error variance is constant for all values of x. –The errors are independent of each other. How can we diagnose violations of these conditions?
Positive First Order Autocorrelation Residuals Time Positive first order autocorrelation occurs when consecutive residuals tend to be similar. 0 + y t
Negative First Order Autocorrelation Residuals Time Negative first order autocorrelation occurs when consecutive residuals tend to markedly differ. y t
Durbin - Watson Test: Are the Errors Autocorrelated? This test detects first order autocorrelation between consecutive residuals in a time series If autocorrelation exists the error variables are not independent
If d<d L there is enough evidence to show that positive first-order correlation exists If d>d U there is not enough evidence to show that positive first-order correlation exists If d is between d L and d U the test is inconclusive. One tail test for Positive First Order Autocorrelation dLdL First order correlation exists Inconclusive test Positive first order correlation Does not exists dUdU
One Tail Test for Negative First Order Autocorrelation If d>4-d L, negative first order correlation exists If d<4-d U, negative first order correlation does not exists if d falls between 4-d U and 4-d L the test is inconclusive. Negative first order correlation exists 4-d U 4-d L Inconclusive test Negative first order correlation does not exist
If d 4-d L first order autocorrelation exists If d falls between d L and d U or between 4- d U and 4-d L the test is inconclusive If d falls between d U and 4-d U there is no evidence for first order autocorrelation dLdL dUdU d U 4-d L First order correlation exists First order correlation exists Inconclusive test Inconclusive test First order correlation does not exist First order correlation does not exist Two-Tail Test for First Order Autocorrelation
Example 18.3 (Xm18-03)Xm18-03 –How does the weather affect the sales of lift tickets in a ski resort? –Data of the past 20 years sales of tickets, along with the total snowfall and the average temperature during Christmas week in each year, was collected. –The model hypothesized was TICKETS= 0 + 1 SNOWFALL+ 2 TEMPERATURE+ –Regression analysis yielded the following results: Testing the Existence of Autocorrelation, Example
Diagnostics: The Error Distribution The errors histogram The errors may be normally distributed
Residual vs. predicted y It appears there is no problem of heteroscedasticity (the error variance seems to be constant). Diagnostics: Heteroscedasticity
Residual over time Diagnostics: First Order Autocorrelation The errors are not independent!! t etet
Test for positive first order auto-correlation: n=20, k=2. From the Durbin-Watson table we have: d L =1.10, d U =1.54. The statistic d= Conclusion: Because d<d L, there is sufficient evidence to infer that positive first order autocorrelation exists. Diagnostics: First Order Autocorrelation
The Modified Model: Time Included The modified regression model TICKETS = 0 + 1 SNOWFALL + 2 TEMPERATURE + 3 TIME + All the required conditions are met for this model. The fit of this model is high R 2 = The model is valid. Significance F = SNOWFALL and TIME are linearly related to ticket sales. TEMPERATURE is not linearly related to ticket sales.
Autocorrelation among the errors of the regression model provides opportunity to produce accurate forecasts. In a stationary time series (no trend and no seasonality) correlation between consecutive residuals leads to the following autoregressive model: y t = 0 + 1 y t-1 + t Autoregressive models
The values for periods 1, 2, 3,… are predictors of the values for periods 2, 3, 4,…, respectively. The estimated model has the form