slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material based on: Bowerman-O’Connell-Koehler, Brooks/Cole
slide 2 DSCI 5340 FORECASTING Page 438 Ex 9.2, Ex 9.3, Ex 9.4 Homework in Textbook
slide 3 DSCI 5340 FORECASTING Ex 9.2 Page 438
slide 4 DSCI 5340 FORECASTING Ex 9.3 Page 438 Part a Autocorrelations Dies Down Slowly – Series is Not Stationary
slide 5 DSCI 5340 FORECASTING Ex 9.3b Page 438
slide 6 DSCI 5340 FORECASTING Ex 9.3c Page 438
slide 7 DSCI 5340 FORECASTING Ex 9.3d Page 438 Autocorrelations Cut off Quickly – Series is Stationary
slide 8 DSCI 5340 FORECASTING Ex 9.3e Page 438 Interpret SAC & SPAC
slide 9 DSCI 5340 FORECASTING Ex 9.3e Page 438 Interpret SAC & SPAC SAC dies exponentially and SPAC cuts off after Lag 1, therefore…
slide 10 DSCI 5340 FORECASTING Ex 9.4a Page 438 …the series is AR(1)
slide 11 DSCI 5340 FORECASTING Ex 9.4b Page 438
slide 12 DSCI 5340 FORECASTING Ex 9.4c Page 439
slide 13 DSCI 5340 FORECASTING Ex 9.4d Page 439 part 1 y 3 hat = ( )y 2 –.64774y 1 y 3 hat = ( )* *235 y 3 hat = y 3 - y 3 hat = =
slide 14 DSCI 5340 FORECASTING Ex 9.4d Page 439 part 2 At time origin 90, Y 91 hat = ( )y 90 –.64774y 89 Y 91 hat = ( )* * Y 91 hat = Y 92 hat = ( )y 91 hat–.64774y 90 Y 92 hat = ( )* * Y 92 hat = Y 93 hat = ( )y 92 hat y 91 hat Y 93 hat = ( )* * Y 93 hat =
slide 15 DSCI 5340 FORECASTING Ex 9.4d Page 439 part 3
slide 16 DSCI 5340 FORECASTING Chapter 9 General Nonseasonal Models
slide 17 DSCI 5340 FORECASTING Autoregressive Moving Average Models A time series that is a linear function of p past values plus a linear combination of q past errors is called an autoregressive moving average process of order (p,q), denoted ARMA(p,q). Also, denoted ARIMA(p,0,q)
slide 18 DSCI 5340 FORECASTING Box-Jenkins ARIMAX Models n ARIMAX: AutoRegressive Integrated Moving Average with eXogenous variables n AR: Autoregressive Time series is a function of its own past. n MA: Moving Average Time series is a function of past shocks (deviations, innovations, errors, and so on). n I: Integrated Differencing provides stochastic trend and seasonal components, so forecasting requires integration (undifferencing). n X: Exogenous Time series is influenced by external factors. (These input variables can actually be endogenous or exogenous.)
slide 19 DSCI 5340 FORECASTING Formulas for TACs
slide 20 DSCI 5340 FORECASTING Formulas for TACs
slide 21 DSCI 5340 FORECASTING Determine Whether the SAC or the SPAC is Cutting Off More Abruptly
slide 22 DSCI 5340 FORECASTING What if SAC and SPAC Are Not Significant for any Lags? n This could happen if the time series is white noise:
slide 23 DSCI 5340 FORECASTING 23 The Backshift Operator The backshift operator B k (sometimes L k is used) shifts a time series by k time units. Shift 1 time unit Shift 2 time units Shift k time units The backshift operator notation is a convenient way to write ARMA models.
slide 24 DSCI 5340 FORECASTING ACF and PACF after 1-Lag Differencing Indication of MA(1) or MA(2) with sharp cut-off after lag 2 Damping pattern eliminates AR possibility
slide 25 DSCI 5340 FORECASTING Autocorrelation Plots for an AR(2) Time Series
slide 26 DSCI 5340 FORECASTING Classical Decomposition (Box-Jenkins) Procedure 26 Verify presence of any seasonal or time-based trends Achieve data stationarity using techniques such as “Differencing” where you difference consecutive data points up to N-lag Use sample Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) to see if the data follows Moving Average (MA) or Auto- regressive (AR) process, respectively “Goodness of Fit” tests (e.g., Akaike Information Criterion) on the selected model parameters to find model fits that are statistically significant p – MA order d – Differencing order q – AR order
slide 27 DSCI 5340 FORECASTING AIC and SBC/BIC – Information Criteria: Smaller is Better
slide 28 DSCI 5340 FORECASTING Stationarity of the AR process If an AR model is not stationary, this implies that previous values of the error term will have a non-declining effect on the current value of the dependent variable. This implies that the coefficients on the MA process would not converge to zero as the lag length increases. For an AR model to be stationary, the coefficients on the corresponding MA process decline with lag length, converging on 0.
slide 29 DSCI 5340 FORECASTING Not Stationary
slide 30 DSCI 5340 FORECASTING AR Process The test for stationarity in an AR model (with p lags) is that the roots of the characteristic equation lie outside the unit circle (i.e. > 1), where the characteristic equation is:
slide 31 DSCI 5340 FORECASTING Unit Root When testing for stationarity for any variable, we describe it as testing for a ‘unit root’, this is based on this same idea. The most basic AR model is the AR(1) model, on which most tests for stationarity are based, such as the Dickey-Fuller test.
slide 32 DSCI 5340 FORECASTING Unit Root Test (L is the backshift operator) (This is the characteristic equation)
slide 33 DSCI 5340 FORECASTING Unit Root Test With the AR(1) model, the characteristic equation of (1-z)= 0, suggests that it has a root of z = 1. This lies on the unit circle, rather than outside it, so we conclude that it is non- stationary. As we increase the lags in the AR model, so the potential number of roots increases, so for 2 lags, we have a quadratic equation producing 2 roots, for the model to be stationary, they both need to lie outside the unit circle.
slide 34 DSCI 5340 FORECASTING 34 One Example: The Dickey-Fuller Single Mean Test Model: Null Hypothesis: Alternative Hypothesis:
slide 35 DSCI 5340 FORECASTING Mean of an AR(1) Process The (unconditional mean) for an AR(1) process, with a constant (μ) is given by: For ϕ 1 = 1, the mean drifts to infinity and the process is non-stationary
slide 36 DSCI 5340 FORECASTING Variance of an AR(1) Process The (unconditional) variance for an AR process of order 1 (excluding the constant) is: For ϕ 1 = 1, the variance drifts to infinity and the process is non-stationary
slide 37 DSCI 5340 FORECASTING ADF – Augmented Dickey Fuller Test for Unit Root proc arima data = TowelSales; identify var = y( 1) nlag=15 stationarity = (adf = (2)); title "ARIMA Stationarity Analysis"; run; Type Lags Rho Pr F Zero Mean <.0001 Single Mean < Trend < Reject unit root – conclude AR(2) is stationary.
slide 38 DSCI 5340 FORECASTING Scan Procedure – Use for Preliminary Estimate proc arima data = TowelSales; identify var = y( 1) nlag=15 scan; title "ARIMA Analysis"; run; Model notation: In this example, ARIMA(2,2) is the simplest model that yields insignificant terms:
slide 39 DSCI 5340 FORECASTING Tentative Model from Output – MA(1) or ARIMA(0,0,1) The simplest model that has a high probability is MA(1): AR(1) has a low probability AR(2) is more complex ARMA(1,1) is more complex
slide 40 DSCI 5340 FORECASTING Forecast Model Building: Fit and Holdout samples Fit SampleHoldout Sample Used to estimate model parameters for accuracy evaluation Used to forecast values in holdout sample Used to evaluate model accuracy Simulates retrospective study Full = Fit + Holdout data is used to fit deployment model
slide 41 DSCI 5340 FORECASTING Page Ex 9.5, Ex 9.6 Ex 9.7 Homework in Textbook