Econ 240 C Lecture 13 1
2 Outline Exponential Smoothing Exponential Smoothing Back of the envelope formula: geometric distributed lag: L(t) = a*y(t-1) + (1-a)*L(t-1); F(t) = L(t) Back of the envelope formula: geometric distributed lag: L(t) = a*y(t-1) + (1-a)*L(t-1); F(t) = L(t) ARIMA (p,d,q) = (0,1,1); ∆y(t) = e(t) –(1-a)e(t-1) ARIMA (p,d,q) = (0,1,1); ∆y(t) = e(t) –(1-a)e(t-1) Error correction: L(t) =L(t-1) + a*e(t) Error correction: L(t) =L(t-1) + a*e(t) Intervention Analysis Intervention Analysis
3 Part I: Exponential Smoothing Exponential smoothing is a technique that is useful for forecasting short time series where there may not be enough observations to estimate a Box-Jenkins model Exponential smoothing is a technique that is useful for forecasting short time series where there may not be enough observations to estimate a Box-Jenkins model Exponential smoothing can be understood from many perspectives; one perspective is a formula that could be calculated by hand Exponential smoothing can be understood from many perspectives; one perspective is a formula that could be calculated by hand
4
5 Three Rates of Growth
6
7 Simple exponential smoothing Simple exponential smoothing, also known as single exponential smoothing, is most appropriate for a time series that is a random walk with first order moving average error structure Simple exponential smoothing, also known as single exponential smoothing, is most appropriate for a time series that is a random walk with first order moving average error structure The levels term, L(t), is a weighted average of the observation lagged one, y(t-1) plus the previous levels, L(t-1): The levels term, L(t), is a weighted average of the observation lagged one, y(t-1) plus the previous levels, L(t-1): L(t) = a*y(t-1) + (1-a)*L(t-1) L(t) = a*y(t-1) + (1-a)*L(t-1)
8 Single exponential smoothing The parameter a is chosen to minimize the sum of squared errors where the error is the difference between the observation and the levels term: e(t) = y(t) – L(t) The parameter a is chosen to minimize the sum of squared errors where the error is the difference between the observation and the levels term: e(t) = y(t) – L(t) The forecast for period t+1 is given by the formula: L(t+1) = a*y(t) + (1-a)*L(t) The forecast for period t+1 is given by the formula: L(t+1) = a*y(t) + (1-a)*L(t) Example from John Heinke and Arthur Reitsch, Business Forecasting, 6 th Ed. Example from John Heinke and Arthur Reitsch, Business Forecasting, 6 th Ed.
9 observationsSales
10 Single exponential smoothing For observation #1, set L(1) = Sales(1) = 500, as an initial condition For observation #1, set L(1) = Sales(1) = 500, as an initial condition As a trial value use a = 0.1 As a trial value use a = 0.1 So L(2) = 0.1*Sales(1) + 0.9*Level(1) L(2) = 0.1* *500 = 500 So L(2) = 0.1*Sales(1) + 0.9*Level(1) L(2) = 0.1* *500 = 500 And L(3) = 0.1*Sales(2) + 0.9*Level(2) L(3) = 0.1* *500 = 485 And L(3) = 0.1*Sales(2) + 0.9*Level(2) L(3) = 0.1* *500 = 485
11 observationsSalesLevel
12 observationsSalesLevel a = 0.1
13 Single exponential smoothing So the formula can be used to calculate the rest of the levels values, observation #4-#24 So the formula can be used to calculate the rest of the levels values, observation #4-#24 This can be set up on a spread-sheet This can be set up on a spread-sheet
14 observationsSalesLevel a = 0.1
15 Single exponential smoothing The forecast for observation #25 is: L(25) = 0.1*sales(24)+0.9*(24) The forecast for observation #25 is: L(25) = 0.1*sales(24)+0.9*(24) Forecast(25)=Levels(25)=0.1* *449 Forecast(25)=Levels(25)=0.1* *449 Forecast(25) = Forecast(25) = 469.1
17 Single exponential distribution The errors can now be calculated: e(t) = sales(t) – levels(t) The errors can now be calculated: e(t) = sales(t) – levels(t)
18 observationsSalesLevelerror a = 0.1
19 observationsSalesLevelerror error squared a = 0.1
20 observationsSalesLevelerror error squared sum sq res a = 0.1
21 Single exponential smoothing For a = 0.1, the sum of squared errors is: errors) 2 = 582,281.2 For a = 0.1, the sum of squared errors is: errors) 2 = 582,281.2 A grid search can be conducted for the parameter value a, to find the value between 0 and 1 that minimizes the sum of squared errors A grid search can be conducted for the parameter value a, to find the value between 0 and 1 that minimizes the sum of squared errors The calculations of levels, L(t), and errors, e(t) = sales(t) – L(t) for a =0.6 The calculations of levels, L(t), and errors, e(t) = sales(t) – L(t) for a =0.6
22 observa tionsSalesLevels a = 0.6
23 Single exponential smoothing Forecast(25) = Levels(25) = 0.6*sales(24) + 0.4*levels(24) = 0.6* *465 = 776 Forecast(25) = Levels(25) = 0.6*sales(24) + 0.4*levels(24) = 0.6* *465 = 776
24 observa tionsSalesLevelserror error square Sum of Sq Res a = 0.6
25 Single exponential smoothing Grid search plot Grid search plot
27 Single Exponential Smoothing EVIEWS: Algorithmic search for the smoothing parameter a EVIEWS: Algorithmic search for the smoothing parameter a In EVIEWS, select time series sales(t), and open In EVIEWS, select time series sales(t), and open In the sales window, go to the PROCS menu and select exponential smoothing In the sales window, go to the PROCS menu and select exponential smoothing Select single Select single the best parameter a = 0.26 with sum of squared errors = and root mean square error = = ( /24) 1/2 the best parameter a = 0.26 with sum of squared errors = and root mean square error = = ( /24) 1/2 The forecast, or end of period levels mean = The forecast, or end of period levels mean = 532.4
28
29
30 Forecast = L(25) = 0.26*Sales(24) L(24) = =0.26* * =532.4
31
32 Part II. Three Perspectives on Single Exponential Smoothing The formula perspective The formula perspective L(t) = a*y(t-1) + (1 - a)*L(t-1) L(t) = a*y(t-1) + (1 - a)*L(t-1) e(t) = y(t) - L(t) e(t) = y(t) - L(t) The Box-Jenkins Perspective The Box-Jenkins Perspective The Updating Forecasts Perspective The Updating Forecasts Perspective
33 Box Jenkins Perspective Use the error equation to substitute for L(t) in the formula, L(t) = a*y(t-1) + (1 - a)*L(t-1) Use the error equation to substitute for L(t) in the formula, L(t) = a*y(t-1) + (1 - a)*L(t-1) L(t) = y(t) - e(t) L(t) = y(t) - e(t) y(t) - e(t) = a*y(t-1) + (1 - a)*[y(t-1) - e(t-1)] y(t) = e(t) + y(t-1) - (1-a)*e(t-1) y(t) - e(t) = a*y(t-1) + (1 - a)*[y(t-1) - e(t-1)] y(t) = e(t) + y(t-1) - (1-a)*e(t-1) or y(t) = y(t) - y(t-1) = e(t) - (1-a) e(t-1) or y(t) = y(t) - y(t-1) = e(t) - (1-a) e(t-1) So y(t) is a random walk plus MAONE noise, i.e y(t) is a (0,1,1) process where (p,d,q) are the orders of AR, differencing, and MA. So y(t) is a random walk plus MAONE noise, i.e y(t) is a (0,1,1) process where (p,d,q) are the orders of AR, differencing, and MA.
34 Box-Jenkins Perspective In Lab Seven, we will apply simple exponential smoothing to retail sales, and which can be modeled as (0,1,1). In Lab Seven, we will apply simple exponential smoothing to retail sales, and which can be modeled as (0,1,1).
35
36
37
39
40 Box-Jenkins Perspective If the smoothing parameter approaches one, then y(t) is a random walk: If the smoothing parameter approaches one, then y(t) is a random walk: y(t) = y(t) - y(t-1) = e(t) - (1-a) e(t-1) y(t) = y(t) - y(t-1) = e(t) - (1-a) e(t-1) if a = 1, then y(t) = y(t) - y(t-1) = e(t) if a = 1, then y(t) = y(t) - y(t-1) = e(t) In Lab Seven, we will use the price of gold to make this point In Lab Seven, we will use the price of gold to make this point
41
42
43
44
45 Box-Jenkins Perspective The levels or forecast, L(t), is a geometric distributed lag of past observations of the series, y(t), hence the name “exponential” smoothing The levels or forecast, L(t), is a geometric distributed lag of past observations of the series, y(t), hence the name “exponential” smoothing L(t) = a*y(t-1) + (1 - a)*L(t-1) L(t) = a*y(t-1) + (1 - a)*L(t-1) L(t) = a*y(t-1) + (1 - a)*ZL(t) L(t) = a*y(t-1) + (1 - a)*ZL(t) L(t) - (1 - a)*ZL(t) = a*y(t-1) L(t) - (1 - a)*ZL(t) = a*y(t-1) [1 - (1-a)Z] L(t) = a*y(t-1) [1 - (1-a)Z] L(t) = a*y(t-1) L(t) = {1/ [1 - (1-a)Z]} a*y(t-1) L(t) = {1/ [1 - (1-a)Z]} a*y(t-1) L(t) = [1 +(1-a)Z + (1-a) 2 Z 2 + …] a*y(t-1) L(t) = [1 +(1-a)Z + (1-a) 2 Z 2 + …] a*y(t-1) L(t) = a*y(t-1) + (1-a)*a*y(t-2) + (1-a) 2 a*y(t-3) + …. L(t) = a*y(t-1) + (1-a)*a*y(t-2) + (1-a) 2 a*y(t-3) + ….
46
47 The Updating Forecasts Perspective Use the error equation to substitute for y(t) in the formula, L(t) = a*y(t-1) + (1 - a)*L(t-1) Use the error equation to substitute for y(t) in the formula, L(t) = a*y(t-1) + (1 - a)*L(t-1) y(t) = L(t) + e(t) y(t) = L(t) + e(t) L(t) = a*[L(t-1) + e(t-1)] + (1 - a)*L(t-1) L(t) = a*[L(t-1) + e(t-1)] + (1 - a)*L(t-1) So L(t) = L(t-1) + a*e(t-1), So L(t) = L(t-1) + a*e(t-1), i.e. the forecast for period t is equal to the forecast for period t-1 plus a fraction a of the forecast error from period t-1. i.e. the forecast for period t is equal to the forecast for period t-1 plus a fraction a of the forecast error from period t-1.
48 Part III. Double Exponential Smoothing With double exponential smoothing, one estimates a “trend” term, R(t), as well as a levels term, L(t), so it is possible to forecast, f(t), out more than one period With double exponential smoothing, one estimates a “trend” term, R(t), as well as a levels term, L(t), so it is possible to forecast, f(t), out more than one period f(t+k) = L(t) + k*R(t), k>=1 f(t+k) = L(t) + k*R(t), k>=1 L(t) = a*y(t) + (1-a)*[L(t-1) + R(t-1)] L(t) = a*y(t) + (1-a)*[L(t-1) + R(t-1)] R(t) = b*[L(t) - L(t-1)] + (1-b)*R(t-1) R(t) = b*[L(t) - L(t-1)] + (1-b)*R(t-1) so the trend, R(t), is a geometric distributed lag of the change in levels, L(t) so the trend, R(t), is a geometric distributed lag of the change in levels, L(t)
49 If the smoothing parameters a = b, then we have double exponential smoothing If the smoothing parameters a = b, then we have double exponential smoothing If the smoothing parameters are different, then it is the simplest version of Holt- Winters smoothing If the smoothing parameters are different, then it is the simplest version of Holt- Winters smoothing Part III. Double Exponential Smoothing
50 Part III. Double Exponential Smoothing Holt- Winters can also be used to forecast seasonal time series, e.g. monthly Holt- Winters can also be used to forecast seasonal time series, e.g. monthly f(t+k) = L(t) + k*R(t) + S(t+k-12) k>=1 f(t+k) = L(t) + k*R(t) + S(t+k-12) k>=1 L(t) = a*[y(t)-S(t-12)]+ (1-a)*[L(t-1) + R(t-1)] L(t) = a*[y(t)-S(t-12)]+ (1-a)*[L(t-1) + R(t-1)] R(t) = b*[L(t) - L(t-1)] + (1-b)*R(t-1) R(t) = b*[L(t) - L(t-1)] + (1-b)*R(t-1) S(t) = c*[y(t) - L(t)] + (1-c)*S(t-12) S(t) = c*[y(t) - L(t)] + (1-c)*S(t-12)
51 Part V. Intervention Analysis
52 Intervention Analysis The approach to intervention analysis parallels Box-Jenkins in that the actual estimation is conducted after pre- whitening, to the extent that non- stationarity such as trend and seasonality are removed The approach to intervention analysis parallels Box-Jenkins in that the actual estimation is conducted after pre- whitening, to the extent that non- stationarity such as trend and seasonality are removed Example: preview of Lab 7 Example: preview of Lab 7
53 Telephone Directory Assistance A telephone company was receiving increased demand for free directory assistance, i.e. subscribers asking operators to look up numbers. This was increasing costs and the company changed policy, providing a number of free assisted calls to subscribers per month, but charging a price per call after that number. A telephone company was receiving increased demand for free directory assistance, i.e. subscribers asking operators to look up numbers. This was increasing costs and the company changed policy, providing a number of free assisted calls to subscribers per month, but charging a price per call after that number.
54 Telephone Directory Assistance This policy change occurred at a known time, March 1974 This policy change occurred at a known time, March 1974 The time series is for calls with directory assistance per month The time series is for calls with directory assistance per month Did the policy change make a difference? Did the policy change make a difference?
55
56 The simple-minded approach = =387
57
58
59
60 Principle The event may cause a change, and affect time series characteristics The event may cause a change, and affect time series characteristics Consequently, consider the pre-event period, January 1962 through February 1974, the event March 1974, and the post-event period, April 1974 through December 1976 Consequently, consider the pre-event period, January 1962 through February 1974, the event March 1974, and the post-event period, April 1974 through December 1976 First difference and then seasonally difference the entire series First difference and then seasonally difference the entire series
61 Analysis: Entire Differenced Series
62
63
64
65
66 Analysis: Pre-Event Differences
67
68
69
70 So Seasonal Nonstationarity It was masked in the entire sample by the variance caused by the difference from the event It was masked in the entire sample by the variance caused by the difference from the event The seasonality was revealed in the pre- event differenced series The seasonality was revealed in the pre- event differenced series
71
72 Pre-Event Analysis Seasonally differenced, differenced series Seasonally differenced, differenced series
73
74
75
76
77 Pre-Event Box-Jenkins Model [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12) [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12)
78
79
80
81 Modeling the Event Step function Step function
82 Entire Series Assist and Step Assist and Step Dassist and Dstep Dassist and Dstep Sddast sddstep Sddast sddstep
83
84
85
86 Model of Series and Event Pre-Event Model: [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12) Pre-Event Model: [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12) In Levels Plus Event: Assist(t)=[WN(t) – a*WN(t-12)]/[1-Z]*[1-Z 12 ] + (-b)*step In Levels Plus Event: Assist(t)=[WN(t) – a*WN(t-12)]/[1-Z]*[1-Z 12 ] + (-b)*step Estimate: [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12) + (-b)* [1-Z 12 ][1 –Z]*step Estimate: [1-Z 12 ][1 –Z]Assist(t) = WN(t) – a*WN(t-12) + (-b)* [1-Z 12 ][1 –Z]*step
87
88
89 Policy Change Effect Simple: decrease of 387 (thousand) calls per month Simple: decrease of 387 (thousand) calls per month Intervention model: decrease of 397 with a standard error of 22 Intervention model: decrease of 397 with a standard error of 22
90
91 Stochastic Trends: Random Walks with Drift We have discussed earlier in the course how to model the Total Return to the Standard and Poor’s 500 Index We have discussed earlier in the course how to model the Total Return to the Standard and Poor’s 500 Index One possibility is this time series could be a random walk around a deterministic trend” One possibility is this time series could be a random walk around a deterministic trend” Sp500(t) = exp{a + d*t +WN(t)/[1-Z]} Sp500(t) = exp{a + d*t +WN(t)/[1-Z]} And taking logarithms, And taking logarithms,
92 Stochastic Trends: Random Walks with Drift Lnsp500(t) = a + d*t + WN(t)/[1-Z] Lnsp500(t) = a + d*t + WN(t)/[1-Z] Lnsp500(t) –a –d*t = WN(t)/[1-Z] Lnsp500(t) –a –d*t = WN(t)/[1-Z] Multiplying through by the difference operator, = [1-Z] Multiplying through by the difference operator, = [1-Z] [1-Z][Lnsp500(t) –a –d*t] = WN(t-1) [1-Z][Lnsp500(t) –a –d*t] = WN(t-1) [LnSp500(t) – a –d*t] - [LnSp500(t-1) – a –d*(t- 1)] = WN(t) [LnSp500(t) – a –d*t] - [LnSp500(t-1) – a –d*(t- 1)] = WN(t) Lnsp500(t) = d + WN(t) Lnsp500(t) = d + WN(t)
93 So the fractional change in the total return to the S&P 500 is drift, d, plus white noise So the fractional change in the total return to the S&P 500 is drift, d, plus white noise More generally, More generally, y(t) = a + d*t + {1/[1-Z]}*WN(t) y(t) = a + d*t + {1/[1-Z]}*WN(t) [y(t) –a –d*t] = {1/[1-Z]}*WN(t) [y(t) –a –d*t] = {1/[1-Z]}*WN(t) [y(t) –a –d*t]- [y(t-1) –a –d*(t-1)] = WN(t) [y(t) –a –d*t]- [y(t-1) –a –d*(t-1)] = WN(t) [y(t) –a –d*t]= [y(t-1) –a –d*(t-1)] + WN(t) [y(t) –a –d*t]= [y(t-1) –a –d*(t-1)] + WN(t) Versus the possibility of an ARONE: Versus the possibility of an ARONE:
94 [y(t) –a –d*t]=b*[y(t-1)–a–d*(t-1)]+WN(t) [y(t) –a –d*t]=b*[y(t-1)–a–d*(t-1)]+WN(t) Y(t) = a + d*t + b*[y(t-1)–a–d*(t-1)]+WN(t) Y(t) = a + d*t + b*[y(t-1)–a–d*(t-1)]+WN(t) Or y(t) = [a*(1-b)+b*d]+[d*(1-b)]*t+b*y(t-1) +wn(t) Or y(t) = [a*(1-b)+b*d]+[d*(1-b)]*t+b*y(t-1) +wn(t) Subtracting y(t-1) from both sides’ Subtracting y(t-1) from both sides’ y(t) = [a*(1-b)+b*d] + [d*(1-b)]*t + (b-1)*y(t-1) +wn(t) y(t) = [a*(1-b)+b*d] + [d*(1-b)]*t + (b-1)*y(t-1) +wn(t) So the coefficient on y(t-1) is once again interpreted as b-1, and we can test the null that this is zero against the alternative it is significantly negative. Note that we specify the equation with both a constant, So the coefficient on y(t-1) is once again interpreted as b-1, and we can test the null that this is zero against the alternative it is significantly negative. Note that we specify the equation with both a constant, [a*(1-b)+b*d] and a trend [d*(1-b)]*t [a*(1-b)+b*d] and a trend [d*(1-b)]*t
95 Part IV. Dickey Fuller Tests: Trend
96 Example Lnsp500(t) from Lab 2 Lnsp500(t) from Lab 2
97
98
99
100