Chapter 6 Autocorrelation
What is in this Chapter? How do we detect this problem? What are the consequences? What are the solutions?
What is in this Chapter? Regarding the problem of detection, we start with the Durbin-Watson (DW) statistic, and discuss its several limitations and extensions. We discuss Durbin's h-test for models with lagged dependent variables and tests for higher-order serial correlation. We discuss (in Section 6.5) the consequences of serially correlated errors and OLS estimators.
What is in this Chapter? The solutions to the problem of serial correlation are discussed in Section 6.3 (estimation in levels versus first differences), Section 6.9 (strategies when the DW test statistic is significant), and Section 6.10 (trends and random walks). This chapter is very important and the several ideas have to be understood thoroughly.
6.1 Introduction The order of autocorrelation In the following sections we discuss how to: 1. Test for the presence of serial correlation. 2. Estimate the regression equation when the errors are serially correlated.
6.2 Durbin-Watson Test
6.2 Durbin-Watson Test
6.2 Durbin-Watson Test
6.2 Durbin-Watson Test
6.2 Durbin-Watson Test
6.3 Estimation in Levels Versus First Differences Simple solutions to the serial correlation problem: First Difference If the DW test rejects the hypothesis of zero serial correlation, what is the next step? In such cases one estimates a regression by transforming all the variables by ρ-differencing (quasi-first difference) or first-difference
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences When comparing equations in levels and first differences, one cannot compare the R2 because the explained variables are different. One can compare the residual sum of squares but only after making a rough adjustment. (Please refer to P.231)
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences Since we have comparable residual sum of squares (RSS), we can get the comparable R2 as well, using the relationship RSS = Syy(l — R2)
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences Illustrative Examples
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences
6.3 Estimation in Levels Versus First Differences Usually, with time-series data, one gets high R2 values if the regressions are estimated with the levels yt and Xt but one gets low R2 values if the regressions are estimated in first differences (yt — yt-1) and (xt — xt-1) Since a high R2 is usually considered as proof of a strong relationship between the variables under investigation, there is a strong tendency to estimate the equations in levels rather than in first differences. This is sometimes called the “R2 syndrome."
6.3 Estimation in Levels Versus First Differences However, if the DW statistic is very low, it often implies a misspecified equation, no matter what the value of the R2 is In such cases one should estimate the regression equation in first differences and if the R2 is low, this merely indicates that the variables y and x are not related to each other.
6.3 Estimation in Levels Versus First Differences Granger and Newbold present some examples with artificially generated data where y, x, and the error u are each generated independently so that there is no relationship between y and x But the correlations between yt and yt-1,.Xt and Xt-1, and ut and ut-1 are very high Although there is no relationship between y and x the regression of y on x gives a high R2 but a low DW statistic
6.3 Estimation in Levels Versus First Differences When the regression is run in first differences, the R2 is close to zero and the DW statistic is close to 2 Thus demonstrating that there is indeed no relationship between y and x and that the R2 obtained earlier is spurious Thus regressions in first differences might often reveal the true nature of the relationship between y and x. Further discussion of this problem is in Sections 6.10 and 14.7
Homework Find the data Run two equations Y is the Taiwan stock index X is the U.S. stock index Run two equations The equation in levels (log-based price) The equation in the first differences A comparison between the two equations The beta estimate and its significance The R square The value of DW statistic Q: Adopt the equation in levels or the first differences?
6.3 Estimation in Levels Versus First Differences For instance, suppose that we have quarterly data; then it is possible that the errors in any quarter this year are most highly correlated with the errors in the corresponding quarter last year rather than the errors in the preceding quarter That is, ut could be uncorrelated with ut-1 but it could be highly correlated with ut-4. If this is the case, the DW statistic will fail to detect it What we should be using is a modified statistic defined as
6.3 Estimation in Levels Versus First Differences
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors GLS (Generalized least squares)
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors In actual practice ρ is not known There are two types of procedures for estimating 1. Iterative procedures 2. Grid-search procedures.
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
6.4 Estimation Procedures with Autocorrelated Errors
Homework Redo the example (see Table 3.11 for the data) in the Textbook OLS C-O procedure H-L procedure with the interval of 0.01 Compare the R2 (Note: please calculate the comparable R2 form the levels equation)
6.5 Effect of AR(1) Errors on OLS Estimates In Section 6.4 we described different procedures for the estimation of regression models with AR(1) errors We will now answer two questions that might arise with the use of these procedures: 1. What do we gain from using these procedures? 2. When should we not use these procedures?
6.5 Effect of AR(1) Errors on OLS Estimates First, in the case we are considering (i.e., the case where the explanatory variable Xt is independent of the error ut), the OLS estimates are unbiased However, they will not be efficient Further, the tests of significance we apply, which will be based on the wrong covariance matrix, will be wrong.
6.5 Effect of AR(1) Errors on OLS Estimates In the case where the explanatory variables include lagged dependent variables, we will have some further problems, which we discuss in Section 6.7 For the present, let us consider the simple regression model
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
An Alternative Method to Prove the Above Characteristics??? Use “simulation method” as shown at Chapter 5 Write your program by the Gauss program Take the program at Chapter 5 and make some modifications on it
6.5 Effect of AR(1) Errors on OLS Estimates Thus the consequences of autocorrelated errors are: 1. The least squares estimators are unbiased but are not efficient. Sometimes they are considerably less efficient than the procedures that take account of the autocorrelation 2. The sampling variances are biased and sometimes likely to be seriously understated. Thus R2 as well as t and F statistics tend to be exaggerated.
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates 2. The discussion above assumes that the true errors are first-order autoregressive. If they have a more complicated structure (e.g., second-order autoregressive), it might be thought that it would still be better to proceed on the assumption that the errors are first-order autoregressive rather than ignore the problem completely and use the OLS method??? Engle shows that this is not necessarily true (i.e., sometimes one can be worse off making the assumption of first-order autocorrelation than ignoring the problem completely).
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.5 Effect of AR(1) Errors on OLS Estimates
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables In previous sections we considered explanatory variables that were uncorrelated with the error term This will not be the case if we have lagged dependent variables among the explanatory variables and we have serially correlated errors There are several situations under which we would be considering lagged dependent variables as explanatory variables These could arise through expectations, adjustment lags, and so on.
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables The various situations and models are explained in Chapter 10. For the present we will not be concerned with how the models arise. We will merely study the problem of testing for autocorrelation in these models Let us consider a simple model
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
new; format /m1 /rd 9,3; beta=2; T=30; @ sample number @ u=Rndn(T,1); x=Rndn(T,1)+0*u; y=beta*x+u; @ OLS @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;
new; format /m1 /rd 9,3; beta=2; T=50000; @ sample number @ u=Rndn(T,1); x=Rndn(T,1)+0*u; y=beta*x+u; @ OLS @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;
new; format /m1 /rd 9,3; beta=2; T=50000; @ sample number @ u=Rndn(T,1); x=Rndn(T,1)+0.5*u; y=beta*x+u; @ OLS @ Beta_OLS=olsqr(y,x); print " OLS beta estimate "; Beta_OLS;
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.7 Tests for Serial Correlation in Models with Lagged Dependent Variables
6.8 A General Test for Higher-Order Serial Correlation: The LM Test The h-test we have discussed is, like the Durbin-Watson test, a test for first-order autoregression. Breusch and Godfrey discuss some general tests that are easy to apply and are valid for very general hypotheses about the serial correlation in the errors These tests are derived from a general principle — called the Lagrange multiplier (LM) principle A discussion of this principle is beyond the scope of this book. For the present we will explain what the test is The test is similar to Durbin's second test that we have discussed
6.8 A General Test for Higher-Order Serial Correlation: The LM Test
6.8 A General Test for Higher-Order Serial Correlation: The LM Test
6.8 A General Test for Higher-Order Serial Correlation: The LM Test
6.8 A General Test for Higher-Order Serial Correlation: The LM Test
6.8 A General Test for Higher-Order Serial Correlation: The LM Test
6.9 Strategies When the DW Test Statistic is Significant The DW test is designed as a test for the hypothesis ρ = 0 if the errors follow a first-order autoregressive process However, the test has been found to be robust against other alternatives such as AR(2), MA(1), ARMA(1, 1), and so on. Further, and more disturbingly, it catches specification errors like omitted variables that are themselves autocorrelated, and misspecified dynamics (a term that we will explain). Thus the strategy to adopt, if the DW test statistic is significant, is not clear. We discuss three different strategies:
6.9 Strategies When the DW Test Statistic is Significant 1. Assume that the significant DW statistic is an indication of serial correlation but may not be due to AR(1) errors 2. Test whether serial correlation is due to omitted variables. 3. Test whether serial correlation is due to misspecified dynamics.
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant Serial correlation due to misspecification dynamics
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.9 Strategies When the DW Test Statistic is Significant
6.10 Trends and Random Walks
6.10 Trends and Random Walks
6.10 Trends and Random Walks
6.10 Trends and Random Walks
6.10 Trends and Random Walks Both the models exhibit a linear trend. But the appropriate method of eliminating the trend differs To test the hypothesis that a time series belongs to the TSP class against the alternative that it belongs to the DSP class, Nelson and Plosser use a test developed by Dickey and Fuller:
6.10 Trends and Random Walks
6.10 Trends and Random Walks
Three Types of RW RW without drift: Yt=1*Yt-1+ut; RW with drift: Yt=alpha+1*Yt-1+ut; RW with drift and time trend: Yt=alpha+beta*t+1*Yt-1+ut ut~iid(0,sigma)
RW or Unit Root tests by E-view Additional Slides: Augmented D-F tests Yt=a1*Yt-1+ut; Yt-Yt-1=(a1-1)*Yt-1+ut ΔYt=(a1-1)*Yt-1+ut ΔYt=λ*Yt-1+ut H0:a1=1 ≡ H0: λ=0 ΔYt=λ*Yt-1+ΣΔYt-i+ut
6.10 Trends and Random Walks
6.10 Trends and Random Walks As an illustration consider the example given by Dickey and Fuller.36 For the logarithm of the quarterly Federal Reserve Board Production Index 1950-1 through 1977-4 they assume that the time series is adequately represented by the model
6.10 Trends and Random Walks
6.10 Trends and Random Walks
6.10 Trends and Random Walks 6. Regression of one random walk on another, with time included for trend, is strongly subject to the spurious regression phenomenon. That is, the conventional t-test will tend to indicate a relationship between the variables when none is present.
6.10 Trends and Random Walks The main conclusion is that using a regression on time has serious consequences when, in fact, the time series is of the DSP type and, hence, differencing is the appropriate procedure for trend elimination Plosser and Schwert also argue that with most economic time series it is always best to work with differenced data rather than data in levels The reason is that if indeed the data series are of the DSP type, the errors in the levels equation will have variances increasing over time
6.10 Trends and Random Walks Under these circumstances many of the properties of least squares estimators as well as tests of significance are invalid On the other hand, suppose that the levels equation is correctly specified. Then all differencing will do is produce a moving average error and at worst ignoring it will give inefficient estimates For instance, suppose that we have the model
6.10 Trends and Random Walks
6.10 Trends and Random Walks Differencing and Long-Run Effects:The Concept of Cointegration One drawback of the procedure of differencing is that it results in a loss of valuable "long-run information" in the data Recently, the concept of cointegrated series has been suggested as one solution to this problem.39 First, we need to define the term "cointegration.“ Although we do not need the assumption of normality and independence, we will define the terms under this assumption.
6.10 Trends and Random Walks
6.10 Trends and Random Walks Yt~I(1) Yt is a random walk △Yt is a white noise, or iid No one could predict the future price change The market is efficient The impact of previous shock on the price will remain and not approach to zero
6.10 Trends and Random Walks
6.10 Trends and Random Walks
6.10 Trends and Random Walks
Cointegration
Cointegration
Cointegration Run the VECM (vector error correction model) by E-view Additional slides
Cointegration
Lead-lag relation obtained with VECM model If beta_A is significant and beta_U is insignificant, the price adjustment mainly depends on ADR markets ADR prices converge to UND prices UND prices lead ADR prices in price discovery process UND prices provide an information advantage
If beta_U is significant and beta_A is insignificant, the price adjustment mainly depends on UND markets UND prices converge to ADR prices ADR prices lead UND prices in price discovery process ADR prices provide an information advantage
If both of beta_U and beta_A are significant suggesting a bidirectional error correction The equilibrium prices line within ADR and UND prices Both ADR and UND prices converge to the equilibrium prices
If both of beta_U and beta_A are significant, but the beta_U is greater than beta_A in absolute velue The finding denotes that it is the UND price that makes greater adjustment in order to reestablish the equilibrium That is, most of the price discovery takes place at the ADR market.
Homework Find the spot and futures prices Daily and 5-year data at least Run the cointegration test Run the VECM Lead-lag relationship
6.11 ARCH Models and Serial Correlation We saw in Section 6.9 that a significant DW statistic can arise through a number of misspecifications. We will now discuss one other source. This is the ARCH model suggested by Engle which has, in recent years, been found useful in the analysis of speculative prices. ARCH stands for "autoregressive conditional heteroskedasticity."
6.11 ARCH Models and Serial Correlation GARCH (p,q) Model:
6.11 ARCH Models and Serial Correlation The high level of persistence in GARCH models the sum of the two GARCH parameter estimates approximates unity in most cases Li and Lin (2003): This finding provides some support for the notion that GARCH models are handicapped by the inability to account for structural changes during the estimation period and thus suffers from a high persistence problem in variance settings.
6.11 ARCH Models and Serial Correlation Find the stock returns Daily and 5-year data at least Run the GARCH(1,1) model Check the sum of the two GARCH parameter estimates Parameter estimates Graph the time-varying variance estimates
Could we identify RW? Low test power of the DF test The Power of the test? The H0 is not true, but we accept the H0 The data series is I(0), but we conclude it is I(1)
Several Key Problems for Unit Root Tests Low test power Structural change problem Size distortion RW or non-stationary or I(1) : Yt=1*Yt-1+ut Stationary Process or I(0): Yt=0.99*Yt-1+ut-1, T=1,000 Yt=0.98*Yt-1+ut-1, T=50 or 1000
Spurious Regression RW 1 : Yt=0.05+1*Yt-1+ut RW 2: Xt=0.03+1*Xt-1+vt
Spurious Regression new; format /m1 /rd 9,3; @ Data Gerneration Process @ Y=zeros(1000,1); u=2*Rndn(1000,1); X=zeros(1000,1); v=1*Rndn(1000,1); i=2; do until i>1000; Y[i,1]=0.05+1*Y[i-1,1]+u[i,1]; X[i,1]=0.03+1*X[i-1,1]+v[i,1]; i=i+1; endo; Output file=d:\Courses\Enclass\Unit\YX_Spur.out reset; Y~X; Output off;