Download presentation
Presentation is loading. Please wait.
1
Autocorrelation in Regression Analysis
What is Autocorrelation? What causes Autocorrelation? Tests for Autocorrelation Examples Durbin-Watson Tests Modeling Autoregressive Relationships April 11, 2006 Bush 632 Lecture 12a
2
What is Autocorrelation?
Correlation between values of the same variable across observations Violation of the assumption: where: In the presence of autocorrelation, the function of Y can be expressed as: the function: where defined as: April 11, 2006 Bush 632 Lecture 12a
3
What is Autocorrelation?
April 11, 2006 Bush 632 Lecture 12a
4
Where do we find Autocorrelation?
Autocorrelation is most often a problem in time series or geographic data It reflects changes in data that are a function of proximity in time or space Examples Energy market price shocks Transitions depend on prior states Economic consequences of LULUs Distance from hazard influences magnitude of price effect April 11, 2006 Bush 632 Lecture 12a
5
Federal Budget Example:
Incrementalists argue that the federal budget shifts only incrementally from the prior year’s budget. Partial Effects Calculating partial effects; interpretation Variable selection and model building Risks in model building April 11, 2006 Bush 632 Lecture 12a
6
Two types of Autocorrelation
Positive autocorrelation This is what we normally find. If the autocorrelation is positive, then we expect the sign of the residual at t to be the same as at t-1. April 11, 2006 Bush 632 Lecture 12a
7
Negative Autocorrelation
We find that the sign of the residual at t is the opposite of that at t-1 Example: a drunken amble April 11, 2006 Bush 632 Lecture 12a
8
What causes autocorrelation?
Misspecification Data Manipulation Before receipt After receipt Event Inertia Spatial ordering April 11, 2006 Bush 632 Lecture 12a
9
Checking for Autocorrelation
Test: Durbin-Watson statistic: Positive Zone of No Autocorrelation Zone of Negative autocorrelation indecision indecision autocorrelation |_______________|__________________|_____________|_____________|__________________|___________________| d-lower d-upper d-upper d-lower Autocorrelation is clearly evident Ambiguous – cannot rule out autocorrelation Autocorrelation in not evident April 11, 2006 Bush 632 Lecture 12a
10
Consider the following regression:
From Statistics option in SPSS April 11, 2006 Bush 632 Lecture 12a
11
Find the D-upper and D-lower
Check a Durbin Watson table for the numbers for d-upper and d-lower. In Hamilton that’s on pp For n=20 and k=2, α = .05 the values are: Lower = 1.20 Upper = 1.41 Because our value falls between zero and d-lower we have positive autocorrelation April 11, 2006 Bush 632 Lecture 12a
12
The Runs Test An alternative to the D-W test is a formalized examination of the signs of the residuals. We would expect that the signs of the residuals will be random in the absence of autocorrelation. The first step is to estimate the model and predict the residuals. Next, order the signs of the residuals against time (or spatial ordering in the case of cross-sectional data) and see if there are excessive “runs” of positives or negatives. Alternatively, you can graph the residuals and look for the same trends. April 11, 2006 Bush 632 Lecture 12a
13
Runs test continued The final step is to use the expected mean and deviation in a standard t-test April 11, 2006 Bush 632 Lecture 12a
14
More on The D-W D-W is not appropriate for auto-regressive (AR) models, where: In this case, we use the Durbin alternative test For AR models, need to explicitly estimate the correlation between Yi and Yi-1 as a model parameter Techniques: AR1 models (closest to regression; 1st order only) ARIMA (any order) April 11, 2006 Bush 632 Lecture 12a
15
Dealing with Autocorrelation
There are several approaches to resolving problems of autocorrelation. Lagged dependent variables Differencing the Dependent variable GLS ARIMA April 11, 2006 Bush 632 Lecture 12a
16
Lagged dependent variables
The most common solution Simply create a new variable that equals Y at t-1, and use as a RHS variable This correction should be based on a theoretic belief for the specification Can, at times cause more problems than it solves Also costs a degree of freedom (lost observation) There are several advanced techniques for dealing with this as well April 11, 2006 Bush 632 Lecture 12a
17
Differencing Differencing is simply the act of subtracting the previous observation value from the current observation. This process is effective; however, it is an EXPENSIVE correction This technique “throws away” long-term trends Assumes the Rho = 1 exactly April 11, 2006 Bush 632 Lecture 12a
18
GLS and ARIMA GLS approaches use maximum likelihood to estimate Rho and correct the model These are good corrections, and can be replicated in OLS ARIMA is an acronym for Autoregressive Integrated Moving Average This process is a univariate “filter” used to cleanse variables of a variety of pathologies before analysis April 11, 2006 Bush 632 Lecture 12a
19
Corrections based on Rho
There are several ways to estimate rho, the most simple being calculating it from the residuals We then estimate the regression by transforming the regressors so that: and This gives the regression: April 11, 2006 Bush 632 Lecture 12a
20
Estimating the relationship between X and Y
First, we can estimate the lagged dependent variable model. April 11, 2006 Bush 632 Lecture 12a
21
Now the regression correcting for Rho
We can estimate Rho by calculating it. ρ = .587 April 11, 2006 Bush 632 Lecture 12a
22
Final thoughts Each correction has a “best” application.
If we wanted to evaluate a mean shift (dummy variable only model), calculating rho will not be a good choice. Then we would want to use the lagged dependent variable Also, where we want to test the effect of inertia, it is probably better to use the lag In Small N, calculating rho tends to be more accurate April 11, 2006 Bush 632 Lecture 12a
23
Homework Using the data that accompany this lecture, estimate the effect of X on Y. Run the regular regression, a lagged dependent variable model and calculate rho. Next, test the effect of dummy variable X2 on the series Y2. Run a regular regression, then run a regression with a lagged dependent variable. Write a brief description of what problems neglecting the effect of time in the second model might cause a decision-maker April 11, 2006 Bush 632 Lecture 12a
24
Break Coming up… Review for Exam Exam Posting
Available on Wednesday Morning, 10am April 11, 2006 Bush 632 Lecture 12a
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.