Marietta College Week 14 1
Tuesday, April 12 2 Exam 3: Monday, April 25, 12- 2:30PM Bring your laptops to class on Thursday too
Collect Asst 21 Use the data set FISH in Chapter 8 (P 274) to run the following regression equation: F = f (PF, PB, Yd, P, N) 1)Conduct all 3 tests of imperfect multicollinearity problem and report your results. 2)If you find an evidence for imperfect multicollinearity problem, suggest and implement a reasonable solution. 3
Use EViews Open FISH in Chapter 8 Run P = f (PF, PB, Yd, N) Click on view on regression output Click on actual, fitted, residual Click on residual graph Do you suspect the residuals to be autocorrealted? 4
This is what you should have got 5 Positive residual is followed by positive residual possible positive autocorrelation
6 Causes of Impure Serial Correlation 1.Wrong functional form – Example: effect of age of the house on its price 2.Omitted variables – Example: not including wealth in the consumption equation 3.Data error
Cause of Pure Serial Correlation Lingering shock over time – War – Natural disaster – Stock market crash 7
8 Consequences of Pure Autocorrelation Unbiased estimates but wrong standard errors – In case of positive autocorrelation standard error of the estimated coefficients drops – Consequences on the t-test of significance?
9 Consequences of Impure Autocorrelation Biased estimates Plus wrong standard errors
Let’s look at first order serial correlation є t = ρ є t-1 + u t ρ (row) is first order autocorrelation coefficient It takes a value between -1 to +1 u 2 is a normally distributed error with the mean of zero and constant variance 10
11 A Formal Test For First Order Autocorrelation Durbin-Watson test Estimate the regression equation Save the residuals, e Then calculate the Durbin -Watson Stat (d stat) d stat ~ 2 (1- ρ) What is dstat under perfect positive correlation? ρ = +1 d = 0 What is dstat under perfect negative correlation? ρ = -1 d = 4 What is dstat under no autocorrelation? ρ = 0 d = 2 What is the range of values for dstat? 0 to 4
12 dstat=0 Perfect positive autocorrelation dstat=4 Perfect negative autocorrelation dstat=2 No autocorrelation If 2>dstat>0 then suspect (test for) positive autocorrelation If 4>dstat>2 then suspect (test for) negative autocorrelation
EViews calculates d-stat automatically It is included in your regression output Run P = f (PF, PB, Yd, N) Do you see the d-stat? 13
Dependent Variable: P Method: Least Squares Date: 04/12/11 Time: 08:59 Sample: Included observations: 25 VariableCoefficientStd. Errort-StatisticProb. C PF PB YD N-5.54E E R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) What type of serial correlation shall we test for? Positive
15 If d stat<2, test for positive autocorrelation. Null and alternative hypotheses – H 0 : ρ≤0 (no positive auto) – H A : ρ>0 (positive auto) Choose the level of significance (say 5%) Critical dstat (PP ) Decision rule – If dstat< d L reject H0 there is significant positive first order autocorrelation – If dstat> d U don’t reject H0 there is no evidence of a significant autocorrelation – if dstat is between d L and d u the test is inconclusive.
Dependent Variable: P Method: Least Squares Date: 04/12/11 Time: 08:59 Sample: Included observations: 25 VariableCoefficientStd. Errort-StatisticProb. C PF PB YD N-5.54E E R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) N = 25, K = 4 At 5% level d L = 1.04, d U =1.77 dstat is between d L and d u the test is inconclusive
17 DWstat=0 Perfect positive autocorrelation DWstat=4 Perfect negative autocorrelation DWstat=2 No autocorrelation H0: ρ≤0 (no positive auto) HA: ρ>0 (positive auto) level of significance = 5% Critical d-stat d L =1.04 d U = 1.77 Decision dstat is between d L and d u the test is inconclusive Fail to reject H0 Reject H0inconclusive 1.5
18 If dstat >2, you will to test for negative autocorrelation. Null and alternative hypotheses – H0: ρ≥0 (no negative auto) – HA: ρ<0 (negative auto) Choose the level of significance (1% or 5%) Critical dstat (page ) Decision rule – If dstat>4-d L reject H0 there is significant negative first order autocorrelation – If dstat< 4-d U don’t reject H0 there is no evidence of a significant autocorrelation – if dstat is between 4 – d L and 4 – d u the test is inconclusive.
19 Example Dependent Variable: CONSUMPTION Method: Least Squares Date: 11/09/08 Time: 20:11 Sample: 1 30 Included observations: 30 VariableCoefficientStd. Errort-StatisticProb. C INCOME WEALTH R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid4.33E+09 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) d-sta >2 test for negative autocorrelation
20 Let’s test for autocorrelation at 1% level in our example H0: ρ≥0 (no negative auto) HA: ρ<0 (negative auto) 1% level of significance, k=2, n=30 d L =1.07, d u = d L =2.93, 4- d u = 2.66 dstat < 4- d u, don’t reject H0
Asst 22: Due Thursday Use the data on Soviet Defense spending (Page 335– Data set: DEFEND Chapter 9) to regress SDH on SDL, UDS and NR only. 1.Conduct a Durbin-Watson test for serial correlation at 5% level of significance 2. If you find an evidence for autocorrelation, is it more likely to be pure or impure autocorrelation? Why? 21
Thursday April 15 Exam 3: Monday, April 25, 12- 2:30PM Bring your laptops to class next Tuesday 22
Collect Asst 22 Use the data on Soviet Defense spending (Page 335– Data set: DEFEND Chapter 9) to regress SDH on SDL, USD and NR only. 1.Conduct a Durbin-Watson test for serial correlation at 5% level of significance 2. If you find an evidence for autocorrelation, is it more likely to be pure or impure autocorrelation? Why? 23
24 Solutions for Autocorrelation Problem If the D-W test indicates autocorrelation problem What should you do?
25 1.Adjust the functional form Sometimes autocorrelation is because we use a linear form while we should have used a non-linear form revenue Price * * * * * With a linear line, errors have formed a pattern The first 3 observations have positive errors The last 2 observations have negative errors Revenue curve is not linear (It is bell shaped) What should we use?
26 2. Add other relevant (missing) variables Sometimes autocorrelation is caused by omitted variables. consumption Income * * * * * We forget to include wealth in our model In year one (obs. 1) wealth goes up drastically big positive error The effect of the increase in wealth in year 1 lingers for 3 years Errors form a pattern We should include wealth in our model
27 3. Examine the data Any systematic error in the collection or recording of data may result in autocorrelation.
28 After you make adjustments 1, 2 and 3 Test for autocorrelation again If autocorrelation is still a problem then suspect pure autocorrelation – Follow the Cochrane-Orcutt procedure – Say what?????
29 Suppose our model is Y t = β 0 + β 1 X t + є t (1) And the error terms in Equation 1 are correlated Let’s lag Equation 1 Y t-1 = β 0 + β 1 X t-1 + є t-1 (4) Where u t is not auto-correlated. Rearranging 2 we get 3 є t - ρ є t-1 = u t (3) є t = ρ є t-1 + u t (2)
30 Now multiply Equation 4 by ρ ρ Y t-1 = ρ β 0 + ρ β 1 X t-1 + ρ є t-1 (5) Now subtract 5 from 1 to get 6 Y t = β 0 + β 1 X t + є t - ρ Y t-1 = - ( ρ β 0 + ρ β 1 X t-1 + ρ є t-1 ) ___________________________________ Y t - ρ Y t-1 = β 0 - ρ β 0 + β 1 X t - ρ β 1 X t-1 + є t - ρ є t-1 (6) Note that the last two terms in Equation 6 are equal to U t So 6 becomes Y t - ρ Y t-1 = β 0 - ρ β 0 + β 1 (X t - ρ X t-1 ) + u t (7)
What is so special about the error term in Equation 7? It is not auto-correlated So, instead of equation 1 we can estimate equation 7 31 Define Z t = Y t – ρY t-1 & W t = X t – ρX t-1 Then 7 becomes Z t = M + β 1 W t + u t (8) Where M is a constant = β 0 (1- ρ) Notice that the slope coefficient of Equation 8 is the same as the slope coefficient of our original equation 1.
32 The Cochrane-Orcutt Method: So our job will be Step 1: Apply OLS to the original model (Equation 1) and find the residuals e t Step 2: Use e t s to estimate Equation 2 and find ρ^ (Note: this equation does not have an intercept.) Step 3: Multiply ρ^ by Y t-1 and X t-1 & find Z t & W t Step 4: Estimate Equation 8
33 Luckily EViews does this (steps 1- 4) automatically All you need to do is to add AR(1) to the set of your independent variables. The estimated coefficient of AR(1) is ρ^ Let’s apply this procedure to Asst 22
Dependent Variable: SDH VariableCoefficientStd. Errort-StatisticProb. C SDL USD NR R-squared Adjusted R-squared Durbin-Watson stat Dependent Variable: SDH VariableCoefficientStd. Errort-StatisticProb. C SDL USD6.71E NR AR(1) R-squared Adjusted R-squared Durbin-Watson stat What is this? It is ρ^ What happened to standard errors as we corrected for serial correlation? They went up Positive autocorrelation standard error
Return and discuss Asst 21 Use the data set FISH in Chapter 8 (P 274) to run the following regression equation: F = f (PF, PB, Yd, P, N) 1)Conduct all 3 tests of imperfect multicollinearity problem and report your results. 2)If you find an evidence for imperfect multicollinearity problem, suggest and implement a reasonable solution. 35
Correlation Matrix FPPBPFYDN F P PB PF YD10.93 N1 36 First test PF is more correlated with PB than with F PF is a problem Yd is more correlated with PB and PF than with F Yd is a problem N is more correlated with PB, PF and Yd than with F N is a problem PB is more correlated with PF than with F PB is a problem P is more correlated with everything else than with F P is a problem
Correlation Matrix FPPBPFYDN F P PB PF YD10.93 N1 37 Second test: problem areas: PF and PB PF and Yd PF and N PB and Yd Yd and N Note: F being highly correlated with independent variables is a good thing not a bad thing
Test 3 Need 5 regression equations 1.PF = f (P, Yd, PB, N) 2.P = f (PF, Yd, PB, N) 3.Yd = f (P, PF, PB, N) 4.PB = f (PF, Yd, P, N) 5.N = f (PF, Yd, PB, P) For all find R 2 then find VIF For all VIF>5 Each independent variable is highly correlated with the rest 38
Solutions 1.Increase sample size – Note: we want at least a df= 30, we have df=19 2.Do we have an irrelevant variable? – Seth argued N is not needed? – What is N? (P 273) – Seth, what was your argument? 3.Generate a new variable that measures the ratio of prices – Makes sense but doesn’t solve the high correlation between Yd and N – Note: make sure your transformed variable makes sense That is the estimated coefficient has a meaning that people can understand – The ratio PF/Yd makes no sense 39