Download presentation
Presentation is loading. Please wait.
1
Returning to Consumption
2
More on Consumption We return to the consumption problem to illustrate the issue of heteroscedasticity It turns out that OLS may NOT give us the best estimate of the MPC The reason is that one of the assumptions of the GM theorem is probably violated in the consumption model The data is probably not heteroscedastic Var(ui|xi) ≠ 2
3
Homoscedastic
4
Heteroscedastic
5
Characteristics of Heteroscedasticty
Systematic pattern exists in variance of residuals: Var(ui|xi) = i2 = 2.f(xi) i2 = f(1+2Z2+…+ kZk) variance has different values for different observations or groups of observations Intuition: if random bit come from roll of dice then homo is with same dice and hetero is with different dice Evident in cross-section data or time series
6
Consequences OLS is unbiased OLS is consistent
OLS is no longer efficient Variance formula used previously is incorrect significance test, confidence intervals etc. cannot be used Aside: a corrected formula can be used Stata: regress y x, robust We don’t bother with this because can do better with alternative estimator
7
Testing for Heteroskedasticity
Plot of residuals Sort the residuals by explanatory variable and plot against that variable , look for pattern, do this for each explanatory variable Not a formal test but can give an idea of what's going on Can use it to reject idea of Het
8
An example of Het
9
Consumption Example
10
Goldfeld Quandt test Used for i2 = 2.f(xi) i.e. related to one variable only State Hypothesis Test H0: i2 = 2. H0: i2 ≠ 2. Note: the null is homoscedasticity Sort residuals by ascending order of xi Omit middle 20% observations: (n-c) observations remain Estimate the original model separately for two samples first (n-c)/2 obs (keep RSS1 ) last (n-c)/2 obs (keep RSS2) Compute: g = SSR2/SSR1 If g > Fc(df,df) => reject null hypothesis of homoscedasticity at a significance level Test can be carried out for each xi
11
Intuition of GQ If het does exist then we can split sample into a low variance and high variance bit Run the regression separately for the two samples Calculate the ratio of variances of the residual (remember s2=RSS/df) If this ratio is 1 then they are equal and the data is homoscedastic So reject null of homoscedasticity if bigger than 1 How much bigger? Bigger than F critical value
12
Consumption Example Test it for nmwage State Hypothesis
Test H0: i2 = 2 H0: i2 ≠ 2. Note: the null is homoscedasticity Sort residuals by ascending order of nmwagei Stata command: sort nmwage Omit middle 20% observations: (n-c) observations remain Two sample: Estimate the original model separately Compute: g = SSR2/SSR1 g= / = If g > Fc(df,df) => reject null hypothesis of homoscedasticity at a significance level 5% sig level F(550,550)=1.15 So cannot reject the null at 5% significance level Test can be carried out for each xi
13
sort nmwage . regress cons nmwage if _n<=550 Source | SS df MS Number of obs = F( 1, 548) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = cons | Coef. Std. Err. t P>|t| [95% Conf. Interval] nmwage | _cons | regress cons nmwage if _n> F( 1, 548) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = nmwage | _cons |
14
White’s Test More general test that allows for more than one varibles to influence the variance of the residuals Estimate model yi = 1 + 2 x2i + 3 x3i + ui Run auxiliary regression: sq’d residuals on squares and cross products of X variables: ei2 =1 +2 x2i +3 x3i + 4 x2i2+ 5 x3i2+ 6 x2i x3i + vi Null hypothesis is homoscedastic errors i.e. 2 = 3 = 4 = 5 = 6 = 0 i.e. ei2 = constant calculate nR2 ~ df2 test nR2 > df2 critical value reject null hypothesis Comment: why not an F-test
15
Consumption Example Test it for nmwage State Hypothesis
Test H0: i2 = 2 H0: i2 ≠ 2. Note: the null is homoscedasticity Estimate the Model and generate residuals squared Regress residual squared on all of the variables that may cause heteroscdasticity Form the test statistic: NR2=0.266 Find critical value: chi-sq, df=2 alpha=0.05=5.99 We cannot reject the null at the 5% significance level
16
predict u, residual gen u2=u^2 gen nmwage2=nmwage^2 regress u2 nmwage nmwage2 Source | SS df MS Number of obs = F( 2, 1327) = 0.11 Model | e e+09 Prob > F = Residual | e e+10 R-squared = Adj R-squared = Total | e e+10 Root MSE = 2.3e u2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] nmwage | nmwage2 | _cons |
17
Efficient Estimation If we find heteroscsadstity we know that OLS will be inefficient Remember why this might be a problem (see over) Can we do better? Yes. There is an efficient estimator called Generalised Least Squares (GLS) Two steps Remove the heteroscedasticity from the data Do OLS on the transformed data
18
Prob of error is lower for efficient estimator at any sample size
Same sample size, different estimator
19
The GLS Procedure Assume that i2 is known: Basic model:
Yi = 1 + 2 Xi + ui , E(ui2) = i2 (not constant) Create new data with each observations weighted by the heteroscedastic standard deviation 𝑌 𝑖 ∗ = 𝑌 𝑖 𝜎 𝑖 𝑋 2𝑖 ∗ = 𝑋 2𝑖 𝜎 𝑖 𝑋 1𝑖 ∗ = 1 𝜎 𝑖
20
The GLS Procedure Then run the regression on the transformed data
Yi∗ = 1∗ 𝑋 1𝑖 ∗ + 2 X2i∗ + 𝑢 𝑖 ∗ The slope estimates are the BLUE of the coefficients of the original model Note the intercept tem is slightly different (it has now become coefficient on a variable)
21
How it Works GLS eliminates heteroscedasticity To see this note that
var(ui*) = E(ui*)2 = E(ui/i)2 = 1/i2.E(ui2) = (1/i2).i2 = 1 var of transformed error term is homoskedastic: it is constant NB This model does not have a constant now: it has two explanatory variables: 1/i and Xi/I Cannot apply GLS if the exact type of hetero is unknown. So do FGLS (Feasible GLS) and replace i with an estimate of I From White’s test
22
The Consumption Example
Transform the data to eliminate the heteroscedasticty Use the estimate of from White’s test Stata command Predict white generate c=cons/sqrt(white) generate y=nmwage/sqrt(white) The GLS of the MPC is given by the regression on the transformed data
23
predict white (option xb assumed; fitted values). gen sigma=white^0. 5
predict white (option xb assumed; fitted values) . gen sigma=white^ gen c=cons/sigma . gen y=nmwage/sigma . regress c y Source | SS df MS Number of obs = F( 1, 1328) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = c | Coef. Std. Err. t P>|t| [95% Conf. Interval] y | _cons |
24
Conclusion The example didn’t appear to have heteroscedasticity.
When het does exist the difference between GLS and OLS can be substantial Both are unbiased and consistent GLS is preferable because it is efficient so there is a lower probability of substantial error
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.