Download presentation
1
Cross section and panel method
Lecture 10 (Ch8) Heteroskedasticity
2
Understanding the problem of the heteroskedasticity
Heteroskedasticity means the following. Var(u|X)≠σ2 : variance of u depends on X. Consider the following model. y=β0+β1x1+β2x2+….+βkxk+u Now remember the series of assumptions MLR1. Linear in parameter MLR2. Random sampling MLR3. No-perfect colinearity MLR4. Zero conditional mean: E(u|X)=0 MLR4’. Uncorreatedness of x and u: Cov(xj,u)=0 MLR5. Homoskedasticity: Var(u|X)=σ2 MLR6. u follows normal distribution. MLR4’ is a weaker assumption than MLR4
3
MLR1~MLR4: β1 ,…, βk are unbiased and consistent.
MLR1~MLR4’: β1 ,…, βk are consistent. MLR1~MLR4, and MLR5: β1 ,…, βk are approximately normal, so t-test is valid. MLR1~MLR4’, and MLR5: β1 ,…, βk are approximately normal, so t-test is valid. MLR1~MLR4, and MLR5, MLR6: β1 ,…, βk have the t-distribution. Therefore, t-test is valid.
4
Note that MLR1~MLR4 (or MLR4’) are sufficient conditions for the consistency. But, in order to conduct the “usual” t-test, you need MLR5, the homoskedasticity assumption. This is because, the standard error formula we used before is not consistent without MLR5 When MLR5 is not satisfied, but if you apply the usual standard error anyway, you may mistakenly judge a parameter to be significant when it is actually not.
5
Heteroskedasticity robust standard errors: One explanatory variable case
Heteroskedasticity means that Var(u|X) depends on X. In such a case, we have to modify the standard error. Fortunately, there is a method to deal with the heteroskedasticity. For the illustrative purpose, I use one explanatory variable case.
6
Consider the following model.
yi=β0+β1xi+ui Using the argument of the asymptotic normality in handout 4, we can show that
7
(1) means that the variance of is given by:
Off course, you do not know and But, fortunately, you can consistently estimate them.
8
To see this, notice that OLS estimators for the coefficients are consistent even under heteroskedasticity. So you can replace ui with the OLS residual . The estimator for the heteroskedasticity-robust variance of is, then, given by: The square root of (3) is the heteroskedasticity-robust standard error.
9
Heteroskedasticity-robust standard errors: (multiple explanatory variable case.)
Now, consider the following regression. y=β0+β1x1+β2x2+….+βkxk+u ……..(4) The valid estimator of under assumption MLR.1 ~MLR4 (or MLR4’) is given by This is the Heteroskedasticity robust variance
10
where is the OLS residual from the original regression model (4).
rij2 Is the the OLS residual from regressing xj on all other explanatory variables. SSRj2 is the sum of squared residuals from this regression. The heteroskedasticity-robust standard error is the square root of (5). Sometimes, this is simply called the robust standard error.
11
Heteroskedasticity-robust t-statistic
Once the heteroskedasticity-robust standard error is computed, heteroskedasticity-robust t-statistic is computed as
12
Heteroskedasticity-robust F-statistic
When the error term is heteroskedastic, the usual F-test is not valid. Heteroskedasticity-robust F-statistic needs to be computed for testing the joint hypothesis. Heteroskedasticity-robust F-statistic is also called Heteroskedasticity robust Wald Statistic.
13
Heteroskedasticity-robust F- statistic involves fairly complex matrix notation. Thus, the details are not covered in this class. However. STATA automatically compute this.
14
Heteroskedasticity robust inference with STATA.
STATA computes the heteroskedasticity robust standard errors and F statistic automatically. Just use the ‘robust’ option when running a regression. Next slide shows the log salary regression of academic economists in Japan.
15
Homoskedasticity version
reg lsalary female fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer Homoskedasticity version reg lsalary female fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer , robust Heteroskedasticity version
16
As you can see, standard errors for heteroskedasticity version are slightly higher for most of the coefficients. However, the statistical significance of the majority of the variables is not altered.
17
Now, suppose that you want to test if there is a gender salary gap for those with experience greater than 20. Then you estimate the following model, Log(salary)=β0+β1female +β2female×(exp>20)+ … and test H0: β1+β2=0
18
reg lsalary female female_exp20 fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer, robust We failed to reject the null hypothesis. Thus we did not find evidence that there is a gender gap for those with experience greater than 20. But notice that, for those with experience less than 20, there is a gender salary gap of 8.7%. Thus, gender gap is concentrated for less experienced workers.
19
Note Heteroskedasticity-robust standard error is robust for any form of heteroskedasticity. Since homoskedasticity is a special case of heteroskedasticity, it is also robust to homoskedasticity as well. The majority of empirical researches uses the heteroskedasticity-robust standard errors. It is highly recommended that students always use this.
20
Testing for heteroskedasticity
Although the heteroskedasticity robust standard errors work for any type of heteroskedasticity including the special case of the homoskedasticity, there are some reasons for having a simple tests that can detect the heteroskedasticity.
21
Basic idea is to test whether u2 is correlated with the explanatory variables.
Consider the following regression. If the error terms are homoskedastic, all the slope coefficients should be zero. Thus, consider the following null hypothesis. H0: δ1=0, δ2=0,.…, δk=0 If we reject the null hypothesis, this is an indication that the heteroskedasticity is present.
22
We can test the null using either the F-statistic or LM statistic:
where is the R-squared from the regression (6). Note, the F-stat has F(k,n-k-1) degree of freedom and the LM is distributed as χ2k. The LM version of the test is called the Breusch-Pagan test for heteroskedasticity.
23
STATA implement a slightly different version of the Breusch-Pagan test, and it is done automatically. You can either compute LM statistic as described in the previous slide, or use the STATA command to automatically test this. To use STATA command, first run OLS without robust option, then type the following command. estat hettes
24
Null hypothesis that the error is homoskedastic is rejected at 5% significance level.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.