Heteroskedasticity
What does it mean? The variance of the error term is not constant What are its consequences? The least squares results are no longer efficient and t tests and F tests results may be misleading How can you detect the problem? Plot the residuals against each of the regressors or use one of the more formal tests How can I remedy the problem? Respecify the model – look for other missing variables; perhaps take logs or choose some other appropriate functional form; or make sure relevant variables are expressed “per capita”
Consumption function example (cross-section data): credit worthiness as a missing variable?
The Homoskedastic Case
The Heteroskedastic Case
Causes Model misspecification - omitted variable or improper functional form. Learning behaviors across time Changes in data collection or definitions. Outliers or breakdown in model. Frequently observed in cross sectional data sets where demographics are involved (population, GNP, etc).
The consequences of heteroskedasticity OLS estimators are still unbiased (unless there are also omitted variables) However OLS estimators are no longer efficient or minimum variance The formulae used to estimate the coefficient standard errors are no longer correct so the t-tests will be misleading (if the error variance is positively related to an independent variable then the estimated standard errors are biased downwards and hence the t-values will be inflated) confidence intervals based on these standard errors will be wrong
Detecting heteroskedasticity Visual inspection of scatter diagram or the residuals Goldfeld-Quandt test suitable for a simple form of heteroskedasticity Breusch-Pagan test a test of more general forms of heteroskedastcity
Plot residuals against one variable at a time
Goldfeld-Quandt test Order the n cases by the X that you think is correlated with ei2. Drop a section of c cases out of the middle (one-fifth is a reasonable number). Run separate regressions on both upper and lower samples.
Goldfeld-Quandt test (cont.) Do F-test for difference in error variances F has (n - c - 2k)/2 degrees of freedom for each
Breusch-Pagan-Godfrey Test Estimate model with OLS Obtain Construct variables May 21, 2019
Breusch-Pagan-Godfrey Test (cont.) Regress pi on the X (and other?!) variables Calculate Note that May 21, 2019
White’s Generalized Heteroskedasticity test Estimate model with OLS and obtain residuals Run the following auxiliary regression Higher powers may also be used, along with more X’s May 21, 2019
White’s Generalized Heteroskedasticity test (cont.) Note that The degrees of freedom is the number of coefficients estimated minus 1. May 21, 2019
Remedies Respecification of the model Include relevant omitted variable(s) Express model in log-linear form or some other appropriate functional form Where respecification won’t solve the problem use robust Heteroskedastic Consistent Standard Errors (due to Hal White, Econometrica 1980)
Iteratively weighted least squares (IWLS) Obtain estimates of ei2 using OLS Use these to get "1st round" estimates of σi Using formula above replace wi with 1/ si and obtain new estimates for a and ß. Adjust data Use these to re-estimate Repeat Step 3-5 until a and ß converge. May 21, 2019