8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het.

Slides:



Advertisements
Similar presentations
Multiple Regression Analysis
Advertisements

The Simple Regression Model
10.3 Time Series Thus Far Whereas cross sectional data needed 3 assumptions to make OLS unbiased, time series data needs only 2 -Although the third assumption.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Instrumental Variables Estimation and Two Stage Least Square
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
Assumption MLR.3 Notes (No Perfect Collinearity)
Part 1 Cross Sectional Data
The Simple Linear Regression Model: Specification and Estimation
HETEROSKEDASTICITY Chapter 8.
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
2.5 Variances of the OLS Estimators
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 4 Multiple Regression.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
12.3 Correcting for Serial Correlation w/ Strictly Exogenous Regressors The following autocorrelation correction requires all our regressors to be strictly.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Chapter 11 Multiple Regression.
Economics 20 - Prof. Anderson
Topic 3: Regression.
The Simple Regression Model
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
1Prof. Dr. Rainer Stachuletz Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Economics Prof. Buckles
Correlation and Regression Analysis
Ordinary Least Squares
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Regression and Correlation Methods Judy Zhong Ph.D.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Returning to Consumption
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Christopher Dougherty EC220 - Introduction to econometrics (revision lectures 2011) Slideshow: autocorrelation Original citation: Dougherty, C. (2011)
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Lecture 10: Correlation and Regression Model.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Principles of Econometrics, 4t h EditionPage 1 Chapter 8: Heteroskedasticity Chapter 8 Heteroskedasticity Walter R. Paczkowski Rutgers University.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Chap 8 Heteroskedasticity
Economics 20 - Prof. Anderson1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
AUTOCORRELATION 1 Assumption B.5 states that the values of the disturbance term in the observations in the sample are generated independently of each other.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
High Speed Heteroskedasticity Review. 2 Review: Heteroskedasticity Heteroskedasticity leads to two problems: –OLS computes standard errors on slopes incorrectly.
Heteroscedasticity Chapter 8
Ch5 Relaxing the Assumptions of the Classical Model
Multiple Regression Analysis: Estimation
The Simple Linear Regression Model: Specification and Estimation
Multiple Regression Analysis
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Simple Linear Regression
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Presentation transcript:

8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het was then corrected using WEIGHTED LEAST SQUARES (WLS) -This method is still useful today, as if heteroskedasticity can be correctly modeled, WLS becomes more efficient than OLS -ie: WLS becomes BLUE

8.4 Known Heteroskedasticity -Assume first that the form of heteroskedasticity is known and expressed as: -Where h(X) is some function of the independent variables -since variance must be positive, h(X)>0 for all valid combinations of X -given a random sample, we can write:

8.4 Known Het Example -Assume that sanity is a function of econometrics knowledge and other factors: -However, by studying econometrics two things happen: either one becomes more sane as one understands the world, or one becomes more crazy as one is pulled into a never-ending vortex of causal relationships. Therefore:

8.4 Known Heteroskedasticity -Since h is a function of x, we know that: -Therefore -So inclusion of the h term in our model can solve heteroskedasticity

8.4 Fixing Het – And Stay Down! -We therefore have the modified equation: -Or alternately: -Note that although our estimates for B J will change (and their standard errors become valid), their interpretation is the same as the straightforward OLS model (don’t try to bring h into your interpretation)

8.4 Het Fixing – “I am the law” -(8.26) is linear and satisfied MLR.1 -if the original sample was random, nothing chances so MLR.2 is satisfied -If no perfect collinearity existed before, MLR.3 is still satisfied now -E(u i *|X i *)=0, so MLR.4 is satisfied -Var(u i *|X i *)=σ 2, so MLR.5 is satisfied -if u i has a normal distribution, so does u i *, so MLR. 6 is satisfied -Thus if the original model satisfies everything but het, the new model satisfies MLR. 1 to 6

8.4 Het Fix – Control the Het Pop -These B J * estimates are different from typical OLS estimates and are examples of GENERALIZED LEAST SQUARES (GLS) ESTIMATORS -this GLS estimation provides standard errors, t statistics and F statistics that are valid -Since these estimates satisfy all 6 CLM assumptions, and because they are BLUE, GLS is always more efficient than OLS -Note that OLS is a special case of GLS where h i =1

8.4 Het Fix – Who broke it anyhow? -Note that the R 2 obtained from this regression is useful for F statistics but is NOT useful for its typical interpretation -this is due to the fact that it explains how much X* explains y*, not how much X explains y -when GLS estimators are used to correct for heteroskedasticity, they are called WEIGHTED LEAST SQUARES (WLS) ESTIMATORS -most econometric programs have commands to minimize the weighted sum of squared residuals:

8.4 Incorrect Correcting? What happens if h(x) is misspecified and WLS is run (ie: if one expects x 1 to cause het but x 3 actually causes het) 1)WLS is still unbiased and consistent (similar to OLS) 2)Standard Errors (thus t and F tests) are no longer valid -to avoid this, one can always apply a fully robust inference for WLS (as we say for OLS in 8.2) -this can be tedious

8.4 Incorrect Correcting? WLS is often criticized as being better than OLS ONLY IF the form of het is correctly chosen -one may argue that making some correction for het is better than none at all -there is always the option of using robust WLS estimation -in cases of doubt, both robust WLS and robust OLS results can be reported

8.4 Averages and Het Heteroskedasticity will always exist when AVERAGES are used -when using averages, each observation is the sum of all individual observations divided by group size: -Therefore in our true regression, our error term is the sum of all individual observations’ error terms divided by group size:

8.4 Averages and Het If the individual model is homoskedastic, and no correlation exists between groups, then the average equation is heteroskedastic with a weight of h i =1/m i -In this way larger groups receive more weight in the regression and is due to the fact that For example, assume that we run a regression on how math knowledge impacts grades in econ classes. Bigger classes (Econ 299) would be weighted to give more information than smaller classes (Econ – Love and Econ.)

8.4 Feasible GLS -In the previous section we assumed that we knew the form of the heteroskedasticity, h i (x) -often this is not the case an we need to use data to estimate h i hat -this yields an estimator called FEASIBLE GLS (FGLS) or ESTIMATED GLS (EGLS) -Although h(x) can be measured many ways, we assume that

8.4 Feasible GLS -Note that while the BP test for Het assumed Het was linear, here we allow for non-linear Het -although testing for linear Het is effective, correcting for Het has issues with linear models as h(X) could be negative, making Var(u|X) negative -since delta is unknown, it must be estimated -using (8.30), -Where v, conditional on X, has a mean of unity

8.4 Feasible GLS -If we assume v is independent of X, -Where e has zero mean and is independent of X -note that the intercept changes, which is unavoidable but not drastically important -as usual, we only have residuals, not errors, so we run the regression and obtain fitted values -To obtain:

8.4 FGLS To use FGLS to correct for Heteroskedasticity, 1)Regress y on all x’s and obtain residuals uhat 2)Create log(uhat 2 ) 3)Regress log(uhat 2 ) on all x’s and obtain fitted values ghat 4)Estimate hhat=exp(ghat) 5)Run WLS using weights 1/hhat

8.4 FGLS If we used the actual h(X), our estimator would be unbiased and BEST -since h(X) is estimated using the same data as FGLS, it is biased and therefore not BEST -however, FGLS is consistent and asymptotically more efficient than OLS -therefore FGLS is a good alternative to OLS in large samples -note that FGLS estimates are interpreted the same as OLS -note also that heteroskedasticity-robust standard errors can always be calculated in cases of doubt

8.4 FGLS Alternative One alternative is to estimate ghat as: Using fitted y values from the OLS equation -This changes step 3 above, but the remaining steps are the same -Note that the Park (1996) test is based on FGLS but is inferior to our previous tests due to FGLS only being consistent

8.4 F Tests and WLS When conducting F tests using WLS, 1)First estimate the restricted and unrestricted model using OLS 2)After determining weights, use these weights on both the restricted and unrestricted model 3)Conduct F tests Luckily most econometric programs have commands for joint tests

8.4 WLS vs. OLS – Cage Match In general, WLS and OLS estimates should always differ due to sampling error -However, some differences are problematic: 1)If significant variables change signs 2)If significant variables drastically change magnitudes -This usually indicates a violation of a Gauss- Markov assumption, generally the zero conditional mean assumption (MLR.4) -this violation would cause bias -the Hausman (1978) test exists to test for this, but “eyeballing” is generally sufficient

8.5 Linear Probability Model We’ve already seen that the Linear Probability Model (LPM), where y is a Dummy Variable, is subject to Heteroskedasticity -the simplest way to deal with this Het is to use OLS estimation with heteroskedastic-robust standard errors -since OLS estimators are generally inefficient in LPM, we can use FGLS:

8.5 LPM and FGLS We know that: Where p(X) is the response probability; probability that y=1 -OLS gives us fitted values and estimates variance using Given that we now have hhat, we can apply FGLS, except for one catch…

8.5 LPM and FGLS If our fitted values, yhat, are outside our (0,1) range, hhat becomes negative or zero -if this happens WLS cannot be done as each observation i is multiplied by The easiest way to fix this is to use OLS and heteroskedasticity-robust statistics -One alternative is to modify yhat to fit in the range, for example, let yhat=0.01 if yhat is too low and yhat=0.99 if yhat is too high -unfortunately this is very arbitrary and thus not the same among estimations

8.5 LPM and FGLS To estimate LPM using FGLS, 1)Estimate the model using OLS to obtain yhat 2)If some values of yhat are outside the unit interval (0,1), adjust those yhat values 3)Estimate variance using: 4) Perform WLS estimation using the weight hhat

8. Heteroskedasticic Review 1) Heteroskedasticity does not affect consistency or biasedness, but does affect standard errors and all tests 2) 2 ways to test for Het are: a) Breuch-Pagan Test b) White Test 3) If the form of Het is know, WLS is superior to OLS 4) If the form of Het is unknown, FGLS can be run and is asymptotically superior to OLS 5) Failing 3 or 4, heteroskedastic-robust standard errors can be used in OLS