Cross section and panel method

Slides:



Advertisements
Similar presentations
Regression Analysis.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
The Simple Regression Model
Chapter 12 Simple Linear Regression
Lecture 8 (Ch14) Advanced Panel Data Method
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Lecture 3 (Ch4) Inferences
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Instrumental Variables Estimation and Two Stage Least Square
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
HETEROSKEDASTICITY Chapter 8.
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Prof. Dr. Rainer Stachuletz
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Multiple Regression Analysis
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
Multiple Regression Analysis
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 3. Asymptotic Properties.
Chapter 11 Multiple Regression.
Topic 3: Regression.
The Simple Regression Model
Lecture 1 (Ch1, Ch2) Simple linear regression
Lecture 2 (Ch3) Multiple linear regression
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Economics Prof. Buckles
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Multiple Linear Regression Analysis
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
8.1 Ch. 8 Multiple Regression (con’t) Topics: F-tests : allow us to test joint hypotheses tests (tests involving one or more  coefficients). Model Specification:
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Chapter 13: Inference in Regression
Hypothesis Testing. Distribution of Estimator To see the impact of the sample on estimates, try different samples Plot histogram of answers –Is it “normal”
Hypothesis Testing in Linear Regression Analysis
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. CH5 Multiple Regression Analysis: OLS Asymptotic 
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Example x y We wish to check for a non zero correlation.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Lecture 11: Simple Linear Regression
Slides by JOHN LOUCKS St. Edward’s University.
I271B Quantitative Methods
Serial Correlation and Heteroskedasticity in Time Series Regressions
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Presentation transcript:

Cross section and panel method Lecture 10 (Ch8) Heteroskedasticity

Understanding the problem of the heteroskedasticity Heteroskedasticity means the following. Var(u|X)≠σ2 : variance of u depends on X. Consider the following model. y=β0+β1x1+β2x2+….+βkxk+u Now remember the series of assumptions MLR1. Linear in parameter MLR2. Random sampling MLR3. No-perfect colinearity MLR4. Zero conditional mean: E(u|X)=0 MLR4’. Uncorreatedness of x and u: Cov(xj,u)=0 MLR5. Homoskedasticity: Var(u|X)=σ2 MLR6. u follows normal distribution. MLR4’ is a weaker assumption than MLR4

MLR1~MLR4: β1 ,…, βk are unbiased and consistent. MLR1~MLR4’: β1 ,…, βk are consistent. MLR1~MLR4, and MLR5: β1 ,…, βk are approximately normal, so t-test is valid. MLR1~MLR4’, and MLR5: β1 ,…, βk are approximately normal, so t-test is valid. MLR1~MLR4, and MLR5, MLR6: β1 ,…, βk have the t-distribution. Therefore, t-test is valid.

Note that MLR1~MLR4 (or MLR4’) are sufficient conditions for the consistency. But, in order to conduct the “usual” t-test, you need MLR5, the homoskedasticity assumption. This is because, the standard error formula we used before is not consistent without MLR5 When MLR5 is not satisfied, but if you apply the usual standard error anyway, you may mistakenly judge a parameter to be significant when it is actually not.

Heteroskedasticity robust standard errors: One explanatory variable case Heteroskedasticity means that Var(u|X) depends on X. In such a case, we have to modify the standard error. Fortunately, there is a method to deal with the heteroskedasticity. For the illustrative purpose, I use one explanatory variable case.

Consider the following model. yi=β0+β1xi+ui Using the argument of the asymptotic normality in handout 4, we can show that

(1) means that the variance of is given by: Off course, you do not know and . But, fortunately, you can consistently estimate them.

To see this, notice that OLS estimators for the coefficients are consistent even under heteroskedasticity. So you can replace ui with the OLS residual . The estimator for the heteroskedasticity-robust variance of is, then, given by: The square root of (3) is the heteroskedasticity-robust standard error.

Heteroskedasticity-robust standard errors: (multiple explanatory variable case.) Now, consider the following regression. y=β0+β1x1+β2x2+….+βkxk+u ……..(4) The valid estimator of under assumption MLR.1 ~MLR4 (or MLR4’) is given by This is the Heteroskedasticity robust variance

where is the OLS residual from the original regression model (4). rij2 Is the the OLS residual from regressing xj on all other explanatory variables. SSRj2 is the sum of squared residuals from this regression. The heteroskedasticity-robust standard error is the square root of (5). Sometimes, this is simply called the robust standard error.

Heteroskedasticity-robust t-statistic Once the heteroskedasticity-robust standard error is computed, heteroskedasticity-robust t-statistic is computed as

Heteroskedasticity-robust F-statistic When the error term is heteroskedastic, the usual F-test is not valid. Heteroskedasticity-robust F-statistic needs to be computed for testing the joint hypothesis. Heteroskedasticity-robust F-statistic is also called Heteroskedasticity robust Wald Statistic.

Heteroskedasticity-robust F- statistic involves fairly complex matrix notation. Thus, the details are not covered in this class. However. STATA automatically compute this.

Heteroskedasticity robust inference with STATA. STATA computes the heteroskedasticity robust standard errors and F statistic automatically. Just use the ‘robust’ option when running a regression. Next slide shows the log salary regression of academic economists in Japan.

Homoskedasticity version reg lsalary female fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer Homoskedasticity version reg lsalary female fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer , robust Heteroskedasticity version

As you can see, standard errors for heteroskedasticity version are slightly higher for most of the coefficients. However, the statistical significance of the majority of the variables is not altered.

Now, suppose that you want to test if there is a gender salary gap for those with experience greater than 20. Then you estimate the following model, Log(salary)=β0+β1female +β2female×(exp>20)+ … and test H0: β1+β2=0

reg lsalary female female_exp20 fullprof assocprof expacademic expacademicsq evermarried kids6 phdabroad extgrant privuniv phdoffer, robust We failed to reject the null hypothesis. Thus we did not find evidence that there is a gender gap for those with experience greater than 20. But notice that, for those with experience less than 20, there is a gender salary gap of 8.7%. Thus, gender gap is concentrated for less experienced workers.

Note Heteroskedasticity-robust standard error is robust for any form of heteroskedasticity. Since homoskedasticity is a special case of heteroskedasticity, it is also robust to homoskedasticity as well. The majority of empirical researches uses the heteroskedasticity-robust standard errors. It is highly recommended that students always use this.

Testing for heteroskedasticity Although the heteroskedasticity robust standard errors work for any type of heteroskedasticity including the special case of the homoskedasticity, there are some reasons for having a simple tests that can detect the heteroskedasticity.

Basic idea is to test whether u2 is correlated with the explanatory variables. Consider the following regression. If the error terms are homoskedastic, all the slope coefficients should be zero. Thus, consider the following null hypothesis. H0: δ1=0, δ2=0,.…, δk=0 If we reject the null hypothesis, this is an indication that the heteroskedasticity is present.

We can test the null using either the F-statistic or LM statistic: where is the R-squared from the regression (6). Note, the F-stat has F(k,n-k-1) degree of freedom and the LM is distributed as χ2k. The LM version of the test is called the Breusch-Pagan test for heteroskedasticity.

STATA implement a slightly different version of the Breusch-Pagan test, and it is done automatically. You can either compute LM statistic as described in the previous slide, or use the STATA command to automatically test this. To use STATA command, first run OLS without robust option, then type the following command. estat hettes

Null hypothesis that the error is homoskedastic is rejected at 5% significance level.