Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.

Slides:



Advertisements
Similar presentations
Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Instrumental Variables Estimation and Two Stage Least Square
Objectives (BPS chapter 24)
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Chapter 10 Simple Regression.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Prof. Dr. Rainer Stachuletz
Simultaneous Equations Models
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Additional Topics in Regression Analysis
Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Chapter 11 Multiple Regression.
Further Inference in the Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Topic 3: Regression.
Inferences About Process Quality
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Economics Prof. Buckles
1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing in Linear Regression Analysis
Regression Method.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Pure Serial Correlation
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Heteroskedasticity Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin The Two-Variable Model: Hypothesis Testing chapter seven.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Dynamic Models, Autocorrelation and Forecasting ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Review Section on Instrumental Variables Economics 1018 Abby Williamson and Hongyi Li October 11, 2006.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Multiple Regression Analysis: Inference
Vera Tabakova, East Carolina University
Vera Tabakova, East Carolina University
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Undergraduated Econometrics
Interval Estimation and Hypothesis Testing
Simple Linear Regression
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Linear Panel Data Models
Simultaneous Equations Models
Presentation transcript:

Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University

 10.1 Linear Regression with Random x’s  10.2 Cases in which x and e are Correlated  10.3 Estimators Based on the Method of Moments  10.4 Specification Tests

The assumptions of the simple linear regression are:  SR1.  SR2.  SR3.  SR4.  SR5. The variable x i is not random, and it must take at least two different values.  SR6. (optional)

The purpose of this chapter is to discuss regression models in which x i is random and correlated with the error term e i. We will:  Discuss the conditions under which having a random x is not a problem, and how to test whether our data satisfies these conditions.  Present cases in which the randomness of x causes the least squares estimator to fail.  Provide estimators that have good properties even when x i is random and correlated with the error e i.

 A10.1 correctly describes the relationship between y i and x i in the population, where β 1 and β 2 are unknown (fixed) parameters and e i is an unobservable random error term.  A10.2The data pairs, are obtained by random sampling. That is, the data pairs are collected from the same population, by a process in which each pair is independent of every other pair. Such data are said to be independent and identically distributed.

 A10.3 The expected value of the error term e i, conditional on the value of x i, is zero. This assumption implies that we have (i) omitted no important variables, (ii) used the correct functional form, and (iii) there exist no factors that cause the error term e i to be correlated with x i.  If, then we can show that it is also true that x i and e i are uncorrelated, and that.  Conversely, if x i and e i are correlated, then and we can show that.

 A10.4In the sample, x i must take at least two different values.  A10.5 The variance of the error term, conditional on x i is a constant σ 2.  A10.6 The distribution of the error term, conditional on x i, is normal.

 Under assumptions A10.1-A10.4 the least squares estimator is unbiased.  Under assumptions A10.1-A10.5 the least squares estimator is the best linear unbiased estimator of the regression parameters, conditional on the x’s, and the usual estimator of σ 2 is unbiased.

 Under assumptions A10.1-A10.6 the distributions of the least squares estimators, conditional upon the x’s, are normal, and their variances are estimated in the usual way. Consequently the usual interval estimation and hypothesis testing procedures are valid.

Figure 10.1 An illustration of consistency

Remark: Consistency is a “large sample” or “asymptotic” property. We have stated another large sample property of the least squares estimators in Chapter 2.6. We found that even when the random errors in a regression model are not normally distributed, the least squares estimators still have approximate normal distributions if the sample size N is large enough. How large must the sample size be for these large sample properties to be valid approximations of reality? In a simple regression 50 observations might be enough. In multiple regression models the number might be much higher, depending on the quality of the data.

 A10.3*

 Under assumption A10.3* the least squares estimators are consistent. That is, they converge to the true parameter values as N .  Under assumptions A10.1, A10.2, A10.3*, A10.4 and A10.5, the least squares estimators have approximate normal distributions in large samples, whether the errors are normally distributed or not. Furthermore our usual interval estimators and test statistics are valid, if the sample is large.

 If assumption A10.3* is not true, and in particular if so that x i and e i are correlated, then the least squares estimators are inconsistent. They do not converge to the true parameter values even in very large samples. Furthermore, none of our usual hypothesis testing or interval estimation procedures are valid.

Figure 10.2 Plot of correlated x and e

True model, but it gets estimated as: …so there is s substantial bias …

Figure 10.3 Plot of data, true and fitted regressions In the case of a positive correlation between x and the error Example Wages = f(intelligence)

When an explanatory variable and the error term are correlated the explanatory variable is said to be endogenous and means “determined within the system.”

XY vu e X Y

Omitted factors: experience, ability and motivation. Therefore, we expect that We will focus on this case Y W X

Other examples:  Y = crime, X = marriage, W = “marriageability”  Y = divorce, X = “shacking up”, W = “good match”  Y = crime, X = watching a lot of TV, W = “parental involvement”  Y = sexual violence, X = watching porn, W = any unobserved anything that would affect both W and Y  The list is endless!. We will focus on this case Y W X

There is a feedback relationship between P i and Q i. Because of this feedback, which results because price and quantity are jointly, or simultaneously, determined, we can show that The resulting bias (and inconsistency) is called the simultaneous equations bias. XY

In this case the least squares estimator applied to the lagged dependent variable model will be biased and inconsistent.

When all the usual assumptions of the linear model hold, the method of moments leads us to the least squares estimator. If x is random and correlated with the error term, the method of moments leads us to an alternative, called instrumental variables estimation, or two-stage least squares estimation, that will work in large samples.

Suppose that there is another variable, z, such that  z does not have a direct effect on y, and thus it does not belong on the right-hand side of the model as an explanatory variable ONCE X IS IN THE MODEL!  z i should not be simultaneously affected by y either, of course  z i is not correlated with the regression error term e i. Variables with this property are said to be exogenous  z is strongly [or at least not weakly] correlated with x, the endogenous explanatory variable  A variable z with these properties is called an instrumental variable.

 An instrument is a variable z correlated with x but not with the error e  In addition, the instrument does not directly affect y and thus does not belong in the actual model as a separate regressor (of course it should affect it through the instrumented regressor x)  It is common to have more than one instrument for x (just not good ones! )  These instruments, z1; z2; : : : ; zs, must be correlated with x, but not with e  Consistent estimation is obtained through the instrumental variables or two-stage least squares (2SLS) estimator, rather than the usual OLS estimator

 Using Z, an “instrumental variable” for X is one solution to the problem of omitted variables bias Z, to be a valid instrument for X must be: –Relevant = Correlated with X –Exogenous = Not correlated with Y except through its correlation with X Z XY W e

Based on the method of moments…

Solving the previous system, we obtain a new estimator that cleanses the endogeneity of X and exploits only the component of the variation of X that is not correlated with e instead: We do that by ensuring that we only use the predicted value of X from its regression on Z in the main regression

These new estimators have the following properties:  They are consistent, if  In large samples the instrumental variable estimators have approximate normal distributions. In the simple regression model

 The error variance is estimated using the estimator For: The stronger the correlation between the instrument and X the better!

Using the instrumental variables is less efficient than OLS (because you must throw away some information on the variation of X) so it leads to wider confidence intervals and less precise inference. The bottom line is that when instruments are weak, instrumental variables estimation is not reliable: you throw away too much information when trying to avoid the endogeneity bias

 Not all of the available variation in X is used  Only that component of X that is “explained” by Z is used to explain Y XY Z X = Endogenous variable Y = Response variable Z = Instrumental variable

XY Z Realistic scenario: Very little of X is explained by Z, and/or what is explained does not overlap much with Y XY Z Best-case scenario: A lot of X is explained by Z, and most of the overlap between X and Y is accounted for

 The IV estimator is BIASED  E(b IV ) ≠ β (finite-sample bias) but consistent: E(b) → β as N → ∞ So IV studies must often have very large samples  But with endogeneity, E(b LS ) ≠ β and plim(b LS ) ≠ β anyway…  Asymptotic behavior of IV: plim(b IV ) = β + Cov(Z,e) / Cov(Z,X)  If Z is truly exogenous, then Cov(Z,e) = 0

 Three different models to be familiar with  First stage (“Reduced form”): EDUC = α 0 + α 1 Z + ω  Structural model: WAGES = β 0 + β 1 EDUC + ε  WAGES = δ 0 + δ 1 Z + ξ  An interesting equality: δ 1 = α 1 × β 1 so… β 1 = δ 1 / α 1 ZXY α1α1 β1β1 ZY δ1δ1 ωε ξ

Check it out: as expected much lower than from OLS!!! We hope for a high t ratio here

Principles of Econometrics, 3rd Edition43 Use mroz.dta summarize drop if lfp==0 gen lwage = log(wage) gen exper2 = exper^2 * Basic OLS estimation reg lwage educ exper exper2 estimates store ls * IV estimation reg educ exper exper2 mothereduc predict educ_hat reg lwage educ_hat exper exper2 But check that the latter will give you wrong s.e. so not reccommended to run 2SLS manually Include in z all G exogenous variables and the instruments available Exogenous variables Instrument themselves!

Principles of Econometrics, 3rd Edition44 In econometrics, two-stage least squares (TSLS or 2SLS) and instrumental variables (IV) estimation are often used interchangeably The `two-stage' terminology comes from the time when the easiest way to estimate the model was to actually use two separate least squares regressions With better software, the computation is done in a single step to ensure the other model statistics are computed correctly

Principles of Econometrics, 3rd Edition45 In econometrics, two-stage least squares (TSLS or 2SLS) and instrumental variables (IV) estimation are often used interchangeably The `two-stage' terminology comes from the time when the easiest way to estimate the model was to actually use two separate least squares regressions

Principles of Econometrics, 3rd Edition46 First stage or “reduced form” equation: Good news!!

Principles of Econometrics, 3rd Edition47 Second stage equation: Wrong standard errors!!!

Principles of Econometrics, 3rd Edition48 Second stage equation: Correct standard errors!!!

Principles of Econometrics, 3rd Edition49 Second stage equation: “Correcter” standard errors Wit a small sample and robust To heteroskedasticity!!!

A 2-step process.  Regress x on a constant term, z and all other exogenous variables G, and obtain the predicted values  Use as an instrumental variable for x.

Two-stage least squares (2SLS) estimator:  Stage 1 is the regression of x on a constant term, z and all other exogenous variables G, to obtain the predicted values. This first stage is called the reduced form model estimation.  Stage 2 is ordinary least squares estimation of the simple linear regression

If we regress education on all the exogenous variables and the TWO instruments: These look promising In fact: F test says Null hypothesis: the regression parameters are zero for the variables mothereduc, fathereduc Test statistic: F(2, 423) = , p-value e-022 Rule of thumb for strong instruments: F>10

 ivregress 2sls lwage (educ=mothereduc fathereduc) exper exper2, small  estimates store iv With the additional instrument, we achieve a significant result in this case

Is a bit more complex, but the idea is that you need at least as many Instruments as you have endogenous variables You cannot use the F test we just saw to test for the weakness of the instruments

When testing the null hypothesis use of the test statistic is valid in large samples. It is common, but not universal, practice to use critical values, and p-values, based on the Student-t distribution rather than the more strictly appropriate N(0,1) distribution. The reason is that tests based on the t-distribution tend to work better in samples of data that are not large.

When testing a joint hypothesis, such as, the test may be based on the chi-square distribution with the number of degrees of freedom equal to the number of hypotheses (J) being tested. The test itself may be called a “Wald” test, or a likelihood ratio (LR) test, or a Lagrange multiplier (LM) test. These testing procedures are all asymptotically equivalent Another complication: you might need to use tests that are robust to heteroskedasticity and/or autocorrelation…

Unfortunately R 2 can be negative when based on IV estimates. Therefore the use of measures like R 2 outside the context of the least squares estimation should be avoided (even if GRETL and STATA produce one!)

 Can we test for whether x is correlated with the error term? This might give us a guide of when to use least squares and when to use IV estimators. TEST OF EXOGENEITY: (Durbin-Wu) HAUSMAN TEST  Can we test whether our instrument is sufficiently strong to avoid the problems associated with “weak” instruments? TESTS FOR WEAK INSTRUMENTS  Can we test if our instrument is valid, and uncorrelated with the regression error, as required? SOMETIMES ONLY  WITH A SARGAN TEST

 If the null hypothesis is true, both OLS and the IV estimator are consistent. If the null hypothesis holds, use the more efficient estimator, OLS.  If the null hypothesis is false, OLS is not consistent, and the IV estimator is consistent, so use the IV estimator.

Let z 1 and z 2 be instrumental variables for x. 1. Estimate the model by least squares, and obtain the residuals. If there are more than one explanatory variables that are being tested for endogeneity, repeat this estimation for each one, using all available instrumental variables in each regression.

2. Include the residuals computed in step 1 as an explanatory variable in the original regression, Estimate this "artificial regression" by least squares, and employ the usual t-test for the hypothesis of significance

3. If more than one variable is being tested for endogeneity, the test will be an F-test of joint significance of the coefficients on the included residuals.  Note: This is a test for the exogeneity of the regressors xi and not for the exogeneity of the instruments zi. If the instruments are not valid, the Hausman test is not valid either.

* Hausman test regression based reg educ exper exper2 mothereduc fathereduc predict vhat, residuals reg lwage educ exper exper2 vhat reg lwage educ exper exper2 vhat, vce(robust)  There are several different ways of computing this test, so don't worry if your result from other software packages differs from the one you compute manually using the above script

so So education is only barely exogenous at about 10%

so So education is only barely exogenous at about 10% hausman iv ls, constant sigmamore Forces Stata to use the OLS Residuals when calculating the Error variance in both estimators

If we have L > 1 instruments available then the reduced form equation is

Then test with an F test whether the instruments help to determine the value of the endogenous variable. Rule of thumb F > 10 for one endogenous regressor

 As we saw before:  * Testing for weak instruments  reg educ exper exper2 mothereduc  reg educ exper exper2 fathereduc  reg educ exper exper2 mothereduc fathereduc  test mothereduc fathereduc

 * Testing for weak instruments using estat ivregress 2sls lwage (educ=mothereduc fathereduc) exper exper2, small estat firststage so We are quite alright here

 * Robust tests reg educ exper exper2 mothereduc fathereduc, vce(robust) test mothereduc fathereduc so Still alright

 * Robust tests using ivregress ivregress 2sls lwage (educ=mothereduc fathereduc) exper exper2, small vce(robust) estat firststage so Still alright

We want to check that the instrument is itself exogenous…but you can only do it for surplus instruments if you have them (overidentified equation)… 1. Compute the IV estimates using all available instruments, including the G variables x 1 =1, x 2, …, x G that are presumed to be exogenous, and the L instruments z 1, …, z L. 2. Obtain the residuals

3. Regress on all the available instruments described in step Compute NR 2 from this regression, where N is the sample size and R 2 is the usual goodness-of-fit measure. 5. If all of the surplus moment conditions are valid, then If the value of the test statistic exceeds the 100(1−α)-percentile from the distribution, then we conclude that at least one of the surplus moment conditions restrictions is not valid.

# Sargan test of overidentification ivregress 2sls lwage (educ=mothereduc fathereduc) exper exper2, small predict ehat, residuals quietly reg ehat mothereduc fathereduc exper exper2 scalar nr2 = e(N)*e(r2) scalar chic = invchi2tail(1,.05) scalar pvalue = chi2tail(1,nr2) di "NR^2 test of overidentifying restriction = " nr2 di "Chi-square critical value 1 df,.05 level = " chic di "p value for overidentifying test 1 df,.05 level = " pvalue You should learn about the Cragg-Uhler test if you end up having more than one endogenous variable

# Sargan test of overidentification You should learn about the Cragg-Uhler test if you end up having more than one endogenous variable so So we are quite safe here…

# Sargan test of overidentification You should learn about the Cragg-Uhler test if you end up having more than one endogenous variable so So we are quite safe here…

Slide Principles of Econometrics, 3rd Edition  asymptotic properties  conditional expectation  endogenous variables  errors-in-variables  exogenous variables  finite sample properties  Hausman test  instrumental variable  instrumental variable estimator  just identified equations  large sample properties  over identified equations  population moments  random sampling  reduced form equation  sample moments  simultaneous equations bias  test of surplus moment conditions  two-stage least squares estimation  weak instruments