Instrumental Variables and Two Stage Least Squares

Slides:



Advertisements
Similar presentations
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Advertisements

Economics 20 - Prof. Anderson
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
C 3.7 Use the data in MEAP93.RAW to answer this question
Lecture 8 (Ch14) Advanced Panel Data Method
CHAPTER 8 MULTIPLE REGRESSION ANALYSIS: THE PROBLEM OF INFERENCE
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Instrumental Variables Estimation and Two Stage Least Square
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Pooled Cross Sections and Panel Data II
Prof. Dr. Rainer Stachuletz
Simultaneous Equations Models
Topic 3: Regression.
Multiple Regression and Correlation Analysis
1 Research Method Lecture 11-1 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Chapter 11 Simple Regression
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Variables Instrumentales Econometría UDESA. Example Wooldridge The estimate for β 1 implies an almost 11% return for another year of education.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
1/69: Topic Descriptive Statistics and Linear Regression Microeconometric Modeling William Greene Stern School of Business New York University New.
Review Section on Instrumental Variables Economics 1018 Abby Williamson and Hongyi Li October 11, 2006.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
INSTRUMENTAL VARIABLES Eva Hromádková, Applied Econometrics JEM007, IES Lecture 5.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Heteroscedasticity Chapter 8
Esman M. Nyamongo Central Bank of Kenya
Chapter 15 Panel Data Models.
ECON 4009 Labor Economics 2017 Fall By Elliott Fan Economics, NTU
Econ 326 Lecture 19.
Pooling Cross Sections across Time: Simple Panel Data Methods
Chow test.
Instrumental Variable (IV) Regression
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
More on Specification and Data Issues
Econometric methods of analysis and forecasting of financial markets
Simultaneous equation system
Fundamentals of regression analysis 2
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
Microeconometric Modeling
More on Specification and Data Issues
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Pooling Cross Sections across Time: Simple Panel Data Methods
Advanced Panel Data Methods
Multiple Regression Analysis with Qualitative Information
Chapter 6: MULTIPLE REGRESSION ANALYSIS
How sensitive are estimates of the marginal propensity to consume to measurement error in survey data in South Africa Reza C. Daniels UCT
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Instrumental Variables and Two Stage Least Squares
Microeconometric Modeling
Migration and the Labour Market
Identification: Instrumental Variables
Instrumental Variables
Chengyuan Yin School of Mathematics
Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Instrumental Variables and Two Stage Least Squares
Tutorial 1: Misspecification
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Microeconometric Modeling
Linear Panel Data Models
Multiple Regression Analysis: OLS Asymptotics
Multiple Regression Analysis: OLS Asymptotics
Chapter 13 Additional Topics in Regression Analysis
Instrumental Variables Estimation and Two Stage Least Squares
More on Specification and Data Issues
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Advanced Panel Data Methods
Presentation transcript:

Instrumental Variables and Two Stage Least Squares The endogeneity problem is common in social sciences/economics Consider the regression equation 𝑦 𝑖 = 𝛽 0 + 𝛽 1 𝑥 𝑖 + 𝑢 𝑖 Endogeneity exists when 𝑐𝑜𝑣 𝑢 𝑖 , 𝑥 𝑖 ≠0 OLS estimate of 𝛽 1 has bias with same sign as 𝑐𝑜𝑣 𝑢 𝑖 , 𝑥 𝑖 In many cases important variables cannot be observed leading to endoegeneity bias in OLS Examples: education in an earnings equation price in a demand equation marital status in an earnings equation 401k dummy in a savings equation matching dummy in a savings equation for those eligible for a 401k Measurement error in a control variable may also lead to endogeneity Solutions to endogeneity problems : Proxy variables method for omitted regressors Fixed effects methods if panel data is available endogeneity is time-constant, regressors are not time-constant Instrumental variables method (IV) IV is the most well-known method to address endogeneity problems

Instrumental Variables and Two Stage Least Squares Definition of a instrumental variable for an endogenous variable x It does not appear in the regression (i.e. does not directly impact dependent variable) 2) It is highly correlated with the endogenous variable (x) 3) It is uncorrelated with the error term (u)

Instrumental Variables and Two Stage Least Squares Assume existence of an instrumental variable : (but ) The instrumental variable is correlated with the explanatory variable

Instrumental Variables and Two Stage Least Squares Example: Father‘s education as an IV for education OLS: Return to education probably overestimated Is the education of the father a good IV? Should it direct affect wage? Is it significantly correlated with educ Is it uncorrelated with the error (?) The estimated return to education decreases (which is to be expected) It is also much less precisely estimated IV:

Instrumental Variables and Two Stage Least Squares Other IVs for education that have been used in the literature: The number of siblings Not a direct determinant of wages Correlated with education because of resource constraints in hh Uncorrelated with innate ability(?) College proximity when 16 years old Not a direct determinant of wages Correlated with education because more education if lived near college Uncorrelated with error (?) Month of birth Correlated with education because of compulsory school attendance laws

Instrumental Variables and Two Stage Least Squares Properties of IV with a weak instrumental variable IV may be much more inconsistent than OLS if the instrumental variable is not completely exogenous and only weakly related to There is no problem if the instrumental variable is really exogenous [Corr(z,u)=0]. If not, the asymptotic bias will be the larger the weaker the correlation with x (i.e. with weak IV). IV worse than OLS if:

Instrumental Variables and Two Stage Least Squares IV estimation in the multiple regression model (1) 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 +𝑢 𝑦 2 𝑖𝑠 𝑒𝑛𝑑𝑜𝑔𝑒𝑛𝑜𝑢𝑠 𝑥 1 … 𝑥 𝑘 𝑎𝑟𝑒 𝑒𝑥𝑜𝑔𝑒𝑛𝑜𝑢𝑠 Conditions for 𝑧 to be a valid IV for 𝑦 2 z does not belong in regression equation (i.e. Does not directly affect 𝑦 2 ) z is uncorrelated with error term 𝑢 z is correlated with endogenous explanatory variable In reduced form equaton : 𝑦 2 𝑎𝑠 𝑙𝑖𝑛𝑒𝑎𝑟 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑒𝑥𝑜𝑔𝑒𝑛𝑜𝑢𝑠 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠), coefficient on z must be non-zero and statistically siginficant. (2) 𝑦 2 = 𝜋 0 + 𝜋 1 𝑥 1 + … 𝜋 𝑘 𝑥 𝑘 + 𝜋 𝑘+1 𝑧+𝑣

Instrumental Variables and Two Stage Least Squares Two Stage Least Squares (2SLS) estimation (equivalent to IV estimator) First stage: estimate reduced form equation (2) and obtain predicted values for endogenous variable 𝑦 2 𝑦 2 = 𝜋 0 + 𝜋 1 𝑥 1 + … 𝜋 𝑘 𝑥 𝑘 + 𝜋 𝑘+1 𝑧+𝑣 Test for “weak instruments” is test of whether z has statistically significant explanatory power in 1st stage regression. Second stage: estimate structural form equation (1) with predicted value of endogenous variable replacing actual value. 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 +𝑢

Instrumental Variables and Two Stage Least Squares Why does Two Stage Least Squares work? All variables in the second stage regression are exogenous because y2 was replaced by a prediction based on only exogenous information By using the prediction based on exogenous information, y2 is purged of its endogenous part (the part that is related to the error term) Properties of Two Stage Least Squares The standard errors from the OLS second stage regression are wrong. However, it is not difficult to compute correct standard errors. Stata does it automatically (ivreg). If there is one endogenous variable and one instrument then 2SLS = IV The 2SLS estimation can also be used if there is more than one endogenous variable and/or many instruments. 2SLS estimates are generally less precise than OLS estimates – but eliminate potential bias.

Instrumental Variables and Two Stage Least Squares Example: 2SLS in a wage equation using two instruments First stage regression (regress educ on all exogenous variables): Education is significantly partially correlated with the education of the parents (i.e. These are not weak instruments) Two Stage Least Squares estimation results: The return to education is much lower but also much more imprecise than with OLS (bias in OLS estimate of education coefficient was positive).

Instrumental Variables and Two Stage Least Squares Testing exogeneity. 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 + 𝛿 1 𝑣 +𝑒 Variable y2 is exogenous if and only if 𝑣 is uncorrelated with u, i.e. if the parameter 𝛿 1 is zero in the regression. Use t-test to test null hypothesis of exogenity of 𝑦 2 .