Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instrumental Variables and Two Stage Least Squares

Similar presentations


Presentation on theme: "Instrumental Variables and Two Stage Least Squares"— Presentation transcript:

1 Instrumental Variables and Two Stage Least Squares
The endogeneity problem is common in social sciences/economics Consider the regression equation 𝑦 𝑖 = 𝛽 0 + 𝛽 1 𝑥 𝑖 + 𝑢 𝑖 Endogeneity exists when 𝑐𝑜𝑣 𝑢 𝑖 , 𝑥 𝑖 ≠0 OLS estimate of 𝛽 1 has bias with same sign as 𝑐𝑜𝑣 𝑢 𝑖 , 𝑥 𝑖 In many cases important variables cannot be observed leading to endoegeneity bias in OLS Examples: education or job training program in an earnings equation price in a demand or supply equation charter school in an educational outcome equation Measurement error in a control variable may also lead to endogeneity Solutions to endogeneity problems : Proxy variables method for omitted regressors Fixed effects methods if panel data is available Source of endogeneity is a fixed effect Other regressors are not time-constant Instrumental variables method (IV) IV is the most well-known method to address endogeneity problems

2 Instrumental Variables and Two Stage Least Squares
Definition of a instrumental variable z for an endogenous variable x It does not directly impact dependent variable. The only relationship between z and y is through its relationship with x.  Corr(z,u)=0 2) It is highly correlated with the endogenous variable (x) corr(z,x)≠0

3 Instrumental Variables and Two Stage Least Squares
Assume existence of an instrumental variable : (but ) The instrumental variable must be correlated with the explanatory variable

4 Instrumental Variables and Two Stage Least Squares
Example: Father‘s education as an IV for education OLS: Return to education probably overestimated Is the education of the father a good IV? Should it directly affect wage? Is it uncorrelated with the error (?) Is it significantly correlated with educ The estimated return to education decreases (which is to be expected) It is also much less precisely estimated IV:

5 Instrumental Variables and Two Stage Least Squares
Other IVs for education that have been used in the literature: The number of siblings Not a direct determinant of wages Correlated with education because of resource constraints in hh Uncorrelated with innate ability(?) College proximity when 16 years old Not a direct determinant of wages Correlated with education because more education if lived near college Uncorrelated with error (?) Month of birth Correlated with education because of compulsory school attendance laws

6 Instrumental Variables and Two Stage Least Squares
Properties of IV with a weak instrumental variable IV may be much more inconsistent than OLS if the instrumental variable is not completely exogenous and only weakly related to There is no problem if the instrumental variable is really exogenous [Corr(z,u)=0]. If not, the asymptotic bias will is greater the weaker the correlation with x (i.e. with weak IV). IV worse than OLS if:

7 Instrumental Variables and Two Stage Least Squares
IV estimation in the multiple regression model (1) 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 +𝑢 𝑦 2 𝑖𝑠 𝑒𝑛𝑑𝑜𝑔𝑒𝑛𝑜𝑢𝑠 𝑥 1 … 𝑥 𝑘 𝑎𝑟𝑒 𝑒𝑥𝑜𝑔𝑒𝑛𝑜𝑢𝑠 Conditions for 𝑧 to be a valid IV for 𝑦 2 z does not belong in regression equation (i.e. Does not directly affect 𝑦 2 ) z is uncorrelated with error term 𝑢 z is correlated with endogenous explanatory variable In reduced form equaton : 𝑦 2 𝑎𝑠 𝑙𝑖𝑛𝑒𝑎𝑟 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑒𝑥𝑜𝑔𝑒𝑛𝑜𝑢𝑠 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠), coefficient on z must be non-zero and statistically siginficant (2) 𝑦 2 = 𝜋 0 + 𝜋 1 𝑥 1 + … 𝜋 𝑘 𝑥 𝑘 + 𝜋 𝑘+1 𝑧+𝑣

8 Instrumental Variables and Two Stage Least Squares
Two Stage Least Squares (2SLS) estimation (equivalent to IV estimator) First stage: estimate reduced form equation (2) and obtain predicted values for endogenous variable 𝑦 2 𝑦 2 = 𝜋 0 + 𝜋 1 𝑥 1 + … 𝜋 𝑘 𝑥 𝑘 + 𝜋 𝑘+1 𝑧+𝑣 Test for “weak instruments” is test of whether z has statistically significant explanatory power in 1st stage regression. Second stage: estimate structural form equation (1) with predicted value of endogenous variable replacing actual value. 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 +𝑢

9 Instrumental Variables and Two Stage Least Squares
Why does Two Stage Least Squares work? All variables in the second stage regression are exogenous because y2 was replaced by a prediction based on only exogenous information By using the prediction based on exogenous information, y2 is purged of its endogenous part (the part that is related to the error term) Properties of Two Stage Least Squares The standard errors from the OLS second stage regression are wrong. However, it is not difficult to compute correct standard errors. Stata does it automatically (ivreg). If there is one endogenous variable and one instrument then 2SLS = IV The 2SLS estimation can also be used if there is more than one endogenous variable and/or many instruments. 2SLS estimates are generally less precise than OLS estimates – but eliminate potential bias.

10 Instrumental Variables and Two Stage Least Squares
Example: 2SLS in a wage equation using two instruments First stage regression (regress educ on all exogenous variables): Education is significantly partially correlated with the education of the parents (i.e. These are not weak instruments) Two Stage Least Squares estimation results: The return to education is much lower but also much more imprecise than with OLS (bias in OLS estimate of education coefficient was positive).

11 Instrumental Variables and Two Stage Least Squares
Testing exogeneity. 𝑦 1 = 𝛽 0 + 𝛽 1 𝑦 2 + 𝛽 2 𝑥 1 + …+ 𝛽 𝑘+1 𝑥 𝑘 + 𝛿 1 𝑣 +𝑒 Variable y2 is exogenous if and only if 𝑣 is uncorrelated with u, i.e. if the parameter 𝛿 1 is zero in the regression. Use t-test to test null hypothesis of exogenity of 𝑦 2 .


Download ppt "Instrumental Variables and Two Stage Least Squares"

Similar presentations


Ads by Google