Identification: Instrumental Variables

Slides:



Advertisements
Similar presentations
Financial Econometrics
Advertisements

Graduate Methods Master Class
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Economics 20 - Prof. Anderson
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.
Instrumental Variables Estimation and Two Stage Least Square
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Prof. Dr. Rainer Stachuletz
1Prof. Dr. Rainer Stachuletz Simultaneous Equations y 1 =  1 y 2 +  1 z 1 + u 1 y 2 =  2 y 1 +  2 z 2 + u 2.
Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Topic 3: Regression.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.
Chapter 11 Simple Regression
JDS Special program: Pre-training1 Carrying out an Empirical Project Empirical Analysis & Style Hint.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Instrumental Variables: Problems Methods of Economic Investigation Lecture 16.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
OLS SHORTCOMINGS Preview of coming attractions. QUIZ What are the main OLS assumptions? 1.On average right 2.Linear 3.Predicting variables and error term.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
Non-Experimental Evaluations Methods of Economic Investigation Lecture 5 1.
Regression Analysis A statistical procedure used to find relations among a set of variables.
RCTs and instrumental variables Anna Vignoles University of Cambridge.
Christel M. J. Vermeersch November 2006 Session V Instrumental Variables.
Review Section on Instrumental Variables Economics 1018 Abby Williamson and Hongyi Li October 11, 2006.
Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.
INSTRUMENTAL VARIABLES Eva Hromádková, Applied Econometrics JEM007, IES Lecture 5.
IV Estimation Instrumental Variables. Implication Estimate model by OLS and by IV, and compare estimates If But test INDIRECTLY using Wu-Hausman.
Time Series Econometrics
Instrumental Variables Regression
Chapter 13 Simple Linear Regression
Esman M. Nyamongo Central Bank of Kenya
Econometrics ITFD Week 8.
Instrumental Variable (IV) Regression
More on Specification and Data Issues
Econometric methods of analysis and forecasting of financial markets
Simultaneous equation system
Chapter 11 Simple Regression
Multiple Regression Analysis
Fundamentals of regression analysis 2
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
More on Specification and Data Issues
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Instrumental Variables and Two Stage Least Squares
Serial Correlation and Heteroskedasticity in Time Series Regressions
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Regression Models - Introduction
Instrumental Variables and Two Stage Least Squares
Migration and the Labour Market
Serial Correlation and Heteroscedasticity in
Instrumental Variables
Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)
Instrumental Variables and Two Stage Least Squares
Simple Linear Regression
Tutorial 1: Misspecification
CHAPTER 14 MULTIPLE REGRESSION
Microeconometric Modeling
Linear Panel Data Models
Chapter 13 Additional Topics in Regression Analysis
Instrumental Variables Estimation and Two Stage Least Squares
Linear Regression Summer School IFPRI
More on Specification and Data Issues
Simultaneous Equations Models
Serial Correlation and Heteroscedasticity in
Measuring the Wealth of Nations
Advanced Tools and Techniques of Program Evaluation
Regression Models - Introduction
Presentation transcript:

Identification: Instrumental Variables Ziyodullo Parpiev, PhD Delivered at Summer school 2017 Tashkent, Uzbekistan June 16, 2017

Why do we need IV? Internal Validity Problems Independent variables are correlated with the error term. Three types relevant here: Errors-in-variables Omitted Variable Bias These 2 usually solved by adding omitted variable or correcting error, but what if no additional data? Simultaneous Causality (Endogeneity) When X  Y AND Y  X Simple OLS picks up both effects and produces biased estimate of causal effect. Errors-in-variables – error in measurement of one of variables – Example – systematic incorrect answers on a survey or systematic data coding problem. Ideally would just correct, but if you can’t you might be able to use an IV OVB – when variable that determines Y and is correlated with X is excluded from the regression equation. – Example: - Usually solved by including omitted variable, but when that is impossible, can use IV Simultaneous Causality – Example – Yi = B0 + B1X1 + Ui Xi=A0+A1Y1+Vi (If Ui is negative, this decreases Yi in the first eqn, but also affects the value of Xi in the 2nd eqn. If A1 is positive, a low Yi will lead to a low Xi. So if Ai is positive, Xi and Ui will be correlated.)

What is the IV Technique? When you have endogeneity problem, you want to somehow separate out the part of the independent variable that is correlated with the error term. Once that part is separated out, you can get an unbiased causal estimate of the effect of the “uncorrelated portion” of the independent variable on the dependent variable of interest. Now Hongyi will lead us through a bit more detailed presentation of how this presentation works.

IV: basic idea Consider the following regression model: yi = β0 + β1 Xi + ei Variation in the endogenous regressor Xi has two parts the part that is uncorrelated with the error (“good” variation) the part that is correlated with the error (“bad” variation) The basic idea behind instrumental variables regression is to isolate the “good” variation and disregard the “bad” variation

IV: conditions for a valid instrument The first step is to identify a valid instrument A variable Zi is a valid instrument for the endogenous regressor Xi if it satisfies two conditions: 1. Relevance: corr (Zi , Xi) ≠ 0 Exogeneity: corr (Zi , ei) = 0

IV: two-stage least squares The most common IV method is two-stage least squares (2SLS) Stage 1: Decompose Xi into the component that can be predicted by Zi and the problematic component Xi = 0 + 1 Zi + i Stage 2: Use the predicted value of Xi from the first-stage regression to estimate its effect on Yi yi = 0 + 1 X-hati + i Note: software packages like Stata perform the two stages in a single regression, producing the correct standard errors

Z as an instrument for X

Clear?

Evaluating Instruments Two conditions: Instrument Relevance – IV is correlated with the problematic independent variable: corr (Zi , Xi) ≠ 0 Instrument Exogeneity – IV is NOT correlated with the error term: corr (Zi , ei) = 0

Evaluating Instruments # POLICE  CRIME (Steven Levitt 1997) Simple OLS gives positive result – increase number of police, increase crime Why? Problem with simple OLS is that there is a policy response to crime – hire more police – which causes a reverse causality effect

Evaluating Instruments # POLICE  CRIME (Steven Levitt 1997) Simple OLS gives positive result – increase number of police, increase crime Why? Instrument: Was there a mayoral election in the year the measurements were taken? IV regression gives expected negative result – increase number of police, decrease crime Why is this a good instrument? Mayors hire more police in election year – so correlated with independent variable But whether there is a mayoral elections does NOT affect the level of crime.

IV: example Two-stage least squares: Stage 1: Decompose police hires into the component that can be predicted by the electoral cycle and the problematic component policei = 0 + 1 electioni + i Stage 2: Use the predicted value of policei from the first-stage regression to estimate its effect on crimei crimei = 0 + 1 police-hati + i Finding: an increased police force reduces violent crime (but has little effect on property crime)

IV: number of instruments There must be at least as many instruments as endogenous regressors Let k = number of endogenous regressors m = number of instruments The regression coefficients are exactly identified if m=k (OK) overidentified if m>k (OK) underidentified if m<k (not OK)

IV: testing instrument relevance How do we know if our instruments are valid? Recall our first condition for a valid instrument: 1. Relevance: corr (Zi , Xi) ≠ 0 Stock and Watson’s rule of thumb: the first-stage F-statistic testing the hypothesis that the coefficients on the instruments are jointly zero should be at least 10 (for a single endogenous regressor) A small F-statistic means the instruments are “weak” (they explain little of the variation in X) and the estimator is biased

IV: testing instrument exogeneity Recall our second condition for a valid instrument: 2. Exogeneity: corr (Zi , ei) = 0 If you have the same number of instruments and endogenous regressors, it is impossible to test for instrument exogeneity But if you have more instruments than regressors: Overidentifying restrictions test – regress the residuals from the 2SLS regression on the instruments (and any exogenous control variables) and test whether the coefficients on the instruments are all zero

IV: drawbacks of this method It can be difficult to find an instrument that is both relevant (not weak) and exogenous Assessment of instrument exogeneity can be highly subjective when the coefficients are exactly identified IV can be difficult to explain to those who are unfamiliar with it

Closing Comments about Instrumental Variables Studies In general, a lagged value of the endogenous regressor is not a good instrument Traditional structural equation model uses lagged values of X and Y as instruments to break the simultaneity between the current values of X and Y X1 X2 Y1 Y2 These models impose the awfully strong assumption that lagged values of X and Y only affect the outcomes through current values

Closing Comments about Instrumental Variables Studies Good IV models are generally interesting in their own right, and should not be treated as “tack on” analyses Practice varies widely across disciplines Some researchers write papers about their discovery and application of a “clever” IV for some problem Other researchers “tack on” IV models at the end of their analysis, often poorly, as a way to convince readers that their results are robust

Rules for Good Practice with Instrumental Variables Models IV models can be very informative, but it’s your job to convince your audience Show the first-stage model diagnostics Even the most clever IV might not be sufficiently strongly related to X to be a useful source of identification Report test(s) of overidentifying restrictions An invalid IV is often worse than no IV at all Report LS endogeneity (DWH) test

Rules for Good Practice with Instrumental Variables Models Most importantly, TELL A STORY about why a particular IV is a “good instrument” Something to consider when thinking about whether a particular IV is “good” Does the IV, for all intents and purposes, randomize the endogenous regressor?