Identification: Instrumental Variables

Slides:

Advertisements

Similar presentations

Financial Econometrics

Advertisements

Graduate Methods Master Class

REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.

Economics 20 - Prof. Anderson

There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.

Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.

Instrumental Variables Estimation and Two Stage Least Square

Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)

Prof. Dr. Rainer Stachuletz

1Prof. Dr. Rainer Stachuletz Simultaneous Equations y 1 =  1 y 2 +  1 z 1 + u 1 y 2 =  2 y 1 +  2 z 2 + u 2.

Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Topic 3: Regression.

Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.

1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.

Chapter 11 Simple Regression

JDS Special program: Pre-training1 Carrying out an Empirical Project Empirical Analysis & Style Hint.

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

Instrumental Variables: Problems Methods of Economic Investigation Lecture 16.

Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.

OLS SHORTCOMINGS Preview of coming attractions. QUIZ What are the main OLS assumptions? 1.On average right 2.Linear 3.Predicting variables and error term.

1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.

Non-Experimental Evaluations Methods of Economic Investigation Lecture 5 1.

Regression Analysis A statistical procedure used to find relations among a set of variables.

RCTs and instrumental variables Anna Vignoles University of Cambridge.

Christel M. J. Vermeersch November 2006 Session V Instrumental Variables.

Review Section on Instrumental Variables Economics 1018 Abby Williamson and Hongyi Li October 11, 2006.

Endogenous Regressors and Instrumental Variables Estimation Adapted from Vera Tabakova, East Carolina University.

INSTRUMENTAL VARIABLES Eva Hromádková, Applied Econometrics JEM007, IES Lecture 5.

IV Estimation Instrumental Variables. Implication Estimate model by OLS and by IV, and compare estimates If But test INDIRECTLY using Wu-Hausman.

Time Series Econometrics

Instrumental Variables Regression

Chapter 13 Simple Linear Regression

Esman M. Nyamongo Central Bank of Kenya

Econometrics ITFD Week 8.

Instrumental Variable (IV) Regression

More on Specification and Data Issues

Econometric methods of analysis and forecasting of financial markets

Simultaneous equation system

Chapter 11 Simple Regression

Multiple Regression Analysis

Fundamentals of regression analysis 2

STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES

More on Specification and Data Issues

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.

Instrumental Variables and Two Stage Least Squares

Serial Correlation and Heteroskedasticity in Time Series Regressions

Chapter 6: MULTIPLE REGRESSION ANALYSIS

Regression Models - Introduction

Instrumental Variables and Two Stage Least Squares

Migration and the Labour Market

Serial Correlation and Heteroscedasticity in

Instrumental Variables

Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)

Instrumental Variables and Two Stage Least Squares

Simple Linear Regression

Tutorial 1: Misspecification

CHAPTER 14 MULTIPLE REGRESSION

Microeconometric Modeling

Linear Panel Data Models

Chapter 13 Additional Topics in Regression Analysis

Instrumental Variables Estimation and Two Stage Least Squares

Linear Regression Summer School IFPRI

More on Specification and Data Issues

Simultaneous Equations Models

Serial Correlation and Heteroscedasticity in

Measuring the Wealth of Nations

Advanced Tools and Techniques of Program Evaluation

Regression Models - Introduction

Presentation transcript:

Identification: Instrumental Variables Ziyodullo Parpiev, PhD Delivered at Summer school 2017 Tashkent, Uzbekistan June 16, 2017

Why do we need IV? Internal Validity Problems Independent variables are correlated with the error term. Three types relevant here: Errors-in-variables Omitted Variable Bias These 2 usually solved by adding omitted variable or correcting error, but what if no additional data? Simultaneous Causality (Endogeneity) When X  Y AND Y  X Simple OLS picks up both effects and produces biased estimate of causal effect. Errors-in-variables – error in measurement of one of variables – Example – systematic incorrect answers on a survey or systematic data coding problem. Ideally would just correct, but if you can’t you might be able to use an IV OVB – when variable that determines Y and is correlated with X is excluded from the regression equation. – Example: - Usually solved by including omitted variable, but when that is impossible, can use IV Simultaneous Causality – Example – Yi = B0 + B1X1 + Ui Xi=A0+A1Y1+Vi (If Ui is negative, this decreases Yi in the first eqn, but also affects the value of Xi in the 2nd eqn. If A1 is positive, a low Yi will lead to a low Xi. So if Ai is positive, Xi and Ui will be correlated.)

What is the IV Technique? When you have endogeneity problem, you want to somehow separate out the part of the independent variable that is correlated with the error term. Once that part is separated out, you can get an unbiased causal estimate of the effect of the “uncorrelated portion” of the independent variable on the dependent variable of interest. Now Hongyi will lead us through a bit more detailed presentation of how this presentation works.

IV: basic idea Consider the following regression model: yi = β0 + β1 Xi + ei Variation in the endogenous regressor Xi has two parts the part that is uncorrelated with the error (“good” variation) the part that is correlated with the error (“bad” variation) The basic idea behind instrumental variables regression is to isolate the “good” variation and disregard the “bad” variation

IV: conditions for a valid instrument The first step is to identify a valid instrument A variable Zi is a valid instrument for the endogenous regressor Xi if it satisfies two conditions: 1. Relevance: corr (Zi , Xi) ≠ 0 Exogeneity: corr (Zi , ei) = 0

IV: two-stage least squares The most common IV method is two-stage least squares (2SLS) Stage 1: Decompose Xi into the component that can be predicted by Zi and the problematic component Xi = 0 + 1 Zi + i Stage 2: Use the predicted value of Xi from the first-stage regression to estimate its effect on Yi yi = 0 + 1 X-hati + i Note: software packages like Stata perform the two stages in a single regression, producing the correct standard errors

Z as an instrument for X

Clear?

Evaluating Instruments Two conditions: Instrument Relevance – IV is correlated with the problematic independent variable: corr (Zi , Xi) ≠ 0 Instrument Exogeneity – IV is NOT correlated with the error term: corr (Zi , ei) = 0

Evaluating Instruments # POLICE  CRIME (Steven Levitt 1997) Simple OLS gives positive result – increase number of police, increase crime Why? Problem with simple OLS is that there is a policy response to crime – hire more police – which causes a reverse causality effect

Evaluating Instruments # POLICE  CRIME (Steven Levitt 1997) Simple OLS gives positive result – increase number of police, increase crime Why? Instrument: Was there a mayoral election in the year the measurements were taken? IV regression gives expected negative result – increase number of police, decrease crime Why is this a good instrument? Mayors hire more police in election year – so correlated with independent variable But whether there is a mayoral elections does NOT affect the level of crime.

IV: example Two-stage least squares: Stage 1: Decompose police hires into the component that can be predicted by the electoral cycle and the problematic component policei = 0 + 1 electioni + i Stage 2: Use the predicted value of policei from the first-stage regression to estimate its effect on crimei crimei = 0 + 1 police-hati + i Finding: an increased police force reduces violent crime (but has little effect on property crime)

IV: number of instruments There must be at least as many instruments as endogenous regressors Let k = number of endogenous regressors m = number of instruments The regression coefficients are exactly identified if m=k (OK) overidentified if m>k (OK) underidentified if m<k (not OK)

IV: testing instrument relevance How do we know if our instruments are valid? Recall our first condition for a valid instrument: 1. Relevance: corr (Zi , Xi) ≠ 0 Stock and Watson’s rule of thumb: the first-stage F-statistic testing the hypothesis that the coefficients on the instruments are jointly zero should be at least 10 (for a single endogenous regressor) A small F-statistic means the instruments are “weak” (they explain little of the variation in X) and the estimator is biased

IV: testing instrument exogeneity Recall our second condition for a valid instrument: 2. Exogeneity: corr (Zi , ei) = 0 If you have the same number of instruments and endogenous regressors, it is impossible to test for instrument exogeneity But if you have more instruments than regressors: Overidentifying restrictions test – regress the residuals from the 2SLS regression on the instruments (and any exogenous control variables) and test whether the coefficients on the instruments are all zero

IV: drawbacks of this method It can be difficult to find an instrument that is both relevant (not weak) and exogenous Assessment of instrument exogeneity can be highly subjective when the coefficients are exactly identified IV can be difficult to explain to those who are unfamiliar with it

Closing Comments about Instrumental Variables Studies In general, a lagged value of the endogenous regressor is not a good instrument Traditional structural equation model uses lagged values of X and Y as instruments to break the simultaneity between the current values of X and Y X1 X2 Y1 Y2 These models impose the awfully strong assumption that lagged values of X and Y only affect the outcomes through current values

Closing Comments about Instrumental Variables Studies Good IV models are generally interesting in their own right, and should not be treated as “tack on” analyses Practice varies widely across disciplines Some researchers write papers about their discovery and application of a “clever” IV for some problem Other researchers “tack on” IV models at the end of their analysis, often poorly, as a way to convince readers that their results are robust

Rules for Good Practice with Instrumental Variables Models IV models can be very informative, but it’s your job to convince your audience Show the first-stage model diagnostics Even the most clever IV might not be sufficiently strongly related to X to be a useful source of identification Report test(s) of overidentifying restrictions An invalid IV is often worse than no IV at all Report LS endogeneity (DWH) test

Rules for Good Practice with Instrumental Variables Models Most importantly, TELL A STORY about why a particular IV is a “good instrument” Something to consider when thinking about whether a particular IV is “good” Does the IV, for all intents and purposes, randomize the endogenous regressor?