1 IV/2SLS models. 2 3 4 Vietnam era service Defined as 1964-1975 Estimated 8.7 million served during era 3.4 million were in SE Asia 2.6 million served.

Slides:



Advertisements
Similar presentations
Dummy Variables and Interactions. Dummy Variables What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending.
Advertisements

Mis-measured Treatment Effects Fall 2009 Bill Evans.
Sociology 601 Class 24: November 19, 2009 (partial) Review –regression results for spurious & intervening effects –care with sample sizes for comparing.
Using Instrumental Variables (IV) Analysis in Institutional Research & Program Evaluation GARY PIKE HIGHER EDUCATION & STUDENT AFFAIRS INDIANA UNIVERSITY.
1 Results from hsb_subset.do. 2 Example of Kloeck problem Two-stage sample of high school sophomores 1 st school is selected, then students are picked,
Instrumental Variables Estimation and Two Stage Least Square
Heteroskedasticity The Problem:
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Some Topics In Multivariate Regression. Some Topics We need to address some small topics that are often come up in multivariate regression. I will illustrate.
EC220 - Introduction to econometrics (chapter 7)
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
1 Angrist/Evans Angrist/Krueger
Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.
1 IV/2SLS models. 2 Z i =1 1 =0.57 Z i =0 0 =0.80.
Prof. Dr. Rainer Stachuletz
1 IV/2SLS models. 2 Z i =1 1 =0.57 Z i =0 0 =0.80.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Regression Discontinuity Design 1. 2 Z Pr(X i =1 | z) 0 1 Z0Z0 Fuzzy Design Sharp Design.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Sociology 601 Class 23: November 17, 2009 Homework #8 Review –spurious, intervening, & interactions effects –stata regression commands & output F-tests.
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
LT6: IV2 Sam Marden Question 1 & 2 We estimate the following demand equation ln(packpc) = b 0 + b 1 ln(avgprs) +u What do we require.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
Returning to Consumption
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Addressing Alternative Explanations: Multiple Regression
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
Instrumental variables
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Two-stage least squares 1. D1 S1 2 P Q D1 D2D2 S1 S2 Increase in income Increase in costs 3.
1 IV/2SLS models. 2 Z i =1 1 =0.57 Z i =0 0 =0.80.
Special topics. Importance of a variable Death penalty example. sum death bd- yv Variable | Obs Mean Std. Dev. Min Max
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Endogeneity in Econometrics: Instrumental Variable Estimation
Lecture 5. Linear Models for Correlated Data: Inference.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
QM222 Class 9 Section A1 Coefficient statistics
Instrumental Variable (IV) Regression
QM222 Class 8 Section A1 Using categorical data in regression
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
QM222 Class 15 Section D1 Review for test Multicollinearity
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

1 IV/2SLS models

2

3

4 Vietnam era service Defined as Estimated 8.7 million served during era 3.4 million were in SE Asia 2.6 million served in Vietnam 1.6 million saw combat 203K wounded in action, 153K hospitalized 58,000 deaths n%20war%20casualty.htm#t7

5 Vietnam Era Draft 1 st part of war, operated liked WWII and Korean War At age 18 men report to local draft boards Could receive deferment for variety of reasons (kids, attending school) If available for service, pre-induction physical and tests Military needs determined those drafted

6 Everyone drafted went to the Army Local draft boards filled army. Priorities –Delinquents, volunteers, non-vol –For non-vol., determined by age College enrollment powerful way to avoid service –Men w. college degree 1/3 less likely to serve

7 Draft Lottery Proposed by Nixon Passed in Nov 1969, 1 st lottery Dec 1, st lottery for men age on 1/1/70 –Men born Randomly assigned number 1-365, Draft Lottery number (DLN) Military estimates needs, sets threshold T If DLN<=T, drafted

8 Questions? What are the research questions? Why can we NOT obtain estimates from observational data?

9 If volunteer, could get better assignment Thresholds for service DraftYear of BirthThreshold Draft suspended in 1973

10

11

12

13

14 Angrist/Evans

15

16

17

18

19

20

21

22

23 Correlation coefficient

24 Ratio of variances = ( / )^2 =

25 R 2 = / = βiv = / =

26 Reduced form, just identified model

27 First stage, just identified model

28 2SLS, just identified model Β iv = / =

29 1 st stage over identified model

ivreg2 Download from www Within stata, type ssc install ivreg2, replace and hit return Does all the tests seemlessly 30

31 * the syntax is ivreg2 y w (x=z), first endog(x); * the first command asks stata to report the 1st stage, and; * endog(x) asks stata to do the hausman-wu test of endogeneity; ivreg2 workedm boy1st boy2nd agem1 agefstm black hispan othrace (morekids=samesex), first endog(morekids); Endogenous variable And instruments Ask for 1 st stage Test for endogeneity of morekids in model Outcome of interest W’s (exogenous covariates)

32 IV (2SLS) estimation Estimates efficient for homoskedasticity only Statistics consistent for homoskedasticity only Number of obs = F( 8,254645) = Prob > F = Total (centered) SS = Centered R2 = Total (uncentered) SS = Uncentered R2 = Residual SS = Root MSE = workedm | Coef. Std. Err. z P>|z| [95% Conf. Interval] morekids | boy1st | boy2nd | agem1 | agefstm | black | hispan | othrace | _cons | Underidentification test (Anderson canon. corr. LM statistic): Chi-sq(1) P-val = Weak identification test (Cragg-Donald Wald F statistic): Stock-Yogo weak ID test critical values: 10% maximal IV size % maximal IV size % maximal IV size % maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission Sargan statistic (overidentification test of all instruments): 0.000

33 OLS estimation Estimates efficient for homoskedasticity only Statistics consistent for homoskedasticity only Number of obs = F( 8,254645) = Prob > F = Total (centered) SS = Centered R2 = Total (uncentered) SS = Uncentered R2 = Residual SS = Root MSE = morekids | Coef. Std. Err. t P>|t| [95% Conf. Interval] boy1st | agem1 | agefstm | black | hispan | othrace | twoboys | twogirls | _cons | Included instruments: boy1st agem1 agefstm black hispan othrace twoboys twogirl > s F test of excluded instruments: F( 2,254645) = Prob > F = Angrist-Pischke multivariate F test of excluded instruments: F( 2,254645) = Prob > F = st stage F

34 Summary results for first-stage regressions (Underid) (Weak id) Variable | F( 2,254645) P-val | AP Chi-sq( 2) P-val | AP F( 2,254645) morekids | | |

35 IV (2SLS) estimation Estimates efficient for homoskedasticity only Statistics consistent for homoskedasticity only Number of obs = F( 7,254646) = Prob > F = Total (centered) SS = Centered R2 = Total (uncentered) SS = Uncentered R2 = Residual SS = Root MSE = workedm | Coef. Std. Err. z P>|z| [95% Conf. Interval] morekids | boy1st | agem1 | agefstm | black | hispan | othrace | _cons | Underidentification test (Anderson canon. corr. LM statistic): Chi-sq(2) P-val = Weak identification test (Cragg-Donald Wald F statistic): Stock-Yogo weak ID test critical values: 10% maximal IV size % maximal IV size % maximal IV size % maximal IV size 7.25 Source: Stock-Yogo (2005). Reproduced by permission Sargan statistic (overidentification test of all instruments): Chi-sq(1) P-val = endog- option: Endogeneity test of endogenous regressors: Chi-sq(1) P-val = Regressors tested: morekids Instrumented: morekids Included instruments: boy1st agem1 agefstm black hispan othrace Excluded instruments: twoboys twogirls Test of over id. Hausman endo test

36. * output residuals and do the tests of overid;. * and hausman test by brute force;. predict res_2sls_worked, res;. * test of overid;. reg res_2sls_worked twoboys twogirls boy1st agem1 agefstm black hispan othr > ace; Source | SS df MS Number of obs = F( 8,254645) = 0.77 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = res_2sls_w~d | Coef. Std. Err. t P>|t| [95% Conf. Interval] twoboys | e-06 twogirls | boy1st | agem1 | 3.72e agefstm | 2.07e black | hispan | othrace | _cons |

37 SSM = SST = R2 = SSM/SST = 2.43E-5 N = NR 2 = 6.18 Dist as χ 2 (1) P-value of 6.18 is

38. * Run Hausmans test of endogeneity, two instrument case;. * add residual from 1st stage regression to OLS of structural model;. reg workedm morekids boy1st agem1 agefstm black hispan othrace res_1st_2zs; Source | SS df MS Number of obs = F( 8,254645) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = workedm | Coef. Std. Err. t P>|t| [95% Conf. Interval] morekids | boy1st | agem1 | agefstm | black | hispan | othrace | res_1st_2zs | _cons | * notice that OLS of this model generates 2SLS estimates of the other;. * variables in the model (morekids, boy1st, etc.);. test res_1st_2zs; ( 1) res_1st_2zs = 0 F( 1,254645) = 3.81 Prob > F = Do Hausman test brute force

39 Can reject at 5.1 percent the null the coefficients are The same

Angrist/Krueger 40

41 Example Suppose a school district requires that a child turn 6 by October 31 in the 1 st grade Has compulsory education until age 18 Consider two kids One born Oct 1, 1960 Another born Nov 1,1960

42 Oct 1, 1960 –Starts school in 1966 (age 5) –Turns 6 a few months into school –Starts senior year in 1977 (age 16) –Does not turn 18 until after HS school is over Nov 1, 1960 –Start school in 1967 (age 6) –Turns 7 a few months into school –Starts senior year in 1978 (age 17) –Turns 18 midway through senior year

43

44

45

46

47 1 st stage Reduced-form β iv= = / =

48 Correlation coefficient: z and x

49

50

51

52

53

54

55

56 Overidentified model 10 years of birth 3 quarters of birth 30 instruments

57 The xi command i.m*i.n takes and generates dummies for i.m, i.n then all the unique interactions of m and n

58 YOB effects QOB main effects and qob x yob interactions as instruments

59. estat overid; Tests of overidentifying restrictions: Sargan (score) chi2(29)= (p = ) Basmann chi2(29) = (p = )

60 1 st stage F – lots of concerns about finite sample bias

61 In columns (4) and (8), age and agesq reduce information contained in instrument. 1 st stage F falls to 1.6. Compare 2sls to IV in these cases. In this instance, low F – poor 1 st stage fit – results collapse to OLS

62 Generate instruments by interacting 3 QOB x 10 YOB dummies (30) 3 QOB x 50 YOB dummies (147) 177 instruments, 176 DOF in NR 2 test Notice how close the 2SLS and OLS are

63