Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA 1.1 Descriptive Statistics and Linear Regression
Linear Regression Model Data Description Linear Regression Model Basic Statistics Tables Histogram Box Plot Kernel Density Estimator Linear Model Specification & Estimation Nonlinearities Interactions Inference - Testing Wald F LM Prediction and Model Fit Endogeneity 2SLS Control Function Hausman Test
Cornwell and Rupert Panel Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP = work experience WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA = 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by union contract ED = years of education LWAGE = log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155. 3
Objective: Impact of Education on (log) Wage Specification: What is the right model to use to analyze this association? Estimation Inference Analysis
Simple Linear Regression LWAGE = 5.8388 + 0.0652*ED
Multiple Regression
Nonlinear Specification: Quadratic Effect of Experience
Partial Effects Coefficients do not tell the story Education: .05654 Experience .04045 - 2*.00068*Exp FEM -.38922
Effect of Experience = .04045 - 2*.00068*Exp Positive from 1 to 30, negative after.
Model Implication: Effect of Experience and Male vs. Female
Interaction Effect Gender Difference in Partial Effects
Partial Effect of a Year of Education E[logWage]/ED=ED + ED. FEM Partial Effect of a Year of Education E[logWage]/ED=ED + ED*FEM *FEM Note, the effect is positive. Effect is larger for women.
Gender Effect Varies by Years of Education
Endogeneity y = X+ε, Definition: E[ε|x]≠0 Why not? The most common reasons: Omitted variables Unobserved heterogeneity (equivalent to omitted variables) Measurement error on the RHS (equivalent to omitted variables) Endogenous sampling and attrition
The Effect of Education on LWAGE
An Exogenous Influence
Instrumental Variables Structure LWAGE (ED,EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) ED (MS, FEM) Reduced Form: LWAGE[ ED (MS, FEM), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ]
Two Stage Least Squares Strategy Reduced Form: LWAGE[ ED (MS, FEM,X), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ] Strategy (1) Purge ED of the influence of everything but MS, FEM (and the other variables). Predict ED using all exogenous information in the sample (X and Z). (2) Regress LWAGE on this prediction of ED and everything else. Standard errors must be adjusted for the predicted ED
OLS
The extreme result for the coefficient on ED is probably due to the fact that the instruments, MS and FEM are dummy variables. There is not enough variation in these variables.
Source of Endogeneity LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + ED = f(MS,FEM, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u
Remove the Endogeneity by Using a Control Function LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u + LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u + Strategy Estimate u Add u to the equation. ED is uncorrelated with when u is in the equation.
Auxiliary Regression for ED to Obtain Residuals
OLS with Residual (Control Function) Added 2SLS
A Warning About Control Functions Sum of squares is not computed correctly because U is in the regression. A general result. Control function estimators usually require a fix to the estimated covariance matrix for the estimator.
An Endogeneity Test? (Hausman) Exogenous Endogenous OLS Consistent, Efficient Inconsistent 2SLS Consistent, Inefficient Consistent Base a test on d = b2SLS - bOLS Use a Wald statistic, d’[Var(d)]-1d What to use for the variance matrix? Hausman: V2SLS - VOLS
Hausman Test Chi squared with 1 degree of freedom
Endogeneity Test: Wu Considerable complication in Hausman test (Greene (2012), pp. 234-237) Simplification: Wu test. Regress y on X and estimated for the endogenous part of X. Then use an ordinary Wald test. Variable addition test
Wu Test