Econometric Analysis of Panel Data

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Econometric Analysis of Panel Data
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Econometric Analysis of Panel Data Random Regressors –Pooled (Constant Effects) Model Instrumental Variables –Fixed Effects Model –Random Effects Model.
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.
The Generalized IV Estimator IV estimation with a single endogenous regressor and a single instrument can be naturally generalized. Suppose that there.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Econometric Analysis of Panel Data Dynamic Panel Data Analysis –First Difference Model –Instrumental Variables Method –Generalized Method of Moments –Arellano-Bond-Bover.
Econometric Analysis of Panel Data Instrumental Variables in Panel Data –Assumptions of Instrumental Variables –Fixed Effects Model –Random Effects Model.
Generalized Regression Model Based on Greene’s Note 15 (Chapter 8)
Econometric Analysis of Panel Data Lagged Dependent Variables –Pooled (Constant Effects) Model –Fixed Effects Model –Random Effects Model –First Difference.
Part 7: Regression Extensions [ 1/59] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.
Autocorrelation Lecture 18 Lecture 18.
Ordinary Least Squares
1. Descriptive Tools, Regression, Panel Data. Model Building in Econometrics Parameterizing the model Nonparametric analysis Semiparametric analysis Parametric.
Part 5: Random Effects [ 1/54] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
[Part 3: Common Effects ] 1/57 Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Microeconometric Modeling William Greene Stern School of Business New York University.
Topics in Microeconometrics William Greene Department of Economics Stern School of Business.
Part 6: MLE for RE Models [ 1/38] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
10. Basic Regressions with Times Series Data 10.1 The Nature of Time Series Data 10.2 Examples of Time Series Regression Models 10.3 Finite Sample Properties.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Application 3: Estimating the Effect of Education on Earnings Methods of Economic Investigation Lecture 9 1.
1 REGRESSION ANALYSIS WITH PANEL DATA: INTRODUCTION A panel data set, or longitudinal data set, is one where there are repeated observations on the same.
1/68: Topic 1.3 – Linear Panel Data Regression Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA William.
[Topic 2-Endogeneity] 1/33 Topics in Microeconometrics William Greene Department of Economics Stern School of Business.
1/69: Topic Descriptive Statistics and Linear Regression Microeconometric Modeling William Greene Stern School of Business New York University New.
1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.
[Topic 1-Regression] 1/37 1. Descriptive Tools, Regression, Panel Data.
1/53: Topic 3.1 – Models for Ordered Choices Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA William.
Part 4A: GMM-MDE[ 1/33] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Lecture 1 Introduction to econometrics
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
Chapter 15 Panel Data Models.
Econometrics ITFD Week 8.
Difference-in-Differences
More on Specification and Data Issues
Microeconometric Modeling
Microeconometric Modeling
More on Specification and Data Issues
Microeconometric Modeling
Microeconometric Modeling
Microeconometric Modeling
Microeconometric Modeling
Serial Correlation and Heteroscedasticity in
Econometrics I Professor William Greene Stern School of Business
Chengyuan Yin School of Mathematics
Econometric Analysis of Panel Data
Microeconometric Modeling
Microeconometric Modeling
Microeconometric Modeling
Econometrics I Professor William Greene Stern School of Business
Microeconometric Modeling
Instrumental Variables Estimation and Two Stage Least Squares
More on Specification and Data Issues
Econometrics I Professor William Greene Stern School of Business
Econometric Analysis of Panel Data
Serial Correlation and Heteroscedasticity in
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2019 William Greene Department of Economics Stern School.
Advanced Panel Data Methods
Econometrics I Professor William Greene Stern School of Business
Econometrics I Professor William Greene Stern School of Business
Presentation transcript:

Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Short Term Agenda for Simple Effects Models Models with individual effects Interpretation of models Computation (practice) and estimation (theory) Extensions Nonstandard panels: Rotating, Pseudo-, Nested Generalizing the regression model Alternative estimators Methods Least squares: OLS, GLS, FGLS MLE and Maximum Simulated Likelihood

Fixed and Random Effects Unobserved individual effects in regression: E[yit | xit, ci] Notation: Linear specification: Fixed Effects: E[ci | Xi ] = g(Xi). Cov[xit,ci] ≠0 effects are correlated with included variables. Random Effects: E[ci | Xi ] = μ; effects are uncorrelated with included variables. If Xi contains a constant term, μ=0 WLOG. Common: Cov[xit,ci] =0, but E[ci | Xi ] = μ is needed for the full model

Convenient Notation Fixed Effects – the ‘dummy variable model’ Random Effects – the ‘error components model’ Individual specific constant terms. Compound (“composed”) disturbance

Balanced and Unbalanced Panels Distinction: Balanced vs. Unbalanced Panels A notation to help with mechanics zi,t, i = 1,…,N; t = 1,…,Ti The role of the assumption Mathematical and notational convenience: Balanced, n=NT Unbalanced: Is the fixed Ti assumption ever necessary? Almost never. (Baltagi chapter 9 is about algebra, not different models!) Is unbalancedness due to nonrandom attrition from an otherwise balanced panel? This will require special considerations.

Unbalanced Panels and Attrition ‘Bias’ Test for ‘attrition bias.’ (Verbeek and Nijman, Testing for Selectivity Bias in Panel Data Models, International Economic Review, 1992, 33, 681-703. Variable addition test using covariates of presence in the panel Nonconstructive – what to do next? Do something about attrition bias. (Wooldridge, Inverse Probability Weighted M-Estimators for Sample Stratification and Attrition, Portuguese Economic Journal, 2002, 1: 117-139) Stringent assumptions about the process Model based on probability of being present in each wave of the panel

Inverse Probability Weighting

An Unbalanced Panel: RWM’s GSOEP Data on Health Care N = 7,293 Households Some households exited then returned

Unbalanced Panel Data – Not Attrition (First 10 households in healthcare data)

Exogeneity Contemporaneous exogeneity E[εit|xit,ci]=0  Not sufficient for regression Doesn’t imply how to estimate β Strict exogeneity – the most common assumption E[εit|xi1, xi2,…,xiT,ci]=0 Can use first difference or fixed effects Cannot hold if xit contains lagged values of yit Sequential exogeneity? E[εit|xi1, xi2,…,xit,ci] = 0 These assumptions are not testable. They are part of the model.

Assumptions for Asymptotics Convergence of moments involving cross section Xi. N increasing, T or Ti assumed fixed. “Fixed T asymptotics” (see text, p. 175) Time series characteristics are not relevant (may be nonstationary) If T is also growing, need to treat as multivariate time series. Ranks of matrices. X must have full column rank. (Xi may not, if Ti < K.) Strict exogeneity and dynamics. If xit contains yi,t-1 then xit cannot be strictly exogenous. Xit will be correlated with the unobservables in period t-1. (To be revisited later.) Empirical characteristics of microeconomic data

Estimating β β is the partial effect of interest Can it be estimated (consistently) in the presence of (unmeasured) ci? Does pooled least squares “work?” Strategies for “controlling for ci” using the sample data Using a proxy variable.

The Pooled Regression Presence of omitted effects Potential bias/inconsistency of OLS – depends on ‘fixed’ or ‘random’

A Popular Misconception If only one variable in X is correlated with , the other coefficients are consistently estimated. False. The problem is “smeared” over the other coefficients.

OLS with Individual Effects Bias turns on the group means

Mundlak’s Estimator Mundlak, Y., “On the Pooling of Time Series and Cross Section Data, Econometrica, 46, 1978, pp. 69-85.

Chamberlain’s (1982) Approach Use a linear projection, not necessarily the conditional mean. This “regression” can be computed T times, using one year at a time. How would we reconcile the multiple estimators of each parameter?.

Chamberlain’s (1982) Approach

Proxy Variables Proxies for unobserved effects: e.g., Test score for unobserved ability Interest is in δ(xit,ci)=E[yit|xit,ci]/xit Since ci is unobserved, we seek APE = Ec[δ(xit,ci)] Proxy has two characteristics Ignorable in the model: E[yit|xit,zi,ci] = E[yit|xit,ci] ‘Explains’ ci in that E[ci|zi,xit] = E[ci|zi]. In the presence of zi, xit does not further ‘explain ci.’ Then, Ec[δ(xit,ci)] = Ez{E[yit|xit,zi]/xit} Proof: See Wooldridge, pp. 23-24. Loose ends: Where do you get the proxy? What is E[yit|xit,zi]? Use the linear projection and hope for the best.

Estimating the Sampling Variance of bOLS s2(X ́X)-1? Correlation across observations => No Possible heteroscedasticity Find a “robust” covariance matrix Robust estimation (in general) The White estimator A Robust estimator for OLS.

A ‘Cluster’ Estimator

Cluster Estimator (cont.)

Cornwell and Rupert Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP = work experience WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA = 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by union contract ED = years of education LWAGE = log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155.  See Baltagi, page 122 for further analysis.  The data were downloaded from the website for Baltagi's text.

Application: Cornell and Rupert

Bootstrapping Some assumptions that underlie it - the sampling mechanism Method: 1. Estimate using full sample: --> b 2. Repeat R times: Draw n observations from the n, with replacement Estimate  with b(r). 3. Estimate variance with W = (1/R)r [b(r) - b][b(r) - b]’

Bootstrap Application matr;bboot=init(7,21,0.)$ Store results here name;x=one,occ,…,exp$ Define X regr;lhs=lwage;rhs=x$ Compute b calc;i=0$ Counter Proc Define procedure regr;lhs=lwage;rhs=x;quietly$ … Regression matr;{i=i+1};bboot(*,i)=b$... Store b(r) Endproc Ends procedure exec;n=20;bootstrap=b$ 20 bootstrap reps matr;list;bboot' $ Display results

Results of Bootstrap Procedure

Bootstrap Replications Full sample result Bootstrapped sample results

Bootstrap variance for a panel data estimator Panel Bootstrap = Block Bootstrap Data set is N groups of size Ti Bootstrap sample is N groups of size Ti drawn with replacement.

Bootstrapping Naïve bootstrap: Why is it naïve? Cases when it fails Time series “Clustered data” Order statistics Parameters on the edge of the parameter space Alternatives Block bootstrap “Wild” bootstrap (injects extra randomness)

Using First Differences Eliminating the heterogeneity

OLS with First Differences With strict exogeneity of (Xi,ci), OLS regression of Δyit on Δxit is unbiased and consistent but inefficient. GLS is unpleasantly complicated. In order to compute a first step estimator of σε2 we would use fixed effects. We should just stop there. Or, use OLS in first differences and use Newey-West with one lag.

Two Periods With two periods and strict exogeneity, This is a classical regression model. If there are no regressors,

Difference-in-Differences Model With two periods and strict exogeneity of D and T, This is a linear regression model. If there are no regressors,

Difference in Differences

http://dera.ioe.ac.uk/14610/1/oft1416.pdf

Outcome is the fees charged. Activity is collusion on fees.

Treatment Schools: Treatment is an intervention by the Office of Fair Trading Control Schools were not involved in the conspiracy Treatment is not voluntary

Treatment (Intervention) Effect = δ

In order to test robustness two versions of the fixed effects model were run. The first is Ordinary Least Squares, and the second is heteroscedasticity and auto-correlation robust (HAC) standard errors in order to check for heteroscedasticity and autocorrelation.

Natural Experiment: Do Motorcycle Helmets Save Lives? In the three years after Michigan repealed a mandatory motorcycle helmet law, deaths and head injuries among bikers rose sharply, according to a recent study. Deaths at the scene of the crash more than quadrupled, while deaths in the hospital tripled for motorcyclists. Head injuries have increased overall, and more of them are severe, the researchers report in the American Journal of Surgery.

D-in-D Model: Natural Experiment With two periods and strict exogeneity, This is a classical regression model. If there are no regressors,

D-i-D – Natural Experiment Card and Krueger: “Minimum Wages and Employment: A Case Study of the Fast Food Industry in New Jersey and Pennsylvania,” AER, 84(4), 1994, 772-793. Pennsylvania vs. New Jersey 1991, NJ raises minimum wage Compare change in employment PA after the change to change in employment in NJ after the change. Differences cancel out other things specific to the state that would explain change in employment.

A Tale of Two Cities A sharp change in policy can constitute a natural experiment The Mariel boatlift from Cuba to Miami (May-September, 1980) increased the Miami labor force by 7%. Did it reduce wages or employment of non-immigrants? Compare Miami to Los Angeles, a comparable (assumed) city. Card, David, “The Impact of the Mariel Boatlift on the Miami Labor Market,” Industrial and Labor Relations Review, 43, 1990, pp. 245-257.

Difference in Differences

Applying the Model c = M for Miami, L for Los Angeles Immigration occurs in Miami, not Los Angeles T = 1979, 1981 (pre- and post-) Sample moment equations: E[Yi|c,t,T] E[Yi|M,79] = β79 + γM E[Yi|M,81] = β81 + γM + δ E[Yi|L,79] = β79 + γL E[Yi|L,79] = β81 + γL It is assumed that unemployment growth in the two cities would be the same if there were no immigration.

Implications for Differences Neither city exposed to migration E[Yi,0|M,81] - E[Yi,0|M,79] = [β81 + γM ] – [β79 + γM] ( Miami) E[Yi,0|L,81] - E[Yi,0|L,79] = [β81 + γL ] – [β79 + γL] (LA) Both cities exposed to migration E[Yi,1|M,81] - E[Yi,1|M,79] = [β81 + γM ] – [β79 + γM] + δ(Miami) E[Yi,1|L,81] - E[Yi,1|L,79] = [β81 + γL ] – [β79 + γL] + δ(LA) One city (Miami) exposed to migration: The difference in differences is. Miami change - Los Angeles change {E[Yi,1|M,81] - E[Yi,1|M,79]} – {E[Yi,0|L,81] - E[Yi,0|L,79]} = δ (Miami)

The Tale 1979 1980 1981 1982 1983 1984 1985 In 79, Miami unemployment is 2.0% lower In 80, Miami unemployment is 7.1% lower From 79 to 80, Miami gets 5.1% better In 81, Miami unemployment is 3.0% lower In 82, Miami unemployment is 3.3% higher From 81 to 82, Miami gets 6.3% worse

Application of a Two Period Model “Hemoglobin and Quality of Life in Cancer Patients with Anemia,” Finkelstein (MIT), Berndt (MIT), Greene (NYU), Cremieux (Univ. of Quebec) 1998 With Ortho Biotech – seeking to change labeling of already approved drug ‘erythropoetin.’ r-HuEPO

QOL Study Quality of life study i = 1,… 1200+ clinically anemic cancer patients undergoing chemotherapy, treated with transfusions and/or r-HuEPO t = 0 at baseline, 1 at exit. (interperiod survey by some patients was not used) yit = self administered quality of life survey, scale = 0,…,100 xit = hemoglobin level, other covariates Treatment effects model (hemoglobin level) Background – r-HuEPO treatment to affect Hg level Important statistical issues Unobservable individual effects The placebo effect Attrition – sample selection FDA mistrust of “community based” – not clinical trial based statistical evidence Objective – when to administer treatment for maximum marginal benefit

Regression-Treatment Effects Model

Effects and Covariates Individual effects that would impact a self reported QOL: Depression, comorbidity factors (smoking), recent financial setback, recent loss of spouse, etc. Covariates Change in tumor status Measured progressivity of disease Change in number of transfusions Presence of pain and nausea Change in number of chemotherapy cycles Change in radiotherapy types Elapsed days since chemotherapy treatment Amount of time between baseline and exit

First Differences Model

Finding Optimal treatment. Conventional wisdom and assumption of policy. Study finding Note the implication of the study for the location of the optimal point for the treatment. Largest marginal benefit moves from the left tail to the center.

Most Helpful Customer Reviews 31 of 39 people found the following review helpful Too theoretical and poorly written By Doktor Faustus on May 7, 2013 Format: Hardcover Econometric Analysis" by William Greene is one of the more widely use graduate-level textbooks in econometrics. I used it in my first year PhD econometrics course. This is unfortunate for several reasons. The book states that its first objective is to introduce students to applied econometrics, especially the basic techniques of linear regression. When reading the book, however, what the reader notices first is that the applications are essentially just footnotes; the meat of each chapter is dense econometric theory. An applied textbook would focus on working with data, but Greene's book has exercises that focus on proving obscure statistical properties (i.e. prove that the asymptotic variance of various estimators goes to zero). Useful for theorists, but not for applied work, which is what the book advertises itself as. Another problem with the book is its impenetrable text. Reading this book is drudgery even when not trying to make sense of the absurdly huge matrix equations. Greene uses academic, elevated language that does not belong in a technical textbook. Where the student needs clear explanation, he instead reads sentences like the following found in a chapter introduction: "We first consider the consequences for the least squares estimator of the more general form of the regression model. This will include assessing the effect of ignoring the complication of the generalized model and of devising an appropriate estimation strategy, still based on least squares". After reading that second sentence several times I still don't understand what Greene is trying to convey. Finally the book is much too large and expensive for a class textbook. The book is 1200 pages long and includes numerous asides in every chapter. If the objective of the book is to teach econometrics to graduate students (as it says in the book), then it would be better off focusing on important topics and applications, not on topics that are never used by the vast majority of economists. I do not recommend this book for anyone; there are better econometrics textbooks available for undergraduates, graduate students, and professionals.

October 13, 2014 By Daniel Pulido This review is from: Econometric Analysis (7th Edition) (Hardcover) The delivery was fine. But the book itself is the worst Econometric Analysis book I have ever come across. No examples. Only a continuous list of theorems. I would not recommend anyone this book.