Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chengyuan Yin School of Mathematics

Similar presentations


Presentation on theme: "Chengyuan Yin School of Mathematics"— Presentation transcript:

1 Chengyuan Yin School of Mathematics
Econometrics Chengyuan Yin School of Mathematics

2 17. Linear Models for Panel Data
Econometrics 17. Linear Models for Panel Data

3 Panel Data Sets Longitudinal data Cross section time series
National longitudinal survey of youth (NLS) British household panel survey (BHPS) Panel Study of Income Dynamics (PSID) Cross section time series Grunfeld’s investment data Penn world tables Financial data by firm, year rit – rft = i(rmt - rft) + εit, i = 1,…,many; t=1,…many Exchange rate data, essentially infinite T, large N Effects: i=  + vi

4 Terms of Art Cross sectional vs. time series variation - (history: consumption function studies) Heterogeneity Group effects (individual effects) Fixed effects and/or random effects Substantive differences? Is it possible to tell them apart in observed data?

5 Panel Data Rotating panels: Spanish household survey
Spanish income study ( Efficiency analysis: “Efficiency measurement in rotating panel data,” Heshmati, A, Applied Economics, 30, 1998, pp Hierarchical (nested) data sets: Student outcome, by year, district, school, teacher

6 Nested Panel Data Antweiler, W., Nested Random Effects…” Journal of Econometrics, 101, 2001,

7 Balanced and Unbalanced Panels
Distinction A notation to help with mechanics zi,t, i = 1,…,N; t = 1,…,Ti The role of the assumption Mathematical and notational convenience: Balanced, NT Unbalanced: Is the fixed Ti assumption ever necessary? SUR models.

8 Benefits of Panel Data Time and individual variation in behavior unobservable in cross sections or aggregate time series Observable and unobservable individual heterogeneity Rich hierarchical structures Dynamics in economic behavior

9 Fixed and Random Effects
Unobserved individual effects in regression: E[yit | xit, ci] Notation: Linear specification: Fixed Effects: E[ci | Xi ] = g(Xi); effects are correlated with included variables. Common: Cov[xit,ci] ≠0 Random Effects: E[ci | Xi ] = μ; effects are uncorrelated with included variables. If Xi contains a constant term, μ=0 WLOG. Common: Cov[xit,ci] =0, but E[ci | Xi ] = μ is needed for the full model

10 Convenient Notation Fixed Effects Random Effects
Individual specific constant terms. Compound (“composed”) disturbance; “error components”

11 Assumptions for Asymptotics
Convergence of moments involving cross section Xi. N increasing, T or Ti assumed fixed. “Fixed T asymptotics” (see text, p. 196) Time series characteristics are not relevant (may be nonstationary) If T is also growing, need to treat as multivariate time series. Ranks of matrices. X must have full column rank. (Xi may not, if Ti < K.) Strict exogeneity and dynamics. If xit contains yi,t-1 then xit cannot be strictly exogenous. Xit will be correlated with the unobservables in period t-1. (To be revisited later.) Empirical characteristics of microeconomic data

12 The Pooled Regression Presence of omitted effects
Potential bias/inconsistency of OLS – depends on ‘fixed’ or ‘random’

13 Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP = work experience WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA = 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by unioin contract ED = years of education BLK = 1 if individual is black LWAGE = log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp  See Baltagi, page 122 for further analysis.  The data were downloaded from the website for Baltagi's text.

14 Application: Cornell and Rupert

15 Using First Differences
Eliminating the heterogeneity

16 OLS with First Differences
With strict exogeneity of (Xi,ci), OLS regression of Δyit on Δxit is unbiased and consistent but inefficient. GLS is unpleasantly complicated. In order to compute a first step estimator of σε2 we would use fixed effects. We should just stop there. Or, use OLS in first differences and use Newey-West with one lag.

17 Two Periods With two periods and strict exogeneity,
This is a classical regression model. If there are no regressors,

18 Application of a Two Period Model
“Hemoglobin and Quality of Life in Cancer Patients with Anemia,” Finkelstein (MIT), Berndt (MIT), Greene (NYU), Cremieux (Univ. of Quebec) 1998 With Ortho Biotech – seeking to change labeling of already approved drug ‘erythropoetin.’ r-HuEPO

19 QOL Study Quality of life study
i = 1,… clinically anemic cancer patients undergoing chemotherapy, treated with transfusions and/or r-HuEPO t = 0 at baseline, 1 at exit. (interperiod survey by some patients was not used) yit = self administered quality of life survey, scale = 0,…,100 xit = hemoglobin level, other covariates Treatment effects model (hemoglobin level) Background – r-HuEPO treatment to affect Hg level Important statistical issues Unobservable individual effects The placebo effect Attrition – sample selection FDA mistrust of “community based” – not clinical trial based statistical evidence Objective – when to administer treatment for maximum marginal benefit

20 Regression-Treatment Effects Model

21 Effects and Covariates
Individual effects that would impact a self reported QOL: Depression, comorbidity factors (smoking), recent financial setback, recent loss of spouse, etc. Covariates Change in tumor status Measured progressivity of disease Change in number of transfusions Presence of pain and nausea Change in number of chemotherapy cycles Change in radiotherapy types Elapsed days since chemotherapy treatment Amount of time between baseline and exit

22 First Differences Model

23 Dealing with Attrition
The attrition issue: Appearance for the second interview was low for people with initial low QOL (death or depression) or with initial high QOL (don’t need the treatment). Thus, missing data at exit were clearly related to values of the dependent variable. Solutions to the attrition problem Heckman selection model (used in the study) Prob[Present at exit|covariates] = Φ(z’θ) (Probit model) Additional variable added to difference model i = Φ(zi’θ)/Φ(zi’θ) The FDA solution: fill with zeros. (!)

24 Estimation with Fixed Effects
The fixed effects model ci is arbitrarily correlated with xit but E[εit|Xi,ci]=0 Dummy variable representation

25 Assumptions for the FE Model
yi = Xi + diαi + εi, for each individual E[ci | Xi ] = g(Xi); Effects are correlated with included variables. Common: Cov[xit,ci] ≠0

26 Useful Analysis of Variance Notation
Total variation = Within groups variation + Between groups variation

27 WHO Data

28 Baltagi and Griffin’s Gasoline Data
World Gasoline Demand Data, 18 OECD Countries, 19 years Variables in the file are COUNTRY = name of country YEAR = year, LGASPCAR = log of consumption per car LINCOMEP = log of per capita income LRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasolne Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp  The data were downloaded from the website for Baltagi's text.

29 Analysis of Variance

30 Analysis of Variance

31 Estimating the Fixed Effects Model
The FEM is a linear regression model but with many independent variables Least squares is unbiased, consistent, efficient, but inconvenient if N is large.

32 Fixed Effects Estimator (cont.)

33 The Within Transformation Removes the Effects

34 Least Squares Dummy Variable Estimator
b is obtained by ‘within’ groups least squares (group mean deviations) Normal equations for a are D’Xb+D’Da=D’y a = (D’D)-1D’(y – Xb) Notes: This is simple algebra – the estimator is just OLS Least squares is an estimator, not a model. (Repeat twice.) Note what ai is when Ti = 1. Follow this with yit-ai-xit’b=0 if Ti=1.

35 Inference About OLS Assume strict exogeneity: Cov[εit,(xjs,cj)]=0. Every disturbance in every period for each person is uncorrelated with variables and effects for every person and across periods. Now, it’s just least squares in a classical linear regression model. Asy.Var[b] =

36 Application Cornwell and Rupert

37 LSDV Results

38 The Effect of the Effects

39 The Random Effects Model
ci is uncorrelated with xit for all t; E[ci |Xi] = 0 E[εit|Xi,ci]=0

40 Error Components Model
Generalized Regression Model

41 Notation

42 Notation

43 Convergence of Moments

44 Random vs. Fixed Effects
Random Effects Small number of parameters Efficient estimation Objectionable orthogonality assumption (ci  Xi) Fixed Effects Robust – generally consistent Large number of parameters

45 Ordinary Least Squares
Standard results for OLS in a GR model Consistent Unbiased Inefficient True Variance

46 Estimating the Variance for OLS

47 Mechanics

48 Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP = work experience, EXPSQ = EXP2 WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA = 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by unioin contract ED = years of education BLK = 1 if individual is black LWAGE = log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp  See Baltagi, page 122 for further analysis.  The data were downloaded from the website for Baltagi's text.

49 OLS Results +----------------------------------------------------+
| Residuals Sum of squares = | | Standard error of e = | | Fit R-squared = | | Adjusted R-squared = | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| Constant EXP EXPSQ D OCC SMSA MS FEM UNION ED

50 Alternative Variance Estimators
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Constant EXP EXPSQ D OCC SMSA MS FEM UNION ED Robust Constant EXP EXPSQ D OCC SMSA MS FEM UNION ED

51 Generalized Least Squares

52 GLS (cont.)

53 Estimators for the Variances

54 Feasible GLS x´ does not contain a constant term in the preceding.

55 Practical Problems with FGLS
x´ does not contain a constant term in the preceding.

56 Computing Variance Estimators

57 Application +--------------------------------------------------+
| Random Effects Model: v(i,t) = e(i,t) + u(i) | | Estimates: Var[e] = D-01 | | Var[u] = D+00 | | Corr[v(i,t),v(i,s)] = | | (High (low) values of H favor FEM (REM).) | | Sum of Squares D+04 | | R-squared D+00 | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| EXP EXPSQ D OCC SMSA MS FEM UNION ED Constant

58 Testing for Effects: LM Test

59 Application: Cornwell-Rupert

60 Testing for Effects Regress; lhs=lwage;rhs=fixedx,varyingx;res=e$
Matrix ; tebar=7*gxbr(e,person)$ Calc ; list;lm=595*7/(2*(7-1))* (tebar'tebar/sumsqdev - 1)^2$ LM =

61 Hausman Test for FE vs. RE
Estimator Random Effects E[ci|Xi] = 0 Fixed Effects E[ci|Xi] ≠ 0 FGLS (Random Effects) Consistent and Efficient Inconsistent LSDV (Fixed Effects) Consistent Inefficient Possibly Efficient

62 Hausman Test for Effects
β does not contain the constant term in the preceding.

63 Computing the Hausman Statistic
β does not contain the constant term in the preceding.

64 Hausman Test +--------------------------------------------------+
| Random Effects Model: v(i,t) = e(i,t) + u(i) | | Estimates: Var[e] = D-01 | | Var[u] = D+00 | | Corr[v(i,t),v(i,s)] = | | Lagrange Multiplier Test vs. Model (3) = | | ( 1 df, prob value = ) | | (High values of LM favor FEM/REM over CR model.) | | Fixed vs. Random Effects (Hausman) = | | ( 4 df, prob value = ) | | (High (low) values of H favor FEM (REM).) |

65 Wu (Variable Addition) Test
Under the FE assumptions, the common effect is correlated with the group means. Add the group means to the RE model. If statistically significant, this suggests that the RE model is inappropriate.

66 Mundlak (Augmented) Regression
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| |EXPBAR | *** | |OCCBAR | *** | |SMSABAR | *** | |MSBAR | *** | |UNYNBAR | ** | |WKSBAR | ** | |INDBAR | | |SOUTHBAR| | |EXP | *** | |EXPSQ | *** D | |OCC | | |SMSA | ** | |MS | | |FEM | *** | |UNION | ** | |ED | *** | |BLK | *** | |WKS | | |IND | | |SOUTH | | |Constant| *** |

67 Wu TEst --> matr;bm=b(1:8);vm=varb(1:8,1:8)$
--> matr;list;wutest=bm'<vm>bm$ Matrix WUTEST has 1 rows and 1 columns. 1 1| --> calc;list;ctb(.95,8)$ | Listed Calculator Results | Result =


Download ppt "Chengyuan Yin School of Mathematics"

Similar presentations


Ads by Google