Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modelling multiple sample selection in intergenerational occupational mobility Cheti Nicoletti ISER, University of Essex Marco Francesconi Department of.

Similar presentations


Presentation on theme: "Modelling multiple sample selection in intergenerational occupational mobility Cheti Nicoletti ISER, University of Essex Marco Francesconi Department of."— Presentation transcript:

1 Modelling multiple sample selection in intergenerational occupational mobility Cheti Nicoletti ISER, University of Essex Marco Francesconi Department of Economics, University of Essex

2 Main aims of the paper 1.Estimation of intergenerational occupational mobility in Britain. 2. Correcting for potential sample selection problems in short panels using different estimation methods.

3 Sample selection problems Labour market selection: Intergenerational occupational mobility can be estimated only for people who are employed. Coresidence selection: Children must be living together with their parents in at least one wave of the panel.

4 19912003 BHPS 1991-1993 Child age 3 8 13 18 23 28 33 Child age 15 20 25 30 35 44 45 1997 Child age 9 14 19 24 29 34 39 Cohort 1988 1983 1978 1973 1968 1963 1958

5 Taking account of coresidence selection Francesconi and Nicoletti (2006) find that the intergenerational mobility in occupational prestige is underestimated when using the subsample of sons born between 1966 and 1985. They try different estimation methods to correct for sample selection and find that only the inverse propensity score is able to attenuate the selection problem This sample selection evaluation is possible because all BHPS respondents are asked to report occupational characteristics of their parents when they were 14

6 Taking account of selection into employment for children Blanden (2005) and Ermisch et al (2005) consider two-step estimation procedures and find lower and unchanged βs Couch and Lillard (1998) and Nicoletti and Francesconi (2006) consider imputation methods and find lower βs Minicozzi (2003) use partial identification approach to produce bound estimates instead than point estimates for the intergenerational mobility and find higher βs when including unemployed and part-time workers.

7 Contributions of the paper Propose new estimation methods to take account of sample selection problem in the intergenerational mobility models which are very parsimonious Taking account of both coresidence and employment selection bias

8 Selection models If ε i and u i are not independent then we have selection due to unobservables If ε i depends on Z i then we have selection due to observables If ε i depends on Z i and u i then we have selection due to both observables and unobservables

9 Selection due to unobservables y=α+xβ+ε d * =Z γ+u d=l(d * >0) Let E(ε|x)=0, ε ind Z, (ε, u) be N with means zeros, variances σ 2 and 1 and covariance ρ Then E((y-α-xβ) |x,d=1) 0 and OLSE is biased E(y|x,d=1)=α+xβ+E(ε|x,d=1)=α+xβ+ ρλ v=ε- ρλ is such that E(v|d=1,X)=0 We can consider an additional correction term (Heckman 1979, Vella 1998)

10 Selection due to observables y=α+xβ+ε d * =Z γ+u d=l(d * >0) Let E(ε|x)=0, ε u but ε not ind Z Then E(ε|x,d=1)0 and OLSE is biased because of selection on observables Since ε d|x,Z we can adopt (1) propensity score methods, (2) regression adjustment methods or (3) combining methods. (see Rosembaum and Rubin, 1983; Robins and Rotnitzky, 1995; Hirano et al., 2003)

11 Propensity score weighting method Let Pr(d=1|x,Z)=Pr(d=1|Z)=p(Z) Then E(ε d|x) 0 but E(ε d p(Z) -1 |x)=0 E(ε d p(Z) -1 |x)= E Z E(ε d p(Z) -1 |x,Z) = E Z [E(ε |x,Z,d=1) Pr(d=1|x,Z)p(Z) -1 ] Since ε d|x,Z = E Z [E(ε |x,Z) Pr(d=1|x,Z)p(Z) -1 ] =E Z [E(ε |x,Z)]=E(ε |x)=0 This holds even if some of the variables in Z are erroneously omitted from the main equation.

12 Regression adjustment y=α+xβ+ε d * =Z γ+u d=l(d * >0) To take account that ε is not ind of Z y=α N +xβ N +Zδ+ω If the linearity assumption is satisfied then E(ω|X,Z,d=1)=E(ω|X,Z)=E(ω|X)=0 and β N is consistently estimated β=Cov(x,y)/Var(x)=β N +Cov(x,Z)Var(Z) -1 δ

13 Combining regression adjustment and propensity score method Estimation of the extended model y=α+xβ+Zδ+ω by using inverse propensity score weighting E[(y-α-xβ-Zδ) d p(Z) -1 |x] = E Z E[(y-α-xβ- Zδ) d p(Z) -1 |x,Z] = E Z [E(y-α-xβ- Zδ |x,Z,d=1) Pr(d=1|x,Z)p(Z) -1 ] Notice that this expression is 0 if either E(y-α-xβ- Zδ |x,Z,d=1)=E(ω|X,Z,d=1)=0 or Pr(d=1|x,Z)=p(Z) holds and not necessarily both.

14 Selection due to both observables and unobservables y=α+xβ+ε d * =Z γ+u d=l(d * >0) where ε depends on both Z and u (ε, u) is N with means zeros, variances σ 2 and 1 and covariance ρ v=(ε- ρλ) ind d |x,Z We can use: (1) Heckman correction and propensity score weighting or (2) Heckman correction and regression adjustment.

15 Heckman correction & propensity score weighting E[(y-α-xβ- ρλ) d p(Z) -1 |x] = E Z E[(y-α-xβ- ρλ) d p(Z) -1 |x,Z] = E Z [E(y-α-xβ- ρλ|x,Z,d=1) Pr(d=1|x,Z)p(Z) -1 ] Since (y-α-xβ- ρλ) d|x,Z = E Z [E(y-α-xβ- ρλ |x,Z) Pr(d=1|x,Z)p(Z) -1 ] =E Z [E(y-α-xβ- ρλ |x,Z)]= E(y-α-xβ- ρλ |x)= 0

16 Heckman correction & regression adjustment Estimation of the extended model with additional variables Z and correction term λ y=α+xβ+Zδ+ ρλ +ω d * =Z γ+u ρλ controls for the dependence of ε 1 on u Zδ controls for the dependence of ε 2 on Z

17 How can the BHPS help us? All BHPS respondents are asked to report occupational characteristics of their parents when they were 14 THEREFORE We know the occupational prestige even for daughters and fathers living apart during the panel. We can estimate the intergenerational mobility without any coresidence selection. We can consider the subsample of daughters coresident with the fathers at least once during the panel and assess the relevance of the coresidence selection. We can then compare different methods to correct for the coresidence selection.

18 BHPS Samples FULL SAMPLE: 2691 women (daughters) born between 1966 and 1985 with at least one valid interview over the first 13 waves of the BHPS (aged between 16-37, average 24) RESTRICTED SAMPLE: 745 individuals from the full sample who can be matched with their father (aged between 16-37, average age 21). We consider an average occupational prestige over all waves available for daughters. We consider instead the occupation prestige reported retrospectively by daughters for fathers (average age 46).

19 Estimation Methods Used Inverse propensity score weighting (Weights) Regression adjustment Regression adjustment & weights Heckman correction method (Heckman) Heckman & weights Regression adjustment & Heckman

20 Coresidence selection model y=α+xβ+Aμ+ε d * =Z γ+u d=l(d * >0) where y is the daughters occupational prestige (log Hope- Goldthorpe score) x is her fathers occupational prestige A age and age 2 d=1 for daughters living together with their father in at least one wave and 0 otherwise Z=dummies for education, age, regions, ethnicity, religiosity and two house price indexes

21 The intergenerational equation is too parsimonious y=α+xβ+Aμ+ε d * =Z γ+u d=l(d * >0) Education dummies are important to explaining both the daughters occupational prestige and their probability to be coresident The assumption that ε d is not acceptable.

22 Regression adjustment when x is missing y=α N +xβ N +Zδ+ω If the linearity assumption is satisfied β N is consistently estimated β=Cov(x,y)/Var(x)=β N +Cov(x,Z)Var(Z) -1 δ If x is missing it is not possible to estimate Cov(x,Z) consistently

23 Correcting for coresidence selection only βSE Full sample0.2500.028 Restricted sample0.1470.044 Weights0.2080.084 Heckman0.1450.043 Heckman and weights0.2060.083 Regression adjustment0.1350.043 Regression adjustment & Heckman0.1320.043 Regression adjustment & weights0.2060.063

24 Employment selection model y=α+xβ+Aμ+ε d * =Z γ+u d=l(d * >0) where y is the daughters occupational prestige (log Hope- Goldthorpe score) x is her fathers occupational prestige A age and age 2 d=1 for daughters are employed at least in at least one wave and 0 otherwise Z=occupation prestige father, dummies for education, age, regions, ethnicity, religiosity, a house price index, marital status and number of children aged between 0-2, 3-4, 5-11, 12-15, 16-18.

25 Correcting for employment selection only βSE Full sample0.2500.028 Weights0.2650.031 Heckman0.2090.041 Heckman and weights0.2270.032 Regression adjustment0.2490.028 Regression adjustment & Heckman0.2550.029 Regression adjustment & weights0.2530.030

26 Correcting for employment and sample selection simultaneously βSE Full sample0.2500.028 Restricted sample0.1470.044 Weights Bivariate selection0.2080.084 Regression adj & weights Bivariate selection0.1450.043 Weights0.2060.083 Regression adjustment & weights0.1350.043 Heckman0.1320.043 Heckman & Regression adjustment0.1320.043 Heckman & weights0.2060.063

27 Matching selection in quantile regressions QuantileFullRestrictedWeights 100.3860.1250.384 0.0690.1000.133 250.2570.2190.281 0.0520.0770.131 500.2480.1640.215 0.0630.0710.109 750.2400.1090.138 0.0450.0540.096 900.0790.0590.141 0.0330.0700.067

28 Employment selection in quantile regressions QuantileFullWeights 100.3860.311 0.0690.070 250.2570.266 0.0520.046 500.2480.292 0.0630.049 750.2400.279 0.0450.041 900.0790.195 0.0330.040

29 Double selection in quantile regressions QuantileFullRestrictedWeights 100.3860.1250.389 0.0690.1000.132 250.2570.2190.389 0.0520.0770.136 500.2480.1640.244 0.0630.0710.146 750.2400.109 0.0450.0540.131 900.0790.0590.008 0.0330.0700.112

30 Conclusions The intergenerational equation is too parsimonious and there are probably omitted variables such as education dummies. In this situation correcting for selection on observables is much more important than correcting for selection on unobservables. The coresidence selection seems to cause an underestimation of β. The selection into employment does not seem to cause a large bias in β.


Download ppt "Modelling multiple sample selection in intergenerational occupational mobility Cheti Nicoletti ISER, University of Essex Marco Francesconi Department of."

Similar presentations


Ads by Google