Presentation is loading. Please wait.

Presentation is loading. Please wait.

The European Statistical Training Programme (ESTP)

Similar presentations


Presentation on theme: "The European Statistical Training Programme (ESTP)"— Presentation transcript:

1 The European Statistical Training Programme (ESTP)

2 Chapter: 9: Propensity scores
Handbook: chapter 11 The propensity score Adjustment for nonresponse bias with the propensity score An example

3 The propensity score Rosenbaum and Rubin (1983): the propensity score ρk(X) is the conditional probability of assignment to a particular treatment given a vector of observed variables X Adjustment for the one-dimensional propensity score proved to be sufficient to remove the bias because of all observed auxiliary variables X Two underlying assumptions: MAR Matching assumption

4 The propensity score In survey context: The propensity score ρk is the conditional probability that a person with characteristics X responds in the survey ρk = P(Rk = 1) Assume that within subpopulations defined by X all persons have the same response probability: Missing At Random-assumption (MAR). Sample elements with the same response propensity have the same distribution of the auxiliary variables (balancing condition). The propensity scores can be used to adjust for nonresponse bias in various ways: Propensity score weighting (directly) In combination with linear weighting (directly) Propensity score stratification (indirectly)

5 Adjustment with the propensity score
Propensity score weighting Replace unknown response propensities in adapted Horvitz-Thompson estimator by the estimated response propensities Combination with linear weighting Linear weighting with adjusted inclusion probabilities Linear weighting including propensity score strata variable Propensity score stratification Divide the sample into strata based on the estimated propensity scores

6 The propensity score – An example
Theory applies for true respons propensities They are not known in practice and have to be estimated This can be done by, for instance, a logit model Estimation must be done so that the balancing property of the propensity score still holds

7 The propensity score – An example Estimating response propensities
So, Rk is dependent variable in a logistic regression model and all the auxiliary variables are candidates for the independent variables Xk in the response propensity model. The response propensities are obtained by Now, meaning that persons i and j have the same probability of response if they are in the same strata defined by X.

8 The propensity score – An example Estimating response propensities
Building the logistic regression model in two steps: Evaluate bivariate relations between X-variables Building multivariate model with stepwise incluision of variables untill no significant variables remain. Check whether the balancing condition holds Application to the General Population Survey (GPS)

9 The propensity score – An example Estimating response propensities
Auxiliary variable V Region of the country 0.163 Degree of urbanization 0.153 Has listed phone number 0.150 Percentage non-natives in neighborhood 0.138 Percentage non-western non-natives in neighborhood 0.133 Average house value in neighborhood 0.115 Type of non-native 0.112 Type of household 0.106 Size of the household 0.099 Marital status 0.097 Is married Is non-native 0.087 Has social allowance 0.077 Age in 13 classes 0.061 Has an allowance Children in household 0.056 Has a job 0.037 Age in 3 classes 0.030 Has disability allowance 0.021 Gender 0.011 Has unemployment allowance 0.000 The propensity score – An example Estimating response propensities Cramér’s V statistic for bivariate tests of independence between auxiliary variables and response behaviour:

10 The propensity score – An example Estimating response propensities
Variable Wald χ2 Region 817.31 190.83 159.02 174.30 163.31 159.96 159.17 160.22 162.57 163.70 163.62 164.79 Degree of urbanization 89.90 50.00 22.31 19.97 16.36 15.56 15.38 15.10 16.04 15.93 16.28 Having a listed phone 415.61 344.05 274.83 251.62 251.18 244.48 233.54 238.21 241.78 242.04 Average housevalue 96.00 73.83 37.47 34.43 30.87 26.84 24.64 25.71 25.29 Ethnic background 69.58 83.89 96.79 107.71 92.42 90.91 96.25 93.65 Type of household 116.56 31.06 16.25 16.74 15.34 14.08 23.61 Size of household 52.50 52.19 52.78 52.58 52.38 Marital status 49.91 51.30 65.01 76.66 74.12 Has a social allowance 14.76 8.08 8.50 8.62 Has a job 29.32 17.89 23.70 Age (3 categories) 13.05 10.97 Gender 14.72 pseudo R2 0.019 0.022 0.031 0.033 0.035 0.038 0.039 0.040 0.041 0.042 χ2 842.6 932.6 1347.7 1443.6 1514.5 1631.3 1684.0 1733.6 1748.5 1777.8 1790.9 1805.6 df 4 8 9 20 24 28 32 35 36 37 39 40

11 The propensity score – An example Estimating response propensities
Construct strata of the response propensities: Cochran (1986) suggests that it is enough to use 5 strata. Strata must be homogeneous w.r.t. response behaviour Base strata on width interval; NOT on equal number of observations within strata! 1 2 3 4 5 Density .2 .4 .6 .8 Response propensity stratum range nh nr,h 1 0.10 303 63 2 0.24 1,913 609 3 0.38 5,385 2,504 4 0.52 14,690 8,777 5 0.66 9,728 6,839 Total 32,019 18,792

12 The propensity score – An example Estimating response propensities
Check balancing condition: within response propensity strata, both respondents and nonrespondents have the same distribution of auxiliary variables. Test for a bivariate relationship between the response indicator and the auxiliary variables within each of the strata; compared to strenght of overall relationship: less is good! Variable Stratum Cramér's V Region 0.163 1 0.133 2 0.063 3 0.041 4 0.020 5 0.014 Variable Stratum Cramér's V House value 0.116 1 0.180 2 0.058 3 0.039 4 0.024 5 0.031

13 The propensity score – An example Estimating response propensities
For the first stratum, the response propensity stratification did not lead to a better balanced distribution w.r.t. degree of urbanization, average house value, ethnic background, household size, marital status, social allowance, age in 3 categories and gender Possible solutions: Delete observations with a very low (or very high) response propensity Build strata based on balancing condition (Imbens and Rubin, 2012)

14 The propensity score – An example Results
Compared propensity weighting and –stratification to the regular GREG estimator for output variables variable category response mean se propensity weighting propensity stratification GREG PC in household yes 57.8 0.36 55.2 0.38 55.5 55.3 0.32 no 42.6 44.8 44.6 44.7 Persons with a PC in the household seem to be over-represented in the response.

15 The propensity score – An example Results
variable category response mean se propensity weighting propensity stratification GREG Owns a house yes 62.5 0.35 58.4 0.37 58.8 58.3 0.30 no 37.5 41.6 41.2 41.7 Persons that own a house seem to be more inclined to respond to the surveys than persons that do not own their own house.

16 The propensity score – An example Results
variable category response mean se propensity weighting propensity stratification GREG Job level very low 3.6 0.14 3.8 low 13.3 0.25 13.0 0.26 13.1 0.24 middle 22.8 0.31 21.9 0.32 22.1 22.0 0.28 high 11.2 0.23 11.0 0.22 academic 4.2 0.15 4.1 no job 44.8 0.36 46.2 0.37 46.0 46.1 Persons with a very low job level or no job seem to be underrepresented.

17 The propensity score – An example Results
variable category response mean se propensity weighting propensity stratification GREG Level of education primary 20.0 0.29 21.3 21.1 0.28 21.2 junior secondary 9.2 0.21 0.22 prevocational 17.4 17.0 senior secondary 6.9 0.19 7.1 0.18 post secondary 28.0 0.33 26.9 27.1 0.32 higher professional 13.3 0.25 0.26 university 5.2 0.16 5.3 0.17 5.4 Persons with a primary or senior secondary level, as well as persons with a university degree seem to be underrepresented.

18 The propensity score – An example Conclusions
The estimates for the three different nonresponse bias adjustment methods are very similar Three methods always agree on the direction in which the estimate must be adjusted GREG-estimator usually adjusts the most, followed by response propensity weighting All estimates differ significantly from the response mean, but fall in the confidence intervals of the estimates made with other methods The adjustment technique has a small impact on the size of the adjustment. Apparently, the information that is being used for the adjustment is most important.


Download ppt "The European Statistical Training Programme (ESTP)"

Similar presentations


Ads by Google