Guillaume Osier Institut National de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division Construction of sample weights for the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, February 2012
2 Why calculate sample weights? – 1/2 Sample weights are needed to correct for imperfections in the sample that might lead to bias and other departures between the sample and the reference population: –the selection of units with unequal probabilities, –non-coverage of the population, –unit non-response When sample weights are adjusted to external data sources (calibration weighting), they can also help improve sampling accuracy by reducing the sampling variance
3 Why calculate sample weights? – 2/2 Sample weighting is therefore an important step, which must be taken very seriously In the following, we provide a set of recommendations for computing and adjusting sample weights in the EU SASU. Two types of sampling designs are distinguished: –(C1) A sample of households is selected from a list of addresses/dwellings through a single or multi-stage sampling design. One individual is then randomly selected from every household –(C2) A sample of individuals is selected population register using single or multi-stage sampling. All the households the sample individuals belong to are eligible for inclusion in the sample
4 The step-by-step procedure (C1) Selection of households 1.Calculation of the household design weights 2.Re-weighting for household non-response 3.Adjustment to external data sources (calibration) 4.Calculation of the individual weights (C2) Selection of individuals 1. Calculation of the individual design weights 2.Re-weighting for individual non-response 3.Adjustment to external data sources (calibration) 4.Calculation of the household weights
5 Design weights Computed for each sample unit (household or individual) as the inverse of the selection probability of the unit. For example, if the selection probability is 0.25 then the design weight is 4 Intuitively, the design weight of unit i can be interpreted as the number of population units represented by i (assuming complete response) Still assuming complete response, design weights yield to unbiased estimators for linear parameters (totals, means or proportions)
6 Re-weighting for unit non-response – 1/6 Unit non-response is the failure to collect any information at all from a sample unit (household or individual). The reasons are diverse: failure to make contact with the unit, refusal or inability of the unit to participate, lack of quality of the collected data, accidental loss of the questionnaire… Non-response is a serious issue because it affects the quality of the data, particularly when the non-responding units differ from the responding units with respect to key survey characteristics. This may create bias in estimates
7 Re-weighting for unit non-response – 2/6 The correction strategy: re-weight the sample of respondents –Division of the design weights of the responding units by the inverse of their response probabilities (need to estimate those probabilities) –Another solution: Calibration (see later) Let x be a vector of auxiliary characteristics assumed to be known both on the responding and the non-responding units. It is of crucial importance to identify responding and non- responding units correctly. Selected units which turn out to be non-eligible or non-existent must be excluded and not counted as non-responding
8 Re-weighting for unit non-response – 3/6 Let p i be the response probability of i. That probability can be estimated by: a is a vector of parameters which is estimated from the sample This re-weighting strategy may cause severe dispersion of the final weights. An alternative is to form Response Homogeneous Groups (RHG) and to estimate the probability by the response rate in the group: (Logistic model)
9 Re-weighting for unit non-response – 4/6 A variant of the RHG method includes the design weights: The choice between these two possibilities is often a matter of taste. If the sample is (nearly) self-weighted, then there should not be much difference between the weighted and the non-weighted forms. Otherwise, the effect of extreme weights might in some cases lead to unexpected results small preference for the non-weighted approach
10 Re-weighting for unit non-response – 5/6 An essential step toward non-response correction: collect as much information as possible on the non-respondents –This way, a powerful model for estimating the response probabilities can be constructed –Possible sources of information on the non-respondents: The sampling frame Administrative sources (if available) Ad-hoc surveys on the non-respondents: we ask a few questions to the non-respondents Data about the collection process (paradata) Experience from other surveys, common sense
11 Re-weighting for unit non-response – 6/6 Potential non-response adjustment variables: Household non-response (selection of households) –Household size –Household composition –Geographical location (NUTS region) –Characteristics of the household’s reference person: age, gender, education, activity status… Individual non-response (selection of individuals) –Age, gender, education, citizenship, education, activity status… –Characteristics of the household: size, composition, location…
12 Calibration to external sources – 1/8 We seek to modify (calibrate) the non-response adjusted design weights to make the weighted sample totals conform to population totals for a given set of auxiliary variables Let be the design weights corrected for unit non-response. Let x be a vector of auxiliary variables for which the population total X is known to high precision (from census data, administrative sources, large surveys…). We seek new weights solution to the problem:
13 Calibration to external sources – 2/8 D is a distance function. The idea is to get the final weights to be as close as possible to the initial weights The calibration variables can be either quantitative (calibration on population totals) or qualitative (calibration on population distributions) It can be shown that the final calibrated weights are given by: where the function F is derived from the distance D and is an unknown parameter. is solution to the calibration equation:
Calibration to external sources – 3/8 Examples of distances functions
15 Calibration to external sources – 4/8 Provided the sample size is large enough, all distance functions lead to the same results in terms of bias and variance: –The bias is negligible –Calibration generally makes precision better: the stronger the correlation between the study variable and the calibration variables, the lower the sampling variance However, « bounded » methods like the logistic or the truncated linear methods are commonly used because they avoid extreme weights: the ratios (final weight/initial weight) are between two bounds L and U which are fixed in advance
16 Calibration to external sources – 5/8 Potential calibration variables for the EU-SASU: they have to be correlated with victimization items –Household level: household size, household composition, NUTS region, dwelling type, household income… –Individual level: age, gender, education, citizenship, activity status… The calibration approach is implemented in many softwares: Calmar (INSEE France), Clan (Statistics Sweden), G-calib (Statistics Belgium), Bascula (Statistics Netherlands), R (package ‘sampling’)…
17 Calibration to external sources – 6/8 In practice, calibration information may be available both at household and individual level. For consistency reasons, an « integrative » calibration is recommended. In case of a selection of households, the idea is to use both household and individual information in a single-shot calibration at household level. The individual variables are aggregated at household level. The calibration is then done at household level using household variables and individual variables summed up as household variables. As we’ll see, this technique ensures « consistency » between household and individual estimates
18 Calibration to external sources – 7/8 For instance, suppose we want to calibrate the sample to: –the total number of private households in the population (information at household level), –the population distribution by age and gender (information at individual level) The dummy variables 1 C for all the age and gender categories are transformed into household variables by multiplying by the household size N h (in number of members aged 16 or more):
19 Calibration to external sources – 8/8 The calibration is then done at household level using the variable equal to one for all the households (to calibrate to the total number of private households in the population) and the aggregated dummies x (C) at household level In case of a selection of individuals, the approach is similar except that the calibration variables at household level are disaggregated at individual level by dividing by the number of household members aged 16 or more. The calibration is implemented at individual level using the individual variables and the disaggregated household variables
20 Calibration and non-response – 1/6 Although calibration was originally introduced as a way to reduce sampling variance, the approach can also lead toward non-response bias reduction, on condition that the calibration variables are correlated with the probability of response Thus, by a single-shot calibration of the design weights, one can expect to reduce both non-response bias and sampling variance (one step instead of two) Equation for single-shot calibration:
21 Advantages of the calibration approach: –The non-response adjustment variables are not required to be known for the non-respondents –Can reduce bias and variance both in the same time (powerful) –Simplicity (no need any more to construct a model for estimating the response probabilities) Problem: the non-response variables must be calibration variables as well, that is, their population totals must be known. This condition is still a serious obstacle, especially when non-response is non-ignorable (i.e. depends on the variables of interest measured within the same survey) Calibration and non-response – 2/6
22 Calibration and non-response – 3/6 Generalised calibration approach: we still expect to correct non-response bias and reduce sampling variance by a single- shot calibration of the design weights. We seek to determine new weights function of the design weights and a vector z of non-response variables: F is a « calibration » function and z is a vector of non-response explanatory variables for which the only prerequisite is to know their values on the respondents. In particular, there is no need any more to have their values on the non-respondents or to know their total over the population
23 Calibration and non-response – 4/6 The vector is determined by solving the calibration equations based on a vector x of calibration variables: The advantage of this generalized calibration is that, contrary to the classical approach in which the non-response variables have to be calibration variables too, all the variables that are collected during the survey, and which are observed only for the respondents, can be used for non-response correction
24 Calibration and non-response – 5/6 The bias of the generalised calibration estimator is near zero as long as the sample size is large enough and the inverse of the response probabilities (called influence) has good linear relation to the non-response variables z In terms of precision (variance), we still benefit from the calibration variables x: the stronger the correlation between the study variable and the calibration variables, the lower the sampling variance. However, if the calibration variables x are poorly correlated to the non-response variables z, the precision can deteriorate
25 Calibration and non-response – 6/6 In fact, the main difficulty in using the generalised calibration approach is that we need strong convictions that non- response is caused by variables of interest of the survey, for which the values are only observed on the respondents. In the absence of any information on the non-respondents which could help check this assumption, we have no other choice but to « believe » in it The generalised calibration is implemented in the new version of the SAS macro Calmar (Calmar2), developed by France’s Statistics Office (INSEE)
26 Case 1: Selection of households The individual weights are calculated by multiplying the household weighting factor (i.e. the household design weights adjusted for non-response and calibrated to external data sources) by the size of the household in number of individuals aged 16 or more. This formula is justified by the fact that one member aged 16 or more is randomly selected in every sampled household. Thanks to the “integrative” calibration at household level using both household and individual calibration information, the individual weights are already calibrated to population characteristics at individual level Calculation of the final weights – 1/3
27 Case 2: Selection of individuals The household weights are derived by dividing the individual weighting factor (i.e. the individual design weights adjusted for non-response and calibrated to external data sources) by the size of the household in number of individuals aged 16 or more. More generally, if there is more than one sampled individual per household, the household weight is calculated by summing up the weights of all the sampled individuals in the household and dividing the result by the size of the household Calculation of the final weights – 2/3
28 A final remark: the distribution of the final weights (household and individual) should be examined in order to detect any extreme values. Extreme weights should be treated (trimming, top-coding…) as they can lead to unexpected estimates, particularly over sub-populations. However, this should not happened if this concern of the dispersion of the weights is properly taken care of at each step of the procedure, using for instance Response Homogeneous Groups and “bounded” calibration methods Calculation of the final weights – 3/3