An assessment of the robustness of weights in the Famille et Employeurs survey Nicolas Razafindratsima & Elisabeth Morand INED European Conference on Quality in Official Statistics, Rome, July 2008
Outline Introduction The Famille et Employeurs survey Methodology of weights calculations Results Conclusion
Introduction In indirect sampling : the generalized weight share method (Lavallée, 2002 & 2007) is a simple and easy strategy to produce unbiased estimations Objective of this study : to expose the implementation of this method for the Familles et Employeurs survey (Ined/Insee, ) : difficulties, and robustness of the weights obtained.
The Familles et Employeurs survey Main purpose : to study professional and family life conciliation: - from the employees point of view - from the employers point of view Can behaviours within the family be explained through employers rules and characteristics? A matched survey: employee employers
The household survey Sampling strategy : - a sample of dwellings drawn from the INSEE master sample of them answered - interview of 1 or 2 individuals, aged 20-49, in each household (random sampling if >2) respondents Face to face interview. Questionnaire on demographic events (birth, unions), family life organization, etc., in relation to individuals professional context
The establishment survey Target : Establishments of 20 employees or more Individuals respondents were asked to give their employers name, address, national identification number as well as their size and sector. A self administered questionnaire was sent to the establishments. Possibility to answer by paper or by the Internet respondents.
Weighting methodology The basic for establishments weighting methodology : the generalized weight share method Where : Wi=weight of establishment i Wj=weight of individual j (those in the sample) Li= number of eligible employees (aged 20-49) in the establishment i
Weighting methodology Difficulties : - the choice of individuals weight (non-response, calibration) - the estimation of the number of employees aged in the establishment [- establishment non-responses]
Weighting the sample of individuals Non response variations : – At the household level (non-response higher in urban areas, for collective dwellings, etc.) – At the individual level (non-response higher for males, single persons, less-educated people, etc.) Non-response adjustments : – At the household level : using corrective response rates within response homogeneity groups – At the individual level : calibration (on Labor force survey data) - on a single variable : gender*age - or on 7 variables : gender*age, employment status, nationality, region, highest diploma, urbanization status, household size
Distribution of individuals weights Calibration variablesNMeanCV (%)MinMedianMaxMax/ Min Gender*age ,148,0557, ,5 7 variables ,151,9440, ,8
Estimation of the number of eligible employees (aged 20-49) Number of employees (total and by age groups) asked in the establishment questionnaire But number of not available for 16% of the establishments (item non-response) Imputation of number of when non-response : using coefficients of a regression model among respondents, linking nb of to total (one model by activity sector) For 2/3 of the establishments, total size is also available through the SIRENE directory. Allows an other estimation of number for evaluation purposes
Distribution of total size NMeanCV (%)MinMedianMaxMax/ Min Declared ,5264,620,0152, ,8 SIRENE (after imputation) ,6285,020,0151, ,7
Distribution of number of Nb of missi ng MeanCV (%)MinMedianMaxMax/ Min Declared433419,5259,03,0110, ,0 Imputed using declared 0398,0256,53,0109, ,0 Imputed using SIRENE 0430,7286,513,9113, ,0
Summary of the options for establishments weighting Calibration variables for individual weights Estimation of the number of employees aged Imputation using declared size Imputation using SIRENE size Gender*age onlyP1P2 7 variables- P3 - P3tr : trimmed P5 & P95 then adjusted on the total P4 Employees weight=establishment weight*nb of employees
Distribution of establishment weights MeanCV (%)MinMaxMax/ Min Nb of weights < 1 P141,8138,60,3646,02 583,951 P241,1128,70,1476,47 217,455 P341,3140,80,2592,03 235,060 P440,7134,00,1459,98 516,367 P3tr 41, ,7155,291,20
Distribution of employees weight MeanCV (%)MinMaxMax/ Min P14 019,681,6819, ,985,1 P23 835,362,1819, ,643,1 P33 926,081,4586, ,2113,1 P43 751,266,0586, ,263,9 P3tr4 409,082,1980, ,7 77,7
Percentage of establishments having a crèche
Percentage of employees working in an establishment with a crèche
Conclusion Weight sharing method : – A simple and easy to implement method – However, difficult to implement in the Famille et Employeurs survey, due to : non-responses on the ‘link’ variable High dispersion of the size variable More sensitive to the link variable than on specification of the individual weights All the computed weights present great variability Weight trimming may improve precision at the establishment level, but fails to do so at the employees level
References Lavallée, P. (2002) : Le sondage indirect, ou la méthode généralise du partage des poids, Bruxelles, Presses de l’université libre de Bruxelles et Paris, Ellipses, 242 pages. Lavallée, P. (2007) : Indirect sampling, New York, Springer (Springer series in statistics), 245 pages.
Thank you for your attention !