Estimation of preliminary unemployment rates by means of multiple imputation UN/ECE-Work Session on Data Editing Vienna, April 2008 Thomas Burg, Statistics Austria
S T A T I S T I K A U S T R I A April Outline Description of the problem Methods of Estimation Preliminary estimation using MI Results
S T A T I S T I K A U S T R I A April Quickness of results Today policy makers want to receive results as early as possible Challenging for official statistics Final results only after field work is completed Can I get figures earlier?
S T A T I S T I K A U S T R I A April Austrian Labor Force Survey Survey performed quarterly based on a rotating sample of households. Every quarter one fifth of the sample is exchanged Data collection is distributed to 13 weeks of a quarter and respondents are questioned about their labor status with reference to the week before. Most important figures: Unemployment rates
S T A T I S T I K A U S T R I A April Situation during field work End of quarter Estimation on data available on first day after Quarter ends.
S T A T I S T I K A U S T R I A April The Problem Is it possible to estimate preliminary unemployment Rates on the basis of the data already received? Available Data ~70% Missing Records ~30% Unemployment figures
S T A T I S T I K A U S T R I A April Missing Records For missing records not everything is missing…… Rotating sample Basic socio demographic information (Age, Sex, etc… Information from sampling frame Assumed household size, residence..
S T A T I S T I K A U S T R I A April Estimation Methods Weighting on basis of available data Raking procedure involving marginal distributions of the Austrian population Imputing labor status for records still to come Assumption on the set of records necessary
S T A T I S T I K A U S T R I A April Imputing labour status Available Data ~70% Missing Records ~30% To impute values on a record I definitely need records on which I can impute! Information from prior rotations and from the sampling frame
S T A T I S T I K A U S T R I A April Multiple imputation In official statistics not very common: There you like to have authentic databases with stored values Multiple imputation rather focuses on concrete estimation problems => Here I have a concrete estimation problem!
S T A T I S T I K A U S T R I A April Multiple imputation – single imputation step Analysis (I) Labour status: 4 possible values (1=’employed’, 2=’unemployed’, 3=’not relevant for employment’, 4=’military person’). Analysis of distributional differences of labour status between known and expected records based on poststratification including Sex, Age-groups, and Citizenship
S T A T I S T I K A U S T R I A April Multiple imputation – single imputation step Analysis (II) Results were also depending on the quarter. Even incorporating this figures were not satisfactory => There must be an additional factor => Weight of a person delivered desired result.
S T A T I S T I K A U S T R I A April Multiple imputation – single imputation step
S T A T I S T I K A U S T R I A April Multiple imputation Multiple Imputation smoothes out Variability of estimators
S T A T I S T I K A U S T R I A April Results (I) Results for the MI-Estimation of preliminary figures compared to the real data
S T A T I S T I K A U S T R I A April Results (II) Comparison of estimation of unemployment rate – MI, Grossing up and Real data
S T A T I S T I K A U S T R I A April Conclusions – Critical remarks Multiple imputation is a possible estimation strategy for preliminary figures Problematic assumptions concerning expected records Time series are very thin now