An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude Turmelle Statistics Canada International Total Survey Error Workshop Québec, June 20, 2011
Outline Introduction Quality Indicators (QI) Measure of Impact (MI) Scores Future Work 2 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Unified Enterprise Surveys (UES) UES consists of 58 annual business surveys integrated in terms of content, collection and data processing Collect information on enterprise financial variables Collection period: February to early October Telephone pre-contact used for new units in the sample Mail questionnaires for initial data gathering Telephone follow-up conducted to collect data from non- respondent and to resolve failed edits 3 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Unified Enterprise Surveys (UES) Score function is used to prioritize telephone follow-up for non-response Score based on weighted sampling revenue For most of the UES surveys: no score function used for failed edits follow-up Collection Processing System: Blaise Paradata in Blaise Transaction History files 4 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Integrated Business Statistics Program (IBSP) IBSP is under development to redesign and expand UES to integrate other enterprise surveys and sub-annual surveys Goal: Reduce operating costs Enhance quality assurance IBSP will integrate 120 surveys by 2016 (phase 1: 2014) Electronic questionnaire (electronic data collection) will be the principal collection mode offered to enterprise 5 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Current UES – Processing Model Collection, processing and analysis are run sequentially Estimates produced at very end only Collection ends at set date Sampling Collection Processing Analysis Dissemination Statistics Canada • Statistique Canada 20/11/2018 20/11/2018 6
Statistics Canada • Statistique Canada IBSP – Estimates Model Collection, processing and analysis will be run in parallel Estimates will be produced and re-run periodically Collection could end earlier when pre-specified quality target has been met Collection Sampling Dissemination Processing Analysis 7 Statistics Canada • Statistique Canada 20/11/2018 20/11/2018 7
Active Collection Role: Manage follow-up of non-response and measurement errors (failed edits) Responsive Design (Laflamme and Karaganis, 2010) or Dynamic adaptative approach (Schouten, Calinescu and Luiten, 2011) that uses data available during collection to modify collection strategy Estimates and quality indicators will be produced periodically throughout collection: e.g. monthly basis Then scores measuring impact on estimates and on quality indicators are calculated to allocate and prioritize telephone follow-up 8 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Basic Collection Strategy Initial Sample S Production of Intermediate Estimates Successive Designs d0 d1 d2 di-2 di-1 NR1 NR2 NR3 NRi-1 NRi Observed NR and Response R1 R2 R3 Ri-1 Ri Statistics Canada • Statistique Canada 20/11/2018
Parameter and Estimator Variables of interest: Set of I key variables Parameters of interest: Stratified expansion estimators: Sampling variances: (under a stratified Bernoulli design): Where i, k and h identify respectively the I variables, the Nh units and the H strata Nh = stratum population size ph = unit sampling probability within stratum nh = the stratum sample size 10 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Non-Response Response propensity model: Estimation: Auxiliary data and paradata would be used to estimate response propensities Estimation: In case on non-response, we will either use imputation or reweighting to account for missing data Response propensities could be used to form imputation or reweighting homogeneous classes for reducing the non- response bias (Haziza and Beaumont, 2007) Stratified expansion estimators: 11 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Quality Indicators (QI) Role: Monitor collection progress Help to allocate and prioritize collection efforts Can be item-based Specific to a variable of interest Variance, CV Item response rate of a variable of interest Bias: MSE: 12 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Quality Indicators (QI) Can be covariate-based Derived from statistics on the estimated response propensities given the covariates X Independent from the variables of interest Examples of covariate-based QIs (Schouten, 2011) : Mean response propensity: R-indicator: Standardized Maximal Bias: Standardized Maximal Variance: Standardized Maximal MSE: 13 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Measure of Impact (MI) Scores Types of Scores Common types: Edit-related and estimate-related score functions Example: Predicted difference in estimates (Hedlin, 2008) Proposal: Generalize the MI Score to include quality-related score functions For an estimated parameter (estimate or quality indicator) Definition: Where is the estimated parameter after changing reported values and/or covariates of unit k respectively to and/or and is a scaling factor 14 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Measure of Impact (MI) Scores MI Score for an estimated total: Requires predicted values to compare to reported values Proposal: Use imputation to obtain predicted values Used to prioritize units for failed edit follow-up 15 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Measure of Impact (MI) Scores MI Score for item-based quality indicators MI Score for estimated sampling variance for expansion estimators Specific to a variable of interest Also use imputation to obtain predicted values Linked directly to quality of output estimates Prioritize units for failed edit follow-up 16 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Measure of Impact (MI) Scores MI Score for item-based quality indicator MI Score for covariate-based quality indicator Used to prioritize units for both non-response and failed edit follow- up 17 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Active Collection Management A large number of variables to monitor Monitoring all of them will be a challenge Not all equally important Identify a limited number of key variables For each key variable Quality monitored using item-based QIs and MI Scores For the non-key variables Quality controlled using covariate-based QIs 18 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Active Collection Management MI scores for each estimated parameter and quality indicator are considered local scores In order to prioritize units for telephone follow-up, global score per unit is needed Derive global MI Score (Hedlin, 2008) Sum, maximum or Euclidian distance could be used Some QIs are appropriate for evaluating the impact of non- response and others for the impact of edit failures Derive one global score for non-response follow-up and one global score for failed edit follow-up 19 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Control Quality with Covariated-Based QIs Goal: Increase the average of the response propensities while improving their homogeneity. 20 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Control Quality with Covariated-Based QIs Goal: Increase the average of the response propensities while improving their homogeneity. 21 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Control Quality with Covariated-Based QIs Goal: Increase the average of the response propensities while improving their homogeneity. 22 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Summary Current Approach Proposed Approach Follow-up and editing A score function with no link with estimates Prioritization based on frame (static) information Follow-up and editing for influential units based on estimates and quality Prioritization based on frame, paradata (dynamic) and estimates Processing Results (and quality measures) are known only at the end of the process Produce results (and quality measures) during collection to manage collection Cut-off collection Based on weighted response rate Based on achieved quality of estimates 23 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018 23
Quality Indicators (QI) Measure of Impact (MI) Scores Summary Quality Indicators (QI) Measure of Impact (MI) Scores Quality (accuracy) specific to a domain and an estimate Impact of a unit on an estimate or on a quality indicator Monitor collection and analysis progress Allocate and prioritize collection and analysis efforts Proactively identify problems Assess quality of produced estimates Close active collection Non-response and failed-edit follow-up 24 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Summary Covariate-based QIs Item-based QIs Independent of survey variables Related to survey variables Used to all variables Used with MI Scores to monitor specified key variables Mean response propensity R-indicator Standardized Maximal Bias, Variance and MSE Other… Item response rate Variance, CV Bias, MSE 25 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Non-response follow-up Summary Non-response follow-up MI Scores Failed Edit follow-up One global score Item response rate Mean response propensity R-indicator Variance, CV Item-based MSE Standardized Maximal Bias, Variance and MSE Estimated total Estimated sampling variance 26 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Future Work Methodology development Response propensity model: development of a model based on data and paradata Item-based and covariate-based QIs Validation of the proposed strategy Conduct simulation studies and develop prototypes using current UES environment Summer 2011 prototype: response rates, imputation rate, CV and MI scores Next prototype: Other local and global MI scores and QIs 27 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Discussion What quality indicators are appropriate to measure the risks of potential bias in the estimates? What is the best way to use quality indicator (e.g. R-indicator) to monitor collection of highly skewed business surveys? The proposed approach obviously affects the response propensities throughout collection. Although we can adjust the estimator later on to take this into account, is it something we should move away from? Or should we take advantage of it? In the proposed approach, are there any additional aspects that should be considered? 28 Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 20/11/2018 20/11/2018
Statistics Canada • Statistique Canada Merci / Thank You For more information, Pour plus d’information, please contact: veuillez contacter : Jeannine Claveau jeannine.claveau@statcan.gc.ca Serge Godbout serge.godbout@statcan.gc.ca Claude Turmelle claude.turmelle@statcan.gc.ca Statistics Canada • Statistique Canada 20/11/2018