A Theoretical Framework for Adaptive Collection Designs Jean-François Beaumont, Statistics Canada David Haziza, Université de Montréal International Total.

Slides:



Advertisements
Similar presentations
Variance Estimation When Donor Imputation is Used to Fill in Missing Values Jean-François Beaumont and Cynthia Bocci Statistics Canada Third International.
Advertisements

1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.
Innovation Surveys: Advice from the Oslo Manual South Asian Regional Workshop on Science, Technology and Innovation Statistics Kathmandu,
Page 1 Measuring Survey Quality through Representativity Indicators using Sample and Population based Information Chris Skinner, Natalie Shlomo, Barry.
Survey Methodology Nonresponse EPID 626 Lecture 6.
Split Questionnaire Designs for Consumer Expenditure Survey Trivellore Raghunathan (Raghu) University of Michigan BLS Workshop December 8-9, 2010.
Brian A. Harris-Kojetin, Ph.D. Statistical and Science Policy
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
1 Sampling Telephone Numbers and Adults, and Interview Length, and Weighting in the California Health Survey Cell Phone Pilot Study J. Michael Brick, Westat.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Quality indicators for measuring and enhancing the composition of survey response Q2008 – Special topic session, July 9 Jelke Bethlehem and Barry Schouten.
Optimizing CATI Call Scheduling International Total Survey Error Workshop Hidiroglou, M.A., with Choudhry, G.H., Laflamme, F. Statistics Canada 1 Statistics.
Documentation and survey quality. Introduction.
1 Marketing Research Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides.
Error and Sample Sizes PHC 6716 June 1, 2011 Chris McCarty.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
08/08/2015 Statistics Canada Statistique Canada Paradata Collection Research for Social Surveys at Statistics Canada François Laflamme International Total.
18/08/2015 Statistics Canada Statistique Canada Responsive Collection Design (RCD) for CATI Surveys and Total Survey Error (TSE) François Laflamme International.
National Household Survey: collection, quality and dissemination Laurent Roy Statistics Canada March 20, 2013 National Household Survey 1.
Determining Sample Size
Research Problem.
Responsive Design for Household Surveys: Illustration of Management Interventions Based on Survey Paradata Robert M. Groves, Emilia Peytcheva, Nicole Kirgis,
Nonresponse issues in ICT surveys Vasja Vehovar, Univerza v Ljubljani, FDV Bled, June 5, 2006.
Fieldwork efforts  Monitoring fieldwork efforts  Monitoring fieldwork efforts: Did interviewers /survey organisations implement fieldwork guidelines.
Crop area estimates with area frames in the presence of measurement errors Elisabetta Carfagna University of Bologna Department.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
A Latent Class Call-back Model for Survey Nonresponse Paul P. Biemer RTI International and UNC-CH Michael W. Link Centers for Disease Control and Prevention.
Evaluating a Research Report
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
ICP Workshop, Tunis Nov. 03 Overview of the Sample Framework.
Stop the Madness: Use Quality Targets Laurie Reedman.
Survey Methodology Lilian Ma November 6, Three aspects 1. How questions were designed 2. How data was collected 3. How samples were drawn Probability.
Prioritizing Follow-up of Non- Respondents Using Scores for the Canadian Quarterly Survey of Financial Statistics for Enterprises Pierre Daoust Statistics.
Prioritizing Follow-up for the Canadian Quarterly Survey of Financial Statistics for Enterprises Pierre Daoust Statistics Canada ICES III, Montréal Statistique.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Statistics Canada Statistique Canada Cost-Efficient Framework for Data Collection for CATI Surveys Social Surveys Collection Research Steering Committee.
Handbook on Precision Requirements and Variance Estimation for ESS Household Surveys Denisa Florescu, Eurostat European Conference on Quality in Official.
G Lecture 7 Confirmatory Factor Analysis
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Research Program and Enterprise Architecture for Adaptive Survey Design At Census Peter Miller Anup Mathur Michael Thieme May 23, 2014.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
A Quality Driven Approach to Managing Collection and Analysis
11 How Much of Interviewer Variance is Really Nonresponse Error Variance? Brady T. West Michigan Program in Survey Methodology University of Michigan-Ann.
Sampling Fundamentals 2 Sampling Process Identify Target Population Select Sampling Procedure Determine Sampling Frame Determine Sample Size.
Practical Survey Design Strategies for Minimizing MSE Lars Lyberg and Bo Sundgren Statistics Sweden
Representativity Indicators for Survey Quality Programme: Cooperation Theme: Socio-economic sciences and Humanities Activity: Socio-economic and scientific.
1 Responsive Design and Survey Management in the National Survey of Family Growth (NSFG) William D. Mosher, NCHS FCSM Statistical Policy Seminar Washington,
Examining the Trade Off between Sampling and Non-response Error in a Targeted Non-response Follow-up Sarah Tipping and Jennifer Sinibaldi, NatCen.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
1 European Conference on Quality in Official Statistics - Helsinki. Finland 3-6 May 2010 The use of R-indicators in responsive survey design – Some Norwegian.
Q2010 – special topic session 33 - Page 1 Indicators for representative response Barry Schouten (Statistics Netherlands) Natalie Shlomo and Chris Skinner.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Small area estimation combining information from several sources Jae-Kwang Kim, Iowa State University Seo-Young Kim, Statistical Research Institute July.
TOPIC - Page 1 Representativity Indicators for Survey Quality R-indicators and fieldwork monitoring Koen Beullens & Geert Loosveldt K.U.Leuven.
Single Season Study Design. 2 Points for consideration Don’t forget; why, what and how. A well designed study will:  highlight gaps in current knowledge.
Sampling and Sampling Distribution
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude.
Planning the change to a targeted survey design
Assessing Quality of Paradata to Better Understand the Data Collection Process for CAPI Social Surveys François Laflamme Milana Karaganis European Conference.
Chapter 10: Selection of auxiliary variables
NAMRATA TIWARI-12 NISHANT KUMAR-06 Research Problem 1.
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Sampling and estimation
Adaptive mixed-mode design WP1
Presentation transcript:

A Theoretical Framework for Adaptive Collection Designs Jean-François Beaumont, Statistics Canada David Haziza, Université de Montréal International Total Survey Error Workshop Québec, June 19-22, 2011

Overview  Selected literature review  Framework Definition of the problem Choice of quality indicator and cost function Mathematical formulation of the problem  Solution and discussion  Conclusion 2

Literature review: Groves & Heeringa (2006, JRSS, Series A)  Responsive designs: Use paradata to guide changes in the features of data collection in order to achieve higher quality estimates per unit cost Paradata: Data about data collection process Examples of features: mode of data collection, use of incentives, … Need to define quality and determine quality indicators Two main concepts: phase and phase capacity 3

Literature review: Groves & Heeringa (2006, JRSS, Series A)  Phase: Period of data collection during which the same set of methods is used Phase 1: gather information about design features Phases 2+: alter features (e.g., subsampling of nonrespondents, larger incentives, …)  A phase is continued until its phase capacity is reached Judged by the stability of an indicator as the phase matures 4

Literature review: Schouten, Cobben & Bethlehem (2009, SM)  Goal: determine an indicator of nonresponse bias as an alternative to response rates  Proposed a quality indicator, called R-indicator: Population standard deviation must be estimated Response probabilities,, must be estimated using some model  An issue: indicator depends on the proper choice of model (choice of auxiliary variables) 5

Literature review: Schouten, Cobben & Bethlehem (2009, SM)  Another issue: indicator does not depend on the variables of interest but nonresponse bias does  Maximal bias of :  is the unadjusted estimator of the population mean:  Two limitations of maximal bias (and R-indicator): unadjusted estimator is rarely used in practice depends on proper specification of 6

Literature review: Peytchev, Riley, Rosen, Murphy & Lindblad (2010, SRM)  Goal: Reduce nonresponse bias through case prioritization  Suggest targeting individuals with lower estimated response probabilities For instance, give them larger incentives or give interviewer incentives Their approach is basically equivalent to trying to increase the R-indicator (or achieving a more balanced sample)  Recommend using auxiliary variables that are associated with the variables of interest 7

Literature review: Laflamme & Karaganis (2010, ECQ)  Development and implementation of responsive designs for CATI surveys at Statistics Canada  Planning phase: before data collection starts (determination of strategies, analyses of previous data, …)  Initial collection phase: evaluate different indicators to determine when the next phase should start  Two Responsive Designs (RD) phases 8

Literature review: Laflamme & Karaganis (2010, EQC)  RD phase 1: prioritize cases (based on paradata or other information) with the objective of improving response rates increase the number of respondents (desirable)  RD phase 2: prioritize cases with the objective of reducing the variability of response rates between domains of interest (increasing R-indicator) likely reduce the variability of weight adjustments (desirable) 9

Literature review: Schouten, Calinescu & Luiten (2011, Stat. Netherlands)  First paper to propose a theoretical framework for adaptive survey designs  Suggest: Maximizing quality for a given cost; or Minimizing cost for a given quality  Requires a quality indicator (e.g., overall response rate, R-indicator, Maximal bias, …) Which one to use? 10

Definition of the problem  Adaptive collection design: Any procedure of calls prioritization or resources allocation that is dynamic as data collection progresses Use paradata (or other information) to adapt itself to what is observed during data collection Focus on calls prioritization  Our objective: Maximize quality for a given cost  Context: CATI surveys 11

Choice of quality indicator  Focus of the literature: Find collection designs that reduce nonresponse bias (or maximize R- indicator) of an unadjusted estimator  We think the focus should not be on nonresponse bias. Why? Any bias that can be removed at the collection stage can also be removed at the estimation stage  We suggest reducing nonresponse variance of an estimator adjusted for nonresponse 12

Quality indicator  Suppose we want to estimate the total:  Assuming that nonresponse is uniform within cells, an asymptotically unbiased estimator is:  Quality indicator: The nonresponse variance 13

Overall cost  Overall cost: 14

Expected overall cost  Expected overall cost: 15

Mathematical formulation  Objective: Find that minimizes the nonresponse variance subject to a fixed expected overall cost,  Solution:  Note:Equivalent to maximizing the R-indicator only in a very special scenario 16

Implementation  Find the effort (number of attempts) necessary to achieve the target response probability  Procedure: Select cases to be interviewed with probability proportional to the effort  Issues:1) Avoid small estimated to avoid an unduly large effort 2) Might want to ensure that a certain time has elapsed between two consecutive calls 17

Graph of variance vs cost Minimum nonresponse variance Expected overall cost 18

Revised solution  Solution of the optimization problem is found before data collection starts  May be a good idea to revise the solution periodically (e.g., daily) Some parameters might need to be modified Update remaining budget and expected overall cost The revised optimization problem is similar to the initial one 19

Revised solution  Solution (same as before):  Revised target response probability:  Effort: 20 Could be negative

Conclusion  Next steps: Simulation study Adapt the theory for practical applications Test in a real production environment  Which quality indicator? Nonresponse variance? Others?  Reduction of nonresponse bias: subsampling of nonrespondents Our approach could be used within the subsample 21

Thanks - Merci  For more information, please contact:  Pour plus d’information, veuillez contacter : Jean-François Beaumont David Haziza 22