Two Approaches to the Use of Administrative Records to Reduce Respondent Burden and Data Collection Costs John L. Eltinge Office of Survey Methods Research.

Slides:



Advertisements
Similar presentations
Statistics NZs experience in using Administrative Data in an Integrated Programme of Economic Vince Galvin General Manager Strategy & Communications.
Advertisements

Alternative Approaches to Data Dissemination and Data Sharing Jerome Reiter Duke University
Comparison of Simulation Methods Using Historical Data in the U.S. International Price Program M.J. Cho, T-C. Chen, P.A. Bobbitt, J.A. Himelein, S.P. Paben,
Description, Characterization and Optimization of Drill-Down Methods for Outlier Detection and Treatment in Establishment Surveys J. L. Eltinge, U.S. Bureau.
Characterization and Management of Multiple Components of Cost and Risk in Disclosure Protection for Establishment Surveys Discussion of Advances in Disclosure.
Estimating the Level of Underreporting of Expenditures among Expenditure Reporters: A Further Micro-Level Latent Class Analysis Clyde Tucker Bureau of.
Paul Smith Office for National Statistics
1 New Perspectives on the Quality of Administrative Data COPAFS Quarterly Meeting September 21, 2012 Bill Iwig, USDA/NASS.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Split Questionnaire Designs for Consumer Expenditure Survey Trivellore Raghunathan (Raghu) University of Michigan BLS Workshop December 8-9, 2010.
Brian A. Harris-Kojetin, Ph.D. Statistical and Science Policy
Federal Guidance on Statistical Use of Administrative Data Shelly Wilkie Martinez, Statistical and Science Policy, OIRA U. S. Office of Management and.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
TAJSTAT: Strengthening the National Statistical System Project Mustafa Dinc TLSS and MICS Conference Dushanbe, Tajikistan July 1, 2008.
© Statistisches Bundesamt, Institute for Research and Development in Federal Statistics Statistisches Bundesamt Business Surveys – Testing Strategies in.
Results and next steps from the ESSnet Admin Data Alison Pritchard Business Outputs & Developments, Office for National Statistics, UK 4 December 2012.
1 Editing Administrative Data and Combined Data Sources Introduction.
Chapter Three Research Design.
The Use of Administrative Sources for Statistical Purposes Administrative Sources and Statistical Registers.
Metadata: Integral Part of Statistics Canada Quality Framework International Conference on Agriculture Statistics October 22-24, 2007 Marcelle Dion Director.
Aggregate and Systemic Components of Risk in Total Survey Error Models John L. Eltinge U.S. Bureau of Labor Statistics International Total Survey Error.
Modeling errors in physical activity data Sarah Nusser Department of Statistics and Center for Survey Statistics and Methodology Iowa State University.
The Application of the Concept of Uniqueness for Creating Public Use Microdata Files Jay J. Kim, U.S. National Center for Health Statistics Dong M. Jeong,
Volunteer Angler Data Collection and Methods of Inference Kristen Olson University of Nebraska-Lincoln February 2,
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
12th Meeting of the Group of Experts on Business Registers
Q2010, Helsinki Development and implementation of quality and performance indicators for frame creation and imputation Kornélia Mag László Kajdi Q2010,
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007.
Role of Statistics in Geography
© John M. Abowd 2007, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2007.
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
CZECH STATISTICAL OFFICE 1 The Quality Metadata System In the Czech Statistical Office Work Session on Statistical Metadata (METIS)
HOW TO WRITE RESEARCH PROPOSAL BY DR. NIK MAHERAN NIK MUHAMMAD.
Assessing the Capacity of Statistical Systems Development Data Group.
Professor Chip Besio Cox School of Business Southern Methodist University.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic 1 Subsystem QUALITY in Statistical Information System Czech.
1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline.
Evaluation of Multiple Components of Error in the Collection and Integration of Survey and Administrative Record Data John L. Eltinge International.
for statistics based on multiple sources
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
ISI Satellite Conference on Agricultural Statistics, Maputo, August 2009 Integrated survey framework Using Household Expenditure Surveys for Food.
Copyright 2010, The World Bank Group. All Rights Reserved. Reducing Non-Response Section B 1.
Household Economic Resources Discussant Comments UN EXPERT GROUP MEETING 9 September 2008 Garth Bode, Australian Bureau of Statistics.
© Federal Statistical Office Germany, Division IB, Institute for Research and Development in Federal Statistics Sheet 1 Surveys, administrative data or.
Copyright 2010, The World Bank Group. All Rights Reserved. Principles, criteria and methods Part 2 Quality management Produced in Collaboration between.
1 For a Population Statistical Register Characteristics and Potentials for the Official Statistics Central department for administrative data and archives.
Household Survey Data on Remittances in Sending Countries Johan A. Mistiaen International Technical meeting on Measuring Remittances Washington DC - January.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
3-1 Copyright © 2010 Pearson Education, Inc. Chapter Three Research Design.
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
Marketing Information System A Marketing Information System is the structure of people, equipment, and procedures used to gather, analyze, and distribute.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
Are the Standard Documentations really Quality Reports? European Conference on Quality in Official Statistics Helsinki, 3-6 May 2010 © STATISTIK AUSTRIA.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
Introduction to Statistics Estonia Study visit of the State Statistical Service of Ukraine on Dissemination of Statistical Information and related themes.
Practical Approaches to Design and Inference Through the Integration of Complex Survey Data and Non-Survey Information Sources John L. Eltinge, Scott.
Methods for Data-Integration
Chapter 3: Cost Estimation Techniques
Introduction to Marketing Research
Prague EU-SILC Best Practice Workshop, 14th and 15th September 2017
Survey phases, survey errors and quality control system
Survey phases, survey errors and quality control system
Organization of efficient Economic Surveys
Sub-regional workshop on integration of administrative data, big data
A SUMMARY NOTE ON REVISED GDP ESTIMATES
2.7 Annex 3 – Quality reports
Use of administrative data for statistical purposes
Presentation transcript:

Two Approaches to the Use of Administrative Records to Reduce Respondent Burden and Data Collection Costs John L. Eltinge Office of Survey Methods Research U.S. Bureau of Labor Statistics 12 th Meeting of the Group of Experts on Business Registers September 14-15, 2011

Acknowldegements and Disclaimer The author thanks Tony Barkume, Rick Clayton, Mike Davern, Bob Fay, Jenna Fulton, Gerry Gates, Pat Getz, Bill Iwig, Shelly Martinez, Bill Mockovak, Polly Phipps, John Ruser, and members of the FCSM Subcommittee on Administrative Records for many helpful discussions of the topics considered here. The views expressed here are those of the author and do not necessarily reflect the policies of the U.S. Bureau of Labor Statistics, nor the FCSM Subcommittee on Statistical Uses of Administrative Records 2

Overview I. Conceptual Background II. Two Approaches to Integration of Survey and Administrative Record Data A. Survey Core B. Administrative Record Core III. Methodological Issues IV.Empirical Issues V. Management Issues 3

I. Conceptual Background A. Primary Question For a specified resource base, can we improve the balance of quality/cost/burden/risk in official statistics by integrating survey and administrative record data? 4

I. Conceptual Background (continued) B. Possible Example: U.S. Consumer Expenditure Survey 1. Goal: Collect data on a wide range of consumer expenditures and related demographics 2. Current approach a. Household sample survey – complex design b. Personal visit and telephone collection 5

I. Conceptual Background (continued) 3. Issues re cost and perceived burden (60+ minutes average interview time; cognitive complexity) 4. BLS currently exploring a wide range of redesign options 5. Prospective use of administrative-record data, and its long-term impact on the balance of quality, cost, burden and risk? 6

I. Conceptual Background (continued) C. Possible Cases 1. Sales data from retailers, other sources - Aggregated across customers, by item - Possible basis for imputation of missing items or disaggregation of global reports 2. Collection of some data (with permission) through administrative records (e.g., grocery loyalty cards), linked with sample consumer units 7

I. Conceptual Background (continued) D. Framework: Population of Consumer Purchases Defined by Cross-Classification of: - Classification of product/service, time, geography - Characteristics of purchaser (Consumer? Demographics?) - Admin: Outlet, intermediaries (financial, other) E. How to modify estimation methods to incorporate administrative data? -Weighting and imputation for CPI cost weights, commonly produced tables - Construction of public-use datasets 8

II. Two Approaches to Integration of Survey and Administrative Record Data A. Survey Core 1. Relatively standard sample survey design a. Possible use of administrative record data for frames, selection probabilities, weights b. Primary data collection through standard survey methodology 2. Supplement survey data with administrative records a. Problematic variables (burden, data quality) Current Example: U.S. National Immunization Survey b. Quality checks (microdata or aggregate levels) 9

II. Two Approaches to Integration of Survey and Administrative Record Data (Continued) B. Administrative Record Core 1. Access administrative record data (at microdata or partially aggregated levels) 2. Per Lessler (2006), supplement as needed for inferential goals a. Fill in for incomplete population unit coverage b. Collect variables not captured in administrative records c. Adjust for data quality issues (e.g., timeliness or aggregation effects 10

II. Two Approaches to Integration of Survey and Administrative Record Data (Continued) C. Design Features 1. Generally differ substantially between the survey-core and administrative record core approaches 2. Need to consider both methodological and managerial components of design 11

III. Methodological Issues Comparison and contrast of the “Survey Core” and “Administrative Record Core” approaches will involve a wide range of methodological issues A. Methods for Evaluation of Properties of Prospective Administrative Record Sources 1. Population aggregates (means, totals) 2. Variable relationships (regression, GLM) 3. Cross-sectional and temporal stability of (1) and (2) 12

III. Methodological Issues (Continued) B. Methods for Integration for Sample Survey and Administrative Record Data: Adaptation of Methods from: 1. Partitioned designs (“multiple matrix sampling”) in education, health statistics 2. Multiple-frame designs (e.g., Lohr and Rao, 1999, 2003) - Frames may capture subpopulations through fundamentally different classification structures 13

III. Methodological Issues (Continued) C. Importance of Clarity on Sources of Variability Considered in Evaluation of Bias, Variance and Other Properties 1. Sources: Superpopulation effects Sample design (e.g., subsampling, matching) Unit, wave and item missingness or time lags Aggregation effects (temporal, cross-sectional) Reporting error (definitional, temporal, other) Imputation effects (including model lack of fit) 2. Conditioning and integration 14

III. Methodological Issues (Continued) 15 D. Working Model for Methodological Properties X = Frame, weight information Y = Sample survey data Z = Additional administrative record data Properties of estimator based on variability from: 1. Population structure 2. Administrative and survey collection processes (“filters”) 3. Homogeneity of (1) and (2) across cases

III. Methodological Issues (Continued) E. Formal Evaluation of Properties Evaluate expected mean squared error with respect to each component in (D.1) and (D.2) Current information available at conceptual, empirical levels? Critical importance of understanding the underlying processes for collection and reporting of administrative data Ex: Propensity of a household or business to provide informed consent to link? Ex: Homogeneity of data quality characteristics over time? 16

III. Methodological Issues (Continued) F. Prior Literature (Examples) Davern (2007, 2009) Demers (2009) Federal Committee on Statistical Methodology (1980) Fulton et al. (2009) Herzog, Winkler and Scheuren (2007) Jabine and Scheuren (1985) Jeskanen-Sundstrom (2007) Ord and Iglarsh (2007) Penneck (2007) Royce (2007) Winkler (2009) 17

III. Methodological Issues (Continued) G. Prior literature: Two concepts of data quality 1. Per Davern (2007), extent usual ideas of “total survey error” (TSE) to administrative data: (Estimator) – (True value) = (frame error) + (sampling error) + (incomplete-data effects) + (measurement error) + (processing effects) 18

III. Methodological Issues (Continued) 2. Broader definitions of data quality, e.g., Brackstone (1999): Accuracy (all components of TSE) AND: Timeliness, Relevance, Interpretability, Accessibility, Coherence 3. Risk: Degradation in any component of data quality a. Aggregate risk: Historical focus of quantitative work b. Systemic risk: Often important for statistical programs - cf. “complex and tightly coupled systems” (Perrow, 1984, 2009) 19

III. Methodological Issues (Continued) H. Cost Structures 1. Statistical products (including surveys and administrative records) require substantial investments (often in intangibles) a. Data originators: - Initial administrative purpose - Accommodate statistical agency (data quality, learning curve, systems) b. Statistical agencies - Learning curves - Systems for acquisition, edit/imputation - Disclosure limitation 20

III. Methodological Issues (Continued) 2. Broad acknowledgement of substantial costs 3. Less empirical information generally available on: a. Relative magnitudes of specific cost components b. Extent of homogeneity of results from (a) with respect to: - Type of administrative/business organization - Type of administrative records - Subpopulation - Other factors 21

III. Methodological Issues (Continued) 4. Level of precision available on cost information: a. Purely qualtitative b. Order of magnitude c. Relatively precise 5. Practical uses of cost information a. Qualitative decisions among options b. Fine-tuning specific procedure 6. Sources of information (F. LaFlamme, 2008) a. Special studies (risks: Hawthorne, incomplete accounting) b. Cost-recovery contract accounting 22

III. Methodological Issues (Continued) I. Burden: 1. Respondent burden a. Elapsed time for collection, related activities b. Cognitive complexity c. Perceived sensitivity d. Informed consent - Direct access and linkage with survey - Obtained during original administrative- record work? 23

III. Methodological Issues (Continued) 2. Organizational burden a. Informed consent b. Record linkage c. Data management d. Data quality evaluation and adjustment 24

IV. Empirical Issues A. Properties of Input Data and Final Estimators B. Cost Structures 1. Obtaining Data: Contractual costs with provider Agency personnel (expertise) 2. Modification and maintenance of production systems C. Case studies are important, but may not allow inference to broader populations, variables 25

V. Managerial Issues A. Central Issue: Management of Costs and Risks - Methodological risks (commonly studied) - Operational risks (“execution risks”) B. Contractual Structure: 1. Performance Criteria and Incentives for Data Provider (Timely Delivery, Quality, Notice on Changes) 2. Stability of Prospective Sources (AOL in 1999, 2011) 3. Changes in Agency Requests (New Products, New Channels) C. Agency Personnel: Skills, Incentives and Institutional Culture 26

V. Managerial Issues (Continued) D. Contrast Between 1. Incremental risks (per standard statistical methodology) 2. Systemic risks cf. literature from Perrow (1984, 1999) and others on risks in “complex and tightly coupled systems” 27

VI. Closing Remarks A. Design Issues in Integration of Survey and Administrative Record Data B. Goal: Improve Balance Among Quality, Cost, Burden and Risk C. Distinction Between “Survey Core” and “Administrative Record Core” Approaches D. Impact on Methodological Design and Management Design E. Importance of Development of a Spectrum of Empirical Results 28

Contact Information John L. Eltinge Associate Commissioner Office of Survey Methods Research U.S. Bureau of Labor Statistics