© John M. Abowd 2005, all rights reserved Assessing Data Quality John M. Abowd April 2005.

Slides:



Advertisements
Similar presentations
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting Part II.
Advertisements

© John M. Abowd 2007, all rights reserved Universes, Populations and Sampling Frames John M. Abowd February 2007.
© 2003 Prentice-Hall, Inc.Chap 1-1 Business Statistics: A First Course (3 rd Edition) Chapter 1 Introduction and Data Collection.
QBM117 Business Statistics Statistical Inference Sampling 1.
© John M. Abowd 2005, all rights reserved Household Samples John M. Abowd March 2005.
© 2002 Prentice-Hall, Inc.Chap 1-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 1 Introduction and Data Collection.
Who and How And How to Mess It up
© John M. Abowd 2005, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2005.
Bridging the Gaps: Dealing with Major Survey Changes in Data Set Harmonization Joint Statistical Meetings Minneapolis, MN August 9, 2005 Presented by:
Irwin/McGraw-Hill Copyright © 2001 by The McGraw-Hill Companies, Inc. All rights reserved. 1-1.
INFO 4470/ILRLE 4470 Social and Economic Data Populations and Frames John M. Abowd and Lars Vilhuber February 7, 2011.
1 Surveys and Population-Based Studies u Definition of a "Survey" A method of collecting information about a human population in which direct (or indirect)
Using Panels for Travel and Tourism Research Chapter 40 Research Methodologies.
© John M. Abowd 2005, all rights reserved Sampling Frame Maintenance John M. Abowd February 2005.
© John M. Abowd 2005, all rights reserved Statistical Programs of the Federal Government John M. Abowd February 2005.
Responding driven sampling Principles of Sampling Session 1.
A new sampling method: stratified sampling
Basic Business Statistics (8th Edition)
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
INFO 4470/ILRLE 4470 Register-based statistics by example: County Business Patterns John M. Abowd and Lars Vilhuber February 14, 2011.
Arun Srivastava. Types of Non-sampling Errors Specification errors, Coverage errors, Measurement or response errors, Non-response errors and Processing.
Learning Objective Chapter 7 Primary Data Collection: Survey Research CHAPTER seven Primary Data Collection: Survey Research Copyright © 2000 by John Wiley.
Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population vs. Sample.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 1-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
1 Development and Application of Statistical Business Registers in Africa Key findings Besa Muwele Besa Muwele Michael Colledge Michael Colledge 9th African.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
United Nations Workshop on Principles and Recommendations for a Vital Statistics System, Revision 3, for African English-speaking countries Addis Ababa,
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
The Application of the Concept of Uniqueness for Creating Public Use Microdata Files Jay J. Kim, U.S. National Center for Health Statistics Dong M. Jeong,
Is that Random?.  A sample is a collection of data from some fraction of a population.  It allows us to learn about the entire population by studying.
12th Meeting of the Group of Experts on Business Registers
Integrated Health Care Survey Designs: Analytical Enhancements Achieved Through Linkage of Surveys and Administrative Data 2008 European Conference on.
Central egency for public mobilization and statistics.
Census Data for GIS and Planning Professionals GIS in Action April 15, 2014 Charles Rynerson Census State Data Center Coordinator Population Research Center.
Role of Statistics in Geography
© John M. Abowd 2007, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2007.
Using Multiple Methods to Reduce Errors in Survey Estimation: The Case of US Farm Numbers Jaki McCarthy, Denise Abreu, Mark Apodaca, and Leslee Lohrenz.
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
1 Things That May Affect Estimates from the American Community Survey.
Copyright 2010, The World Bank Group. All Rights Reserved. Sources of Agricultural Data Section A 1.
Brown, Suter, and Churchill Basic Marketing Research (8 th Edition) © 2014 CENGAGE Learning Basic Marketing Research Customer Insights and Managerial Action.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Data Collection 1.
Bias in Surveys MDM4U – Mathematics of Data Management.
Current Population Survey Sponsor: Bureau of Labor Statistics Collector: Census Bureau Purpose: Monthly Data for Analysis of Labor Market Conditions –CPS.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Quality Assurance Programme of the Canadian Census of Population Expert Group Meeting on Population and Housing Censuses Geneva July 7-9, 2010.
Raymond W. Stone Stone & McCarthy Research Associates Presentation: University of Richmond November 19, 2004 Topics Relating to Employment Statistics.
Estimation of preliminary unemployment rates by means of multiple imputation UN/ECE-Work Session on Data Editing Vienna, April 2008 Thomas Burg, Statistics.
© John M. Abowd 2005, all rights reserved Multiple Imputation, II John M. Abowd March 2005.
Why register-based statistics? Eric Schulte Nordholt Statistics Netherlands Division Social and Spatial Statistics Department Support and Development Section.
© John M. Abowd 2007, all rights reserved General Methods for Missing Data John M. Abowd March 2007.
Workshop LFS, April Estimating the non-response bias using exogenous data on employment Etienne Debauche Corinne Prost.
SAMPLING Obtaining a Sample From a Population. A population is all the people or objects of interest in a study.
A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. Chap 1-1 A Course In Business Statistics 4 th Edition Chapter 1 The Where, Why, and How.
Lesson Sources of Errors in Sampling. Objectives Understand how error can be introduced during sampling.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
Overview and challenges in the use of administrative data in official statistics IAOS Conference Shanghai, October 2008 Heli Jeskanen-Sundström Statistics.
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
INFO 7470/ECON 7400/ILRLE 7400 Understanding Social and Economic Data John M. Abowd and Lars Vilhuber January 21, 2013.
© John M. Abowd 2005, all rights reserved Using the Decennial Census of Population and Housing John M. Abowd February 2005.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
INFO 7470/ECON 7400/ILRLE 7400 Register-based statistics John M. Abowd and Lars Vilhuber March 4, 2013 and April 4, 2016.
INFO 7470 Statistical Tools: Edit and Imputation Examples of Multiple Imputation John M. Abowd and Lars Vilhuber April 18, 2016.
IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011.
© John M. Abowd 2007, all rights reserved Confidentiality, Privacy, and the Ethics of Social Data John M. Abowd January 2007.
Presentation transcript:

© John M. Abowd 2005, all rights reserved Assessing Data Quality John M. Abowd April 2005

© John M. Abowd 2005, all rights reserved Outline Summary Primary collection objective Secondary data analysis In-scope population(s) Sampling frame(s) Completed data Benchmarking Publications

© John M. Abowd 2005, all rights reserved Summary Data quality can be given a precise definition (simplified here) For a given hypothesis, the utility associated with a given data collection is a function of the accuracy (-abs(bias)) and precision (inv(var)) for a given estimand Q The resources associated with generating the estimand Q can be used to reduce bias or increase precision

© John M. Abowd 2005, all rights reserved Efficient data quality Precision [1/V] Accuracy [-abs(b)] Too much accuracy, not enough precision Too much precision, not enough accuracy

© John M. Abowd 2005, all rights reserved Primary Data Collection Bias is controlled by –Instrument design –Sampling frame maintenance –Editing –Confidentiality protections Precision is controlled by –Sampling scheme –Complexity of the concept under measure

© John M. Abowd 2005, all rights reserved Secondary Data Analysis Most bias considerations are hidden in the preparation of the analysis file –Non-response –Item missing and editing –Accuracy of the instrument Most precision gains are illusory –Sampling design is given –Assessments of precision often ignore design

© John M. Abowd 2005, all rights reserved QA of In-scope Population Recall our discussion of the Decennial Census –Original frame –Initial Census results –Census 2000 Supplemental Survey –Independent demographic analysis

© John M. Abowd 2005, all rights reserved QA of Frame Maintenance Diverging trends Household measures of employment v. establishment measures of employment How do you determine which is correct?

© John M. Abowd 2005, all rights reserved QA of Data Completion Basic non-response (missing from frame or sample) Item non-response Incomplete response Confidentiality protections

© John M. Abowd 2005, all rights reserved Benchmarking For household surveys: uses independent population estimates For establishment surveys: uses independent business size estimates

© John M. Abowd 2005, all rights reserved Publications How do the factors we have identified here affect official publications? Often unknown but very important