Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Slides:



Advertisements
Similar presentations
Variance Estimation When Donor Imputation is Used to Fill in Missing Values Jean-François Beaumont and Cynthia Bocci Statistics Canada Third International.
Advertisements

Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,
Evaluating the Effects of Business Register Updates on Monthly Survey Estimates Daniel Lewis.
Annual growth rates derived from short term statistics and annual business statistics Dr. Pieter A. Vlag, Dr. K. van Bemmel Department of Business Statistics,
Review bootstrap and permutation
Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya.
Estimating a Population Variance
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Sampling: Final and Initial Sample Size Determination
1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University.
April 21, 2010 STAT 950 Chris Wichman. Motivation Every ten years, the U.S. government conducts a population census, and every five years the U. S. National.
Examining the use of administrative data for annual business statistics Joanna Woods, Ria Sanderson, Tracy Jones, Daniel Lewis.
1 Statistical Tests of Returns to Scale Using DEA Rajiv D. Banker Hsihui Chang Shih-Chi Chang.
Complex Surveys Sunday, April 16, 2017.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample Christopher Sroka, Elizabeth Stasny, and Douglas Wolfe Department of Statistics.
Resampling techniques
Horng-Chyi HorngStatistics II_Five43 Inference on the Variances of Two Normal Population &5-5 (&9-5)
2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.
Bootstrapping LING 572 Fei Xia 1/31/06.
Modelling health care costs: practical examples and applications Andrew Briggs Philip Clarke University of Oxford & Daniel Polsky Henry Glick University.
Population Proportion The fraction of values in a population which have a specific attribute p = Population proportion X = Number of items having the attribute.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Sample size. Ch 132 Sample Size Formula Standard sample size formula for estimating a percentage:
Understanding sample survey data
Sampling Concepts Population: Population refers to any group of people or objects that form the subject of study in a particular survey and are similar.
Stratified Random Sampling. A stratified random sample is obtained by separating the population elements into non-overlapping groups, called strata Select.
Increasing Survey Statistics Precision Using Split Questionnaire Design: An Application of Small Area Estimation 1.
Standard error of estimate & Confidence interval.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Bootstrapping applied to t-tests
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
1 Terminating Statistical Analysis By Dr. Jason Merrick.
Analysis of Monte Carlo Integration Fall 2012 By Yaohang Li, Ph.D.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 03/10/2015 6:40 PM Final project: submission Wed Dec 15 th,2004.
Case Study - Relative Risk and Odds Ratio John Snow’s Cholera Investigations.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
Nonparametric Confidence Intervals: Nonparametric Bootstrap.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
1 6. Reliability computations Objectives Learn how to compute reliability of a component given the probability distributions on the stress,S, and the strength,
Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Sampling And Resampling Risk Analysis for Water Resources Planning and Management Institute for Water Resources May 2007.
© Copyright McGraw-Hill 2000
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Measuring change in sample survey data. Underlying Concept A sample statistic is our best estimate of a population parameter If we took 100 different.
SWBAT: Explain how undercoverage, nonresponse, and question wording can lead to bias in a sample survey. Do Now: An airline that wants to assess customer.
Chapter 4: Designing Studies... Sampling. Convenience Sample Voluntary Response Sample Simple Random Sample Stratified Random Sample Cluster Sample Convenience.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Active Learning Lecture Slides
Chapter 4. Inference about Process Quality
Maximum Likelihood & Missing data
Test for Mean of a Non-Normal Population – small n
Bootstrap Confidence Intervals using Percentiles
Estimation Interval Estimates (s unknown) Industrial Engineering
QQ Plot Quantile to Quantile Plot Quantile: QQ Plot:
Types of Control I. Measurement Control II. Statistical Control
The European Statistical Training Programme (ESTP)
1-Way Random Effects Model
The Swedish survey on turnover in the service sector
Chapter 13: Item nonresponse
Presentation transcript:

Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

The Jackknife variance estimation based on adjusted imputed values proposed by Rao and Shao (1992). The Bootstrap procedure proposed by Shao and Sitter (1996)

Performance We carry out a Montecarlo study. For each replication, we compute: Relative bias Relative mean square error The 95% confidence interval based on the normal distribution

Imputation methods Ratio and mean imputation For each method we consider several fractions of missing data, with and without covariates

1 case: Structural Business Survey Population : Annual Industrial Business Survey (completely enumerates enterprises with 20 or more employees) of size N=16,438 The variable to impute: Turnover Auxiliary variable: total expense

2 case:Retail Trade Index Survey Population : sample of businesses from the Retail Trade Index Survey of size N=9,414 The variable to impute: Turnover Auxiliary variable: the same month year ago turnover

1 case:Montecarlo study Simple random samples without replacement of sizes n=100, 500, 1000 and 5000 Non-response in the turnover variable is randomly generated (response mechanism uniform) A loss of about 30% is simulated

2 case:Montecarlo study Stratified random samples without replacement of sizes n=800, 1500, 2200 and 3000 Non-response in the turnover variable is randomly generated (response mechanism uniform) Missing data are generated following a distribution similar to the true missing value pattern observed in the survey.

Montecarlo study Number of replications is 200,000 for each auxiliary variable, imputation method and sample size

Results (I) The performance of the jackknife variance estimator is better for larger sample sizes and for ratio imputation. The jackknife variance performs poorly. This shows that strong skewness and kurtosys of imputed variable can influence considerably the results.

Results (II) The relative bias is large for small sizes, then decreases and increases again when the sampling fraction becomes non-negligible The coverage rate is not close to the nominal one even for large samples. (Due to the skewed and heavy-tailed distributions of the variables)

Conclusions (I) Ratio imputation should be used instead of mean whenever auxiliary variable are avalaible. In these examples, the stratification of the sample doesn’t improve the quality of the the jackknife variance estimator

Conclusions (II) The percentile bootstrap performs better than the jackknife for coverage rate of the confidence intervals and the reverse is true for mean square errors and bias of the variance.