Complex Surveys Sunday, April 16, 2017.

Slides:



Advertisements
Similar presentations
Basic Sampling Theory for Simple and Cluster Samples
Advertisements

1 Session 10 Sampling Weights: an appreciation. 2 To provide you with an overview of the role of sampling weights in estimating population parameters.
Introduction Simple Random Sampling Stratified Random Sampling
+ Sampling and Surveys Inference for Sampling The purpose of a sample is to give us information about alarger population. The process of drawing conclusions.
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
1 STRATIFIED SAMPLING Stratification: The elements in the population are divided into layers/groups/ strata based on their values on one/several.
Chapter 5 Stratified Random Sampling n Advantages of stratified random sampling n How to select stratified random sample n Estimating population mean and.
Sampling with unequal probabilities STAT262. Introduction In the sampling schemes we studied – SRS: take an SRS from all the units in a population – Stratified.
Multiple Indicator Cluster Surveys Survey Design Workshop
QBM117 Business Statistics Statistical Inference Sampling 1.
Dr. Chris L. S. Coryn Spring 2012
Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample Christopher Sroka, Elizabeth Stasny, and Douglas Wolfe Department of Statistics.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics 10 th Edition.
STAT262: Lecture 5 (Ratio estimation)
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
Section 5.1. Observational Study vs. Experiment  In an observational study, we observe individuals and measure variables of interest but do not attempt.
7-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft.
Formalizing the Concepts: Simple Random Sampling.
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
17 June, 2003Sampling TWO-STAGE CLUSTER SAMPLING (WITH QUOTA SAMPLING AT SECOND STAGE)
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Sampling Design  M. Burgman & J. Carey Types of Samples Point samples (including neighbour distance samples) Transects line intercept sampling.
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Near East Regional Workshop - Linking Population and Housing Censuses with Agricultural Censuses. Amman, Jordan, June 2012 Improving Efficiency.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
Sampling: What you don’t know can hurt you Juan Muñoz.
Sampling Techniques LEARNING OBJECTIVES : After studying this module, participants will be able to : 1. Identify and define the population to be studied.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Basic Sampling & Review of Statistics. Basic Sampling What is a sample?  Selection of a subset of elements from a larger group of objects Why use a sample?
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.
Other Probability Sampling Methods
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Population Estimation Objective : To estimate from a sample of households the numbers of animals in a population and to provide a measure of precision.
Sampling And Resampling Risk Analysis for Water Resources Planning and Management Institute for Water Resources May 2007.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 7-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 In an observational study, the researcher observes values of the response variable and explanatory.
1 Chapter 2: Sampling and Surveys. 2 Random Sampling Exercise Choose a sample of n=5 from our class, noting the proportion of females in your sample.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Basic Business Statistics
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
SWBAT: Describe and create stratified & cluster random samples. Do Now: The residual plots from five different least squares regression lines are shown.
Chapter 4: Designing Studies... Sampling. Convenience Sample Voluntary Response Sample Simple Random Sample Stratified Random Sample Cluster Sample Convenience.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Marginal Distribution Conditional Distribution. Side by Side Bar Graph Segmented Bar Graph Dotplot Stemplot Histogram.
Complex Surveys
Complex Surveys Components of a complex survey: random sampling; ratio estimation; stratification; clustering. How to assemble above components into a.
2. Stratified Random Sampling.
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
10/18/ B Samples and Surveys.
Presentation transcript:

Complex Surveys Sunday, April 16, 2017

Assembling Design Components Building blocks Probability sampling Simple random sampling (SRS) Unequal probability sampling Stratification Purpose: to increase the precision of estimates by grouping similar items together Cluster sampling Purpose: convenience. Ratio estimation

Simple Random Sampling (without replacement) There are (N choose n) possible samples Each with probability 1/(N choose n) Point estimate and C.I. Sample size (n) calculation

Stratification (Ch 3) The estimate of the population total It’s variance Sample size allocation - proportional, optimal In general, stratified sampling with proportional allocation is more efficient than SRS The more unequal the stratum means, the more benefits

Ratio estimation (ch4) Biased May results in smaller MSE Useful when variables are linearly correlated Regression estimation

Cluster Sampling (ch 5&6) Usually less efficient than other methods The relative efficiency of it and SRS depends on intra-class correlation coefficient The larger the correlation coefficient, the less efficient Can reduce cost and lead to administrative convenience One-stage, two-stage, with equal or unequal probs, point estimate, variance, c.i. Allocation of m and n for two-stage cluster sampling

Cluster Sampling without Replacement Select a sample of n clusters with replacement based on Estimate cluster total and variance Estimate population total Variance can be estimated by formulas in ch5,6 or resampling methods

Cluster Sampling with Replacement Select a sample of n clusters with replacement based on Estimate cluster total Calculate Estimate population total and variance

In practice Most of large surveys involves several ideas of techniques Different types of estimators

An example: background Malaria is a common public health problem in tropical and subtropical regions It is infectious. People get it by being bitten by a kind of female mosquito Without timely and proper treatment, the death rate can be very high Can be prevented by using mosquito nets The prevention is only affective if the nets are in widespread use

Summary Goal: To estimate the prevalence of bed net use in rural areas Sampling frame: all rural villages of <3,000 people in The Gambia

The survey in Gambia (1991) 3000 rural villages Stage Sampling unit Sampling method eastern central western Stratified by region Prob district size 5 districts per region 1 district PHC Non-PHC Stratified by PHC Prob village size 4 villages per district 2 village SRS 6 compounds / village 3 compound Top-down

The survey in Gambia (1991) 3000 rural villages eastern central western Stratified sampling Sampling with unequal probs, two-stage cluster, Ch 6 PHC Non-PHC district Stratified sampling Sampling with unequal probs, two-stage cluster, Ch 6 village compound SRS (average number of nets per compound) Top-down Bottom-up

The survey in Gambia (1991) The way to calculate the estimated total and its variance seems to be complicated It can be worse if we include ratio estimators In practice, we can Use sampling weights to obtain point estimates Use computer intensive methods to obtain standard error (ch9) Such as jackknife, bootstrap

Sampling weights The sampling weight is the reciprocal of Pr(being selected) Each sampled unit “represents” certain number of units in the population The whole sample “represents” the whole population

Sampling weights Weights are used to deal with the effects of stratification and clustering on point estimate Stratified sampling

Sampling weights Cluster sampling with equal probabilities

Sampling weights For three-stage sampling Very large weights are often truncated Biases results Reduces the mean squared error p: primary s: secondary t: tertiary

Sampling weights Weights contain the information needed to construct point estimates Weights do not contain enough information for computing variance Weights can be used to find point estimates because calculating variance requires prob(pairs of units are selected) Computer-intensive methods can be used to find variances

Sampling weights: the malaria example Pr(a compound in central region PHC villages is selected)=

Self-weighting and Non-self-weighting Self-weighting: sampling weights for all observation units are equal A self-weighing sampling is representative of the population if nonsampling errors are ignored Most large self-weighting samples are not SRS Standard software with the usual assumption of iid leads to correct estimate of mean, proportion, percentiles; but erroneous estimation for variance

Ratio Estimation in Complex Surveys Ratio estimation is part of the analysis, not the design Can be used at any level. Usually used near the top

Ratio Estimation in the Malaria example Region level: Above the region level

Ratio Estimation in Complex Surveys The bias of ratio estimation can be large when sample sizes are small Separate ratio estimator for a population total Improves efficiency when ratios vary from stratum to stratum; works poorly for small strata sample sizes Combined ratio estimator for a population total Has less bias when strata sizes are small; works poorly when ratios vary from stratum to stratum

Estimating a Distribution Function Historically, sampling theory was developed to find population means, totals, and ratios. Other quantities, such ass, Pr(Statistics > means or totals) Median? 95th percentile? Probability mass function? Sampling weights can be used in constructing an empirical distribution of the population

Population quantities and functions Probability mass function (pmf) Distribution function

Empirical Functions Empirical probability mass function Empirical distribution function Empirical functions can be used to estimate population quantities such as mean, median, percentiles, variance, ect.

Plotting data from a complex survey SRS Histograms/smoothed density estimates Scatterplots and scatterplot matrices In a complex sampling design, simple plots can be missleading

Incorporating weights

Incorporating weights

Plotting data from a complex survey The 1987 Survey of Youth in Custody Family background, previous criminal history, drug and alcohol use 206 PSU’s (facilities) were divided by 16 strata SSUs were the 1985 Children in Custody (CIC)

The 1987 Survey of Youth in Custody The two figures are very similar because the survey was aimed to be self-weighting Youths aged 15 were undersampled due to unequal selection prob and nonresponse Youths aged 17 were oversampled

The 1987 Survey of Youth in Custody

The 1987 Survey of Youth in Custody

The 1987 Survey of Youth in Custody

The 1987 Survey of Youth in Custody

Design effects Cornfield’s ratio (1951) Measure the efficiency of a sampling plan by the ratio of the variance that would be obtained from an SRS of k observation units to the variance obtained from the complex sampling plan with k observation units The design effect (deff, Kish 1965) The reciprocal of Cornfield’s ratio

The design effects The design effect provides a measure of the precision gained/lost by use of the more complex design instead of SRS For estimating a mean

The design effects Stratified Cluster