Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two-Phase Sampling (Double Sampling)

Similar presentations


Presentation on theme: "Two-Phase Sampling (Double Sampling)"— Presentation transcript:

1 Two-Phase Sampling (Double Sampling)

2 Motivating Example Suppose X is a useful auxiliary variable for Y, the measure that we are interested in We have shown that ratio estimation is very useful when X and Y is highly correlated We don’t know X But X is relatively cheaper (or easier) to obtain (or measure) What should we do?

3 Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 2: measure Y n(2) units Phase 1: measure X Sample n(1) units Population

4 Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 1: measure X Sample n(1) units Population

5 Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 2: measure Y n(2) units Phase 1: measure X Sample n(1) units When conducting phase 2, we treat phase 1 as the “population”

6 Two-Phase vs Two-Stage

7 Theory of Two-Phase Sampling
S(1): phase I sample Indicators Sampling weights: k-dimensional auxiliary variables: Estimated population total for the jth auxiliary variable:

8 Theory of Two-Phase Sampling
S(2): phase II sample

9 Theory of Two-Phase Sampling
measure Y n(2) units treat phase I as the “population”: Phase 1: measure X Sample n(1) units This quantity is unknown, as we don’t measure Y for all the sampled units in phase 1

10 Theory of Two-Phase Sampling
Unbiasedness: Variance Wait a minute, we haven’t used X!

11 Two-Phase Sampling with Ratio Estimation
X is highly correlated with Y X is inexpensive to measure We measure X for all the sampled units in phase I Different from the ratio estimation we learned earlier, here we don’t know X for all units in the population We have to estimate tx using the phase I sample

12 California Schools: estimate api00
Phase 1: SRS of 2,000 schools Sample mean of api99:

13 California Schools: estimate api00
Phase 2: SRS of 200 schools Sample mean of api99: Sample mean of api00: Slope=

14 California Schools: estimate api00
Phase 1: SRS of 2,000 schools Sample mean of api99: Phase 2: SRS of 200 schools Sample mean of api99: Sample mean of api00: How should we use all information to estimate api00 (population mean)?

15 Two-Phase Sampling with Ratio Estimation
Estimate the population total tx : api00:

16 Two-Phase Sampling with Ratio Estimation
The ratio estimator Linearize it using Taylor’s theorem

17 Two-Phase Sampling with Ratio Estimation
Variance of the ratio estimation

18 Two-Phase Sampling with Ratio Estimation
The variance

19 Two-Phase Sampling with Ratio Estimation
The variance An estimator of the variance California schools: Alternatively, we can use Jackknife estimation for variance

20 Jackknife Estimation For Variance
Jackknife is a resampling methods Useful to estimate variance and bias A very simple example Let X1, …, Xn iid Sample mean: Deleting the ith observation:

21 Jackknife Estimation For Variance
Note that Therefore, As a result,

22 Jackknife Estimation For Variance
The ratio estimator: Delete unit j in the Phase I: The Jackknife estimator of the variance

23 California Schools: estimate api00

24 Jackknife Estimation For Variance
Modified weights in Phase I Jackknife

25 Jackknife Estimation For Variance
Modified weights in Phase II Jackknife

26 Designing a Two-Phase Sample
Optimal allocation for ratio estimation Fixed total cost: Optimal allocation:

27 Example: California Schools
We have already known that Api99 and api00 are highly correlated The ratio estimator is helpful in estimating mean api00 Let’s pretend that api99 scores are not available before sampling are cheaper to obtain than api00 Conduct two-phase sampling (homework)

28 Homework 3: Compare Two-Phase Sampling and SRS
Data: California schools Two sampling methods: One-phase sampling: SRS, api00, 200 schools Two-phase sampling Phase 1: conduct an SRS of n schools and collect api99 scores. Phase 2: subsampling an SRS of 200 schools from the n schools sampled in phase 1. Collect api00 scores Evaluate the efficiency of the two-phase sampling relevant to the one-phase sampling. Try several values of n, such as 500, 1000, and 2000

29 Two-Phase Sampling With Stratification

30 Two-Phase Sampling With Stratification
What if we don’t have the stratification information at the beginning? Phase I: collect information that can be used for stratification Phase II: use a stratified sample

31 Two-Phase Sampling With Stratification
Indicators for phase 1 Strata membership indicators: Observe the strata membership indicators for all units sampled in phase 1

32 Two-Phase Sampling With Stratification
S(2): phase II sample SRS in each stratum Sampling weights

33 Two-Phase Sampling With Stratification
Two-phase-sampling stratified estimator

34 Stratified vs Two-Phase Stratified
Variance where

35 Stratified vs Two-Phase Stratified
||

36 Stratified vs Two-Phase Stratified
Estimated variance

37 Example: California Schools
Goal: to estimate the mean school size (number of enrollments) Two-phase sampling Phase 1: an SRS of 1,500 schools. Find out the types of schools Phase 2: stratified sampling of 300 schools Equal sized sampling Proportional sampling

38 Example: California Schools
n1=1500, n2=300 var(srs.est): , var(eqstr.2phase: , var(propstr.2phase):


Download ppt "Two-Phase Sampling (Double Sampling)"

Similar presentations


Ads by Google