Download presentation
Presentation is loading. Please wait.
1
Two-Phase Sampling (Double Sampling)
2
Motivating Example Suppose X is a useful auxiliary variable for Y, the measure that we are interested in We have shown that ratio estimation is very useful when X and Y is highly correlated We don’t know X But X is relatively cheaper (or easier) to obtain (or measure) What should we do?
3
Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 2: measure Y n(2) units Phase 1: measure X Sample n(1) units Population
4
Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 1: measure X Sample n(1) units Population
5
Two-Phase Sampling (Double Sampling)
Neyman 1938 Phase 2: measure Y n(2) units Phase 1: measure X Sample n(1) units When conducting phase 2, we treat phase 1 as the “population”
6
Two-Phase vs Two-Stage
7
Theory of Two-Phase Sampling
S(1): phase I sample Indicators Sampling weights: k-dimensional auxiliary variables: Estimated population total for the jth auxiliary variable:
8
Theory of Two-Phase Sampling
S(2): phase II sample
9
Theory of Two-Phase Sampling
measure Y n(2) units treat phase I as the “population”: Phase 1: measure X Sample n(1) units This quantity is unknown, as we don’t measure Y for all the sampled units in phase 1
10
Theory of Two-Phase Sampling
Unbiasedness: Variance Wait a minute, we haven’t used X!
11
Two-Phase Sampling with Ratio Estimation
X is highly correlated with Y X is inexpensive to measure We measure X for all the sampled units in phase I Different from the ratio estimation we learned earlier, here we don’t know X for all units in the population We have to estimate tx using the phase I sample
12
California Schools: estimate api00
Phase 1: SRS of 2,000 schools Sample mean of api99:
13
California Schools: estimate api00
Phase 2: SRS of 200 schools Sample mean of api99: Sample mean of api00: Slope=
14
California Schools: estimate api00
Phase 1: SRS of 2,000 schools Sample mean of api99: Phase 2: SRS of 200 schools Sample mean of api99: Sample mean of api00: How should we use all information to estimate api00 (population mean)?
15
Two-Phase Sampling with Ratio Estimation
Estimate the population total tx : api00:
16
Two-Phase Sampling with Ratio Estimation
The ratio estimator Linearize it using Taylor’s theorem
17
Two-Phase Sampling with Ratio Estimation
Variance of the ratio estimation
18
Two-Phase Sampling with Ratio Estimation
The variance
19
Two-Phase Sampling with Ratio Estimation
The variance An estimator of the variance California schools: Alternatively, we can use Jackknife estimation for variance
20
Jackknife Estimation For Variance
Jackknife is a resampling methods Useful to estimate variance and bias A very simple example Let X1, …, Xn iid Sample mean: Deleting the ith observation:
21
Jackknife Estimation For Variance
Note that Therefore, As a result,
22
Jackknife Estimation For Variance
The ratio estimator: Delete unit j in the Phase I: The Jackknife estimator of the variance
23
California Schools: estimate api00
24
Jackknife Estimation For Variance
Modified weights in Phase I Jackknife
25
Jackknife Estimation For Variance
Modified weights in Phase II Jackknife
26
Designing a Two-Phase Sample
Optimal allocation for ratio estimation Fixed total cost: Optimal allocation:
27
Example: California Schools
We have already known that Api99 and api00 are highly correlated The ratio estimator is helpful in estimating mean api00 Let’s pretend that api99 scores are not available before sampling are cheaper to obtain than api00 Conduct two-phase sampling (homework)
28
Homework 3: Compare Two-Phase Sampling and SRS
Data: California schools Two sampling methods: One-phase sampling: SRS, api00, 200 schools Two-phase sampling Phase 1: conduct an SRS of n schools and collect api99 scores. Phase 2: subsampling an SRS of 200 schools from the n schools sampled in phase 1. Collect api00 scores Evaluate the efficiency of the two-phase sampling relevant to the one-phase sampling. Try several values of n, such as 500, 1000, and 2000
29
Two-Phase Sampling With Stratification
30
Two-Phase Sampling With Stratification
What if we don’t have the stratification information at the beginning? Phase I: collect information that can be used for stratification Phase II: use a stratified sample
31
Two-Phase Sampling With Stratification
Indicators for phase 1 Strata membership indicators: Observe the strata membership indicators for all units sampled in phase 1
32
Two-Phase Sampling With Stratification
S(2): phase II sample SRS in each stratum Sampling weights
33
Two-Phase Sampling With Stratification
Two-phase-sampling stratified estimator
34
Stratified vs Two-Phase Stratified
Variance where
35
Stratified vs Two-Phase Stratified
||
36
Stratified vs Two-Phase Stratified
Estimated variance
37
Example: California Schools
Goal: to estimate the mean school size (number of enrollments) Two-phase sampling Phase 1: an SRS of 1,500 schools. Find out the types of schools Phase 2: stratified sampling of 300 schools Equal sized sampling Proportional sampling
38
Example: California Schools
n1=1500, n2=300 var(srs.est): , var(eqstr.2phase: , var(propstr.2phase):
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.