Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University.

Similar presentations


Presentation on theme: "1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University."— Presentation transcript:

1 1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University

2 2 Multiple Frame Surveys Introduction Introduction – What is Multiple Frame Survey – What is Multiple Frame Survey Different estimators for population total Different estimators for population total Variance Estimators for those estimators Variance Estimators for those estimators Conclusion Conclusion References References

3 3 Introduction Hartley (1962) Hartley (1962) Multiple frame surveys refers to two or more frames that can cover a target population Multiple frame surveys refers to two or more frames that can cover a target population Very useful for sampling rare or hard-to-reach populations Very useful for sampling rare or hard-to-reach populations Dual frame design may result in considerable cost savings over a single frame design with comparable precision Dual frame design may result in considerable cost savings over a single frame design with comparable precision

4 4 Example 1 – Cost Reduction Agriculture [Hartley 1962, 1974] Agriculture [Hartley 1962, 1974] + List frame (incomplete, names, addresses) - Less costly + Area frame (complete, insensitive to changes) - Expensive to sample + Can achieve the same precision Linear Cost Function Linear Cost Function C = n A c A + n B c B

5 5 Example 2 – Rare Populations AIDS [Kalton and Anderson 1986] AIDS [Kalton and Anderson 1986] + Using a general population frame as well as std clinics, drug treatment centers, and hospitals Homeless [Iachan and Dennis 1993] Homeless [Iachan and Dennis 1993] + Frames: homeless shelters, soup kitchens, and street areas Alzheimer’s Alzheimer’s + Frames: general population and adult day-care centers

6 6 Issues to Consider Statisticians must address the following issues Statisticians must address the following issues + How should the information from the samples be combined to estimate samples be combined to estimate population quantities? population quantities? + How should variance estimates be calculated? calculated?

7 7 Notations Universe U = A U B = a U ab U b Universe U = A U B = a U ab U b N=# of elements in the population N=# of elements in the population N A = # of elements in Frame A N A = # of elements in Frame A N B = # of elements in Frame B N B = # of elements in Frame B N a = # of elements in Frame A, but not Frame B N a = # of elements in Frame A, but not Frame B N b = # of elements in Frame B, but not Frame A N b = # of elements in Frame B, but not Frame A N ab = # of elements in Frame A & Frame B N ab = # of elements in Frame A & Frame B S A = P{ i th element is in S} = π A i S A = P{ i th element is in S} = π A i Y = population total = Y a + Y b + Y ab Y = population total = Y a + Y b + Y ab

8 8 Estimators Hartley (H) Hartley (H) Fuller and Burmeister (FB) Fuller and Burmeister (FB) Single Frame estimators Single Frame estimators Pseudo-Maximum Likelihood (PML) Pseudo-Maximum Likelihood (PML)

9 9 Hartley & FB Estimator Minimizes the variance among the class of linear unbiased estimators of Y Minimizes the variance among the class of linear unbiased estimators of Y Have minimum variance for a single response Have minimum variance for a single response Use different set of weights for each response variable Use different set of weights for each response variable Disadvantages: Increased amount of calculations (uses covariances estimated by the data) and possible inconsistencies Disadvantages: Increased amount of calculations (uses covariances estimated by the data) and possible inconsistencies Estimators are not in general linear functions of y Estimators are not in general linear functions of y FB has the greatest asymptotic efficiency FB has the greatest asymptotic efficiency

10 10 Hartley & FB Estimator

11 11 Single Frame Estimators Bankier (1986), Kalton & Anderson (1986) and Skinner (1991) Bankier (1986), Kalton & Anderson (1986) and Skinner (1991) Treat all observations as if they had been sampled from a single frame with modified weights for observations in the intersections of frames Treat all observations as if they had been sampled from a single frame with modified weights for observations in the intersections of frames Do not use any auxiliary information about the population totals Do not use any auxiliary information about the population totals Linear in y Linear in y Other techniques may be applied: Regression Estimation and Ranking Ratio Estimation Other techniques may be applied: Regression Estimation and Ranking Ratio Estimation

12 12 Pseudo-Maximum Likelihood Estimator Skinner and Tao (1996) derived pseudo-ML(PML) estimator for dual frame survey that use the same set of weights for all items of y, similar to “single frame” estimators, and maintain efficiency. Skinner and Tao (1996) derived pseudo-ML(PML) estimator for dual frame survey that use the same set of weights for all items of y, similar to “single frame” estimators, and maintain efficiency. The idea of pseudo-MLE estimation is talked about in Roberts, Rao, Kumar (1987) and Skinner, Holt, and Smith (1989) in which a MLE estimator under simple random sampling is modified to achieve consistent estimation under complex designs. The idea of pseudo-MLE estimation is talked about in Roberts, Rao, Kumar (1987) and Skinner, Holt, and Smith (1989) in which a MLE estimator under simple random sampling is modified to achieve consistent estimation under complex designs.

13 13 The main advantages of PMLE are that it is design consistent and typically has a simple form. The main advantages of PMLE are that it is design consistent and typically has a simple form. The potential disadvantage is that it may not be asymptotically efficient, although it may be hoped that any loss of efficiency will tend to be small in practice. The potential disadvantage is that it may not be asymptotically efficient, although it may be hoped that any loss of efficiency will tend to be small in practice. Pseudo-Maximum Likelihood Estimator

14 14 Pseudo-MLE of Y is derived as Pseudo-MLE of Y is derived as and is the smallest root of the quadratic equation Pseudo-Maximum Likelihood Estimator

15 15 Extensive simulation was done to evaluate the performance of all the estimators in Sharon Lohr and J. N. K Rao(2005) paper Findings: Findings: In all the simulations, the PML method had either the smallest EMSE or an EMSE close to the minimum value. With its high efficiency and ease of computation, as well as the practical advantage of using the same set of weights for all response variables, the PML method appears to be a good choice for estimation in multiple frame survey. In all the simulations, the PML method had either the smallest EMSE or an EMSE close to the minimum value. With its high efficiency and ease of computation, as well as the practical advantage of using the same set of weights for all response variables, the PML method appears to be a good choice for estimation in multiple frame survey. Comparison of All Estimators

16 16 Findings Findings When Q>=3, the theoretically optimal Fuller- Burmeister and Hartley methods became unstable, because they require solving systems of equations using a large estimated covariance matrix. When Q>=3, the theoretically optimal Fuller- Burmeister and Hartley methods became unstable, because they require solving systems of equations using a large estimated covariance matrix. Comparison of All Estimators

17 17 Asymptotic Variance Under some conditions, the H, FB and PML estimators are all consistent estimators of the total. Under some conditions, the H, FB and PML estimators are all consistent estimators of the total. And And But neither H estimator or PML estimator is necessarily more efficient than the other.

18 18 Asymptotic Variance Sharon Lohr and J. N. K. Rao(2005) paper gives a general formula for the asymptotic variance for all above estimators, which can be used to construct optimal designs for multiple frame surveys. Sharon Lohr and J. N. K. Rao(2005) paper gives a general formula for the asymptotic variance for all above estimators, which can be used to construct optimal designs for multiple frame surveys.

19 19 Variance Estimators Two Methods: Skinner and Rao(1996) described a method for estimating the variance of using Taylor linearization. Skinner and Rao(1996) described a method for estimating the variance of using Taylor linearization. Lohr and Rao(2000) defined jackknife variance estimator for estimators from dual frame surveys and showed that jackknife variance estimator is asymptotically equivalent to the Taylor linearization variance estimator. Lohr and Rao(2000) defined jackknife variance estimator for estimators from dual frame surveys and showed that jackknife variance estimator is asymptotically equivalent to the Taylor linearization variance estimator.

20 20 Simulation results ( Lohr and Rao 2000) showed that in comparing the linearization estimator, full jackknife and modified jackknife estimators Simulation results ( Lohr and Rao 2000) showed that in comparing the linearization estimator, full jackknife and modified jackknife estimators 1. The jackknife estimator has exhibited smaller bias than the linearization estimator. 2. The relative bias of all three estimators of the variance tends to decrease as the sample size increase. 3. For the smaller sample sizes, the linearization and modified jackknife methods underestimate the EMSE. 4. Coverage probabilities, though similar for the three variance estimators, were slightly higher for the full jackknife. Variance Estimators

21 21 5. The jackknife methods are less stable than the linearization estimator of the variance as judged by the values of relative standard error. 6. For single frame estimator, the jackknife and linearization estimates of the variance coincide. 7. For the other estimators, both the linearization and modified jackknife estimates of the variance are biased downward. Variance Estimators

22 22 Conclusion Multiple Frame Surveys can be extremely beneficial when sampling rare populations and when a complete frame is very expensive to sample Multiple Frame Surveys can be extremely beneficial when sampling rare populations and when a complete frame is very expensive to sample Different estimators of the total are proposed. Choice of estimators will depend on survey design and complexity: FB is the most efficient, however due to additional calculations and complexity PML may be preferred Different estimators of the total are proposed. Choice of estimators will depend on survey design and complexity: FB is the most efficient, however due to additional calculations and complexity PML may be preferred

23 23 References H.O. Hartley (1974), “Multiple Frame Methodology and Selected Applications”, Sankhya, the Indian Journal of Statistics, Series C, 36, 99-118. H.O. Hartley (1974), “Multiple Frame Methodology and Selected Applications”, Sankhya, the Indian Journal of Statistics, Series C, 36, 99-118. C. J. Skinner and J. N. K. Rao(1996), “Estimation in Dual Frame Surveys with Complex Designs”, Journal of the American Statistical Association, 91, 349-356. C. J. Skinner and J. N. K. Rao(1996), “Estimation in Dual Frame Surveys with Complex Designs”, Journal of the American Statistical Association, 91, 349-356. Sharon L. Lohr and J.N.K. Rao(2000), “Inference from Dual Frame Surveys”, Journal of the American Statistical Association, 95, 2710280. Sharon L. Lohr and J.N.K. Rao(2000), “Inference from Dual Frame Surveys”, Journal of the American Statistical Association, 95, 2710280. Sharon L. Lohr and J. N. K. Rao(2006), “Estimation in Multiple-Frame Surveys”, Journal of the American Statistical Association (under revision). Sharon L. Lohr and J. N. K. Rao(2006), “Estimation in Multiple-Frame Surveys”, Journal of the American Statistical Association (under revision). J. Lessler and W. Kalsbeek (1992), Non-sampling Error in Surveys, John Wiley & Sons, Inc.


Download ppt "1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University."

Similar presentations


Ads by Google