Cluster Sampling The basics
What are we trying to achieve in a survey? A sample that is representative of the larger population A sample that is representative of the larger population
EPI Method Samples groups of person rather than individuals Samples groups of person rather than individuals 30 Clusters with 7 persons per cluster = 210 persons 30 Clusters with 7 persons per cluster = 210 persons Based on smallpox immunization surveys in West Africa in ‘68 and ’69 Based on smallpox immunization surveys in West Africa in ‘68 and ’69 Is this as precise as a simple random sample (SRS)? Is this as precise as a simple random sample (SRS)?
Design Effect Cluster sampling is commonly used, rather than simple random sampling, mainly as a means of saving Cluster sampling is commonly used, rather than simple random sampling, mainly as a means of saving money when, for example, the population is spread out, and the researcher cannot sample from money when, for example, the population is spread out, and the researcher cannot sample from everywhere. However, “respondents in the same cluster are likely to be somewhat similar to one everywhere. However, “respondents in the same cluster are likely to be somewhat similar to one another”. As a result, in a clustered sample “Selecting an additional member from the same cluster another”. As a result, in a clustered sample “Selecting an additional member from the same cluster adds less new information than would a completely independent selection”. Thus, for example, in adds less new information than would a completely independent selection”. Thus, for example, in single stage cluster samples, the sample is not as varied as it would be in a random sample, so that the single stage cluster samples, the sample is not as varied as it would be in a random sample, so that the effective sample size is reduced. The loss of effectiveness by the use of cluster sampling, instead of effective sample size is reduced. The loss of effectiveness by the use of cluster sampling, instead of simple random sampling, is the design effect. The design effect is basically the ratio of the actual simple random sampling, is the design effect. The design effect is basically the ratio of the actual variance, under the sampling method actually used, to the variance computed under the assumption of variance, under the sampling method actually used, to the variance computed under the assumption of simple random sampling simple random sampling
Comparison of 2 cluster sample designs Some problems with EPI cluster sampling Some problems with EPI cluster sampling Communities selected by PPS with inaccurate data Communities selected by PPS with inaccurate data Households not selected from a sampling frame (selection bias) Households not selected from a sampling frame (selection bias) Possibility of non-response bias Possibility of non-response bias
Compact cluster sampling Still select clusters based on PPS from census data Still select clusters based on PPS from census data Clusters then divided into segments with equal number of households (HHs) Clusters then divided into segments with equal number of households (HHs) One segment randomly chosen and all HHs in that segment surveyed One segment randomly chosen and all HHs in that segment surveyed Partially addresses selection and non-response bias Partially addresses selection and non-response bias