1 Aspects of Sampling for Household Surveys Kathleen Beegle Workshop 17, Session 1c Designing and Implementing Household Surveys March 31, 2009
2 Overview Households should be selected through a documented process that gives each household in the population of interest a probability of being chosen that is positive and known This permits making inferences from the sample to the entire population with known margins of error Household samples are generally not simple random samples. They are instead Stratified by Region, by Urban/Rural, by Intervention/Control, … Selected in two stages or more Area Units in the first stage/s (enumeration areas, clusters) Households in the last stage (within areas, randomly drawn)
3 Outline Sampling error Stages & stratification Design effect Implementing a sample design Non-response Non-sampling error Sampling for impact evaluation
4 Sampling Error Sampling error is the result of observing a sample of n households (the sample size) rather than all N households in the country The standard error e is a measure of a sample’s precision. The chances for the true value of an indicator being farther than 2e apart from its sampling estimate are about 95 percent. The standard error e decreases with the square root of the sample size n. To reduce the error to one half, the sample size must be quadrupled.
5 Sampling Error, cont. The size of the population N has almost no influence on the size of the sample that is needed to achieve a given precision!! To obtain national estimates, big countries and small countries require samples of about the same size. Increasing the sample size will generally reduce sampling errors However, it is also likely to increase non- sampling errors
6 Stages & stratification Why two stages? An updated list of all households in the country is generally unavailable A single-stage sample would be too scattered in the territory 2 stage sampling solves these problems, but the sample becomes less precise as a result of clustering Why stratification? In order to potentially improve precision, by gaining control of the composition of the sample In order to provide estimates for subgroups that would otherwise be poorly represented (small regions, female-headed households, etc.) Most stratified samples select households with unequal probabilities. This implies that the survey needs to be analyzed with weights.
7 Design Effect Any 2 households from the same area/location are more homogenous than any 2 from different locations (across areas) The combined result of clustering and stratification is called design effect. Design effect (deff) depends on the cluster size: the number of households in each enumeration area/cluster LSMS and HIES surveys try not to exceed households per cluster. Demographic surveys may occasionally do more Tradeoff between # of households per cluster (& lower survey costs) and error (which goes up with clustering)
8 Implementing a sample design An adequate sample frame needs to be available before a sample can be selected A sample frame is a list of all units in the population 1 st stage: the sample frame is usually the most recent list of census enumeration areas It needs to be linked to cartography. Usually from Nat Stats Office. 2 nd stage: the sample frame is usually developed specifically for each survey, by way of a household listing operation conducted in all EAs. The time and budget of household listing are Small enough to be considered a marginal part of the overall data collection effort Large enough to be a headache if they are forgotten or underestimated
9 Implementing a sample design, cont. Parts of the country may need to be excluded because of Outside the program (geographically, organizationally, …) Security reasons Accessibility Nomads Etc. That can be OK, as long as long as Decisions are properly documented Results are not extrapolated later to the whole country
10 Implementing a sample design, cont. Basing the sample design on some pre-identified indicator of impact. For example, for a survey whose main focus is poverty, you may estimate the standard error of consumption expenditure for a given sample design “Power calculations” for sample design Need to narrow down focus to a small set (or even 1) indicator to design a sample Need to make assumptions about the mean/distribution of that indicator in the country/region of the survey: usually from existing data Sometimes, no such data exists Avoid launching surveys that have no power calculations to determine sample design Invest in hiring a sampling expert Document!!!
11 Non-response Non-response: when a household selected for inclusion is not then interviewed. Sources of non-response: listing is outdated (e.g. household moved to a new dwelling, mortality), refusal, unable to locate. None of the following is a solution for non-response Replace non-respondents with “similar” households Increase the sample size to compensate for it Use correction formulas Use imputation techniques (hot-deck, cold-deck, warm-deck, etc.) to simulate the answers of non-respondents The best way to deal with non-response is to prevent it.
12 The required sample size n is determined by The variability of the indicator of interest Var(X) Though this is unknown…proxied by existing data/evidence The maximum margin of error E we are willing to accept. E is acceptable error (a value or percentage points). How confident we want to be in that the error of our estimation will not exceed that maximum For each confidence level α there is a coefficient t α The size of the population…….. not very important… Sample Size Averages Proportions correction for finite pop (smaller than original)
13 Non-sampling error The quality of the data depends on both sampling error (the integrity of the sample design) and non-sampling error Non-sampling error is the quality of the data collected: completeness of the listing exercise, reliability of responses, mis-reporting, recoding errors, mis-measurement in general. Larger samples, larger teams, shorter time frame for field work: can lead to more non- sampling error. Harder to supervisor and monitor field work.
14 Sampling for impact evaluation (1) Sample design usually aims to produce a sample that will measure with a certain degree of confidence the difference between participants and non-participants with respect to some indicator/outcome Indicator of interest to design the sample is usually the expected size of the effect (outcome of interest for the program) Minimum detectable effect Picking effect size that are too optimistic (higher effect assumed) will usually result in a sample that is too small Still want/need pre-existing data for estimate of variance of the outcome of interest
15 Sampling for impact evaluation (2) If treatment is randomized across villages (clustered design), statistical power or precision is less than for individual randomization, often by a lot -- due to the design effect. Generally, for the sample, the number of individuals in the clusters (villages) matters less than the number of clusters. For a program implemented at the village level, difficult to compensate for too few villages (clusters) in the treatment/control groups with high number of households surveyed in each village.