Weighting data
Weighting “Weighting data”- survey respondents are “adjusted” to better represent the target population To do so, the weight given each respondent is adjusted to represent the number of similar respondents in the target population
What is the prevalence of food insecurity in the whole country? Example You do a survey in country x. You have stratified the country into 2 regions, A and B. You come up with a prevalence of food insecurity of 10% in region A, and 20% in region B. What is the prevalence of food insecurity in the whole country?
What is the prevalence of food insecurity in the whole country? Exercise 4 cont. You now know that there is a total of 10,000 households in region A, and 40,000 in region B. What is the prevalence of food insecurity in the whole country?
Exercise 4 cont. In this same example, you now consider that you sampled 100 households from each province. What will SPSS tell you the prevalence of food insecurity is in the entire country? Is this accurate?
Weighting in analysis SO, as we have seen in the exercise, a stratified sample can lead to serious bias in our results! Why did this happen? The households in region A, with a smaller number of households, had a better chance of being selected than the households in region B, which has a larger number of households.
Weighting in analysis How do we fix this bias? We can tell SPSS to ‘count’ some households more or less than others, making it AS IF each household in the entire sample has the same probability of being selected (like a SRS). This is called WEIGHTING.
Weighting in analysis The equation to calculate these weights can be complicated. (example in Excel). The important point to remember is that the potential NEED for weights should be considered. Keep in mind and always carefully record what information the analyst will need to calculate these weights Total population size (or number of hhs) Population (or #hhs) for each strata Detailed description of the sampling plan
In-depth weighting exercise Country survey is completed in country x. This country is comprised of 3 states and the country director as well as the MOH would like to get both country and state level estimates of food insecurity status. To do so, 1000 hhs per state (30x30 two stage cluster sample) are surveyed, meaning that 3,000 households are surveyed nationwide. State 1: total pop= 10,000 hhs; food insecurity rate= 35% State 2 has a total pop= 3,000 hhs; food insecurity rate=10% State 3 has a total pop= 15,000 hhs; food insecurity rate of 30%
In-depth weighting exercise 1. Would this be considered a stratified sample? If so, what kind (disproportionate or proportionate)? 2. In which state is the probability of a hh being sampled the highest? 3. In which state is the probability of a hh being sampled the lowest? 4. Without weighting the data, what is the national prevalence of food insecurity? 5. When the data is weighted, what is the national prevalence of food insecurity?