Presentation is loading. Please wait.

Presentation is loading. Please wait.

From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample.

Similar presentations


Presentation on theme: "From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample."— Presentation transcript:

1 From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample from that population.  e.g. Want to know Proportion of U.S. adults confident in president Obama's handling of the economy. Only have survey data from n=1027 respondents. Gallup April 2009 data: 71% surveyed have “great deal/fair amount of confidence” How do we move from the known sample statistic, call it s, to the unknown population parameter, call it p? Can we say p=71%? How accurate? How reliable? How confident? Error margin? What sorts of errors may be involved?

2 From Sample to Population We discuss:  Parameter versus Statistic  Bias and Variability  Margin of Error  Confidence Statements  Error types/sources (sampling errors and non-sampling errors); Sampling designs.

3 From Sample to Population Parameter  fixed, unknown number that describes some characteristic of the population Statistic  known value calculated from a sample  a statistic is used to estimate a parameter Two major issues in estimating p from s   Bias: in repeated samples, the sample statistic consistently misses the population parameter in the same direction (e.g. Sampling frame wrong, under-coverage)‏  Variability: different samples from the same population may yield different values of the sample statistic  Want to minimize both

4 Bias and Variability Figure 3.3 Bias and variability in shooting arrows at a target. Bias means the archer systematically misses in the same direction. Variability means that the arrows are scattered.

5 Bias and Variability

6 To reduce bias, use random sampling  We've seen how “bad” samples can result from convenience sampling and voluntary response samples, leading to bias in estimation results. To reduce variability, use larger samples  estimate from a random sample will be closer to the true population parameter if the sample is larger. (In the limit it would be a census.) Estimates from larger samples differ less from one another (in the limit there is no variation)

7 The Effect of Sample Size: Sampling Distribution for n=100 Figure 3.1 The results of many SRSs have a regular pattern. Here, we draw 1000 SRSs of size 100 from the same population. The population proportion is p = 0.5. The sample proportions vary from sample to sample, but their values center at the truth about the population.

8 The Effect of Sample Size: Sampling Distribution for n=2527 Figure 3.2 Draw 1000 SRSs of size 2527 from the same population as in Figure 3.1. The 1000 values of the sample proportion are much less spread out than was the case for smaller samples.

9 Margin of Error The sample statistic is unlikely to be identical to the population parameter. What's the error margin? (Two elements: the error, and the confidence)‏

10 Margin of Error Assuming random sampling, two components: Variation in the population,  we'll come back to this later on). The larger the  the less accurate is the sample statistic as an estimate under a given sample size.  Sample size. Relation: error margin proportional to  sqrt(n). (The quick approximate method below is nearly exact for p=1/2.) 

11 Confidence Statement “95%” confidence: Standard. But can also use other levels, such as 99%. (What can we do, in terms of error margin and sample size, to increase confidence level?)‏ Exactly how do we get the confidence statement? Need knowledge of the sampling distribution. (More later)‏

12 What the Margin of Error Doesn't Say Under coverage, convenience and voluntary sampling bias are examples of sampling errors Non-response, problems in survey question construction and response errors are examples of non-sampling errors.

13 Non-response

14 Some Issues in Survey Design Induced bias:  “If you found a wallet with $20 in it, would you do the right thing and return the money?” Question ordering:  “How often do you normally go out on a date? about ___ times a month”  “How happy are you with life in general?”  (Induces association of the questions)‏ Complex question:  Do you sometimes find that you have arguments with your family members and co-workers?  (If one has arguments only with family members, should he answer “yes” or “no”?)

15 Who carried out the survey? What was the population? How was the sample selected? How large was the sample and what was the margin of error? What was the response rate? How were the subjects contacted? When was the survey conducted? What were the exact questions asked? See, e.g. Pew Research Center: http://people-press.org/methodology Questions to Ask Before You Believe a Poll

16 “Random undergraduate classroom survey of n=810 students was conducted by the Office of Health Promotion within the University Student Health Services, Division of Student Affairs. Statistics from this survey led to the following conclusions: - most students (67%) have 0-4 drinks when they go out - most (69%) have had 0-1 sex partners in the past year - most (76%) either don’t drink, or use designated drivers if they do” What questions should you ask to help you assess the credibility of these results? Example: “University Students are Healthier than You Think”

17 Probability Sampling Plans So far we've been focusing on simple random sampling. In the real world, many surveys use more complex sampling designs (in order to save resources, ensure representation of certain groups, etc.)‏ e.g. Stratify on race for a survey on racial relations on campus. (e.g., you might draw 10% of black students, 1% of white students) Simple random sampling and stratified sampling are both examples of probability sampling, in which the probability of each individual being selected is known, even though the probabilities may not be equal. Weighting may be used to make the sample from a complex plan to mimic a simple random sample.

18 Another example of probability sampling Divide the population of interest into groups Randomly select some of those groups Divide the resulting collection of individuals into smaller groups Randomly select some of those groups Continue dividing the resulting collection of individuals into groups and randomly selecting some of those groups until you can simply list all of the resulting individuals and randomly select n of them for your sample Multistage Sample

19 Example: Selecting 1500 registered U.S. voters [Use multistage sampling since we don't have a sampling frame (list) of all registered U.S. voters.]  randomly select five U.S. states  obtain a list of all counties/cities in those states  randomly select 20 of those counties/cities  obtain a list of all registered voters in those 20 counties/cities  randomly select 1500 voters from that list


Download ppt "From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample."

Similar presentations


Ads by Google