Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sampling Variability Sampling Distributions

Similar presentations


Presentation on theme: "Sampling Variability Sampling Distributions"β€” Presentation transcript:

1 Sampling Variability Sampling Distributions
Chapter 8 Sampling Variability Sampling Distributions Created by Kathy Fritz

2 Suppose that you are interested in learning about the proportion of women in the group of students pictured below and that this group is the entire population of interest. The proportion of women = =0.56 You can compute this proportion because the picture provides complete information on gender for the entire population (a census).

3 that neither of these proportions equal the population proportion.
But, suppose that the population information is not available. To learn about the proportion of women in the population, you decide to select a sample from the population by choosing 5 students at random. Notice that the proportions from the two different samples are NOT the same AND that neither of these proportions equal the population proportion. Here is one possible sample. The proportion of women is 3 5 =0.6. In this chapter, you will learn about how the value of the sample proportion varies from sample to sample (sampling variability) AND about the long-run behavior of sample proportions (the sampling distribution). Here is a different sample. The proportion of women is 2 5 =0.4.

4 Statistics and Sampling Variability

5 Statistic A number computed from the values in a sample is called a statistic. The observed value of a statistic varies from sample to sample depending on the particular sample selected. This variability is called sampling variability. Recall the notation for population characteristics Statistics, such as the sample mean π‘₯ m the sample median the sample standard deviation s s the sample proportion 𝑝 provide information about population characteristics. p

6 Consider a small population consisting of 100
Consider a small population consisting of students in an introductory psychology course. Students in the class completed a survey on academic procrastination. Suppose that on the basis of their responses, 40% of the students were identified as severe procrastinators. 𝑝= =0.40 Let’s investigate what happens if random samples of size 20 are selected from this population. To do this, write the numbers 1 to 100 on slips of paper, where 1 – 40 represent students who are severe procrastinators. Mix the slips well, then select 20 slips without replacement.

7 Looking at some additional samples will provide some insight.
Consider a small population consisting of 100 students in an introductory psychology course. Students in the class completed a survey on academic procrastination. Suppose that on the basis of their responses, 40% of the students were identified as severe procrastinators. 𝑝= =0.40 This value is 0.05 larger than the population proportion of Is this difference typical, or is this particular sample proportion unusually far away from p? Looking at some additional samples will provide some insight. One sample of size 20 might be: The nine highlighted numbers correspond to students who were identified as severe procrastinators. 𝑝 = 9 20 =0.45 78 16 31 86 5 67 39 28 97 70 34 65 47 89 26 79 52 4 82 6

8 The histogram shows the sampling variability in the statistic 𝑝 .
It also provides an approximation to the distribution of 𝑝 values that would have been observed if you had considered EVERY different possible sample of size 20 from this population. It is difficult to see, by looking at this table, if a sample proportion of 0.45 is typical or unusual. Let’s look at a histogram of these sample proportions. Severe procrastinators Continued Let’s look at 50 samples of size 5. Sample 𝑝 1 0.45 11 21 0.25 31 0.30 41 2 12 0.50 22 32 0.55 42 3 0.35 13 0.20 23 0.40 33 43 4 14 24 34 44 5 15 25 35 45 6 16 040 26 035 36 0.65 46 7 17 27 37 47 8 18 28 38 48 9 19 29 39 49 10 20 30 40 50

9 Sampling Distribution
The distribution formed by the values of a sample statistic for every possible different sample of a given size is called its sampling distribution.

10 Sampling Distribution of a Sample Proportion

11 In the fall of 2008, there were 18,516 students enrolled at California Polytechnic State University, San Luis Obispo. Of these students, 8091 (43.7%) were female. What would you expect to see for the sample proportion of females if you were to take a random sample of size 10 from this population? With success denoting a female student, the proportion of successes in this population is p = Using a statistical software package, we will generate 500 samples of size 10 and compute the proportion of females for each sample.

12 This tells you that a sample of size 10 from this population of students may not provide very accurate information about the proportion of women in the population. In the fall of 2008, there were 18,516 students enrolled at California Polytechnic State University, San Luis Obispo. Of these students, 8091 (43.7%) were female. The relative frequency histogram displays the 500 values of 𝑝 . How would selecting a larger size sample affect the distribution of 𝑝 ? Notice that there is a lot of sample-to-sample variability in the value of 𝑝 , with values from 0.0 to 0.9.

13 California Polytechnic State University Continued . . .
We will generate 500 samples of each of the following sample sizes: n = 10, n = 25, n = 50, n = 100 and compute the proportion of females for each sample. The following histograms display the distributions of the sample proportions for the 500 samples of each sample size. Compare the center, spread, and shape of these histograms.

14 What do you notice about the shape of these distributions?
Are these histograms centered around the population proportion p = 0.437? What do you notice about the standard deviation of these distributions? What do you notice about the shape of these distributions?

15 General Properties for Sampling Distributions of 𝑝
This means that the 𝑝 values from many different random samples will tend to cluster around the actual value of the population proportion. 𝒑 is the proportion of successes in a random sample of size n from a population where the proportion of success is p. The mean value of 𝑝 is denoted by πœ‡ 𝑝 , and the standard deviation of 𝑝 is denoted by 𝜎 𝑝 . Rule 1: 𝝁 𝒑 =𝒑 Rule 2: 𝝈 𝒑 = 𝒑 πŸβˆ’π’‘ 𝒏 Rule 3: When n is large and p is not too near 0 or 1, the sampling distribution of 𝑝 is approximately normal. This means that for larger samples, 𝑝 values tend to cluster more tightly around the actual value of the population proportion. Does this mean that the sampling distribution of 𝑝 will always be approximately normal? For what values of n and p is the sampling distribution of 𝑝 approximately normal? This rule is exact if the population is infinite, and is approximately correct if the population is finite and no more than 10% of the population is included in the sample.

16 When is the sampling distribution of 𝑝 approximately normal?
An equivalent way of stating this rule is to say that the sampling distribution of 𝑝 is approximately normal if the sample size is large enough to expect at least 10 successes and at least 10 failures in the sample. The farther the value of p is from 0.5, the larger n has to be in order for 𝑝 to have a sampling distribution that is approximately normal. A conservative rule of thumb is that if both np β‰₯ 10 and n (1 – p) β‰₯ 10, then the sampling distribution of 𝑝 is approximately normal.

17 How Sampling Distributions Support Learning from Data

18 What role do sampling distributions play in learning about a population characteristic?
In an estimation situation, you need to understand sampling variability to assess how close an estimate is likely to be to the actual value of the corresponding population characteristic. Sample data can also be used to evaluate whether or not a claim about a population is believable. There are two reasons why a sample statistic may not equal the value of the claim – Sampling variability 2. There really is a difference between the corresponding population characteristic and the claim value.

19 How accurate is this estimate likely to be?
Each person in a random sample of U.S. adults was asked the following question, β€œDo you believe that extraterrestrial beings have visited Earth at some time in the past?” Of the 1255 people in the survey, 442 answered yes to this question, resulting in a sample proportion of 𝑝 = =0.35. Let’s look at the sampling distribution of 𝑝. The population proportion who believe that extraterrestrials beings have visited Earth isn’t exactly 0.35. How accurate is this estimate likely to be? Rule 1: πœ‡ 𝑝 =𝑝, so we know that the sampling distribution of 𝑝 is centered at p. Rule 2: 𝜎 𝑝 β‰ˆ =0.013, so the 𝑝 values will be tightly clustered around the actual value of the population proportion. Rule 3: The sampling distribution of 𝑝 is approximately normal since there are 442 successes and 813 failures.


Download ppt "Sampling Variability Sampling Distributions"

Similar presentations


Ads by Google