Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 7 Sampling Distributions

Similar presentations


Presentation on theme: "Chapter 7 Sampling Distributions"— Presentation transcript:

1 Chapter 7 Sampling Distributions
7.1 What is a Sampling Distribution?

2 Big Idea #1 Statistic vs. Parameter

3 A Parameter is a number that describes some characteristic of the population.
A parameter always exists but in practice we rarely know it’s value because of the difficulty in creating a census. Ex: If we wanted to compare the IQ’s of all American and Asian males it would be impossible, but it’s important to realize that 𝜇𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛𝑠 and 𝜇𝐴𝑠𝑖𝑎𝑛𝑠 exist. Ex: If we were interested in whether there is a greater percentage of women who eat broccoli than men, we want to know whether 𝑝 𝑤𝑜𝑚𝑒𝑛 > 𝑝 𝑚𝑒𝑛 Parameter

4 A statistic is a number that describes some characteristic of a sample.
It’s important to realize that a statistic can change from sample to sample. We often use statistics to estimate an unknown parameter. Ex: I take a random sample of 500 American males and find their IQ’s. We find that 𝑥 = I take a random sample of 200 women and find that 40 like broccoli. Then 𝑝 = .2 Statistic

5 Symbols Symbols Used Sample Statistic Population Parameter Proportions
𝑝 Means 𝑥 𝜇

6 Identify the population, the parameter, the sample, and the statistic in each of the following settings. A pediatrician wants to know the 75th percentile for the distribution of heights of 10-year-old boys, so she takes a sample of 50 patients and calculates 𝑄 3 = 56 𝑖𝑛𝑐ℎ𝑒𝑠 population – 10 year old boys Parameter – 75th percentile for heights Sample – 50 patients Statistic – Q3 = 56 inches Example:

7 Identify the population, the parameter, the sample, and the statistic in each of the following settings. A Pew Research Center Poll asked to 17- year olds in the United States if they have a cell phone. Of the respondents, 71% said Yes. Population – 12- to 17- year olds in the united states Parameter – proportion p with cell phones Sample – to 17- year olds Statistic – 𝑝 =0.71 Example:

8 Check your understanding p. 417
Each boldface number is the value of either a parameter or a statistic. In each case, state which it is and use appropriate notation to describe the number. On Tuesday, the bottles of Arizona Iced Tea filled in a plant were supposed to contain and average of 20 ounces of iced tea. Quality control inspectors sample 50 bottles at random from the day’s production. These bottles contained an average of 19.6 ounces of iced tea. Parameter - μ=20 ounces of iced tea Statistic - 𝑥 =19.6 ounces Check your understanding p. 417

9 Check your understanding p. 417
Each boldface number is the value of either a parameter or a statistic. In each case, state which it is and use appropriate notation to describe the number. On a New York – to – Denver flight, 8% of the 125 passengers were selected for random security screening before boarding. According to the Transportation Security Administration, 10% of passengers at this airport are chosen for random screening. Parameter - 𝑝=.10 or 10% of passengers Statistic - 𝑝 =0.08 or 8% of the sample passengers Check your understanding p. 417

10 Big Idea #2 Statistics Vary

11 Sampling variability: the value of a statistic varies in repeated random sampling.
What would happen if we took many samples? Here’s how to answer that question: Take a large number of samples from the same population. Calculate the statistic for each sample. Make a graph of the values of the statistic. Examine the distribution displayed in the graph for SOCS. Sampling Variability

12 Big Idea #3 Difference Between: Population Distribution
Distribution of Sample Data Sampling Distribution

13 Sampling Distribution
The sampling distribution of a statistic is the distribution of all values taken by a statistic in all possible samples of the same size from the same population. In reality, we use simulation to get many, many samples and obtain an approximation to the sampling distribution. When we sample, we sample with replacement. A sampling distribution is a sample space- it describes everything that can happen when we sample. Sampling Distribution

14 Population Distribution
The population distribution gives the values of the variable for all the individuals in the population. Describes individuals. Population Distribution

15 Distribution of Sample Data
The distribution of sample data shows the values of the variable for the individuals in the sample. Describes individuals. Distribution of Sample Data

16 A simulation was conducted choosing 500 SRSs of size 𝑛=20 from a population of 200 chips, 100 red and 100 blue. To the left is a dotplot of the values of 𝑝 , the sample proportion of red chips, from these 500 samples. Is this the sampling distribution of 𝑝 ? Justify your answer. No. It doesn’t show the values of the statistic in all possible samples. Suppose your teacher prepares a similar bag and claims half of them are red. A class mate takes an SRS of 20 chips; 17 of them are red. What would you conclude about your teacher’s claim? Explain. 17 20 =.85%. This value never occurred in the 500 simulated samples. This gives strong evidence against the teacher’s claim. Example No. It doesn’t show the values of the statistic in all possible samples. 17/20 = .85%. This value never occurred in the 500 simulated samples. This gives strong evidence against the teacher’s claim.

17 Let’s Compare

18 Check your understanding p. 420
Mars, Incorporated, says that the mix of colors in its M&M’s Milk Chocolate Candies is 24% blue, 20% orange, 16% green, 14% yellow, 13% red, and 13% brown. Assume that the company’s claim is true. We want to examine the proportion of orange M&M’s in repeated random samples of 50 candies. Graph the population distribution. Identify the individuals, the variable, and the parameter of interest. Individuals: M&Ms Milk Chocolate Candies. Variable: color. Parameter of interest: proportion of orange M&Ms. Check your understanding p. 420

19 Check your understanding p. 420
Mars, Incorporated, says that the mix of colors in its M&M’s Milk Chocolate Candies is 24% blue, 20% orange, 16% green, 14% yellow, 13% red, and 13% brown. Assume that the company’s claim is true. We want to examine the proportion of orange M&M’s in repeated random samples of 50 candies. Imagine taking an SRS of 50 M&M’s. Make a graph showing a possible distribution of the sample data. Give the value of the appropriate statistic for this sample. Answers will vary. 𝑝 = 𝑛 50 =____% For this sample, there are 11 orange M&M’s, so 𝑝 = =0.22 Check your understanding p. 420

20 Check your understanding p. 420
Which of the graphs at the top of p. 421 could be the approximate sampling distribution of the statistic? Explain your choice. The middle graph. The center of the distribution of 𝑝 should be at approximately 0.20. Check your understanding p. 420

21 Big Idea #4 How to Describe a Sampling Distribution

22 Center: Biased and unbiased estimators
A statistic is called an unbiased estimator of a parameter if the mean of its sampling distribution is equal to the true value of the parameter being estimated Unbiased doesn’t mean perfect. It is called “unbiased” because in repeated samples, the estimates will be a good predictor. The mean of a sampling distribution will always equal the mean of the population for any sample size. Biased Estimator We say something is biased if it is a poor predictor. Center: Biased and unbiased estimators

23 Spread: Low variability is better!
The variability of a statistic is described by the spread of its sampling distribution. The spread of a sampling distribution is affected by the sample size, not the population size (as long as the population is at least 10 times larger than the sample). Larger sample sizes result in smaller spread or variability. (*but doesn’t eliminate bias!) Spread: Low variability is better!

24 Check Your Understanding p. 426
The histogram above shows the intervals (in minutes) between eruptions of the Old Faithful geyser for all 222 recorded eruptions during a particular month. For this population, the median is 75 minutes. We used Fathom software to take 500 SRSs of size 10 from the population. The 500 values of the sample median are displayed in the histogram above right. The mean of the 500 sample median values is 73.5. Is the sample median an unbiased estimator of the population median? Justify your answer. The median does not appear to be an unbiased estimator of the population median. The mean of the 500 sample medians is 73.5, whereas the median of the population is 75.

25 Check Your Understanding p. 426
Suppose we had taken samples of size 20 instead of size 10. Would the spread of the sampling distribution be larger, smaller, or about the same? Justify your answer. With larger samples, the spread of the sampling distribution is smaller, so increasing the sample size from 10 to 20 will decrease the spread of the sampling distribution. Describe the shape of the sampling distribution. Explain what this means in terms of overestimating or underestimating the population median. The sampling distribution is skewed to the left. This means that, in general, underestimates of the population median will be greater than overestimates.

26 Think of the true value of the population parameter as the bull’s eye on a target and the sample statistic as an arrow fired at the target. Bias and Variability

27 Match the description to the sampling distribution.
High bias, low variability High bias, high variability Low bias, high variability No bias, low variability Match the description to the sampling distribution. D C B A


Download ppt "Chapter 7 Sampling Distributions"

Similar presentations


Ads by Google