Presentation is loading. Please wait.

Presentation is loading. Please wait.

Homework Read! Try all problems! Check odds!! Read pp

Similar presentations


Presentation on theme: "Homework Read! Try all problems! Check odds!! Read pp"— Presentation transcript:

1 Homework Read! Try all problems! Check odds!! Read pp p. 439 # 27-30, 33, 35, 37, 41, 43-46 7.2 Quiz tomorrow 2nd Chance MC Ch 6 – Th & Fri during lunch Ch 6 Bonus Test – Th after school Ch 7 Test next Monday 1/14 Unit 2 Celebration – next Tuesday Unit 2 Test – next Thursday

2 Simulations Means (n=30) Means (n=60) Means (n=120)
4 3 2 To illustrate the general behavior of samples of fixed size n, samples each of size 30, 60 and 120 were generated from this uniform distribution and the means calculated. Probability histograms were created for each of these (simulated) sampling distributions. Notice all three of these look to be essentially normally distributed. Further, note that the variability decreases as the sample size increases. Means (n=60) 4 3 2 Means (n=120) 4 3 2

3 Simulations To further illustrate the general behavior of samples of fixed size n, samples each of size 4, 16 and 32 were generated from the positively skewed distribution pictured below. Skewed distribution Notice that these sampling distributions are all skewed, but as n increased the sampling distributions became more symmetric and eventually appeared to be almost normally distributed.

4 What Is a Sampling Distribution?
Describing Sampling Distributions Spread: Low variability is better! To get a trustworthy estimate of an unknown population parameter, start by using a statistic that’s an unbiased estimator. This ensures that you won’t tend to overestimate or underestimate. Unfortunately, using an unbiased estimator doesn’t guarantee that the value of your statistic will be close to the actual parameter value. What Is a Sampling Distribution? n=100 n=1000 Larger samples have a clear advantage over smaller samples. They are much more likely to produce an estimate close to the true value of the parameter. The variability of a statistic is described by the spread of its sampling distribution. This spread is determined primarily by the size of the random sample. Larger samples give smaller spread. The spread of the sampling distribution does not depend on the size of the population, as long as the population is at least 10 times larger than the sample. Variability of a Statistic

5 Sampling Distributions
Why does population size have little influence on the behavior of statistics from random samples? Imagine sampling harvested corn by thrusting a scoop into a lot of corn kernels. The scoop does not know whether it is surrounded by a bag of corn or by an entire truckload. As long as the corn is well mixed (so the scoop selects a random sample), the variability of the result depended only on the size of the scoop – larger scoop = decreased variability Thus, you do not need larger samples to get good estimates for larger populations

6 What Is a Sampling Distribution?
Describing Sampling Distributions The fact that statistics from random samples have definite sampling distributions allows us to answer the question, “How trustworthy is a statistic as an estimator of the parameter?” To get a complete answer, we consider the center, spread, and shape. What Is a Sampling Distribution? Center: Biased and unbiased estimators In the chips example, we collected many samples of size 20 and calculated the sample proportion of red chips. How well does the sample proportion estimate the true proportion of red chips, p = 0.5? Note that the center of the approximate sampling distribution is close to In fact, if we took ALL possible samples of size 20 and found the mean of those sample proportions, we’d get exactly 0.5. Definition: A statistic used to estimate a parameter is an unbiased estimator if the mean of its sampling distribution is equal to the true value of the parameter being estimated.

7 What Is a Sampling Distribution?
Describing Sampling Distributions What Is a Sampling Distribution? Center: Biased and unbiased estimators “Unbiased” does not mean perfect. An unbiased estimator will almost always provide an estimate that is not equal to the value of the population parameter. It is called “unbiased” because in repeated samples, the estimates won’t consistently be too high or too low. Also, when we talk about biased and unbiased estimators, we are assuming that the sampling process we are using has no bias. That is, there are no sampling or nonsampling errors present, just sampling variability. These types of errors also introduce bias and would lead to our estimates being consistently too low or consistently too high, even if we are using what is otherwise considered an unbiased estimator.

8 High/low? Variability/bias?

9 What Is a Sampling Distribution?
Describing Sampling Distributions Bias, variability, and shape We can think of the true value of the population parameter as the bull’s- eye on a target and of the sample statistic as an arrow fired at the target. Both bias and variability describe what happens when we take many shots at the target. Did you notice that the heading “Bias, variability, and shape” corresponds to the characteristics of a distribution: center, spread, and shape? What Is a Sampling Distribution? Bias means that our aim is off and we consistently miss the bull’s-eye in the same direction. Our sample values do not center on the population value. High variability means that repeated shots are widely scattered on the target. Repeated samples do not give very similar results. The lesson about center and spread is clear: given a choice of statistics to estimate an unknown parameter, choose one with no or low bias and minimum variability.

10 High/low? Variability/bias?

11 What Is a Sampling Distribution?
Describing Sampling Distributions Bias, variability, and shape Sampling distributions can take on many shapes. The same statistic can have sampling distributions with different shapes depending on the population distribution and the sample size. Be sure to consider the shape of the sampling distribution before doing inference. What Is a Sampling Distribution? Sampling distributions for different statistics used to estimate the number of tanks in the German Tank problem. The blue line represents the true number of tanks. Note the different shapes. Which statistic gives the best estimator? Why? Mathematicians in D.C. suggested the partition method, which uses the statistic 5/4 – max. Why did they suggest this?

12 What Is a Sampling Distribution?
Describing Sampling Distributions Here are five more methods for estimating the total number of tanks: Partition = (5/4)(maximum) Max = maximum MeanMedian = mean + median SumQuartiles = Q1 + Q3 TwiceIQR = 2(IQR) The graph below shows the approximate sampling distribution for each of these statistics when taking 250 samples of size 4 from a population of 342 tanks. What Is a Sampling Distribution? Which of these statistics appear to be biased estimators? Explain Of the unbiased estimators, which is the best? Justify your answer Explain why a biased estimator might be preferred over an unbiased estimator. (Why might we want to use Max?)

13 Goal for Good Estimator
We want Low Bias and Low Variability Since we are using unbiased statistics, mean = sample proportion = We have low bias!! Thus, we need to focus on creating low variability by choosing a sufficient simple random sample size

14 Answers to last night’s evens
7.18 a) A larger sample does not reduce the bias of a poll result. If the sampling technique results in bias, simply increasing the sample size will not reduce the bias. (b) A larger sample will reduce the variability of the result. More people means more information which means less variability. 7.20 (a) If we choose many samples, the average of the x- values from these samples will be close to μ. In other words, the mean of the sampling distribution is centered at the population mean we are trying to estimate. (b) A larger sample will give more information and, therefore, more precise results. The variability in the distribution of the sample average decreases as the sample size increases. e b

15 Chapter 7: Sampling Distributions
Section 7.2 Sample Proportions Day 3 / 7

16 Chapter 7 Sampling Distributions
7.1 What is a Sampling Distribution? – 2 days 7.2 Sample Proportions – 1 day 7.3 Sample Means – 2 days

17 Section 7.2 Sample Proportions
Learning Objectives After this section, you should be able to… FIND the mean and standard deviation of the sampling distribution of a sample proportion DETERMINE whether or not it is appropriate to use the Normal approximation to calculate probabilities involving the sample proportion CALCULATE probabilities involving the sample proportion EVALUATE a claim about a population proportion using the sampling distribution of the sample proportion

18 Recall Parameter Statistic Population mean: μ standard deviation: σ
proportion: p Sometimes we call the parameters “true”; true mean, true proportion, etc. Sample mean: x-bar standard deviation: s proportion: p-hat Sometimes we call the statistics “sample”; sample mean, sample proportion, etc.

19 Chapter 7 Sample Proportions
What proportion of U.S. teens know that 1492 was the year in which Columbus “discovered” America? A Gallup Poll found that 210 out of a random sample of 501 American teens aged 13 to 17 knew this historically important date. The sample proportion p-hat = 201/501 = 0.42 is the statistic that we use to gain information about the unknown population parameter p. Since another random sample of 501 teens would likely result in a different estimate, we can only say that “about” 42% of U.S. teenagers aged 13 – 17 know this fact. But what does “about” mean?

20 Form teams of three. Each team will be given a brown bag containing 100 skittles Assume that each sample is independent and representative of all skittles produced From your bag, choose four random samples of n = 25 and calculate the proportion of yellow skittles for each sample. Add your data to the board Now, assume your bag of skittles is a sample of n = 100. Find the value of and record your data on the board. Calculate the mean and standard deviation of each sampling distribution

21 Sample proportions We charted the proportion of yellow Skittles;
How would you describe the two sampling distributions we created? The actual distribution, according to Wrigley, of yellow Skittles is apx. p = .20 or 20% of all Skittles are yellow How did the mean of our distribution compare to p (the population proportion)? What changed when our sample size changed?

22 3 important facts about the sampling distribution of
It’s mean is very close to p It’s shape is close to normal if your sample size is large enough. It’s standard deviation gets smaller as the sample size gets larger.

23 The Sampling Distribution of
What did you notice about the shape, center, and spread of each sampling distribution? Sample Proportions

24 As sample size increases, the variability decreases.
The Sampling Distribution of In Chapter 8, we learned that the mean and standard deviation of a binomial random variable X are Sample Proportions These formulas are included on your formula sheet for the AP exam. They are listed in the section with the formulas for the binomial distribution. As sample size increases, the variability decreases.

25 Standard Deviations of Sample Proportions
The standard deviation of gets smaller as the sample size n increases. That is, is less variable in larger samples. The formula for the standard deviation of does not apply when the sample is a large part of the population. You cannot use this formula if you choose an SRS of 50 from a population of 100 In practice, we usually take a sample only when the population is large because if the population is relatively small, we can use a census.

26 How large is large enough?
So, what size of population requires sampling? When can we accurately use the standard deviation formula for a sampling distribution of a proportion? Rule of Thumb #1 Only use the formula for standard deviation of when the population is at least 10 times as large as the sample; 10n ≤ population

27 Normality of Sample Proportions
The sampling distribution of is approximately normal and is closer to a normal distribution when the sample size n is large. The accuracy of the normal approximation improves as the sample size n increases. For a fixed sample size n, the normal approximation is most accurate when p is close to ½ . The normal approximation is no good at all when p=1 or p=0. Rule of Thumb #2 Only use the normal approximation to the sampling distribution of for values of n and p that satisfy; np ≥ 10 and n(1-p) ≥ 10

28 Sampling Distribution of a Sample Proportion
The Sampling Distribution of Sample Proportions As n increases, the sampling distribution becomes approximately Normal. Before you perform Normal calculations, check that the Normal condition is satisfied: np ≥ 10 and n(1 – p) ≥ 10. Sampling Distribution of a Sample Proportion

29 Summary: Rules of Thumb
When dealing with proportions, before using the rules for standard deviation and assuming normality to estimate probabilities, you MUST ALWAYS check BOTH rules of thumb. 10n ≤ population (Ensures we can assume individuals are independent) np ≥ 10 and n(1-p) ≥ 10 (Ensures the sample is sufficiently large)

30 APPLYING TO COLLEGE A polling organization asks an SRS of 1500 first-year college students whether they applied for admission into any other college. In fact, 35 % of all first-year students applied to colleges besides the one they are attending. There are over 1.7 million 1st year college students. 1. Identify and label all variables n = 1500; p = .35; population = 1.7 million What is the probability that the random sample of students will give a result within 2 percentage points of this true value? Want normal curve, need std dev; check rules of thumb Rule of thumb 1(10)(1500) = 15,000≤1,700,000 ✓ Rule of thumb 2(1500)(0.35) = 525 and (1500)(0.65) = 975 ✓

31 APPLYING TO COLLEGE What is the probability that the random sample of students will give a result within 2 percentage points of this true value? Find the mean and standard deviation Draw a picture, write a probability statement, use your calculators or z-charts N(.35, .0123) P(.33 ≤ p hat ≤ .37) Answer the question apx. 90%

32 Helsinki Heart Study The Helsinki Heart Study asks whether the anti-cholesterol drug gemfibrozil will reduce heart attacks. In planning such an experiment, the researchers must be confident that the sample sizes are large enough to enable them to observe enough heart attacks. The Helsinki study plans to give gemfibrozil to 2000 men and a placebo to another The probability of a heart attack during the 5-year period of the study for men this age is about We can think of the study participants as an SRS from a large population, of which the proportion p = .04 will have heart attacks. What is the probability that the group will suffer at least 75 heart attacks? Identify and label all variables Want normal curve, need std dev, check rules of thumb Find the mean and standard deviation Draw a picture, write a probability statement, use your calculators or z-charts Answer the question

33 More Suppose you are going to roll a fair six-sided die 60 times and record p-hat, the proportion of times that a 1 or a 2 is showing. 1. Where should the distribution of the 60 p-hat values be centered? Justify your answer. 2. What is the standard deviation of the sampling distribution of p-hat, the proportion of all rolls of the die that show a 1 or a 2? 3. Describe the shape of the sampling distribution of p-hat. Justify your answer. Power companies kill trees growing near their lines to avoid power failures due to falling limbs in storms. Applying a chemical to slow the growth of the trees is cheaper than trimming, but the chemical kills some of the trees. Suppose that one such chemical would kill 20% of sycamore trees. The power company tests the chemical on 250 sycamores. Consider these an SRS from the population of all sycamore trees. 4. What are the mean and standard deviation of the proportion of trees that are killed? 5. What is the probability that at least 60 trees (24% of the sample) are killed? (Remember to check that you can use the Normal approximation.)

34 Homework Read! Try all problems! Check odds!! Read pp p. 439 # 27-30, 33, 35, 37, 41, 43-46 7.2 Quiz tomorrow 2nd Chance MC Ch 6 – Th & Fri during lunch Ch 6 Bonus Test – Th after school Ch 7 Test next Monday 1/14 Unit 2 Celebration – next Tuesday Unit 2 Test – next Thursday


Download ppt "Homework Read! Try all problems! Check odds!! Read pp"

Similar presentations


Ads by Google