Download presentation
Presentation is loading. Please wait.
Published byJohnathan Goodwin Modified over 8 years ago
1
Chapter 9.1, 9.3, 10.2 Sampling Distributions in general; Sampling Distributions for means in particular Hypothesis Testing
2
Sampling Distribution Def: A sampling distribution is a distribution of a particular statistic taken from all possible samples of a particular size from a particular population Know this definition!
3
But nobody’s ever seen one! Sampling distributions aren’t real. They’re imaginary. No one has ever taken all possible groups of ten babies, measured their average weights, then plotted them on a graph. Sampling distributions aren’t real. They’re imaginary. No one has ever taken all possible groups of ten babies, measured their average weights, then plotted them on a graph.
4
Why the big fuss? We really would like to be able to infer something about a population from a sample. (Recall—this is how statistics was first described, as opposed to probability)* We really would like to be able to infer something about a population from a sample. (Recall—this is how statistics was first described, as opposed to probability)* * McGrath—draw the drawing on the board! * McGrath—draw the drawing on the board! Hold that thought…
5
Why the big fuss? (Confidence Intervals and such) We really would like to be able to infer (or estimate) something about a population based on a sample. (Recall—this is how statistics was first described, as opposed to probability)* (Confidence Intervals and such) We really would like to be able to infer (or estimate) something about a population based on a sample. (Recall—this is how statistics was first described, as opposed to probability)* Confidence Intervals and such * McGrath—draw the drawing on the board! * McGrath—draw the drawing on the board!
6
Why the big fuss? Also, we would like to be able to answer questions like this: “If I find a sample of babies whose average weight is 10 pounds, is this unusual, or is it just chance variation?” Also, we would like to be able to answer questions like this: “If I find a sample of babies whose average weight is 10 pounds, is this unusual, or is it just chance variation?” Hypothesis Testing!! Hold these thoughts…
7
Sampling Distribution Example: Consider the population of the weights of babies. Example: Consider the population of the weights of babies. Now consider the population of the mean weights of all possible groups of ten babies! Now consider the population of the mean weights of all possible groups of ten babies! This distribution of sample means would be –you guessed it—a sampling distribution
8
Properties of Sampling Distributions (Focus on Means, not Proportions or other sample statistics) The mean of a sampling distribution is the same as the mean of the population distribution: The mean of a sampling distribution is the same as the mean of the population distribution:
9
Properties of Sampling Distributions (Focus on Means, not Proportions or other sample statistics) The standard deviation of a sampling distribution is the population s.d. divided by the square root of the sample size. The standard deviation of a sampling distribution is the population s.d. divided by the square root of the sample size.
10
What if the “parent” population is normal? If population distribution (the distribution of X) is: Then the distribution of sample means is: REGARDLESS OF SAMPLE SIZE “Normal begets Normal”
11
But what if the parent population is not normal? Even if the parent population is skewed, uniform, bizarro---who cares… Even if the parent population is skewed, uniform, bizarro---who cares… The sampling distribution will be approximately normal, as long as the sample size is big enough The sampling distribution will be approximately normal, as long as the sample size is big enough
12
The Central Limit Theorem Even if the parent population is skewed, uniform, bizarro---who cares… Even if the parent population is skewed, uniform, bizarro---who cares… The sampling distribution will be approximately normal, as long as the sample size is big enough The sampling distribution will be approximately normal, as long as the sample size is big enough
13
We’re Saved!!! Yea! Even if we know nothing whatsoever about the shape of the parent population, we can do groovy, Table A, Normalcdf-type analysis involving samples! Even if we know nothing whatsoever about the shape of the parent population, we can do groovy, Table A, Normalcdf-type analysis involving samples!
14
But there’s a catch… The more non-normal the parent distribution is, the bigger the sample size required to achieve approximate normality. The more non-normal the parent distribution is, the bigger the sample size required to achieve approximate normality. In practice, a simple random sample of 25 is usually enough. Some say 30. In practice, a simple random sample of 25 is usually enough. Some say 30. Furthermore—finding a true SRS? That’s the hard part! Furthermore—finding a true SRS? That’s the hard part!
15
To summarize (for means) 1. All sampling distributions have mean and standard deviation σ/square root of n 2. Normal populations guarantee a normal sampling distribution—exactly— regardless of sample size 3. CLT says, as long as n is big, we have an approximately normal sampling distribution.
16
Testing a Hypothesis (10.2) Statistical Significance: “A finding is statistically significant if it is so rare that it would be unlikely to be found: Statistical Significance: “A finding is statistically significant if it is so rare that it would be unlikely to be found: 1.By chance alone 2.If the null hypothesis is true”
17
Testing a Hypothesis (10.2) “By Chance Alone” means you are working with a Simple Random Sample. “By Chance Alone” means you are working with a Simple Random Sample. i.e., Your sample was just as likely to be used as any other sample of that same size. “Give chance a chance!”
18
Testing a Hypothesis (10.2) “If the null hypothesis is true” means that the middle of your sampling distribution really is where you assumed it to be according to your null hypothesis.” “If the null hypothesis is true” means that the middle of your sampling distribution really is where you assumed it to be according to your null hypothesis.” i.e., x = x-bar = Ho μ Ho
19
P-value The probability of having found your sample mean, (assuming that the middle of your sampling distribution is at Ho) is called a P-value. The probability of having found your sample mean, (assuming that the middle of your sampling distribution is at Ho) is called a P-value.
20
-level is your ‘line in the sand’ for making a decision. is your ‘line in the sand’ for making a decision. If P ≤ , reject If P ≤ , reject Ho , fail to reject If P > , fail to reject Ho :.05,.01,.001 Typical values for :.05,.01,.001
21
The Logic of the Test Assume your null hypothesis is true. (IOW, your sampling distribution really is centered at Ho ) Assume your null hypothesis is true. (IOW, your sampling distribution really is centered at Ho ) Look at your sample. Take its mean. Look at your sample. Take its mean. Find the probability of finding your x-bar or something more extreme. (z-score or normalcdf) Find the probability of finding your x-bar or something more extreme. (z-score or normalcdf) If that P-value is small, reject Ho. Else, “fail to reject” Ho. If that P-value is small, reject Ho. Else, “fail to reject” Ho.
22
The Hypothesis Test Toolbox (Required on all such problems involving a hypothesis test*) 1. State the population of interest State the parameter of interest in words and symbols State the parameter of interest in words and symbols State the Null & Alternate Hypotheses in words and symbols. State the Null & Alternate Hypotheses in words and symbols. 2. Identify the Test you’re using (State it by name) Satisfy Conditions (or Make Assumptions). 3. Perform Calculations (including a drawing), determine your P-value, State whether you Reject or Fail to Reject Ho, and why. 4. Write your conclusion in context. * If you have to ask, the answer is yes!
23
Example 1: ”Kids, Learn your Stats!” The long-term average of the number of students taking AP Statistics in US high schools is 42, with a standard deviation of 10. After a presidential announcement that American students need better understanding of research techniques, a simple random sample of 36 schools is taken the following year by the USDOE. The average number of students in the sample is 44.2. The long-term average of the number of students taking AP Statistics in US high schools is 42, with a standard deviation of 10. After a presidential announcement that American students need better understanding of research techniques, a simple random sample of 36 schools is taken the following year by the USDOE. The average number of students in the sample is 44.2. Is this sufficient evidence that the president’s announcement had a statistically significant effect? Is this sufficient evidence that the president’s announcement had a statistically significant effect?
24
Example: ”Kids, Learn your Stats!” Step 1 Population: US High School Students Parameter: Mean number of students at each school taking AP Statistics, Ho: The mean number of AP Stats students per school is 42 ( = 42) Ha: The mean is now greater than 42 ( > 42)
25
Example: ”Kids, Learn your Stats!” Step 2 (Test and Conditions) Test: 1-sample z-test for means (because σ is known) Conditions: SRS? Satisfied per the problem statement
26
Example: ”Kids, Learn your Stats!” Step 3 Calculations, P-value, and Reject/Fail to Reject 42 10/√36 44. 2 P(x-bar ≥ 44.2) = normalcdf (left boundary = 44.2, right bound = 1E99, mean = 42, standard dev = (10/√36)) =.0934) Or… P(x-bar ≥ 44.2) = P(z ≥ (44.2 – 42)/(10/√36)) = P (z ≥ 1.32) =.0934 Because this sample mean (or one more extreme could be found more than 9 times out of 100 by chance alone if the true mean is 42, fail to reject Ho.)
27
Example: ”Kids, Learn your Stats!” Step 4 Conclusion in Context Evidence from this sample is not significant evidence that the mean number of high school students taking AP Statistics rose significantly as a result of the presidential announcement.
28
Example: Campagnolo Bottom Bracket Spindle Manufacture The specified diameter of a Campagnolo bottom bracket spindle is 15.000 mm. Lately, the factory has been getting complaints about out- of-spec spindles. They seem to be coming out sometimes too big, sometimes too small. Historically, the standard deviation of the process is.005 mm Ernesto from the Quality Control department takes a sample of 36 spindles from the production line and finds an average diameter of 14.996 mm. Is this statistically evidence that the process is too “sloppy”?
29
Example: Campagnolo BB spindle Step 1 Population: Campagnolo bottom bracket spindles Parameter: Mean diameter, Ho: The mean diameter is 15.000 mm; = 15.000 Ha: The mean diameter is different than 15.000mm; ≠ 15.000 Note—This is a two-tailed Alternate Hypothesis!
30
Example: Campagnolo BB spindle Step 2 (Test and Conditions) Test: 1-sample z-test for means (because σ is known) Conditions: SRS? Problem says merely “sample.” We must assume this to be an SRS and proceed with caution.
31
Example: Campagnolo BB spindle Step 3 Calculations, P-value, and Reject/Fail to Reject 15.000.005/√36 14.996 P(x-bar ≤ 14.996) = normalcdf (left boundary = -1E99, sample mean =14.996, mu = 15, SD = (.005/√36)) = 7.94 x 10 -7 P-value is twice this because it’s two-tailed = 1.58 x 10 -6 Because this value is extremely small, we reject Ho robustly!
32
Example: Campagnolo BB spindle Step 4 Conclusion in Context Evidence from this sample suggests strongly that Campy’s bottom bracket spindle diameters are, on average, different than 15.000 mm. Adjustment and tuning of the process are clearly needed!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.