Sampling Distribution of a Sample Mean Lecture 28 Section 8.4 Fri, Nov 5, 2004
Sampling Distribution of the Sample Mean Sampling Distribution of the Sample Mean– The distribution of sample means over all possible samples of the size n from that population.
With or Without Replacement? If the sample size is small in relation to the population size (< 5%), then it does not matter whether we sample with or without replacement. The calculations are simpler if we sample with replacement.
Example Suppose a population consists of the numbers {1, 2, 3}. If we select samples of size n = 3, find the sampling distribution ofx. Draw a tree diagram showing all possibilities.
The Tree Diagram 1 1 1 2 4/3 3 5/3 1 4/3 1 2 2 5/3 3 2 1 5/3 3 2 2 3 7/3 1 1 4/3 2 5/3 3 2 1 2 2 5/3 2 2 3 7/3 1 2 3 2 7/3 3 8/3 1 1 5/3 2 2 3 7/3 3 1 2 2 2 7/3 3 8/3 1 7/3 3 2 8/3 3 3
The Sampling Distribution Of the 27 possible samples (equally likely), we find that 1 sample has an average of 1. 3 samples have an average of 4/3 = 1.333. 6 samples have an average of 5/3 = 1.667. 7 samples have an average of 2. 6 samples have an average of 7/3 = 2.333. 3 samples have an average of 8/3 = 2.667. 1 sample has an average of 3.
The Sampling Distribution The sampling distribution ofx is x P(X = x) 1 1/27 1.333 3/27 1.667 6/27 2 7/27 2.333 2.667 3
Probability Histogram 7/27 6/27 5/27 4/27 3/27 2/27 1/27 1 2 3
Probability Histogram 7/27 6/27 5/27 4/27 3/27 2/27 1/27 1 2 3
Experiment We may simulate this on the TI-83 using the function randInt. The expression randInt(1, 3, 3) will select 3 random numbers from the set {1, 2, 3}, with replacement, and put them in a list.
Experiment For example, randInt(1, 3, 3) gives {3, 3, 2} which has a sample mean of 2.667. Generate a list of 100 such sample means, each from a sample of size 3.
Experiment 50 40 30 20 10 1 2 3
Fit a Normal Curve 50 40 30 20 10 1 2 3
Observations The distribution appears to be centered at . The distribution appears to be approximately normal.
Experiment Now generate a list of 100 sample means, each from a sample of size 30. Draw the distribution.
Experiment 50 40 30 20 10 1 2 3
Fit a Normal Curve 50 40 30 20 10 1 2 3
The Central Limit Theorem Begin with a population that has mean and standard deviation . For sample size n, the sampling distribution of the sample mean is approximately normal with
The Central Limit Theorem The approximation gets better and better as the sample size gets larger and larger. For many populations, the distribution is almost exactly normal when n 10. For almost all populations, if n 30, then the distribution is almost exactly normal.
The Central Limit Theorem If the original population is exactly normal, then the sampling distribution of the sample mean is exactly normal for any sample size. This is all summarized on page 500.
Example The population {1, 2, 3} has Mean 2. Variance 2/3. Standard deviation (2/3) = 0.8165.
Example When n = 3, the sample mean is (very) approximately normal with Mean 2. Standard deviation 0.8165/3 = 0.4714.
Example When n = 30, the sample mean is approximately (almost exactly) normal with Mean 2. Standard deviation 0.8165/30 = 0.1491.
Example If I collect (with replacement) a sample of 30 values from this population, what is the probability that the mean will be at least 2.2?
Let’s Do It! Let’s do it! 8.9, p. 502 – Probability of Accepting the Shipment. Let’s do it! 8.10, p. 503 –Mean Grocery Expenditure.
Estimating the Population Mean Example 8.12, p. 504 – Estimating the Population Mean Grocery Expenditure. The sampling distribution ofx is approximately normal with x = $60. x = $35/100 = $3.50. Based on the Empirical Rule, 95% of all samples have a mean within $7.00 of $60, that is, between $53 and $67.
Estimating a Population Proportion Read the article “Water on airlines often unacceptable, finds EPA.” They found that the water in 20 out of 158 airliners contained coliform. So p^ = 20/158 = 0.1266 = 12.66%. What is a good estimate of p, the planes whose water contains coliform as a proportion of all planes?
Estimating a Population Proportion Based on our theory, there is a 95% chance that p^ is within 2 standard deviations of p. Therefore, there is a 95% chance that p is within 2 standard deviations of p^. That is, there’s a 95% chance that p is between p^ – 2p^ and p^ + 2p^.
Estimating a Population Proportion Compute p^ = 0.0265 = 2.65%. Therefore, we are 95% sure that the true proportion is between 7.36% and 17.96%. This is called a 95% confidence interval. Is it reasonable to believe that the water on airliners is at least as clean as the water in municipal water systems?
Estimating a Population Mean The Lancet story on civilian casualties in Iraq. A comment on the story.
Let’s Do It! Let’s do it! 8.11, p. 504 – Testing Hypotheses about the Mean Weight of Nuts.