+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.3 Estimating a Population Mean
+ 1-PropZInt STAT, TESTS, A:”1-PropZInt” X = Number of successes in the sample N = Sample Size C-Level = Confidence Level
+ Warm Up The Pew Internet & American Life Project asked a random sample of US teens, “Do you have a cell phone…or a Blackberry, iPhone, or other device that is also a cell phone?” Based on this poll, the 95% confidence interval for the proportion of all US teens that say they have a cell phone is (0.72, 0.82). Source: the 2011 Teens and Digital Citizenship Survey sponsored by the Pew Research Center’s Internet and American Life Project. A) Interpret the confidence interval. B) What is the point estimate that was used to create the interval? What is the margin of error? C) Based on this poll, a newspaper report claims that more than 75% of US teens have a cell phone. Use the confidence interval to evaluate this claim.
+ Z-Interval for µ So, the formula for a Confidence Interval for a population mean is To be honest, σ is never known. So, this formula isn’t used very much. The only homework problem regarding this formula is #55, a problem where you are asked to find the sample size for a desired margin of error for a CI for µ. Let’s practice that first.
+ Finding n for a desired MOE for a CI for μ High school students who take the SAT Math exam a second time generally score higher than on their first try. Past data suggest that the score increase has a standard deviation of 50 points. How large a sample of high school students would be needed to estimate the mean change in SAT score to within 2 points with 95% confidence?
Standard Error We don’t know σ. Therefore, we will estimate σ based on s, the standard deviation of the sample. When σ is estimated from the data, the result is called the standard error of the statistic.
Finding the standard error of the mean (like #59 in tonight’s homework) Standard error of the mean is often abbreviated SEM. Suppose that a SRS of 20 people yields an average IQ of 107 with a standard deviation of What is the SEM?
+ Degrees of freedom Unlike the z-statistic, the shape of the t-distribution changes based on the sample size. t(k) stands for a t-distribution with k degrees of freedom. So, if our sample size is 30, and σ is unknown, the distribution is a t- distribution with 29 degrees of freedom. This would be written t(29).
+ The t-distribution What happens to the t- distribution as n increases? How many degrees of freedom are there if n=3? If n=10? What do I mean by n=∞ (normal)?
+ Characteristics of the t-distributions They are similar to the normal distribution. They are symmetric, bell-shaped, and are centered around 0. The t-distributions have more spread than a normal distribution. They have more area in the tails and less in the center than the normal distribution. That’s because using s to estimate σ introduces more variation. As the degrees of freedom increase, the t- distribution more closely resembles the normal curve. As n increases, s becomes a better estimator of σ.
+ Using Table BTable B Now that we are studying a different distribution, we’ll use a different table. Table B gives the t-distribution critical values. Let’s look for the t* value for an upper tail area of.05 with n = 15. What is the t* value for 18 degrees of freedom with probability 0.90 to the left of t*? Suppose you want to construct a 95% confidence interval for the mean of a population based on a SRS of size n = 12. What critical value t* should you use?
+ Using Table B to Find Critical t* Values Suppose you want to construct a 95% confidence interval for the mean µ of a Normal population based on an SRS of size n = 12. What critical t* should you use? Estimating a Population Mean In Table B, we consult the row corresponding to df = n – 1 = 11. The desired critical value is t * = We move across that row to the entry that is directly above 95% confidence level. Upper-tail probability p df z* %95%96%98% Confidence level C
+ One-Sample t Interval for a Population Mean The one-sample t interval for a population mean is similar in both reasoning and computational detail to the one-sample z interval for a population proportion. As before, we have to verify three importantconditions before we estimate a population mean. Estimating a Population Mean Choose an SRS of size n from a population having unknown mean µ. A level C confidence interval for µ is where t* is the critical value for the t n – 1 distribution. Use this interval only when: (1)the population distribution is Normal or the sample size is large (n ≥ 30), (2)the population is at least 10 times as large as the sample. The One-Sample t Interval for a Population Mean Random: The data come from a random sample of size n from the population of interest or a randomized experiment. Normal: The population has a Normal distribution or the sample size is large (n ≥ 30). Independent: The method for calculating a confidence interval assumes that individual observations are independent. To keep the calculations reasonably accurate when we sample without replacement from a finite population, we should check the 10% condition: verify that the sample size is no more than 1/10 of the population size. Conditions for Inference about a Population Mean
+ Example: Video Screen Tension Read the Example on page 508. Parameter: µ = the true mean tension of all the video terminals produced this day Estimating a Population Mean Conditions: Random: We are told that the data come from a random sample of 20 screens from the population of all screens produced that day. Normal: Since the sample size is small (n < 30), we must check whether it’s reasonable to believe that the population distribution is Normal. Examine the distribution of the sample data. These graphs give no reason to doubt the Normality of the population Independent: Because we are sampling without replacement, we must check the 10% condition: we must assume that at least 10(20) = 200 video terminals were produced this day.
+ Example: Video Screen Tension Example on page 508. We want to estimate the true mean tension µ of all the video terminals produced this day at a 90% confidence level. Estimating a Population Mean DO: Using our calculator, we find that the mean and standard deviation of the 20 screens in the sample are: Upper-tail probability p df %95%96% Confidence level C Since n = 20, we use the t distribution with df = 19 to find the critical value. CONCLUDE: We are 90% confident that the interval from to mV captures the true mean tension in the entire batch of video terminals produced that day. From Table B, we find t* = Therefore, the 90% confidence interval for µ is:
+ Using t Procedures Wisely The stated confidence level of a one-sample t interval for µ is exactly correct when the population distribution is exactly Normal.No population of real data is exactly Normal. The usefulness ofthe t procedures in practice therefore depends on how strongly they are affected by lack of Normality. Estimating a Population Mean Definition: An inference procedure is called robust if the probability calculations involved in the procedure remain fairly accurate when a condition for using the procedures is violated. Fortunately, the t procedures are quite robust against non-Normality of the population except when outliers or strong skewness are present. Larger samples improve the accuracy of critical values from the t distributions when the population is not Normal.
+ Using t Procedures Wisely Except in the case of small samples, the condition that the data come from a random sample or randomized experiment is moreimportant than the condition that the population distribution isNormal. Here are practical guidelines for the Normal conditionwhen performing inference about a population mean. Estimating a Population Mean Sample size less than 15: Use t procedures if the data appear close to Normal (roughly symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. Sample size at least 15: The t procedures can be used except in the presence of outliers or strong skewness. Large samples: The t procedures can be used even for clearly skewed distributions when the sample is large, roughly n ≥ 30. Using One-Sample t Procedures: The Normal Condition
+ Confidence Intervals – puzzles for math nerds ParameterStatisticStandard Deviation of the Statistic Standard Error of the Statistic Mean Proportion