Estimating a Population Mean: Not Known Section 7.4 Estimating a Population Mean: Not Known
Learning Targets: In this section… We present methods for estimating a population mean when the population standard deviation is not known. With σ unknown, we use the Student t distribution assuming that the relevant requirements are satisfied.
Student t Distribution If the distribution of a population is essentially normal, then the distribution of is a Student t Distribution for all samples of size n. It is often referred to as a t distribution and is used to find critical values denoted by t/2 (t*). t star Student t distribution is usually referred to as the t distribution.
degrees of freedom = n – 1 in this section. The number of degrees of freedom for a collection of sample data is the number of sample values that can vary after certain restrictions have been imposed on all data values. The degree of freedom is often abbreviated df. degrees of freedom = n – 1 in this section. Explain that other statistical procedures may have different formulas for degrees of freedom.
Important Properties of the Student t Distribution 1. The Student t distribution is different for different sample sizes (see the following slide, for the cases n = 3 and n = 12). 2. The Student t distribution has the same general symmetric bell shape as the standard normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples. 3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0). 4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a σ = 1). 5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.
Student t Distributions for n = 3 and n = 12 Student t distributions have the same general shape and symmetry as the standard normal distribution, but reflect a greater variability that is expected with small samples.
Choosing the Appropriate Distribution
SUMMARY How do we know when to use zα/2 or tα/2 (z* or t*)? *If you are working with a categorical variable (estimating a population proportion, p) always use zα/2 (z*). *If you are working with a quantitative variable (estimating a population mean, µ) and you DO know σ, use zα/2 (z*). *If you are working with a quantitative variable (estimating a population mean, µ) and you DO NOT know σ, use tα/2 (t*). **Remember that the population distribution must be normal or n must be large for quantitative variables.**
Example 1: Assume we want to construct a confidence interval using the given confidence level. Do one of the following, as appropriate: (i) Find the critical value zα/2 (z*), (ii) find the critical value tα/2 (t*), (iii) state that neither the normal nor the t distribution applies. a) 95%; n = 34; σ is unknown; population appears to be normally distributed. b) 90%; n = 72; σ is known; population appears to be normally distributed. c) 98%; n = 22; σ is unknown; population appears to be very skewed. d) 94%; n = 50; σ is unknown; population appears to be skewed. e) 95%; n = 200; σ = 24.5; population appears to be skewed. f) 99.5%, n = 20; σ is unknown; population appears to be normally distributed.
Margin of Error E for Estimate of (With σ Not Known) where tα/2 has n – 1 degrees of freedom. NOTE: tα/2 = t*
Notation = population mean = sample mean s = sample standard deviation n = number of sample values E = margin of error t/2 = t* = critical t value separating an area of /2 in the right tail of the t distribution Smaller samples will have means that are likely to vary more. The greater variation is accounted for by the t distribution. Students usually like hearing about the history of how the Student t distribution got its name. page 350 of text See Note to Instructor in margin for other ideas.
Confidence Interval for the Estimate of μ (With σ Not Known) where df = n – 1
Critical Values for a Population Mean μ (With σ Not Known) When finding a critical value, use following calculator command: invT(area to the left, df)
Example 2: Use the given confidence level and sample data to find (i) the margin of error and (ii) the confidence interval for the population mean µ. a) 91% confidence; n = 212, = $4002 and s = $284. b) 93% confidence n = 4, = 0.54 and s = 0.09.
Example 3: A common claim is that garlic lowers cholesterol levels Example 3: A common claim is that garlic lowers cholesterol levels. In a test of the effectiveness of garlic, 49 subjects were treated with doses of raw garlic, and their cholesterol levels were measured before and after the treatment. The changes in their levels of LDL cholesterol (in mg/dL) have a mean of 0.4 and a standard deviation of 21.0. a) What is the best point estimate of the population mean net change in LDL cholesterol after the garlic? b) Construct a 95% confidence interval estimate of the mean net change in LDL cholesterol after the garlic treatment.
Example 4: Data Set 2 in Appendix B includes 106 body-temperature for which = 98.20˚F and s = 0.62˚F. a) What is the best point estimate of the mean body temperature of all healthy humans? b) Using the sample statistics, construct a 99% confidence interval estimate of the mean body temperature of all healthy humans.
Example 5: In a study designed to test the effectiveness of Echinacea for treating upper respiratory tract infections in children, 337 children were treated with Echinacea and 370 other children were given a placebo. The numbers of days of peak severity of symptoms for the Echinacea treatment group had a mean of 6.0 days and a standard deviation of 2.3 days. The numbers of days of peak severity of symptoms for the placebo group had a mean of 6.1 days and a standard deviation of 2.4 days (based on data from “Efficacy and Safety of Echinacea in Treating Upper Respiratory Tract Infections in Children,” by Taylor et al., Journal of the American Medical Association, Vol. 290, No. 21). a) Construct the 95% confidence interval for the mean number of days of peak severity of symptoms for those who receive Echinacea treatment. b) Construct the 95% confidence interval for the mean number of days of peak severity of symptoms for those who are given a placebo.
Example 6: In a study designed to test the effectiveness of magnets for treating back pain, 20 patients were given a treatment with magnets and also a sham treatment without magnets. Pain was measured using a standard Visual Analog Scale (VAS). After given the magnet treatments, the 20 patients had VAS scores with a mean of 5.0 and a standard deviation of 2.4. After being given the sham treatments, the 20 patients had VAS scores with a mean of 4.7 and a standard deviation of 2.9. a) Construct the 95% confidence interval estimate of the mean VAS score for patients given the magnet treatment. b) Construct the 95% confidence interval estimate of the mean VAS score for patients given the sham treatment.
Finding the Point Estimate and E from a Confidence Interval
Example 7: Use the given data and corresponding display to express the confidence interval in the format of 94% confidence; n = 74, = 84.72, s = 5.46.