Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young
Objective Sec After this section you will understand when it is appropriate to use the t distribution rather than the normal distribution for constructing confidence intervals or conducting hypothesis tests for population means, and know how to make proper use of the t distribution.
t Distribution for Inferences about a Mean When dealing with confidence intervals (ch.8) and hypothesis testing (ch.9), we worked with samples that were large enough to assume a normal distribution which allowed us to use the standard scores (z-scores) to find probabilities of certain values occurring Recall that in order to find the z-score, the population standard deviation is needed In real applications, the population standard deviation is typically not available, which means that in order to find the confidence interval or conduct the hypothesis test we would estimate it using the sample standard deviation Many statisticians believe that this is not the best approach and they use what is known as a t distribution (or student t distribution) in place of the normal distribution As long as the sample size is at least 30 or the population assumes a normal distribution, a t distribution can be used to find a confidence interval and/or conduct a hypothesis test The t distribution is similar in shape and symmetry to the normal distribution It accounts for greater variability that is expected with small samples Note ~ when you know the population standard deviation and the sample size is greater than 30 or the population is normally distributed, the normal distribution is best to use Sec. 10.1
t Distribution for Inferences about a Mean The following diagram is a comparison between the standard normal distribution and two different t distributions of sample size n = 3 and n = 12 As you can see, they are very similar in shape, and as the sample size increases, the t distribution becomes more and more normal Sec. 10.1
Confidence Intervals Using the t Distribution When determining a confidence interval using a t distribution, we use t values rather than z-scores to determine significance A t value is a number that represents the number of standard deviations a value falls from the mean on a t distribution Recall that to write a confidence interval, you must first calculate the margin of error The formula for the margin of error using a t distribution is: t = t value Found by looking up the value that corresponds to the appropriate number of degrees of freedom (table 10.1 on P.412 ) n = sample size s = standard deviation of the sample Sec. 10.1
Critical Values of t Sec Degrees of freedom Use column 2 for a 97.5% confidence level for a one-tailed test Use column 3 for a 95% confidence level for a one- tailed test Use column 2 for a 95% confidence level for a two- tailed test (or confidence interval) Use column 3 for a 90% confidence level for a two- tailed test (or confidence interval)
Confidence Intervals Using the t Distribution Recall that the standard form for a confidence interval when dealing with means is: Example 1 ~ Diastolic Blood Pressure Here are five measures of diastolic blood pressure from randomly selected adult men: 78, 54, 81, 68, 66. These five values result in these sample statistics: n = 5,, and s = Using this sample, construct the 95% confidence interval estimate of the mean diastolic blood pressure level for the population of all men. Note ~ we are using the t distribution because the population standard deviation is not known and it is reasonable to assume that blood pressure levels are normally distributed Before finding the margin of error, we must first find the t value from the table that corresponds to 4 degrees of freedom (since the sample size was 5; the degrees of freedom is 5 – 1, or 4) For the 95% confidence level, 4 degrees of freedom corresponds to a t value of t = Note ~ for confidence intervals, we use the t values for the “area in two tails” because the margin of error can either be below the mean or above the mean Sec. 10.1
Confidence Intervals Using the t Distribution Example 1 Cont’d… Here are five measures of diastolic blood pressure from randomly selected adult men: 78, 54, 81, 68, 66. These five values result in these sample statistics: n = 5,, and s = Using this sample, construct the 95% confidence interval estimate of the mean diastolic blood pressure level for the population of all men. Now that we know that t = 2.776, we can find the margin of error: To construct the confidence interval, add and subtract the margin of error to the sample mean ( ) Based on the five sample measurements, we can be 95% confident that the true mean of diastolic blood pressure for adult men is between 56.1 and 82.7 Sec. 10.1
Hypothesis Tests Using the t Distribution When a t distribution is used to conduct a hypothesis test, the t value plays the role that the z-score played when we worked with the normal distribution Recall, that we determined statistical significance by comparing the z-score to critical values or by using the z-score to determine the P- value Use the following formula to calculate the t value: This t value is then compared to the “Critical Values of t” chart to determine significance Note ~ a P-value can be calculated, but it is usually done with the aid of statistical software in which case we will not be calculating the P-values using a t distribution in this course Sec null hypothesis
Hypothesis Tests Using the t Distribution Once you calculate t, you can decide whether to reject or not reject the null hypothesis by using this following criteria: Right-tailed test: reject the null if the t value that you found is ≥ the t value from the table (that corresponds to the appropriate degrees of freedom) Use column 2 as a comparison if you want a 97.5% confidence level and column 3 if you want a 95% confidence level Left-tailed test: reject the null if the t value that you found is ≤ the negative of the t value from the table (that corresponds to the appropriate degrees of freedom) Use column 2 as a comparison if you want a 97.5% confidence level and column 3 if you want a 95% confidence level Two-tailed test: reject the null if the absolute value of the t value that you found is ≥ to the t value from the table (that corresponds to the appropriate degrees of freedom) Use column 2 as a comparison if you want a 95% confidence level and column 3 if you want a 90% confidence level Sec. 10.1
Hypothesis Tests Using the t Distribution Example 2 ~ Right Tailed Hypothesis Test for a Mean Listed below are ten randomly selected IQ scores of statistics students: Using methods from Chapter 4, you can confirm that these data have the following sample statistics: n = 10,, and s = 5.2. Using a 0.05 significance level, test the claim that statistics students have a mean IQ score greater than 100, which is the mean IQ score of the general population. Sec Step 1: Step 2: Sample size: n = 10 Sample mean: Standard deviation of the sample: s = 5.2
Hypothesis Tests Using the t Distribution Sec Step 3: Since this is a one-tailed test, the t value that we will be comparing will be found in the 3 rd column of the table that corresponds to 9 degrees of freedom (10 – 1); it is Since this is a right-tailed test, it will be statistically significant if the t value that we found is greater than or equal to the t value of (found in the table) is greater than 1.833, so this is statistically significant at the 0.05 level Step 4: Since this is statistically significant at the.05 level, we can conclude that we have enough evidence to reject the null hypothesis and support the claim that the mean IQ score of the general population is greater than 100
Hypothesis Tests Using the t Distribution Example 3 ~ Two Tailed Hypothesis Test for a Mean Using the same data from example 2 and the same significance level of.05, test the claim that the mean IQ score is equal to 100 Sec Step 1: Step 2: Sample size: n = 10 Sample mean: Standard deviation of the sample: s = 5.2 Step 3: Since this is a two-tailed test, we are looking at column 2 for a.05 significance level The degrees of freedom is 9, so the t value in the table is Because this is a two-tailed test, this will be statistically significant at the.05 level if the absolute value of our t value (5.777) is greater than or equal to 2.262
Hypothesis Tests Using the t Distribution Sec Step 4: Since the absolute value of the t value that we found (5.777) is greater than 2.262, we can say that this is statistically significant at the.05 level and therefore reject the null hypothesis that the mean score is equal to 100 In other words, there is sufficient evidence that supports the alternative hypothesis that the mean IQ score is not equal to 100