Download presentation
Presentation is loading. Please wait.
Published byLeslie McDowell Modified over 9 years ago
1
Copyright © Cengage Learning. All rights reserved. 9 Inferences Involving One Population
2
Copyright © Cengage Learning. All rights reserved. 9.1 Inferences about the Mean ( Unknown)
3
3 Floor to Door
4
4 Think about how long it takes you to get ready in the morning—that is, from the time your feet hit the floor until you are going out the door, having showered, groomed, eaten breakfast, and fully dressed. Some will say they get ready in as little as 5 minutes, but when timed, it is hard to do everything in less than 15 minutes, and even in that case, only if your routine is very well orchestrated.
5
5 Floor to Door Here is one morning routine: up at 6:55, in shower by 7:05, out of shower by 7:15, makeup on and dressed by 7:30, bookbag packed and breakfast grabbed by 7:45, out the door by 7:46, and at class by 8:00. That is a total of 51 minutes “floor to door.” If you were given the job of estimating the “floor-to-door” time for the typical college woman, what information would you need and how would you use it to determine the estimate? Inferences about the population mean are based on the sample mean and information obtained from the sampling distribution of sample means.
6
6 Floor to Door We know that the sampling distribution of sample means has a mean and a standard error of for all samples of size n, and it is normally distributed when the sampled population has a normal distribution or approximately normally distributed when the sample size is sufficiently large. This means the test statistic has a standard normal distribution. However, when is unknown, the standard error is also unknown.
7
7 Floor to Door Therefore, the sample standard deviation s will be used as the point estimate for . As a result, an estimated standard error of the mean,, will be used and our test statistic will become. When a known is being used to make an inference about the mean , a sample provides one value for use in the formulas; that one value is. When the sample standard deviation s is also used, the sample provides two values: the sample mean and the estimated standard error.
8
8 Floor to Door As a result, the z-statistic will be replaced with a statistic that accounts for the use of an estimated standard error. This new statistic is known as Student’s t-statistic. In 1908, W. S. Gosset, an Irish brewery employee, published a paper about this t-distribution under the pseudonym “Student.” In deriving the t-distribution, Gosset assumed that the samples were taken from normal populations. Although this might seem to be restrictive, satisfactory results are obtained when large samples are selected from many nonnormal populations.
9
9 Floor to Door Figure 9.1 presents a diagrammatic organization for the inferences about the population mean. Do I Use the z-Statistic or the t-Statistic? Figure 9.1
10
10 Floor to Door Two situations exist: is known, or is unknown. As stated before, is almost never a known quantity in real-world problems; therefore, the standard error will almost always be estimated by. The use of an estimated standard error of the mean requires the use of the t-distribution. Almost all real-world inferences about the population mean will be made with Student’s t-statistic.
11
11 Floor to Door 1. Is n large? Samples as small as n = 15 or 20 may be considered large enough for the central limit theorem to hold if the sample data are unimodal, nearly symmetrical, short-tailed, and without outliers. Samples that are not symmetrical require larger sample sizes, with 50 sufficing except for extremely skewed samples. 2. Requires the use of a nonparametric technique.
12
12 Floor to Door The t-distribution has the following properties (see also Figure 9.2): Student’s t-Distributions Figure 9.2
13
13 Floor to Door Properties of the t-distribution (df > 2) 1. t is distributed with a mean of zero. 2. t is distributed symmetrically about its mean. 3. t is distributed so as to form a family of distributions, a separate distribution for each different number of degrees of freedom (df 1). 4. The t-distribution approaches the standard normal distribution as the number of degrees of freedom increases.
14
14 Floor to Door 5. t is distributed with a variance greater than 1, but as the degrees of freedom increase, the variance approaches 1. 6. t is distributed so as to be less peaked at the mean and thicker at the tails than is the normal distribution. Degrees of freedom, df A parameter that identifies each different distribution of Student’s t-distribution. For the methods presented in this chapter, the value of df will be the sample size minus 1: df = n – 1.
15
15 Floor to Door The number of degrees of freedom associated with s 2 is the divisor (n – 1) used to calculate the sample variance s 2 ; that is, df = n – 1. The sample variance is the mean of the squared deviations. The number of degrees of freedom is the “number of unrelated deviations” available for use in estimating 2. We know that the sum of the deviations, ( ), must be zero. From a sample of size n, only the first n – 1 of these deviations has freedom of value. That is, the last, or nth, value of ( ) must make the sum of the n deviations total exactly zero.
16
16 Floor to Door As a result, variance is said to average n – 1 unrelated squared deviation values, and this number, n – 1, was named “degrees of freedom.” Although there is a separate t-distribution for each degree of freedom, df = 1, df = 2,..., df = 20,..., df = 40, and so on, only certain key critical values of t will be necessary for our work. Consequently, the table for Student’s t-distribution (Table 6 in Appendix B) is a table of critical values rather than a complete table, such as Table 3 is for the standard normal distribution for z.
17
17 Floor to Door As you look at Table 6, you will note that the left side of the table is identified by “df,” degrees of freedom. This left-hand column starts at 3 at the top and lists consecutive df values to 30, then jumps to 35,..., to “df 100” at the bottom. As we stated, as the degrees of freedom increase, the t-distribution approaches the characteristics of the standard normal z-distribution. Once df is “greater than 100,” the critical values of the t-distribution are the same as the corresponding critical values of the standard normal distribution as given in Table 4A in Appendix B.
18
18 Using the t-Distribution Table
19
19 Using the t-Distribution Table The critical values of Student’s t-distribution that are to be used both for constructing a confidence interval and for hypothesis testing will be obtained from Table 6 in Appendix B. To find the value of t, you will need to know two identifying values: (1) df, the number of degrees of freedom (identifying the distribution of interest), and (2) , the area under the curve to the right of the right-hand critical value.
20
20 Using the t-Distribution Table A notation much like that used with z will be used to identify a critical value. t(df, ), read as “t of df, ,” is the symbol for the value of t with df degrees of freedom and an area of in the right-hand tail, as shown in Figure 9.3. t-Distribution Showing t(df, ) Figure 9.3
21
21 Using the t-Distribution Table For the values of t on the left side of the mean, we can use one of two notations. The t-value shown in Figure 9.4 could be named t (df, 0.95), because the area to the right of it is 0.95, or it could be identified by –t (df, 0.05), because the t-distribution is symmetrical about its mean, zero. t-Value on Left Side Figure 9.4
22
22 Example 3 – t-Values that Bound a Middle Percentage Find the values of the t-distribution that bound the middle 0.90 of the area under the curve for the distribution with df = 17. Solution: The middle 0.90 leaves 0.05 for the area of each tail. The value of t that bounds the right-hand tail is t (17, 0.05) = 1.74, as found in Table 6.
23
23 Example 3 – Solution The value that bounds the left-hand tail is –1.74 because the t-distribution is symmetrical about its mean, zero. cont’d
24
24 Confidence Interval Procedure
25
25 Confidence Interval Procedure We are now ready to make inferences about the population mean using the sample standard deviation. As we mentioned earlier, use of the t-distribution has a condition. The assumption for inferences about the mean when is unknown The sampled population is normally distributed. The procedure to make confidence intervals using the sample standard deviation is very similar to that used when is known. The difference is the use of Student’s t in place of the standard normal z and the use of s, the sample standard deviation, as an estimate of .
26
26 Confidence Interval Procedure The central limit theorem (CLT) implies that this technique can also be applied to nonnormal populations when the sample size is sufficiently large.
27
27 Example 4 – Confidence Interval for With Unknown A random sample of 20 weights is taken from babies born at Northside Hospital. A mean of 6.87 lb and a standard deviation of 1.76 lb were found for the sample. Estimate, with 95% confidence, the mean weight of all babies born in this hospital. Based on past information, it is assumed that weights of newborns are normally distributed. Solution: Step 1 The Set-Up: Describe the population parameter of interest. , the mean weight of newborns at Northside Hospital
28
28 Example 4 – Solution Step 2 The Confidence Interval Criteria: a. Check the assumptions. Past information indicates that the sampled population is normal. b. Identify the probability distribution and the formula to be used. The value of the population standard deviation, , is unknown. Student’s t-distribution will be used with formula (9.1). c. State the level of confidence: 1 – = 0.95. cont’d
29
29 Example 4 – Solution Step 3 The Sample Evidence: Collect the sample information: n = 20, = 6.87, and s =1.76. Step 4 The Confidence Interval: a. Determine the confidence coefficients. Since 1 – = 0.95, = 0.05, and therefore /2 = 0.025. Also, since n = 20, df = 19. cont’d
30
30 Example 4 – Solution At the intersection of row df =19 and the one-tailed column = 0.025 in Table 6, we find t(df, /2) = t(19, 0.025) = 2.09. See the figure. cont’d
31
31 Example 4 – Solution cont’d b. Find the maximum error of estimate. E = t (df, /2) E = (19, 0.025) = 2.09 = (2.09)(0.394) = 0.82
32
32 Example 4 – Solution cont’d c. Find the lower and upper confidence limits. – E to + E 6.87 – 0.82 to 6.87 + 0.82 6.05 to 7.69 Step 5 The Results: State the confidence interval. 6.05 to 7.69 is the 95% confidence interval for . That is, with 95% confidence we estimate the mean weight of babies born at Northside Hospital to be between 6.05 and 7.69 lb.
33
33 Hypothesis-Testing Procedure
34
34 Hypothesis-Testing Procedure In hypothesis-testing situations, we use formula (9.2) to calculate the value of the test statistic t : The calculated t is the number of estimated standard errors that is from the hypothesized mean .
35
35 Hypothesis-Testing Procedure As with confidence intervals, the CLT indicates that the t-distribution can also be applied to nonnormal populations when the sample size is sufficiently large.
36
36 Example 6 – Two-Tailed Hypothesis Test for With Unknown On a popular self-image test that results in normally distributed scores, the mean score for public-assistance recipients is expected to be 65. A random sample of 28 public-assistance recipients in Emerson County is given the test. They achieve a mean score of 62.1, and their scores have a standard deviation of 5.83. Do the Emerson County public-assistance recipients test differently, on average, than what is expected at the 0.02 level of significance?
37
37 Example 6 – Solution Step 1 The Set-Up: a. Describe the population parameter of interest. , the mean self-image test score for all Emerson County public assistance recipients b. State the null hypothesis (H o ) and the alternative hypothesis (H a ). H o : = 65 (mean is 65) H a : 65 (mean is different from 65)
38
38 Example 6 – Solution Step 2 The Hypothesis Test Criteria: a. Check the assumptions. The test is expected to produce normally distributed scores; therefore, the assumption has been satisfied; is unknown. b. Identify the probability distribution and the test statistic to be used. The t-distribution with df = n – 1 = 27, and the test statistic is t, formula (9.2). c. Determine the level of significance: = 0.02 (given in statement of problem). cont’d
39
39 Example 6 – Solution Step 3 The Sample Evidence: a. Collect the sample information: n = 28, = 62.1, and s = 5.83. b. Calculate the value of the test statistic. Use formula (9.2): t t = –2.632 = –2.63 cont’d
40
40 Example 6 – Solution Step 4 The Probability Distribution: Using the p-value procedure: a. Calculate the p-value for the test statistic. Use both tails because H a expresses concern for values related to “different from.” P = P(t –2.63) + P(t 2.63) = 2 P(t 2.63), with df = 27 as shown in the figure. cont’d
41
41 Example 6 – Solution To find the p-value, use one of three methods: 1. Use Table 6 in Appendix B to place bounds on the p-value: 0.01 P 0.02. 2. Use Table 7 in Appendix B to place bounds on the p-value: 0.012 P 0.016. 3. Use a computer or calculator to calculate the p-value: P = 0.0140. cont’d
42
42 Example 6 – Solution b. Determine whether or not the p-value is smaller than . The p-value is smaller than the level of significance, . Using the classical procedure: a. Determine the critical region and critical value(s). The critical region is both tails because H a expresses concern for values related to “different from.” cont’d
43
43 Example 6 – Solution The critical value is found at the intersection of the row and the one-tailed 0.01 column of Table 6: t (27, 0.01) = 2.47. b. Determine whether or not the calculated test statistic is in the critical region. t is in the critical region, as shown in red in the preceding figure. cont’d
44
44 Example 6 – Solution Step 5 The Results: a. State the decision about H o : Reject H o. b. State the conclusion about H a. At the 0.02 level of significance, we do have sufficient evidence to conclude that the Emerson County assistance recipients test significantly different, on average, from the expected 65. cont’d
45
45 Hypothesis-Testing Procedure Calculating the p-value when using the t-distribution Method 1: Using Table 6, find 2.63 between two entries in the df = 27 row and read the bounds for P from the two-tailed heading at the top of the table: 0.01 P 0.02 Method 2: Generally, bounds found using Table 7 will be narrower than bounds found using Table 6.
46
46 Hypothesis-Testing Procedure The following table shows you how to read the bounds from Table 7; find t = 2.63 between two rows and df = 27 between two columns, and locate the four intersections of these columns and rows. The value of P is bounded by the upper left and the lower right of these table entries.
47
47 Hypothesis-Testing Procedure Method 3: If you are doing the hypothesis test with the aid of a computer or calculator, most likely it will calculate the p-value for you (do not double it). Or you may use the cumulative probability distribution commands.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.