Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples
Copyright © Cengage Learning. All rights reserved. 9.5 Inferences Concerning Two Population Variances
3 Methods for comparing two population variances (or standard deviations) are occasionally needed, though such problems arise much less frequently than those involving means or proportions. For the case in which the populations under investigation are normal, the procedures are based on a new family of probability distributions.
4 The F Distribution
5 The F probability distribution has two parameters, denoted by v 1 and v 2. The parameter v 1 is called the number of numerator degrees of freedom, and v 2 is the number of denominator degrees of freedom; here v 1 and v 2 are positive integers. A random variable that has an F distribution cannot assume a negative value. Since the density function is complicated and will not be used explicitly, we omit the formula. There is an important connection between an F variable and chi-squared variables.
6 The F Distribution If X 1 and X 2 are independent chi-squared rv’s with v 1 and v 2 df, respectively, then the rv (the ratio of the two chi-squared variables divided by their respective degrees of freedom), can be shown to have an F distribution. (9.8)
7 The F Distribution Figure 9.8 illustrates the graph of a typical F density function. Figure 9.8 An F density curve and critical value
8 The F Distribution Analogous to the notation t ,v and we use for the value on the horizontal axis that captures of the area under the F density curve with v 1 and v 2 df in the upper tail. The density curve is not symmetric, so it would seem that both upper- and lower-tail critical values must be tabulated. This is not necessary, though, because of the fact that
9 The F Distribution Appendix Table A.9 gives for =.10,.05,.01, and.001, and various values of v 1 (in different columns of the table) and v 2 (in different groups of rows of the table). For example, F.05,6,10 = 3.22 and F.05,10,6 = The critical value F.95,6,10, which captures.95 of the area to its right (and thus.05 to the left) under the F curve with v 1 = 6 and v 2 = 10, is F.95,6,10 = 1/F.05,10,6 = 1/4.06 =.246.
10 The F Test for Equality of Variances
11 The F Test for Equality of Variances A test procedure for hypotheses concerning the ratio is based on the following result. Theorem Let X 1,…, X m be a random sample from a normal distribution with variance let Y 1,…, Y n be another random sample (independent of the X i ’s) from a normal distribution with variance and let and denote the two sample variances. Then the rv has an F distribution with v 1 = m – 1 and v 2 = n – 1. (9.9)
12 The F Test for Equality of Variances This theorem results from combining (9.8) with the fact that the variables and each have a chi-squared distribution with m – 1 and n – 1 df, respectively. Because F involves a ratio rather than a difference, the test statistic is the ratio of sample variances. The claim that is then rejected if the ratio differs by too much from 1.
13 The F Test for Equality of Variances Null hypothesis: Test statistic value: Alternative Hypothesis Rejection Region for a Level Test
14 The F Test for Equality of Variances Since critical values are tabled only for =.10,.05,.01, and.001, the two-tailed test can be performed only at levels.20,.10,.02, and.002. Other F critical values can be obtained from statistical software.
15 Example 14 On the basis of data reported in the article “Serum Ferritin in an Elderly Population” (J. of Gerontology, 1979: 521–524), the authors concluded that the ferritin distribution in the elderly had a smaller variance than in the younger adults. (Serum ferritin is used in diagnosing iron deficiency.) For a sample of 28 elderly men, the sample standard deviation of serum ferritin (mg/L) was s 1 = 52.6; for 26 young men, the sample standard deviation was s 2 = Does this data support the conclusion as applied to men?
16 Example 14 Let and denote the variance of the serum ferritin distributions for elderly men and young men, respectively. The hypotheses of interest are versus At level.01, H 0 will be rejected if f F.99, 27, 25. To obtain the critical value, we need F.01,25,27. From Appendix Table A.9, F.01,25,27 = 2.54, so F.99, 27, 25 = 1/2.54 =.394. The computed value of F is (52.6) 2 /(84.2) 2 =.390. Since.390 .394, H 0 is rejected at level.01 in favor of H a, so variability does appear to be greater in young men than in elderly men. cont’d
17 P-Values for F Tests
18 P-Values for F Tests As we know that the P-value for an upper-tailed t test is the area under the relevant t curve (the one with appropriate df) to the right of the calculated t. In the same way, the P-value for an upper-tailed F test is the area under the F curve with appropriate numerator and denominator df to the right of the calculated f.
19 P-Values for F Tests Figure 9.9 illustrates this for a test based on v 1 = 4 and v 2 = 6. Figure 9.9 A P-value for an upper-tailed F test
20 P-Values for F Tests Tabulation of F-curve upper-tail areas is much more cumbersome than for t curves because two df’s are involved. For each combination of v 1 and v 2, our F table gives only the four critical values that capture areas.10,.05,.01, and.001.
21 P-Values for F Tests Figure 9.10 shows what can be said about the P-value depending on where f falls relative to the four critical values. Figure 9.10 Obtaining P-value information from the F table for an upper-tailed F test
22 P-Values for F Tests For example, for a test with v 1 = 4 and v 2 = 6, f = , < P-value, <.05 f = 2.16 P-value >.10 f = P-value <.001 Only if f equals a tabulated value do we obtain an exact P-value (e.g., if f = 4.53, then P-value =.05).
23 P-Values for F Tests Once we know that.01 < P-value <.05, H 0 would be rejected at a significance level of.05 but not at a level of.01. When P-value <.001, H 0 should be rejected at any reasonable significance level. The F tests discussed in succeeding chapters will all be upper-tailed. If, however, a lower-tailed F test is appropriate, then lower-tailed critical values should be obtained as described earlier so that a bound or bounds on the P-value can be established.
24 P-Values for F Tests In the case of a two-tailed test, the bound or bounds from a one-tailed test should be multiplied by 2. For example, if f = 5.82 when v 1 = 4 and v 2 = 6, then since 5.82 falls between the.05 and.01 critical values, 2(.01) < P-value < 2(.05), giving.02 < P-value <.10. H 0 would then be rejected if =.10 but not if =.01. In this case, we cannot say from our table what conclusion is appropriate when =.05 (since we don’t know whether the P-value is smaller or larger than this).
25 P-Values for F Tests However, statistical software shows that the area to the right of 5.82 under this F curve is.029, so the P-value is.058 and the null hypothesis should therefore not be rejected at level.05 (.058 is the smallest for which H 0 can be rejected and our chosen is smaller than this). Various statistical software packages will, of course, provide an exact P-value for any F test.
26 A Confidence Interval for 1 / 2
27 A Confidence Interval for 1 / 2 The CI for is based on replacing F in the probability statement by the F variable (9.9) and manipulating the inequalities to isolate An interval for 1 / 2 results from taking the square root of each limit.