McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9
9-2 Statistical Inferences Based on Two Samples 9.1Comparing Two Population Means by Using Independent Samples: Variances KnownComparing Two Population Means by Using Independent Samples: Variances Known 9.2Comparing Two Population Means by Using Independent Samples: Variances UnknownComparing Two Population Means by Using Independent Samples: Variances Unknown 9.3Paired Difference ExperimentsPaired Difference Experiments 9.4Comparing Two Population Proportions by Using Large Independent SamplesComparing Two Population Proportions by Using Large Independent Samples 9.5Comparing Two Population Variances by Using Independent SamplesComparing Two Population Variances by Using Independent Samples
9-3 Comparing Two Population Means by Using Independent Samples: Variances Known Suppose a random sample has been taken from each of two different populations Suppose that the populations are independent of each other –Then the random samples are independent of each other Then the sampling distribution of the difference in sample means is normally distributed
9-4 Sampling Distribution of the Difference of Two Sample Means #1 Suppose one population, called it population 1, has mean 1 and variance 1 2 From population 1, a random sample of size n 1 is selected which has mean 1 and variance s 1 2 Suppose a another population, call it population 2, has mean 2 and variance 2 2 From population 2, a random sample of size n 2 is selected which has mean 2 and variance s 2 2 Then …
9-5 Sampling Distribution of the Difference of Two Sample Means #2 The sampling distribution of the difference of two sample means is: 1.Normal, if each of the sampled populations is normal Approximately normal if the sample sizes n 1 and n 2 are large 2.Has mean 3.Has standard deviation
9-6 z-Based Confidence Interval for the Difference in Means (Variances Known) #2 Then a 100(1 – ) percent confidence interval for the difference in populations 1 – 2 is
9-7 z-Based Test About the Difference in Means (Variances Known) Test the null hypothesis about H 0 : 1 – 2 = D 0 D 0 = 1 – 2 is the claimed difference between the population means D 0 is a number whose value varies depending on the situation Often D 0 = 0, and the null means that there is no difference between the population means Use the notation from the confidence interval statement on prior slide Assume that each sampled population is normal or that the samples sizes n 1 and n 2 are large
9-8 Test Statistic (Variances Known) The test statistic is The sampling distribution of this statistic is a standard normal distribution If the populations are normal and the samples are independent...
9-9 z-Based Test About the Difference in Means (Variances Known) #2 Alternative Reject H 0 if: p-value Area under standard normal to the right of z Area under standard normal to the left of –z Twice the area under standard normal to the right of |z| * H a : 1 – 2 > D 0 H a : 1 – 2 < D 0 H a : 1 – 2 ≠ D 0 * either z > z /2 or z < –z /2
9-10 Comparing Two Population Means by Using Independent Samples: Variances Unknown Generally, the true values of the population variances 1 2 and 2 2 are not known. They have to be estimated from the sample variances s 1 2 and s 2 2, respectively Also need to estimate the standard deviation of the sampling distribution of the difference between sample means Two approaches: If it can be assumed that 1 2 = 2 2 = 2, then calculate the “pooled estimate” of 2 If 1 2 ≠ 2 2, then use approximate methods
9-11 Pooled Estimate of 2 Assume that 1 2 = 2 2 = 2 The pooled estimate of 2 is the weighted averages of the two sample variances, s 1 2 and s 2 2 The pooled estimate of 2 is denoted by s p 2 The estimate of the population standard deviation of the sampling distribution is
9-12 t-Based Confidence Interval for the Difference in Means (Variances Unknown) Select two independent random samples from two normal populations with equal variances. Then a 100(1 – ) percent confidence interval for the difference in populations 1 – 2 is where and t /2 is based on (n 1 + n 2 – 2) degrees of freedom (df)
9-13 Test Statistic (Variances Unknown) The test statistic is where D 0 = 1 – 2 is the claimed difference between the population means The sampling distribution of this statistic is a t distribution with (n 1 + n 2 – 2) degrees of freedom
9-14 t-Based Test About the Difference in Means (Variances Unknown) #3 Alternative Reject H 0 if: p-value Area under standard normal to the right of z Area under standard normal to the left of –z Twice the area under standard normal to the right of |z| * H a : 1 – 2 > D 0 H a : 1 – 2 < D 0 H a : 1 – 2 ≠ D 0 * either t > t /2 or t < –t /2 where t , t /2, and p-values are based on (n 1 + n 2 – 2) degrees of freedom
9-15 Small Sample Intervals and Tests about Differences in Means When Variances are Not Equal If sampled populations are both normal, but sample sizes and variances differ substantially, small-sample estimation and testing can be based on the following “unequal variance” procedure Confidence IntervalTest Statistic For both the interval and test, the degrees of freedom are equal to
9-16 Paired Difference Experiments Before, drew random samples from two different populations Now, have two different processes (or methods) Draw one random sample of units and use those units to obtain the results of each process For instance, use the same individuals for the results from one process vs. the results from the other process E.g., use the same individuals to compare “before” and “after” treatments By using the same individuals, eliminating any differences in the individuals themselves and just comparing the results from the two processes
9-17 Paired Difference Experiments Continued Let d be the mean of population of paired differences d = 1 – 2, where 1 is the mean of population 1 and 2 is the mean of population 2 Let and s d be the mean and standard deviation of a sample of paired differences that has been randomly selected from the population is the mean of the differences between pairs of values from both samples
9-18 t-Based Confidence Interval for Paired Differences in Means If the sampled population of differences is normally distributed with mean d, then a )100% confidence interval for d is where for a sample of size n, t /2 is based on n – 1 degrees of freedom
9-19 Test Statistic for Paired Differences The test statistic is D 0 = 1 – 2 is the claimed or actual difference between the population means D 0 varies depending on the situation Often D 0 = 0, and the null means that there is no difference between the population means The sampling distribution of this statistic is a t distribution with (n – 1) degrees of freedom
9-20 Paired Differences Testing Rules Alternative Reject H 0 if: p-value Area under t distribution to the right of t Area under t distribution to the left of –t Twice the area under t distribution to the right of |t| * H a : d > D 0 H a : d < D 0 H a : d ≠ D 0 * either t > t /2 or t < –t /2 where t , t /2, and p-values are based on (n – 1) degrees of freedom.
9-21 Comparing Two Population Proportions Select a random sample of size n 1 from a population, and let denote the proportion of units in this sample that fall into the category of interest Select a random sample of size n 2 from another population, and let denote the proportion of units in this sample that fall into the same category of interest Suppose that n 1 and n 2 are large enough n 1 p 1 ≥ 5, n 1 (1 - p 1 ) ≥ 5, n 2 p 2 ≥ 5, and n 1 (1 – p 2 ) ≥ 5
9-22 Comparing Two Population Proportions Continued Then the population of all possible values of Is approximately has a normal distribution if each of the sample sizes n 1 and n 2 is large Here, n 1 and n 2 are large enough is n 1 p 1 ≥ 5, n 1 (1 - p 1 ) ≥ 5, n 2 p 2 ≥ 5, and n 1 (1 – p 2 ) ≥ 5 Has mean Has standard deviation
9-23 Confidence Interval for the Difference of Two Population Proportions If the random samples are independent of each other, then the following a 100(1 – a) percent confidence interval for
9-24 Test Statistic for the Difference of Two Population Proportions The test statistic is D 0 = p 1 – p 2 is the claimed or actual difference between the population proportions D 0 is a number whose value varies depending on the situation Often D 0 = 0, and the null means that there is no difference between the population means The sampling distribution of this statistic is a standard normal distribution
9-25 Comparing Two Population Variances Using Independent Samples Population 1 has variance 1 2 and population 2 has variance 2 2 The null hypothesis H 0 is that the variances are the same H 0 : 1 2 = 2 2 The alternative is that one of them is smaller than the other That population has less variable, more consistent, measurements Suppose 1 2 > 2 2 More usual to normalize Test H 0 : 1 2 / 2 2 = 1 vs. 1 2 / 2 2 > 1
9-26 Comparing Two Population Variances Using Independent Samples Continued Reject H 0 in favor of H a if s 1 2 /s 2 2 is significantly greater than 1 s 1 2 is the variance of a random of size n 1 from a population with variance 1 2 s 2 2 is the variance of a random of size n 2 from a population with variance 2 2 To decide how large s 1 2 /s 2 2 must be to reject H 0, describe the sampling distribution of s 1 2 /s 2 2 The sampling distribution of s 1 2 /s 2 2 is the F distribution
9-27 F Distribution Shape depends on two parameters: the numerator number of degrees of freedom (df 1 ) and the denominator number of degrees of freedom ( df 2 ) The F is skewed to the right
9-28 F Distribution The F point F is the point on the horizontal axis under the curve of the F distribution that gives a right-hand tail area equal to The value of F depends on (the size of the right-hand tail area) and df 1 and df 2 Different F tables for different values of See: Tables A.5 for = 0.10 Tables A.6 for = 0.05 Tables A.7 for = Tables A.8 for = 0.01