1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University
2 2 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 10, Part A Statistical Inferences About Means and Proportions with Two Populations n Inferences About the Difference Between Two Population Means: 1 and 2 Known Two Population Means: 1 and 2 Known n Inferences About the Difference Between Two Population Means: Matched Samples Two Population Means: Matched Samples n Inferences About the Difference Between Two Population Means: 1 and 2 Unknown Two Population Means: 1 and 2 Unknown
3 3 Slide © 2008 Thomson South-Western. All Rights Reserved Inferences About the Difference Between Two Population Means: 1 and 2 Known Interval Estimation of 1 – 2 Interval Estimation of 1 – 2 Hypothesis Tests About 1 – 2 Hypothesis Tests About 1 – 2 In this chapter we will show how interval estimates and hypothesis tests can be developed for situations involving two populations when the difference between the two population means or the two population proportions is of prime importance. We are using statistical inference in our conclusions about the differences.
4 4 Slide © 2008 Thomson South-Western. All Rights Reserved Estimating the Difference Between Two Population Means Let 1 equal the mean of population 1 and 2 equal Let 1 equal the mean of population 1 and 2 equal the mean of population 2. the mean of population 2. n The difference between the two population means is 1 - 2. 1 - 2. To estimate 1 - 2, we will select a simple random To estimate 1 - 2, we will select a simple random sample of size n 1 from population 1 and a simple sample of size n 1 from population 1 and a simple random sample of size n 2 from population 2. random sample of size n 2 from population 2. n Let equal the mean of sample 1 and equal the mean of sample 2. mean of sample 2. The point estimator of the difference between the The point estimator of the difference between the means of the populations 1 and 2 is. means of the populations 1 and 2 is.
5 5 Slide © 2008 Thomson South-Western. All Rights Reserved n We are focusing on inferences about the difference between the means: μ 1 – μ 2. n The two samples, taken separately and independently, are referred to as independent simple random samples. n We show how to compute a margin of error and develop an interval estimate. Estimating the Difference Between Two Population Means
6 6 Slide © 2008 Thomson South-Western. All Rights Reserved n Expected Value Sampling Distribution of n Standard Deviation (Standard Error) where: 1 = standard deviation of population 1 2 = standard deviation of population 2 2 = standard deviation of population 2 n 1 = sample size from population 1 n 1 = sample size from population 1 n 2 = sample size from population 2 n 2 = sample size from population 2
7 7 Slide © 2008 Thomson South-Western. All Rights Reserved n Interval Estimate Interval Estimation of 1 - 2 : 1 and 2 Known where: 1 - is the confidence coefficient 1 - is the confidence coefficient
8 8 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 - 2 : 1 and 2 Known In a test of driving distance using a mechanical In a test of driving distance using a mechanical driving device, a sample of Par golf balls was compared with a sample of golf balls made by Rap, Ltd., a competitor. The sample statistics appear on the next slide. Par, Inc. is a manufacturer Par, Inc. is a manufacturer of golf equipment and has developed a new golf ball that has been designed to provide “extra distance.” n Example: Par, Inc.
9 9 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Par, Inc. Interval Estimation of 1 - 2 : 1 and 2 Known Sample Size Sample Mean Sample #1 Par, Inc. Sample #2 Rap, Ltd. 120 balls 80 balls 120 balls 80 balls 275 yards 258 yards Based on data from previous driving distance Based on data from previous driving distance tests, the two population standard deviations are known with 1 = 15 yards and 2 = 20 yards.
10 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 - 2 : 1 and 2 Known n Example: Par, Inc. Let us develop a 95% confidence interval estimate Let us develop a 95% confidence interval estimate of the difference between the mean driving distances of the two brands of golf ball.
11 Slide © 2008 Thomson South-Western. All Rights Reserved Estimating the Difference Between Two Population Means 1 – 2 = difference between the mean distances the mean distances x 1 - x 2 = Point Estimate of 1 – 2 Population 1 Par, Inc. Golf Balls 1 = mean driving distance of Par distance of Par golf balls Population 1 Par, Inc. Golf Balls 1 = mean driving distance of Par distance of Par golf balls Population 2 Rap, Ltd. Golf Balls 2 = mean driving distance of Rap distance of Rap golf balls Population 2 Rap, Ltd. Golf Balls 2 = mean driving distance of Rap distance of Rap golf balls Simple random sample Simple random sample of n 2 Rap golf balls of n 2 Rap golf balls x 2 = sample mean distance for the Rap golf balls for the Rap golf balls Simple random sample Simple random sample of n 2 Rap golf balls of n 2 Rap golf balls x 2 = sample mean distance for the Rap golf balls for the Rap golf balls Simple random sample Simple random sample of n 1 Par golf balls of n 1 Par golf balls x 1 = sample mean distance for the Par golf balls for the Par golf balls Simple random sample Simple random sample of n 1 Par golf balls of n 1 Par golf balls x 1 = sample mean distance for the Par golf balls for the Par golf balls
12 Slide © 2008 Thomson South-Western. All Rights Reserved Point Estimate of 1 - 2 Point estimate of 1 2 = where: 1 = mean distance for the population of Par, Inc. golf balls of Par, Inc. golf balls 2 = mean distance for the population of Rap, Ltd. golf balls of Rap, Ltd. golf balls = 275 258 = 17 yards
13 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 - 2 : 1 and 2 Known We are 95% confident that the difference between We are 95% confident that the difference between the mean driving distances of Par, Inc. balls and Rap, Ltd. balls is to yards or yards to yards Point estimate ± Margin of error
14 Slide © 2008 Thomson South-Western. All Rights Reserved Hypothesis Tests About 1 2 : 1 and 2 Known Hypotheses Hypotheses Left-tailedRight-tailedTwo-tailed Test Statistic Test Statistic
15 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Par, Inc. Hypothesis Tests About 1 2 : 1 and 2 Known Can we conclude, using Can we conclude, using =.01, that the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls?
16 Slide © 2008 Thomson South-Western. All Rights Reserved H 0 : 1 - 2 < 0 H a : 1 - 2 > 0 where: 1 = mean distance for the population of Par, Inc. golf balls of Par, Inc. golf balls 2 = mean distance for the population of Rap, Ltd. golf balls of Rap, Ltd. golf balls 1. Develop the hypotheses. p –Value and Critical Value Approaches p –Value and Critical Value Approaches Hypothesis Tests About 1 2 : 1 and 2 Known 2. Specify the level of significance. =.01
17 Slide © 2008 Thomson South-Western. All Rights Reserved 3. Compute the value of the test statistic. Hypothesis Tests About 1 2 : 1 and 2 Known p –Value and Critical Value Approaches p –Value and Critical Value Approaches
18 Slide © 2008 Thomson South-Western. All Rights Reserved p –Value Approach p –Value Approach 4. Compute the p –value. For z = 6.49, the p –value < Hypothesis Tests About 1 2 : 1 and 2 Known 5. Determine whether to reject H 0. Because p –value < =.01, we reject H 0. At the.01 level of significance, the sample evidence At the.01 level of significance, the sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls.
19 Slide © 2008 Thomson South-Western. All Rights Reserved Hypothesis Tests About 1 2 : 1 and 2 Known 5. Determine whether to reject H 0. Because z = 6.49 > 2.33, we reject H 0. Critical Value Approach Critical Value Approach For =.01, z.01 = Determine the critical value and rejection rule. Reject H 0 if z > 2.33 The sample evidence indicates the mean driving The sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls.
20 Slide © 2008 Thomson South-Western. All Rights Reserved Inferences About the Difference Between Two Population Means: 1 and 2 Unknown Interval Estimation of 1 – 2 Interval Estimation of 1 – 2 Hypothesis Tests About 1 – 2 Hypothesis Tests About 1 – 2
21 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 - 2 : 1 and 2 Unknown When 1 and 2 are unknown, we will: replace z /2 with t /2. replace z /2 with t /2. use the sample standard deviations s 1 and s 2 use the sample standard deviations s 1 and s 2 as estimates of 1 and 2, and use the t distribution rather than the standard use the t distribution rather than the standard normal distribution. normal distribution. compute a margin of error and develop an interval compute a margin of error and develop an interval estimate of the difference between two population estimate of the difference between two population means when σ 1 and σ 2 are unknown. means when σ 1 and σ 2 are unknown.
22 Slide © 2008 Thomson South-Western. All Rights Reserved Where the degrees of freedom for t /2 are: Interval Estimation of 1 - 2 : 1 and 2 Unknown n Interval Estimate
23 Slide © 2008 Thomson South-Western. All Rights Reserved n In most applications of the interval estimation and hypothesis testing procedures, random samples with n 1 ≥ 30 and n 2 ≥ 30 are adequate. n In cases where either or both sample sizes are less than 30, the distributions of the populations become important considerations. n With smaller sample sizes, it is more important for the analyst to be satisfied that is reasonable to assume that the distributions of the two populations are at least approximately equal. Interval Estimation of 1 - 2 : 1 and 2 Unknown
24 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Specific Motors Difference Between Two Population Means: 1 and 2 Unknown Specific Motors of Detroit Specific Motors of Detroit has developed a new automobile known as the M car. 24 M cars and 28 J cars (from Japan) were road tested to compare miles-per-gallon (mpg) performance. The sample statistics are shown on the next slide.
25 Slide © 2008 Thomson South-Western. All Rights Reserved Difference Between Two Population Means: 1 and 2 Unknown n Example: Specific Motors Sample Size Sample Mean Sample Std. Dev. Sample #1 M Cars Sample #2 J Cars 24 cars 2 8 cars 24 cars 2 8 cars 29.8 mpg 27.3 mpg 2.56 mpg 1.81 mpg
26 Slide © 2008 Thomson South-Western. All Rights Reserved Difference Between Two Population Means: 1 and 2 Unknown Let us develop a 90% confidence Let us develop a 90% confidence interval estimate of the difference between the mpg performances of the two models of automobile. n Example: Specific Motors
27 Slide © 2008 Thomson South-Western. All Rights Reserved Point estimate of 1 2 = Point Estimate of 1 2 where: 1 = mean miles-per-gallon for the population of M cars population of M cars 2 = mean miles-per-gallon for the population of J cars population of J cars = = 2.5 mpg
28 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 2 : 1 and 2 Unknown The degrees of freedom for t /2 are: With /2 =.05 and df = 24, t /2 = Always round non-integer degrees of freedom down to provide a larger t-value and a more conservative interval estimate.
29 Slide © 2008 Thomson South-Western. All Rights Reserved Interval Estimation of 1 2 : 1 and 2 Unknown We are 90% confident that the difference between We are 90% confident that the difference between the miles-per-gallon performances of M cars and J cars is to mpg or to mpg
30 Slide © 2008 Thomson South-Western. All Rights Reserved Hypothesis Tests About 1 2 : 1 and 2 Unknown n Hypotheses Left-tailedRight-tailedTwo-tailed n Test Statistic D 0 is the hypothesized difference between μ 1 and μ 2.
31 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Specific Motors Hypothesis Tests About 1 2 : 1 and 2 Unknown Can we conclude, using a Can we conclude, using a.05 level of significance, that the miles-per-gallon ( mpg ) performance of M cars is greater than the miles-per- gallon performance of J cars?
32 Slide © 2008 Thomson South-Western. All Rights Reserved H 0 : 1 - 2 < 0 H a : 1 - 2 > 0 where: 1 = mean mpg for the population of M cars 2 = mean mpg for the population of J cars 1. Develop the hypotheses. p –Value and Critical Value Approaches p –Value and Critical Value Approaches Hypothesis Tests About 1 2 : 1 and 2 Unknown
33 Slide © 2008 Thomson South-Western. All Rights Reserved 2. Specify the level of significance. 3. Compute the value of the test statistic. =.05 p –Value and Critical Value Approaches p –Value and Critical Value Approaches Hypothesis Tests About 1 2 : 1 and 2 Unknown
34 Slide © 2008 Thomson South-Western. All Rights Reserved Hypothesis Tests About 1 2 : 1 and 2 Unknown p –Value Approach p –Value Approach 4. Compute the p –value. The degrees of freedom for t are: Because t = > t.005 = 1.683, the p –value t.005 = 1.683, the p –value <.005.
35 Slide © 2008 Thomson South-Western. All Rights Reserved 5. Determine whether to reject H 0. We are at least 95% confident that the miles-per- gallon ( mpg ) performance of M cars is greater than the miles-per-gallon performance of J cars?. We are at least 95% confident that the miles-per- gallon ( mpg ) performance of M cars is greater than the miles-per-gallon performance of J cars?. p –Value Approach p –Value Approach Because p –value < =.05, we reject H 0. Hypothesis Tests About 1 2 : 1 and 2 Unknown
36 Slide © 2008 Thomson South-Western. All Rights Reserved 4. Determine the critical value and rejection rule. Critical Value Approach Critical Value Approach Hypothesis Tests About 1 2 : 1 and 2 Unknown For =.05 and df = 41, t.05 = Reject H 0 if t > Determine whether to reject H 0. Because > 1.683, we reject H 0. We are at least 95% confident that the miles-per- gallon ( mpg ) performance of M cars is greater than the miles-per-gallon performance of J cars?. We are at least 95% confident that the miles-per- gallon ( mpg ) performance of M cars is greater than the miles-per-gallon performance of J cars?.
37 Slide © 2008 Thomson South-Western. All Rights Reserved n In most applications, equal or nearly equal sample sizes such that n 1 + n 2 is at least 20 can be expected to provide very good results even if the populations are not normal. n Larger sample sizes are recommended if the distributions of the populations are highly skewed or contain outliers. n Whenever possible, equal sample sizes, n 1 = n 2, are recommended. n The t procedure does not require the assumption of equal population standard deviations and can be applied whether the population standard deviations are equal or not. Hypothesis Tests About 1 2 : 1 and 2 Unknown
38 Slide © 2008 Thomson South-Western. All Rights Reserved With a matched-sample design each sampled item With a matched-sample design each sampled item provides a pair of data values. provides a pair of data values. This design often leads to a smaller sampling error This design often leads to a smaller sampling error than the independent-sample design because than the independent-sample design because variation between sampled items is eliminated as a variation between sampled items is eliminated as a source of sampling error. source of sampling error. Inferences About the Difference Between Two Population Means: Matched Samples We assume that the two populations have the same We assume that the two populations have the same mean. Thus, the null hypothesis is H0: μ 1 – μ 2 = 0. mean. Thus, the null hypothesis is H0: μ 1 – μ 2 = 0. The key to the analysis of the matched sample design is to realize that we consider only the column of differences in our tests. The key to the analysis of the matched sample design is to realize that we consider only the column of differences in our tests.
39 Slide © 2008 Thomson South-Western. All Rights Reserved n We need to make the assumption that the population of differences has a normal distribution if the sample size is small, 20 or less, because we will use the t distribution with n -1 degrees of freedom for hypothesis testing and interval estimation procedures. n A matched sample procedure for inferences about two population means generally provides better precision than the independent sample approach. Inferences About the Difference Between Two Population Means: Matched Samples
40 Slide © 2008 Thomson South-Western. All Rights Reserved n In choosing the sampling procedure to collect data and test the hypothesis, we have two alternatives: The first alternative is the independent sample design: In the case of two different populations, simple random samples are selected from each population. The difference between the population means is tested using the sample means. The first alternative is the independent sample design: In the case of two different populations, simple random samples are selected from each population. The difference between the population means is tested using the sample means. Inferences About the Difference Between Two Population Means: Matched Samples
41 Slide © 2008 Thomson South-Western. All Rights Reserved The second alternative is the matched sample design: In the case of two different treatments on the same population, one simple random sample is selected. Each subject receives both treatments. The order of the two treatments is assigned randomly to each subject. Each subject will provide a pair of values, one value for the first treatment and the second value for the second treatment. The second alternative is the matched sample design: In the case of two different treatments on the same population, one simple random sample is selected. Each subject receives both treatments. The order of the two treatments is assigned randomly to each subject. Each subject will provide a pair of values, one value for the first treatment and the second value for the second treatment. Inferences About the Difference Between Two Population Means: Matched Samples
42 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Express Deliveries Inferences About the Difference Between Two Population Means: Matched Samples A Chicago-based firm has A Chicago-based firm has documents that must be quickly distributed to district offices throughout the U.S. The firm must decide between two delivery services, UPX (United Parcel Express) and INTEX (International Express), to transport its documents.
43 Slide © 2008 Thomson South-Western. All Rights Reserved n Example: Express Deliveries Inferences About the Difference Between Two Population Means: Matched Samples In testing the delivery times In testing the delivery times of the two services, the firm sent two reports to a random sample of its district offices with one report carried by UPX and the other report carried by INTEX. Do the data on the next slide indicate a difference in mean delivery times for the two services? Use a.05 level of significance.
44 Slide © 2008 Thomson South-Western. All Rights Reserved UPXINTEXDifference District Office Seattle Los Angeles Boston Cleveland New York Houston Atlanta St. Louis Milwaukee Denver Delivery Time (Hours) Inferences About the Difference Between Two Population Means: Matched Samples
45 Slide © 2008 Thomson South-Western. All Rights Reserved H 0 : d = 0 H a : d Let d = the mean of the difference values for the two delivery services for the population two delivery services for the population of district offices of district offices 1. Develop the hypotheses. Inferences About the Difference Between Two Population Means: Matched Samples p –Value and Critical Value Approaches p –Value and Critical Value Approaches
46 Slide © 2008 Thomson South-Western. All Rights Reserved 2. Specify the level of significance. =.05 Inferences About the Difference Between Two Population Means: Matched Samples p –Value and Critical Value Approaches p –Value and Critical Value Approaches 3. Compute the value of the test statistic.
47 Slide © 2008 Thomson South-Western. All Rights Reserved 5. Determine whether to reject H 0. We are at least 95% confident that there is a difference in mean delivery times for the two services? We are at least 95% confident that there is a difference in mean delivery times for the two services? 4. Compute the p –value. For t = 2.94 and df = 9, the p –value is between For t = 2.94 and df = 9, the p –value is between.02 and.01. (This is a two-tailed test, so we double the upper-tail areas of.01 and.005.) Because p –value < =.05, we reject H 0. Inferences About the Difference Between Two Population Means: Matched Samples p –Value Approach p –Value Approach
48 Slide © 2008 Thomson South-Western. All Rights Reserved 4. Determine the critical value and rejection rule. Inferences About the Difference Between Two Population Means: Matched Samples Critical Value Approach Critical Value Approach For =.05 and df = 9, t.025 = Reject H 0 if t > Determine whether to reject H 0. Because t = 2.94 > 2.262, we reject H 0. We are at least 95% confident that there is a difference in mean delivery times for the two services?
49 Slide © 2008 Thomson South-Western. All Rights Reserved End of Chapter 10 Part A