Chapter 10 Inferences from Two Samples

Chapter 10 Inferences from Two Samples
Inferences about Two Means: Independent and Large Samples Inferences about Two Means: Independent and Small Samples Inferences about Two Means: Matched Pairs Inferences about Two Proportions

Overview There are many important and meaningful situations in which it becomes necessary to compare two sets of sample data. page 438 of text Examples in the discussion

10-2 Inferences about Two Means: Independent and Large Samples

Two Samples: Independent
Definitions Two Samples: Independent The sample values selected from one population are not related or somehow paired with the sample values selected from the other population. If the values in one sample are related to the values in the other sample, the samples are dependent. Such samples are often referred to as matched pairs or paired samples. Text will use the wording ‘matched pairs’. Example at bottom of page

Assumptions 1. The two samples are independent.
2. The two sample sizes are large. That is, n1 > 30 and n2 > 30. 3. Both samples are simple random samples. page 439

Characteristics of the Sampling Distribution XA-XB
The mean of the sampling distribution of all possible XA-XB is µA- µB. The standard deviation of the sampling distribution of all the possible values of

Test Statistic for Two Means: Independent and Large Samples
Hypothesis Tests Test Statistic for Two Means: Independent and Large Samples

Hypothesis Tests Test Statistic for Two Means: Independent and Large Samples (x1 - x2) - (µ1 - µ2) z = 1. 2 2 2 + n1 n2

Hypothesis Tests Test Statistic for Two Means: Independent and Large Samples  and  If and are not known, use s1 and s2 in their places. provided that both samples are large. P-value: Use the computed value of the test statistic z, and find the P-value . Critical values: Based on the significance level , find critical values .

Coke Versus Pepsi Sample statistics are shown. Use the 0.01 significance level to test the claim that the mean weight of regular Coke is different from the mean weight of regular Pepsi. Example on page 440 of text

Coke Versus Pepsi Sample statistics are shown. Use the 0.01 significance level to test the claim that the mean weight of regular Coke is different from the mean weight of regular Pepsi. Regular Coke Regular Pepsi n x s

Coke Versus Pepsi

Coke Versus Pepsi Claim: 1  2 Ho : 1 = 2 H1 : 1  2  = 0.01
Reject H0 Fail to reject H0 Reject H0 Z = Z = 2.575 1 -  = 0 or Z = 0

Coke Versus Pepsi Test Statistic for Two Means: Independent and Large Samples 2 (x1 - x2) - (µ1 - µ2) z = n1 n2 + 1. 2

Coke Versus Pepsi Test Statistic for Two Means: Independent and Large Samples z = ( ) - 0 + 36 36 =

Reject H0 Fail to reject H0 Reject H0 Z = Z = 2.575 sample data: z = 1 -  = 0 or Z = 0

There is significant evidence to support the claim that there is a difference between the mean weight of Coke and the mean weight of Pepsi. Reject H0 Fail to reject H0 Reject H0 Further explanation of interpretation is given in text. The magnitude of the difference is the weights is not anything that consumers would notice. Also this test simply indicates the Coke ingredients weigh less which does not indicate that there is less volume of the product. Reject Null Z = Z = 2.575 sample data: z = 1 -  = 0 or Z = 0

Confidence Intervals Begins at bottom of page 441.

Confidence Intervals (x1 - x2) - E < (µ1 - µ2) < (x1 - x2) + E
page 442 of text

1 2 2 2 where E = z + n1 n2

where E = z n1 n2 + 1 2 2 Find an 80% confidence interval for the difference for Coke and Pepsi. For Coke versus Pepsi, x1 - x2 = , and z = 1.28 (.00157) = (.00527, )

t = t-Distribution Model (x1 - x2) - (µ1 - µ2) n1 n2 1. 2 +
The degrees of freedom are nA+ nB – 2 A pooled variance is the weighted mean of the sample variances. and is used if the the data is not normally distributed. 2 (x1 - x2) - (µ1 - µ2) t = n1 n2 + 1. 2

Two groups were tested to see whether calcium reduces blood
pressure. The following data was collected. Is there evidence at the .1 level that calcium reduces blood pressure? Group 1 (calcium) –2 Group 2 (placebo) Group Treatment n x s Calcium Placebo HO: µ1 - µ2 > 0 HA: µ1 - µ2 < 0 3. tcritical = One tail t test, n < 30 – 2 = 19 d.f.

5. There is not enough evidence at the .1 level
that calcium reduces blood pressure. -1.328 1.604

Inferences about Two Proportions
Assumptions 1. We have proportions from two independent simple random samples. 2. For both samples, the conditions np  5 and nq  5 are satisfied. page 458 of text

Notation for Two Proportions
For population 1, we let: p1 = population proportion n1 = size of the sample x1 = number of successes in the sample page 459 of text The value of x1 is sometimes given, but sometimes must be calculated from the information in the problem. See example below definition box on this page. Computed values of x1 should be a whole number with rounding possibly necessary.

For population 1, we let: p1 = population proportion n1 = size of the sample x1 = number of successes in the sample ^ p1 = x1/n1 (the sample proportion)

For population 1, we let: p1 = population proportion n1 = size of the sample x1 = number of successes in the sample ^ p1 = x1/n1 (the sample proportion) q1 = 1 - p1 ^ ^

For population 1, we let: π1 = population proportion n1 = size of the sample x1 = number of successes in the sample p1 = x1/n1 (the sample proportion) q1 = 1 - p1 The corresponding meanings are attached to π2, n2 , x2 , p2. and q2 , which come from population 2.

Test Statistic for Two Proportions
For H0: π1 = π2 , H0: π1  π2 , H0: π1 π2 HA:π1  π2 , HA: π1 < π2 , HA: π 1> π2

For H0: p1 = p2 , H0: p1  p2 , H0: p1 p2 H1: p1  p2 , H1: p1 < p2 , H1: p 1> p2 where π1 - π 2 = (assumed in the null hypothesis)

For H0: p1 = p2 , H0: p1  p2 , H0: p1 p2 H1: p1  p2 , H1: p1 < p2 , H1: p 1> p2 where p1 - p 2 = (assumed in the null hypothesis) x1 x2 p1 p2 and = = n1 n2 Example given at the bottom of page

Confidence Interval Estimate of π1 - π2
(p1 - p2 ) - E < (π1 - π2 ) < (p1 - p2 ) + E If 0 is not in the interval, one may be C% confident that the two population proportions are different.

A sample of households in urban and rural homes displayed the
following data for preference of artificial or natural Christmas trees: Population n X p = X/n 1(urban) 2(rural Is there a difference in preference between urban and rural homes? with a confidence interval of 90%?

(-.139, .021) We are 90% confident that the difference in proportions is between -.14 and .02. Because the interval contains 0, we are not confident that either group has a stronger preference for natural trees than the other group.

Assumptions 1. The sample data consist of matched pairs.
2. The samples are simple random samples. 3. If the number of pairs of sample data is small (n  30), then the population of differences in the paired values must be approximately normally distributed. page 449 of text

Notation for Matched Pairs
µd = mean value of the differences d for the population of paired data page 450 of text Use Table 8-1 on page 449 to point out that it will be the differences between the measured pairs of data that will be investigated. Looking at the individual sample means would waste important information about the paired data.

Notation for Matched Pairs
µd = mean value of the differences d for the population of paired data d = mean value of the differences d for the paired sample data (equal to the mean of the x - y values) sd = standard deviation of the differences d for the paired sample data n = number of pairs of data.

Test Statistic for Matched Pairs of Sample Data

Test Statistic for Matched Pairs of Sample Data
d - µd t = sd n page 450 of text where degrees of freedom = n - 1

Critical Values If n  30, critical values are found in Table A-4 (t-distribution). If n > 30, critical values are found in Table A- 2 (normal distribution). page 451 of text Hypothesis example given on this page

Confidence Intervals page 452 of text

Confidence Intervals d - ME < µd < d + ME

Confidence Intervals d - ME < µd < d + ME sd
where ME = t sd n degrees of freedom = n -1

How Much Do Male Statistics Students Exaggerate Their Heights?
Using the sample data from Table 8-1 with the outlier excluded, construct a 95% confidence interval estimate of d, which is the mean of the differences between reported heights and measured heights of male statistics students. Example on page 453 of text

Reported and Measured Heights (in inches) of Male Statistics Students
Student A B C D E F G H I J K L Reported Height Measured Difference Table found on page 449 of text. Discussion of outlier possibilities below text. The outlier data pair should be discarded before starting development of confidence interval.

t = (found from Table A-3 with 11 degrees of freedom and 0.05 in two tails)

sd n E = t E = (2.201)( ) 2.244 12 = 1.426

In the long run, 95% o f such samples will lead to confidence intervals that actually do contain the true population mean of the differences. Since the interval does contain 0, the true value of µd is not significantly different from 0. There is not sufficient evidence to support the claim that there is a difference between the reported heights and the measured heights of male statistics students. Some students will need the following explanation: If the measurements are not different, then the differences between the two measurements should have an average (mean) value of 0. Since the 95% confidence interval of these mean differences does not contain 0, the differences are significantly different from 0. Consequently, it would appear that male statistics students do exaggerate their heights.

Chapter 10 Inferences from Two Samples

Similar presentations

Presentation on theme: "Chapter 10 Inferences from Two Samples"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 10 Inferences from Two Samples

Similar presentations

Presentation on theme: "Chapter 10 Inferences from Two Samples"— Presentation transcript:

Similar presentations

About project

Feedback