Download presentation
Presentation is loading. Please wait.
1
AP Statistics Comparing Two Proportions
Chapter 22 AP Statistics Comparing Two Proportions
2
Objectives Sampling distribution of the difference between two proportions 2-proportion z-interval Pooling 2-proportion z-test
3
Comparing Two Proportions
Comparisons between two percentages(two population proportions) are much more common than questions about isolated percentages. And they are more interesting. We often want to know how two groups differ, whether a treatment is better than a placebo control, or whether this year’s results are better than last year’s.
4
Another Ruler In order to examine the difference between two proportions, we need another ruler—the standard deviation of the sampling distribution model for the difference between two proportions. Recall that standard deviations don’t add, but variances do. In fact, the variance of the sum or difference of two independent random quantities is the sum of their individual variances.
5
Assumptions and Conditions
Independence Assumptions: Randomization Condition: The data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment. The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population. Independent Groups Assumption: The two groups we’re comparing must be independent of each other.
6
Assumptions and Conditions
Sample Size Assumption: Each of the groups must be big enough… Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each.
7
The Standard Deviation of the Difference Between Two Proportions
Proportions observed in independent random samples are independent. Thus, we can add their variances. So… The standard deviation of the difference between two sample proportions is Thus, the standard error is
8
The Sampling Distribution
We already know that for large enough samples, each of our proportions has an approximately Normal sampling distribution. The same is true of their difference.
9
The Sampling Distribution
Provided that the sampled values are independent, the samples are independent, and the samples sizes are large enough, the sampling distribution of is modeled by a Normal model with Mean: Standard deviation:
10
The Sampling Distribution
The approximately normal sampling distribution for ( 1 – 2):
11
Two-Proportion z-Interval
When the conditions are met, we are ready to find the confidence interval for the difference of two proportions: The confidence interval is where The critical value z* depends on the particular confidence level, C, that you specify.
12
Procedure: 2-Proportion z-Interval
P A N I C Parameters Assumptions Name Interval Conclusion in context
13
Example: 2-Proportion z-Interval
A recent study of 1000 randomly chosen residents in each of two randomly selected states indicated that the percent of people living in those states who were born in foreign countries was 6.5% for State A and 1.7% for State B. Find a 99% confidence interval for the difference between the proportions of foreign-born residents for these two states.
14
Solution – P A N I C Parameter Assumptions
p1 – p2: the difference between the foreign born people in state A and state B. Assumptions Independence Group Condition: Groups from different randomly selected states should be independent. Randomization Condition: Each sample was drawn randomly from its respective state. 10% Condition: Reasonable that the population of both states is greater than 10,000. Success/Failure Condition:
15
Solution Name Two proportion z-interval Interval
16
Solution Conclusion in context
I am 99% confident that the percent difference of foreign-born residents for states A and B is between 2.5% and 7.1%.
17
Your Turn Suppose the AGA would like to compare the proportion of homes heated by gas in the South with the corresponding proportion in the North. AGA selected a random sample of 60 homes located in the South and found that 34 of the homes use gas for heating fuel. AGA also randomly sampled 80 homes in the North and found 42 used gas for heating. Construct a 90% confidence interval for the difference between the proportions of Southern homes and Northern homes which are heated by gas.
18
TI-84: 2-Proportion z-Interval
Press STAT key, choose TESTS, and then choose B: 2-PropZInterval… Adjust the settings; x1: the number selected sample 1 n1: sample 1 sample size x2: the number selected sample 2 n2: sample 2 sample size C-level: confidence level Select “Calculate”
19
Example – Using the TI-84 How much does the cholesterol-lowering drug Gemfibrozil help reduce the risk of heart attack? We compare the incidence of heart attack over a 5-year period for two random samples of middle-aged men taking either a placebo or the drug. Find a 90% Confidence Interval for the decrease in the percentage of middle-aged men having heart attacks on the cholesterol-lowering drug. H. attack n Drug 56 2051 2.73% Placebo 84 2030 4.14%
20
Solution 90% Confidence Interval Conclusion (.0047, .02345)
We are 90% confident that the percent difference of middle-aged men who suffer a heart attack is 0.47% to 2.3% lower when taking the cholesterol-lowering drug.
21
Two Proportion z-Test
22
Everyone into the Pool The typical hypothesis test for the difference in two proportions is the one of no difference. In symbols, H0: p1 – p2 = 0. Since we are hypothesizing that there is no difference between the two proportions, that means that the standard deviations for each proportion are the same. Since this is the case, we combine (pool) the counts to get one overall proportion.
23
Everyone into the Pool The pooled proportion is where and
If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)
24
Everyone into the Pool We then put this pooled value into the Standard Error formula, substituting it for both sample proportions in the formula:
25
Compared to What? We’ll reject our null hypothesis if we see a large enough difference in the two proportions. How can we decide whether the difference we see is large? Just compare it with its standard deviation. Unlike previous hypothesis testing situations, the null hypothesis doesn’t provide a standard deviation, so we’ll use a standard error (here, pooled).
26
Two-Proportion z-Test
The conditions for the two-proportion z-test are the same as for the two-proportion z-interval. We are testing the hypothesis H0: p1 – p2 = 0, or, equivalently, H0: p1 = p2. The alternative hypothesis is either; Ha: p1≠p2, Ha: p1<p2, or Ha: p1>p2. Because we hypothesize that the proportions are equal, we pool them to find
27
Two-Proportion z-Test
We use the pooled value to estimate the standard error: Now we find the test statistic: When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value.
28
Procedure: 2-Proportion z-Test
P H A N T O M S Parameter (p1 – p2) Hypothesis Assumptions Name the test Test statistic Obtain p-value Make decision State conclusion in context
29
Example: 2-Proportion z-Test
A researcher wants to know whether there is a difference in AP-Statistics exam failure rates between rural and suburban students. She randomly selects 107 rural students and 143 suburban students who took the exam. Thirty rural students failed to pass their exam, while 45 suburban students failed the exam. Is there a significant difference in failure rates for these groups?
30
Solution – P H A N T O M S Parameter Hypothesis
pr – ps: the difference in the proportion of rural and suburban students who fail the AP Statistics exam. Hypothesis Null hypothesis – H0: pr=pS (There is no difference in failure rates.) Alternative hypothesis – Ha: pr≠pS (There is a difference in failure rates.)
31
Solution Assumptions Independent Groups Condition: Rural and Suburban student groups are independent. Randomization Condition: Samples were randomly drawn. 10% Condition: It is reasonable to believe that the population of rural exam takers is greater than 1070 and the population of suburban exam takers is greater than Success/Failures Condition:success1=30>10, success2=45>10, failure1=( )=77>10, failure2=(143-45)=98>10
32
Solution Name the test Two proportion z-test Test Statistic
33
Solution Obtain p-value Make decision Conclusion
z = -.59 and a two-tailed test P-value = 2P(z > .59) = 2(.2776) P-value = .5552 Make decision p-value > .05, therefore fail to reject the null hypothesis. Conclusion The p-value is too high, thus we fail to reject the null hypothesis. There is insufficient evidence to show a difference between the failure rates of rural and suburban students on the AP Statistics exam.
34
Your Turn: 2-Proportion z-Test
In recent years there has been a trend toward both parents working outside the home. Do working mothers experience the same burdens and family pressures as their spouses? A popular belief is that the proportion of working mothers who feel they have enough spare time for themselves is significantly less than the corresponding proportion of working fathers. In order to test this claim, independent random samples of 100 working mothers and 100 working fathers were selected and their views on spare time for themselves were recorded. A summary of the data is given in the table below. Is the belief that the proportion of working mothers who feel they have enough spare time for themselves less than the corresponding proportion of working fathers supported at significance level α=.01 by the sample information? Working Mothers Working Fathers Number sampled 100 Number who feel they have enough spare time 37 56
35
TI-84: 2-Proportion z-Test
Press STAT key, choose TESTS, and then choose 6: 2-PropZTest… Adjust the settings; x1: the number selected sample 1 n1: sample 1 sample size x2: the number selected sample 2 n2: sample 2 sample size p1: select type of test (≠p2, <p2, >p2) Select “Calculate”
36
Example Using TI-84 To market a new white wine, a winemaker decides to use two different advertising agencies, one operating in the east, one in the west. After the white wine has been on the market for 8 months, independent random samples of wine drinkers are taken from each of the two regions and questioned concerning their white wine preference. The numbers favoring the new brand are shown in the table. Is there evidence that the proportion of wine drinkers in the west who prefer the new wine is greater than the corresponding proportion of wine drinkers in the east. Test at the .02 significance level. East West Number sampled 516 438 Number who prefer the new white wine 18 23
37
Solution H0:pW=pE Ha:pW>pE 2-PropZTest
x1: 23 n1: 438 x2: 18 n2: 516 p1: >p2 z = and P-value = .0905 Conclusion The P-value is greater than α =.02, thus we fail to reject the null hypothesis. There is insufficient evidence at the .02 significance level to conclude that the proportion of wine drinkers in the west who prefer the new white wine is greater than the corresponding proportion of wine drinkers in the east.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.