Chapter 22 – Comparing Two Proportions
Difference Between Proportions Sometimes we want to see if there is a significant difference between independent groups. Control group vs. treatment group or placebo group Men vs. women Last year vs. this year
Assumptions and Conditions Indepenence Randomization (within each group) 10% Condition (within each group) Independent Group Assumption 2 groups must be independent of each other if we compare the same group before and after some treatment, variance is affected and formulas won’t apply Sample Size Success/Failure condition (within each group)
Sampling Distribution We know that for large enough samples, each of our proportions has a roughly Normal sampling distribution. So does the difference of proportions. Sampling Distribution Model for a Difference Between Two Independent Proportions
Two-Proportion Z-Interval When conditions are met, we can find a confidence interval for the difference of two proportions, p1 –p2:
Example: HS graduation by gender In October 2000 the US Department of Commerce reported the results of a large-scale survey on high school graduation. Researchers contacted more than 25,000 Americans aged 24 years to see if they had finished high school; 84.9% of the 12,460 males and 88.1% of the 12,678 females indicated they had high school diplomas. Are the assumptions and conditions satisfied? Create a 95% confidence interval for the difference in graduation rates between males and females. Does this provide evidence that girls are more likely than boys to complete high school? Example from DeVeaux, Intro to Stats
Testing for a Difference between Proportions If we want to see if there is a statistically significant difference between p1 and p2, we could check to see if p1 = p2. But what we usually do is check to see if the difference is zero: H0: p1 – p2 = 0 HA: p1 – p2 ≠ 0 HA: p1 – p2 > 0 HA: p1 – p2 < 0
Pooling Since we’re assuming in our null hypothesis that p1 and p2 are the same, we can pool the 2 groups together: When we don’t have the # of successes in each group, we can use:
Two-Proportion Z-Test Conditions are the same as for 2-prop CI We are testing : H0: p1 – p2 = 0 Then we find: And find the standard error:
Two-Proportion Z-Test Continued We find the test statistic: And then use this statistic to find our P-value
Example: Depression and Cardiac Disease A study published in the Archives of General Psychiatry in March 2001 examined the impact of depression on a patient’s ability to survive cardiac disease. Researchers identified 450 people with cardiac disease, evaluated them for depression and followed them for 4 years. Of the 361 patients with no depression, 67 died. Of the 89 patients with major or minor depression, 26 died. Among people who suffer from cardiac disease, are depressed patients more likely to die than non-depressed ones? Example from DeVeaux, Intro to Stats