20. Comparing two proportions The Practice of Statistics in the Life Sciences Third Edition © 2014 W.H. Freeman and Company
Objectives (PSLS Chapter 20) Comparing two proportions Comparing 2 independent samples Confidence interval for 2 proportion Large sample method Plus four method Test of statistical significance Treatment and risk reduction
Comparing 2 independent samples We often need to compare 2 treatments with 2 independent samples. For large enough samples, the sampling distribution of is approximately Normal. However, neither p1 nor p2 are known.
Large-sample CI for 2 proportions For 2 independent SRSs of sizes n1 and n2 with sample proportions of successes p̂1 and p̂2 respectively, an approximate level C confidence interval for p1 – p2 is C is the area under the standard Normal curve between -z* and z*. Use this method when the number of successes and the number of failures are each at least 10 in each sample.
Cholesterol and heart attacks How much does the cholesterol lowering drug Gemfibrozil help reduce the risk of heart attack? We compare the incidence of heart attack over a 5 year period for 2 random samples of middle-aged men taking either the drug or a placebo. Standard error of the difference p̂1 – p̂2 : H. attack n p̂ Drug 56 2051 2.73% Placebo 84 2030 4.14% So the 90% CI is (0.0414-0.0273) ± 1.645*0.0057 = 0.014 ± 0.009 We are 90% confident that the percent of middle-age men who suffer a heart attack is 0.5 to 2.3% percentage points lower when taking the cholesterol-lowering drug than when taking a placebo.
“Plus four” CI for 2 proportions The “plus 4” method again produces more accurate confidence intervals. We act as if we had 4 additional observations: 1 success and 1 failure in each of the 2 samples. The new combined sample size is n1 + n2 + 4 and the proportions of successes are: An approximate level C confidence interval is: Use this when C is at least 90% and both sample sizes are at least 5.
Researchers compared oral health in 46 young adult males wearing a tongue piercing (TP) and a control group of 46 young adult males without tongue piercing. They found that 38 individuals in the TP group and 26 in the control group had enamel cracks. We want to estimate with 95% confidence the difference between the proportions of individuals with enamel cracks among young adult males with and without TP. One count is too low for the large sample method, so we use the plus-four method. The lowest count is 8, which is too low for the large sample method. We are 95% confident that the percent of young adult males with a tongue piercing who have enamel cracks is about 7 to 43 percentage points greater than the percent with enamel cracks among young adult males without a tongue piercing.
Test of significance We test: H0: p1 = p2 = p If H0 is true, we are sampling twice from the same population and we can pool the information from both samples to estimate p. =0 Appropriate when all counts (successes and failures in each sample) are 5 or more.
H0: pgf = pplacebo Ha: pgf > pplacebo Gastric Freezing Gastric freezing was once a treatment for ulcers. Patients would swallow a deflated balloon with tubes to cool the stomach for an hour in hope of reducing acid production and relieving ulcer pain. The treatment was shown to be safe and significantly reducing ulcer pain and was widely used for years. A randomized comparative experiment later compared the outcome of gastric freezing with that of a placebo: 28 of the 82 patients subjected to gastric freezing improved, while 30 of the 78 in the control group improved. H0: pgf = pplacebo Ha: pgf > pplacebo
H0: pgf = pplacebo Ha: pgf > pplacebo Results: 28 of the 82 patients subjected to gastric freezing improved 30 of the 78 patients in the control group improved H0: pgf = pplacebo Ha: pgf > pplacebo The P-value is greater than 50%... Gastric freezing was not significantly better than a placebo (P-value > 0.1), and this treatment was abandoned. ALWAYS USE A CONTROL!!!
Treatment and risk reduction In the health sciences, we often compare a given health risk in the treatment group with the same risk in the control group. One measure of this is the Relative Risk Reduction (RRR) which indicates how better off you would be relative to receiving a placebo or control treatment.
Cholesterol and heart attacks How much does the cholesterol-lowering drug Gemfibrozil help reduce the risk of heart attack? We compare the incidence of heart attack over a 5-year period for two random samples of middle-aged men taking either the drug or a placebo. H. attack No H.A. n p̂ Drug 56 1995 2051 2.73% Placebo 84 1946 2030 4.14% The drug Gemfibrozil reduces the risk of a heart attack in middle-aged men by about 34% over a 5-year period of continuous treatment, compared with middle-aged men taking a placebo (RRR = 34%). That is, the risk of a heart attack over that period is 34% smaller in the Gemfibrozil group than in the placebo group.
The Absolute Risk Reduction (ARR) is simply the absolute difference in outcome rates between the control and treatment groups: The Number Needed to Treat (NNT) is the number of patients that would need to be treated to prevent one additional negative outcome. NNT = 1 / ARR ARR and NNT are better indicators of treatment efficacy than RRR.
H. attack No H.A. n p̂ Drug 56 1995 2051 2.73% Placebo 84 1946 2030 4.14% The group taking Gemfibrozil had a rate of heart attack 1.4 percentage point lower than that of the placebo group (ARR = 1.4%). Pharmaceutical companies typically report the “relative risk reduction,” which makes the treatment effect appear much more impressive… There is growing recognition that NNT is the more intuitive summary of the three. The website www.thennet.com offers lots of examples. On average, we need to treat 71 men for 5 years with Gemfibrozil to avoid 1 heart attack (NNT = 71).