June 26, 2008Stat Lecture 151 Two-Sample Inference for Proportions Statistics Lecture 15
June 26, 2008Stat Lecture 152 Administrative Notes HW 5 due on July 1 (next Wednesday) Exam is from 10:40-12:10 on July 2 (next Thursday) Will focus on material covered after midterm You should expect a question or two on topics covered before the midterm
June 26, 2008Stat Lecture 153 Count Data and Proportions Last class, we re-introduced count data: X i = 1 with probability p and 0 with probability (1-p) Example: Pennsylvania Primary X i = 1 if you favor Obama, X i = 0 if not What is the proportion p of Obama supporters at Penn? We derived confidence intervals and hypothesis tests for a single population proportion p
June 26, 2008Stat Lecture 154 Two-Sample Inference for Proportions Today, we will look at comparing the proportions between two samples from distinct populations Two tools for inference: Hypothesis test for significant difference between p 1 and p 2 Confidence interval for difference p 1 - p 2 Population 1:p 1 Sample 1: Population 2:p 2 Sample 2:
June 26, 2008Stat Lecture 155 Example: Vitamin C study Study done by Linus Pauling in 1971 Does vitamin C reduce incidence of common cold? 279 people randomly given vitamin C or placebo Is there a significant difference in the proportion of colds between the vitamin C and placebo groups? GroupColdsTotal Vitamin C17139 Placebo31140
June 26, 2008Stat Lecture 156 Hypothesis Test for Two Proportions For two different samples, we want to test whether or not the two proportions are different: H 0 : p 1 = p 2 versus H a : p 1 p 2 The test statistic for testing the difference between two proportions is: is called the pooled standard error and has the following formula: is called the pooled sample proportion
June 26, 2008Stat Lecture 157 Example: Vitamin C study We need the following three sample proportions: = 17/139 =.12 = 31/140 =.22 = 48/279 =.17 Next, we calculate the pooled standard error: = – = = √(.17*.83*(1/ /140)) =.045 Finally, we calculate our test statistic: z = ( )/.045 = Vitamin C groupY 1 = 17n 1 = 139 Placebo groupY 2 = 31n 2 = 140
June 26, 2008Stat Lecture 158 Hypothesis Test for Two Proportions We use the standard normal distribution to calculate a p-value for our test statistic Since we used a two-sided alternative, our p-value is 2 x P(Z < -2.22) = 2 x = At a = 0.05 level, we reject the null hypothesis Conclusion: the proportion of colds is significantly different between the Vitamin C and placebo groups Z = prob =
June 26, 2008Stat Lecture 159 Confidence Interval for Difference We use the two sample proportions to construct a confidence interval for the difference in population proportions p 1 - p 2 between two groups: Interval is centered at the difference of the two sample proportions As usual, the multiple Z * you use depends on the confidence level that is needed eg. for a 95% confidence interval, Z * = 1.96
June 26, 2008Stat Lecture 1510 Example: Vitamin C study Want a C.I. for difference in proportion of colds p 1 - p 2 between Vitamin C and placebo Need sample proportions from before: = 17/139 =.12 = 31/140 =.22 Now, we construct a 95% confidence interval: ( ) +/- √(.12*.88/ *.78/140) =(-.19,-.01) Vitamin C causes decrease in cold proportions between 1% and 19%
June 26, 2008Stat Lecture 1511 Another Example Has Shaq gotten worse at free throws over his career? Free throws are uncontested shots given to a player when they are fouled…Shaquille O’Neal is notoriously bad at them Two Samples: the first three years of Shaq’s career vs. a later three years of his career Group Free Throws Made Free Throws Attempted Early Years Y 1 = 1353n 1 = 2425 Later Years Y 2 = 1121n 2 = 2132
June 26, 2008Stat Lecture 1512 Another Example: Shaq’s Free Throws We calculate the sample and pooled proportions = 1353/2425=.558 =1121/2132=.526 =2474/4557=.543 Next, we calculate the pooled standard error: = √(.543*.467(1/2425+1/2132))=.015 Finally, we calculate our test statistic: Z = ( )/.015 = 2.13
June 26, 2008Stat Lecture 1513 Another Example: Shaq’s Free Throws We use the standard normal distribution to calculate a p-value for our test statistic Since we used a two-sided alternative, our p-value is 2 x P(Z > 2.13) = At = 0.05 level, we reject null hypothesis Conclusion: Shaq’s free throw success is significantly different now than early in his career Z = 2.13 prob =
June 26, 2008Stat Lecture 1514 Confidence Interval: Shaq’s FT We want a confidence interval for the difference in Shaq’s free throw proportion: = 1353/2425=.558 =1121/2132=.526 Now, we construct a 95% confidence interval: ( ) +/ *√(.558*.442/ *.474/2132) (.003,.061) Shaq’s free throw percentage has decreased from anywhere between 0.3% to 6.1%
June 26, 2008Stat Lecture 1515 Is Shaq still bad at Free Throws?