Download presentation
Presentation is loading. Please wait.
Published byEstella Turner Modified over 9 years ago
1
Why multiple tests are a problem? Rafael A. Irizarry
2
Other names Multiple comparisons Data snooping Others?
3
References H. Scheffe (1953), “A method for judging all contrasts in the analysis of variance”, Biometrika 40:87-104 D.B. Duncan (1965), “A Bayesian Approach to multiple comparisons” Technometrics 7:171-222. J.W. Tukey (1953), “The problem on multiple comparisons” reprinted in CWJWT Vol. VIII (1994) R.G. Miller, Simultaneous Statistical nference, 2nd ed. (Springer 1981)
4
Thanks to Yoav Benjamini Benjamini and Hochberg (1995) “Controlling the false discovery rate: a practical and powerful approach to multiple testing”. JR Stat. Soc. Ser. B
5
Example E. Giovannucci, A. Ascherio, E. Rimm, M. Stampfer, G. Coldizt, W. Willett: ‘‘Intake of Carotenoids and Retinol in Relation to Risk of Prostate Cancer’’, Journal of the National Cancer Insitute 87(23):1767--1776 (6 Dec 1995).
6
‘‘Using responses to a validated, semiquantitative food Frequency questionnaire mailed to participants in the Health Professionals Follow-up Study in 1986, we assessed dietary intake for a 1-year period for a cohort of 47,894 eligible subjects initially free of diagnosed cancer....We calculate the relative risk (RR) for each of the upper categories of intake of a specific food or nutrient by dividing the incidence of prostate cancer among men in each of these categories by the rate among men in the lowest intake level....
7
‘‘Of 46 vegetables and fruits or related products, four were significantly associated with lower prostate cancer risk; of the four --- tomato sauce (P for trend = 0.001), tomatoes (P for trend = 0.03), and pizza (P for trend = 0.05), but not strawberries --- were primary sources of lycopene.’’
8
BUT the Methods section one page later states: ‘‘For each of 131 food and beverage items listed...’’ And the (presumably strongest) carotenoids and p-values are listed in Table 2 (p.1770): Tomato sauce Tomatoes Tomato juice Pizza 0.001 0.03 0.67 0.05 ‘‘Our findings... suggest that tomato-based foods may be especially beneficial regarding prostate cancer risk.’’
9
What is a p-value again? When nothing protects, we expect 131 x 0.05 7 foods/nutrients to have p-values < 0.05
10
Microarrays When no genes are changing between two groups we expect 20,000 x 0.01 = 200 genes to have p-value < 0.01 However, false positives are not as bad as in other fields
11
What can we do? p-values no longer mean what they used to… no argument Histogram of p-values is useful plot What can we do… lots of argument
12
Multiple Hypothesis Testing Called Significant Not Called Significant Total Null TrueVm 0 – Vm0m0 Altern.TrueSm 1 – Sm1m1 TotalRm – Rm Null = Equivalent Expression; Alternative = Differential Expression
13
Error Rates Per comparison error rate (PCER): the expected value of the number of Type I errors over the number of hypotheses PCER = E(V)/m Per family error rate (PFER): the expected number of Type I errors PFER = E(V) Family-wise error rate: the probability of at least one Type I error FEWR = Pr(V ≥ 1) False discovery rate (FDR) rate that false discoveries occur FDR = E(V/R; R>0) = E(V/R | R>0)Pr(R>0) Positive false discovery rate (pFDR): rate that discoveries are false pFDR = E(V/R | R>0) Many others
14
Conclusions Lets do a multiple comparison of the different beers sold by the IF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.