False Discovery Rates for Discrete Data Joseph F. Heyse Merck Research Laboratories Graybill Conference June 13, 2008.

Slides:

Advertisements

Similar presentations

Tests of Hypotheses Based on a Single Sample

Advertisements

Multiple testing and false discovery rate in feature selection

Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.

By Trusha Patel and Sirisha Davuluri. “An efficient method for accommodating potentially underpowered primary endpoints” ◦ By Jianjun (David) Li and Devan.

1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Tests for Homogeneity.

1 An Overview of Multiple Testing Procedures for Categorical Data Joe Heyse IMPACT Conference November 20, 2014.

Chapter 10 Section 2 Hypothesis Tests for a Population Mean

Likelihood ratio tests

Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.

Differentially expressed genes

Lecture 9: One Way ANOVA Between Subjects

False Discovery Rate Methods for Functional Neuroimaging Thomas Nichols Department of Biostatistics University of Michigan.

8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment

Testing Dose-Response with Multivariate Ordinal Data Bernhard Klingenberg Asst. Prof. of Statistics Williams College, MA Paper available at

Sample Size Determination

Adaptive Designs for Clinical Trials

Sample Size Determination Ziad Taib March 7, 2014.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.

False Discovery Rate (FDR) = proportion of false positive results out of all positive results (positive result = statistically significant result) Ladislav.

McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.

Multiple testing correction

Hypothesis Testing Statistics for Microarray Data Analysis – Lecture 3 supplement The Fields Institute for Research in Mathematical Sciences May 25, 2002.

Multiple testing in high- throughput biology Petter Mostad.

Hypothesis Testing.

Use of the False Discovery Rate for Evaluating Clinical Safety Data Joseph F. Heyse Devan V. Mehrotra Clinical Biostatistics – Vaccines Merck Research.

The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.

+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.

Essential Statistics in Biology: Getting the Numbers Right

Differential Expression II Adding power by modeling all the genes Oct 06.

ANOVA (Analysis of Variance) by Aziza Munir

Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.

Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.

1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.

Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.

Multiple Testing in Microarray Data Analysis Mi-Ok Kim.

Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.

FDA Case Studies Pediatric Oncology Subcommittee March 4, 2003.

Back to basics – Probability, Conditional Probability and Independence Probability of an outcome in an experiment is the proportion of times that.

Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.

1 Nonparametric Statistical Techniques Chapter 17.

Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.

Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry

Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.

One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.

Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.

Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.

Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.

1 Comparing multiple tests for separating populations Juliet Popper Shaffer Paper presented at the Fifth International Conference on Multiple Comparisons,

Tests of Significance: The Basics BPS chapter 15 © 2006 W.H. Freeman and Company.

Suppose we have T genes which we measured under two experimental conditions (Ctl and Nic) in n replicated experiments t i * and p i are the t-statistic.

Logic and Vocabulary of Hypothesis Tests Chapter 13.

© Copyright McGraw-Hill 2004

MPS/MSc in StatisticsAdaptive & Bayesian - Lect 51 Lecture 5 Adaptive designs 5.1Introduction 5.2Fisher’s combination method 5.3The inverse normal method.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.

The Broad Institute of MIT and Harvard Differential Analysis.

1 Drug Screening and the False Discovery Rate Charles W Dunnett McMaster University 3 rd International Conference on Multiple Comparisons, Bethesda, Maryland,

1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.

John W. Tukey’s Multiple Contributions to Statistics at Merck Joseph F. Heyse Merck Research Laboratories Third International Conference on Multiple Comparisons.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.

1 השוואות מרובות מדדי טעות, עוצמה, רווחי סמך סימולטניים ד"ר מרינה בוגומולוב מבוסס על ההרצאות של פרופ' יואב בנימיני ופרופ' מלכה גורפיין.

Statistical issues in the validation of surrogate endpoints Stuart G. Baker, Sc.D.

Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.

Multiple Testing Methods for the Analysis of Microarray Data

Chapter 9: Inferences Involving One Population

Differential Gene Expression

Multiple Testing Methods for the Analysis of Gene Expression Data

Detecting Treatment by Biomarker Interaction with Binary Endpoints

Incorporating the sample correlation between two test statistics to adjust the critical points for the control of type-1 error Dror Rom and Jaclyn McTague.

Presentation transcript:

False Discovery Rates for Discrete Data Joseph F. Heyse Merck Research Laboratories Graybill Conference June 13, 2008

Graybill.ppt. Introduction u Almost all multiplicity considerations in clinical trial applications are designed to control the Family Wise Error Rate (FWER). u Benjamini and Hochberg (1995) argued that in certain settings, requiring control of the FWER is often too conservative. u They suggested controlling the “False Discovery Rate” (FDR) is a more powerful alternative. u Accounting for the discrete endpoints can further improve the power of FDR (and FWER) methods.

Graybill.ppt. Outline 1. Definition and properties of FDR 2. FDR for discrete data 3. Application: Genetic variants of HIV 4. Summary of simulation results 5. Application: Rodent carcinogenicity study 6. Concluding remarks

Graybill.ppt. Familywise Error Rate (FWER) u Let F = {H 1,H 2 … H K } denote a family of K hypotheses. u FWER = Pr(any true H i  F is rejected).  The procedures currently used for clinical studies are intended to control the FWER  . u Benjamini & Hochberg (1995) proposed controlling the “False Discovery Rate” (FDR) as a more powerful alternative to FWER.

Graybill.ppt. False Discovery Rate (FDR) (Benjamini & Hochberg, 1995) u u When R=0, FDR is defined to be 0.

Graybill.ppt. False Discovery Rate (FDR) (cont’d) (Benjamini & Hochberg) u Example (K=4)

Graybill.ppt. Properties of FDR Control u The B&H sequential procedure controls the FDR at u FDR < FWER and equality holds if K=K 0. u The Hochberg (1988) stepwise procedure compares while the FDR procedure compares u FDR is potentially more powerful than FWER controlling procedures. for independent hypotheses.

Graybill.ppt. Comparing FDR and FWER FDR adjusted P-values < FWER adjusted P-values Example (K=4) Unadjusted P-values FDR-adjusted P-values FWER-adjusted P-values

Graybill.ppt. Modified FDR for Discrete Data u Adjusted P-values for FDR u For discrete data Where is largest P-value achievable for hypothesis i that is less than or equal to P (j).

Graybill.ppt. FDR for Discrete Data u Gain in power for the discrete data FDR comes from the difference. u If endpoint i is not able to achieve a P-value ≤ P (j) then and the dimensionality is reduced. u If endpoint i is able to achieve a P-value ≤ P (j) then and a smaller quantity adds to.

Graybill.ppt. Other Approaches for Discrete Data u Tarone (1990) proposed a modified Bonferroni procedure for discrete data by removing those endpoints unable to reach that level of statistical significance. u Gilbert (2005) proposed a 2 step FDR method for discrete data. 1.Apply Tarone’s method to identify endpoints suitable for adjustment. 2.Apply B-H FDR to those endpoints. u Calculating the FDR adjusted P-value is expected to improve upon these approaches by using the complete exact distribution.

Graybill.ppt. Example: Genetic Variants of HIV Gilbert (2005) compared the mutation rates at 118 positions in HIV amino-acid sequences of 73 patients with subtype C to 73 patients with subtype B. The B-H FDR procedure identified 12 significant positions. The Tarone modified FDR procedure reduced the dimensionality to 25 and identified 15 significant positions. The fully discrete FDR identified 20 significant positions.

Graybill.ppt. Simulation Study for Independent Hypotheses u A simulation study was conducted to evaluate the statistical properties of the FDR controlling methods for discrete data using Fisher’s Exact Test. u Simulation parameters –Number of Hypotheses: K = 5, 10, 15, 20 –Varying numbers of false hypotheses (K-K 0 ) –Background rates chosen randomly from U(.01,.5) –Odds Ratios for Effect Size: OR = 1.5, 2, 2.5, 3 –Sample sizes: N = 10, 25, 50, 100 –  = Tailed

Graybill.ppt. Rate of Rejecting True Hypotheses When All Hypotheses are True (K 0 =K)

Graybill.ppt. Rate of Rejecting True Hypotheses When Some Hypotheses are False (K 0 <K)

Graybill.ppt. Rate of Rejecting False Hypotheses

Graybill.ppt. Other Applications u Analysis of Tier II clinical trial adverse experiences u Trend test analysis of rodent carcinogenicity data u Similar modification applied to Bonferonni adjustment for discrete endpoints

Graybill.ppt. Rodent Carcinogenicity Studies u Long-term carcinogenicity studies typically test candidate drugs in several graded doses and use a vehicle control group. u 50 male and 50 female rodents are randomly assigned to each drug treated group with 100 rodents of each sex assigned to control. Male and female studies are considered separately. u Treatment is administered daily and a terminal necropsy is performed at the end of the study.

Graybill.ppt. Rodent Carcinogenicity Studies (cont’d) u Each individual tumor site encountered is described by a combination of organ or tissue with tumor type. u A statistical analysis of trend is performed for all tumor sites encountered. u An exact test uses the permutation distribution of the trend statistic. u Exact tests can account for age at necropsy and lethality of tumor. u For illustration purposes, this analysis only considered dose and presence of tumor.

Graybill.ppt. Summary of Statistical Results from a Long-Term Carcinogenicity Study in Male Mice Tumor Site Liver, Hepatocellular Carcinoma P.S.U., Hemangiosarcoma Adrenal Cortex, Adenoma P.S.U., Sarcoma P.S.U., Lymphoma Lung, Adenoma Liver, Hepatocellular Adenoma Liver, Hemangiosarcoma Harderian Gland, Adenoma Skin, Fibroma Thyroid, Follicular Cell Carcinoma P.S.U., Leukemia Lung, Adenocarcinoma Testes, Interstitial Cell Tumor Stomach, Papilloma Number of Mice on Study Control Trend P-value N.41N.16N Trend P-value is reported 1-tailed using exact permutational distribution. N indicates 1-tailed P-value for negative trend. Test Agent Dose

Graybill.ppt. Available Methods u Adjusting P-values to account for multiple tumor types. (Mantel 1980, Brown and Fears 1981, Mantel et al. 1982)  Adapting  for interpreting unadjusted P-values (Haseman 1983 and 1990, Lin and Rahman 1998) u Resampling methods to adjust P-values (Heyse and Rom 1988, Westfall and Young 1989, Westfall and Soper 1998) u Bayesian methods using historical control priors (Westfall and Soper 2001)

Graybill.ppt. Multiplicity of Statistical Tests u Liver, hepatocellular carcinoma was only 1 of K=15 tumor sites encountered. u P (1) = was the most extreme individual trend P- value. u Interest is in the likelihood of observing P (1) = as the most extreme P-value among the K=15 in this study. u Need to consider the discrete nature of the data since several tumor sites may not be able to achieve significance levels of P (1).

Graybill.ppt. P-value Adjustment Methods u Mantel (1980) attributed to J.W. Tukey Where k=number of tumor sites that could yield P-values as extreme as P (1). u Mantel et al. (1982) Where is largest P-value achievable for tumor site i that is less than or equal to P (1). u Discrete FDR adjustment: = 0.264

Graybill.ppt. Conclusions u FDR control provides higher power than FWER control when some hypotheses are false.  Proposed procedure based on exact analysis of binomial data controls FDR at  u Discrete nature of data results in slightly conservative FDR control. FDR is less conservative for increasing sample size and increasing numbers of hypotheses. u Accounting for discrete endpoints increases the power of FDR.