Xin Huang, Yan Sun, V. Devanarayan Exploratory Statistics AbbVie Inc.

Slides:

Advertisements

Similar presentations

Random Forest Predrag Radenković 3237/10

Advertisements

Breakout Session 4: Personalized Medicine and Subgroup Selection Christopher Jennison, University of Bath Robert A. Beckman, Daiichi Sankyo Pharmaceutical.

CART: Classification and Regression Trees Chris Franck LISA Short Course March 26, 2013.

Biomarker Analyses in CLEOPATRA: A Phase III, Placebo-Controlled Study of Pertuzumab in HER2- Positive, First-Line Metastatic Breast Cancer (MBC) Baselga.

Federal Institute for Drugs and Medical Devices | The Farm is a Federal Institute within the portfolio of the Federal Ministry of Health (Germany) How.

Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.

Departments of Medicine and Biostatistics

Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the.

Recursive Partitioning Method on Survival Outcomes for Personalized Medicine 2nd International Conference on Predictive, Preventive and Personalized Medicine.

Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,

Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review

Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns.

Lasso regression. The Goals of Model Selection Model selection: Choosing the approximate best model by estimating the performance of various models Goals.

Bayesian Methods for Benefit/Risk Assessment

Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.

Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD Biostatistician.

Re-Examination of the Design of Early Clinical Trials for Molecularly Targeted Drugs Richard Simon, D.Sc. National Cancer Institute linus.nci.nih.gov/brb.

Ensemble Learning (2), Tree and Forest

Efficiency of stroke clinical trials with ordinal outcomes: a simulation study UPC, Julio 2010 BASEL, October 2011 Juan Vicente Torres Supervisors: Dr.

Use of Genomics in Clinical Trial Design and How to Critically Evaluate Claims for Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric.

Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.

Predictive Biomarkers and Their Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.

1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.

Some Statistical Aspects of Predictive Medicine

Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.

Estimating cancer survival and clinical outcome based on genetic tumor progression scores Jörg Rahnenführer 1,*, Niko Beerenwinkel 1,, Wolfgang A. Schulz.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,

Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 5: Classification Trees: An Alternative to Logistic.

Recursive Partitioning And Its Applications in Genetic Studies Chin-Pei Tsai Assistant Professor Department of Applied Mathematics Providence University.

University of Washington Institute of Technology Tacoma, WA, USA Ecole des Hautes Etudes en Santé Publique Département Infobiostat Rennes, France Isabelle.

Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Rasch trees: A new method for detecting differential item functioning in the Rasch model Carolin Strobl Julia Kopf Achim Zeileis.

Bayesian Analysis and Applications of A Cure Rate Model.

Experimental Design and Statistical Considerations in Translational Cancer Research (in 15 minutes) Elizabeth Garrett-Mayer, PhD Associate Professor of.

Statistical Review: Recursive Partitioning Identifies Patients at High and Low Risk for Ipsilateral Tumor Recurrence After Breast- Conserving Surgery and.

Empirical Efficiency Maximization: Locally Efficient Covariate Adjustment in Randomized Experiments Daniel B. Rubin Joint work with Mark J. van der Laan.

Use of Candidate Predictive Biomarkers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.

Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Adaptive Designs for Using Predictive Biomarkers in Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.

Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.

Bayesian Approach For Clinical Trials Mark Chang, Ph.D. Executive Director Biostatistics and Data management AMAG Pharmaceuticals Inc.

Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.

Super Learning in Prediction HIV Example Mark van der Laan Division of Biostatistics, University of California, Berkeley.

Multilevel and multifrailty models. Overview  Multifrailty versus multilevel Only one cluster, two frailties in cluster e.g., prognostic index (PI) analysis,

Classification Ensemble Methods 1

Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.

Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.

Confidential and Proprietary Business Information. For Internal Use Only. Statistical modeling of tumor regrowth experiment in xenograft studies May 18.

Logistic Regression: Regression with a Binary Dependent Variable.

Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences

Bootstrap and Model Validation

Classification with Gene Expression Data

TBCRC (the translational breast cancer research consortium) 005 Prospective study

Lecture 17. Boosting¶ CS 109A/AC 209A/STAT 121A Data Science: Harvard University Fall 2016 Instructors: P. Protopapas, K. Rader, W. Pan.

Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.

A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants Andrew.

Statistical Considerations for Using Multiple Databases to Build a Biomarker Probability Tool Shijia Bian MS1; Wenting Wang PhD1; Nancy Maserejian.

with application in a phase II study

Coiffier B et al. Proc ASH 2011;Abstract 265.

Introduction to Machine learning

A machine learning approach to prognostic and predictive covariate identification for subgroup analysis David A. James and David Ohlssen Advanced Exploratory.

Björn Bornkamp, Georgina Bermann

Detecting Treatment by Biomarker Interaction with Binary Endpoints

Logical Inference on Treatment Efficacy When Subgroups Exist

Oncology Biostatistics

Hong Zhang, Judong Shen & Devan V. Mehrotra

Presentation transcript:

Xin Huang, Yan Sun, V. Devanarayan Exploratory Statistics AbbVie Inc. Paths to precision medicine: Subgroup Identification in Clinical Trials Xin Huang, Yan Sun, V. Devanarayan Exploratory Statistics AbbVie Inc. April. 15, 2015

Why Personalized Medicine? Edward Abrahams and Mike Silver. The Case for Personalized Medicine. (2009) Journal of Diabetes Science and Technology V3 Issue 4

Some statistical challenges For ease of implementation in clinical practice, need cut-points on biomarkers for predicting responders/non-responders. i.e., threshold-based biomarker signatures E.g., Patients with Gene X1 > …, Gene X2 < …, are likely responders. This should be “Multivariate”. Derive this from typically a large panel of candidate markers, and often from full genome data (e.g., > 30,000 genes). Need to account for linear and nonlinear trends. After a promising threshold-based signature is identified, need to predict it’s performance in a future dataset. i.e., predict treatment effect in the “responder” subgroup, or predict the signature effect among patients receiving treatment.

Biomarker signatures for subgroup identification Prognostic Signature: identifies a subgroup of patients that are more likely to experience an outcome of interest (efficacy, toxicity, disease progression, etc.), independent of treatment. Predictive Signature: identifies a subgroup of patients that respond better to a specific treatment.

Some existing methods Prognostic Signatures (predict the disease outcome irrespective of the treatment ): CART (Breiman et al, 1984) MARS (Friedman, 1991) RuleFit (Friedman and Popescu 2008) Predictive Signatures (predict the response to a specific treatment compared to other treatments): Interaction Trees (Su et al. 2008, 2009) Virtual Twins ( Foster et al. 2011) SIDES method (Lipkovich et al. 2011, 2014) Bayesian approaches (Berger et al. 2014)

Objective functions Consider a supervised learning problem with data 𝒙 𝒊 , 𝑦 𝑖 , 𝑖= 1, 2, …, 𝑛, where 𝒙 𝒊 is a p-vector of predictor and 𝑦 𝑖 is an outcome variable Consider three major applications: Linear regression for continuous response Logistic regression for binary response, where 𝑦 𝑖 ∈ 0, 1 Cox regression for survival response: 𝑦 𝑖 =( 𝑇 𝑖 , 𝛿 𝑖 ), where 𝑇 𝑖 is a right censored survival time and 𝛿 𝑖 is the censoring indicator Denote the log likelihood or log partial likelihood by 𝓁(𝜂;𝑿, 𝒚), where 𝜂 is the usual linear combination of predictors. continuous response in simple linear regression log odds in logistic regression log hazard ratio in proportional hazards regression.

Objective functions, contd. Consider the following model for prognostic signatures (predict the outcome, irrespective of the treatment), 𝜂=𝛼 + 𝛽∙𝜔(𝑿), (1) where 𝜔 𝑿 ={0, 1} is the signature rule returning grouping indicators for each subject. Consider following model for predictive signatures (predict the response to a specific treatment compared to the other treatment), 𝜂=𝛼 + 𝛽∙ 𝜔 𝑿 ×𝑟 + 𝛾∙𝑟, (2) where r is the treatment indicator. Our algorithms derive signature rules, 𝜔 𝑿 , with the objective of searching for a best grouping to optimize the significance of 𝛽 in (1) and (2)

Bootstrapping & Aggregating of Thresholds from Trees (BATTing) Original Data Bootstrapping (sampling with replacement) Data 1 Data 2 Data B … … ... Tree 1 >= C1 < C1 Tree 2 >= C2 < C2 … … ... Tree B >= CB < CB Aggregate Thresholds (C1, C2, …., CB) Threshold is robust to small perturbations in data, outliers, etc. BATTing Threshold (Median) (Devanarayan, 1999)

BATTing, contd. We use the same simulation model as described in the simulation section for prognostic signatures except that we only include one true predictor as the only candidate predictor. Under the simulated model, the overall response equals to zero, and the signature positive group (cutoff >= 0) has a positive response, while the signature negative group (cutoff < 0) has a negative response. The difference between the two signature groups is determined by a predetermined effect size. Figure 1 shows the distribution of BATTing threshold estimates from 500 simulation runs across different number of bootstrapping for sample size = 100 and effect size = 0.2, with true cutoff being 0 (red dashed vertical line). As demonstrated in Figure 1, BATTing helps reduce the influence of data perturbations in the dataset and thus stabilize the threshold estimate. In our experience, the number of bootstraps >= 50 is recommended.

Sequential BATTing Model Growing within the potential Sig+ group Marker 7 Marker 3 Marker 9 Whole Population (Sig+) (Sig+) (Sig+) Sig+ Sig- Sig- Sig- Sig- Model Growing within the potential Sig+ group Get the BATTing threshold for each unused marker The best marker is selected to split the current sig+ group This procedure continues in the new Sig+ group Stopping Rule: The new added predictor goes through the likelihood ratio test for significance.

Adaptive Index Model AIM (Tian & Tibshirani, 2010) can be used for selecting markers & thresholds. Output: AIM Score An index predictor: # of satisfied rules 𝒔𝒄𝒐𝒓𝒆= 𝒌=𝟏 𝑲 𝑰( 𝑋 𝑘 ≤ 𝑐 𝑘 ) Model to get the AIM score Prognostic: 𝜂 ∗ = 𝜃 0 +𝜽×𝒔𝒄𝒐𝒓𝒆, Predictive: 𝜂 ∗ = 𝜃 0 +𝛾∙𝑇+ 𝜽×𝑻×𝒔𝒄𝒐𝒓𝒆. An information matrix based fast algorithm is used to do score test to select threshold for each marker Markers are selected one at a time (forward selection) Optimal # of markers is determined via cross validation

AIM-BATTing Obtain the AIM Score Use BATTing to derive an optimal AIM Score threshold based on Model (1) & (2). The threshold is then used to stratify the population. Step1 Step2 Patient 1 AIM I(X1≥c1) + I(X2≤c2) ….. I(Xk≥ck) Score 1 Sig+ Grp. Patient 2 Score 2 BATTing I( Score ≥ j ) Sig- Grp. Patient n Score n

Some Refinements to the AIM-BATTing algorithm MC-AIM-BATTing: Monte Carlo procedure to get a more stable estimate of the “optimal # of markers”. i.e., use the median of estimated “optimal # of markers” across multiple cross validation runs with different random seeds MC-AIM-RULE-BATTing: Use BATTing directly on the rules (Xi > c), instead of scores, and get a cutoff on the rule list. Patients meeting all the rules within the cutoff are assigned to the sig+ group

Performance evaluation: Common mistakes in practice Using an entire dataset to build a model Select “important” variables by associating markers with outcomes (e.g., stepwise regression) Test and rely on lack of fit assessment of the resulting model Assuming the resulting model is correct, making inferences using the same dataset over-fitting

Predictive significance via cross-validation Train Train Train Sig. Sig. Sig. Test Test Test Repeat Multiple Times Group Label Group Label Group Label Group Label Group Label CV p-value pi Aggregated cross-validated p values from M iterations (p1, p2, …., pM) predictive significance (median of p value) Note: other performance statistics, e.g., sensitivity, specificity, PPV, NPV, hazard ratio, odds ratio can be calculated similarly

Effect size = E(Y|Trt, sig+) - E(Y|ctrl, sig+) = 0.5 Simulation Design Similar simulation model as Lipkovich et al., 2011, 2014, with each predictor as continuous instead of dichotomized valued Small trials to large trials (n=100, 300, 500) Number of candidate predictors is k=10 and 18 with different correlation structures Effect size is 0.2 (low), 0.5 (medium), 0.8 (high) Effect size = E(Y|Trt, sig+) - E(Y|ctrl, sig+) = 0.5 0.5

Simulation Results For small effect size, none of the methods has many testing p values less than 0.05 for sample size from 100 to 500 Our proposed methods outperform SIDES in terms of the selection accuracy: the accuracy of SIDES is around 50% while that of our proposed algorithms is from 60% to 70% for large sample size. For effect size greater than medium (0.5) and sample size larger than 300, our proposed methods have most of the testing p values less than 0.05 and accuracy around 90%. SIDES method under performs in all scenarios.

Clinical Trial Case Study Data simulated based on a Phase III clinical trial Efficacy of a novel treatment is compared to the standard of care (Control) in patients with severe sepsis Treatment arm (n = 317) vs. Control arm (n = 153) Outcome: Binary (survival) Available markers: demographic and clinical covariates, i.e., age, time from first sepsis-organ fail to start drug, sum of baseline SOFA socres (cardiovascular, hematology, hepaticrenal, and respiration scores), number of baseline organ failures, pre-infusion apache-ii score, baseline GLASGOW coma scale score, baseline activity of daily living score laboratory markers, i.e., baseline local platelets, creatinine, serum IL-6 concentration, local bilirunbin Overall outcome was insignificant (1-tailed p value = 0.08), with survival rates of 40.7% and 34% in the treatment and control arms, respectively Original data was randomly split into two parts (training + testing)

Clinical Trial Case Study, contd. Signature rules for positive subgroup: Sequential-BATTing and AIM-RULE: pre-infusion apache-ii score <= 27 AIM: meet at least two out of the three thresholds: (1) pre-infusion apache-ii score < 27; (2) Age < 54; (3) local bilirunbin > 0.8 SIDES: Creatinine <= 1.1 & baseline GLASGOW coma scale score > 11 Table: 1-tailed p-values for sepsis trial example Seq-BATTing has the most promising CV performance, and its signature is validated in the test dataset

Summary The proposed subgroup identification algorithms perform well in simulations and case-study illustration. These algorithms provide threshold-based multivariate biomarker signatures. Variable selection is automatically built-in to these algorithms. Personalized medicine is a paradigm shift in drug development, which requires Advanced subgroup identification and subgroup analysis methods Enrichment design and simulations Smart diagnostic test development and clinical development strategy to overcome operational challenges Collaboration between functional areas

References Hastie T, Tibshirani R, Friedman J (2011) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, 2nd ed. 2009. Corr. 7th printing 2013 edition. Springer Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and Regression Trees, 1 edition. Chapman and Hall/CRC Friedman JH (1991) Multivariate Adaptive Regression Splines. Ann Stat 19:1–67. doi: 10.1214/aos/1176347963 Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2:916–954. doi: 10.1214/07-AOAS148 Liu X, Minin V, Huang Y, et al. (2004) Statistical methods for analyzing tissue microarray data. J Biopharm Stat 14:671–685. doi: 10.1081/BIP-200025657 Chen G, Zhong H, Belousov A, Devanarayan V (2015) A PRIM approach to predictive-signature development for patient stratification. Stat Med 34:317–342. doi: 10.1002/sim.6343 Su X, Zhou T, Yan X, et al. (2008) Interaction Trees with Censored Survival Data. Int J Biostat. doi: 10.2202/1557-4679.1071 Su X, Tsai C-L, Wang H, et al. (2009) Subgroup Analysis via Recursive Partitioning. J Mach Learn Res 10:141–158. Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search--a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30:2601–2621. doi: 10.1002/sim.4289 Lipkovich I, Dmitrienko A (2014) Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using SIDES. J Biopharm Stat 24:130–153. doi: 10.1080/10543406.2013.856024 Berger JO, Wang X, Shen L (2014) A Bayesian approach to subgroup identification. J Biopharm Stat 24:110–129. doi: 10.1080/10543406.2013.856026 Devanarayan V, Cummins D, Tanzer L (1999) Application of GAM and tree models for assessing the role of drug resistance proteins in leukemia chemotherapy. Tian L, Tibshirani R (2011) Adaptive index models for marker-based risk stratification. Biostatistics 12:68–86. doi: 10.1093/biostatistics/kxq047 Tian L, Alizadeh A, Gentles A, Tibshirani R (2012) A Simple Method for Detecting Interactions between a Treatment and a Large Number of Covariates. arXiv Tibshirani R, Efron B (2002) Pre-validation and inference in microarrays. Stat Appl Genet Mol Biol. doi: 10.2202/1544- 6115.1000 Foster JC, Taylor JM, Ruberg SJ (2011) Subgroup identification from randomized clinical trial data. Stat Med. 30(24) 2867-80