New designs and paradigms for science- based oncology clinical trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Slides:



Advertisements
Similar presentations
New Paradigms for Clinical Drug Development in the Genomic Era Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Advertisements

It is difficult to have the right single completely defined predictive biomarker identified and analytically validated by the time the pivotal trial of.
Sample size estimation
Breakout Session 4: Personalized Medicine and Subgroup Selection Christopher Jennison, University of Bath Robert A. Beckman, Daiichi Sankyo Pharmaceutical.
Federal Institute for Drugs and Medical Devices | The Farm is a Federal Institute within the portfolio of the Federal Ministry of Health (Germany) How.
Transforming Correlative Science to Predictive Personalized Medicine Richard Simon, D.Sc. National Cancer Institute
Statistical Issues in Incorporating and Testing Biomarkers in Phase III Clinical Trials FDA/Industry Workshop; September 29, 2006 Daniel Sargent, PhD Sumithra.
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Use of Archived Tissue in Evaluating the Medical Utility of Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the.
Clinical Trial Design Considerations for Therapeutic Cancer Vaccines Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
Recursive Partitioning Method on Survival Outcomes for Personalized Medicine 2nd International Conference on Predictive, Preventive and Personalized Medicine.
Statistical Issues in the Evaluation of Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Use of Prognostic & Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Evaluation.
Predictive Classifiers Based on High Dimensional Data Development & Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch.
Richard Simon, D.Sc. Chief, Biometric Research Branch
3 rd Summer School in Computational Biology September 10, 2014 Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory.
Moving from Correlative Studies to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute brb.nci.nih.gov.
Statistical Challenges for Predictive Onclogy Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Sample Size Determination
Predictive Analysis of Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Re-Examination of the Design of Early Clinical Trials for Molecularly Targeted Drugs Richard Simon, D.Sc. National Cancer Institute linus.nci.nih.gov/brb.
Using Predictive Biomarkers in the Design of Adaptive Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Sample Size Determination Ziad Taib March 7, 2014.
Ensemble Learning (2), Tree and Forest
Decision Tree Models in Data Mining
Use of Genomics in Clinical Trial Design and How to Critically Evaluate Claims for Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric.
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
Predictive Biomarkers and Their Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Novel Clinical Trial Designs for Oncology
Predictive Analysis of Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Prospective Subset Analysis in Therapeutic Vaccine Studies Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Use of Prognostic & Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Some Statistical Aspects of Predictive Medicine
Cancer Clinical Trials in the Genomic Era Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Validation of Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Gene Expression Profiling Illustrated Using BRB-ArrayTools.
CI - 1 Cure Rate Models and Adjuvant Trial Design for ECOG Melanoma Studies in the Past, Present, and Future Joseph Ibrahim, PhD Harvard School of Public.
Development and Use of Predictive Biomarkers Dr. Richard Simon.
Use of Prognostic & Predictive Genomic Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Experimental Design and Statistical Considerations in Translational Cancer Research (in 15 minutes) Elizabeth Garrett-Mayer, PhD Associate Professor of.
1 Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting Authors: A. Dupuy and R.M. Simon.
Moving from Correlative Studies to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Therapeutic Equivalence & Active Control Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Use of Candidate Predictive Biomarkers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Integration of Diagnostic Markers into the Development Process of Targeted Agents Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Adaptive Designs for Using Predictive Biomarkers in Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
New Approaches to Clinical Trial Design Development of New Drugs & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Introduction to Design of Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Advanced Clinical Trial Educational Session Richard Simon, D.Sc. Biometric Research Branch National Cancer Institute
Design & Analysis of Phase III Trials for Predictive Oncology Richard Simon Chief, Biometric Research Branch National Cancer Institute
Moving From Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
 Adaptive Enrichment Designs for Confirmatory Clinical Trials Specifying the Intended Use Population and Estimating the Treatment Effect Richard Simon,
CLINICAL PROTOCOL DEVELOPMENT
Björn Bornkamp, Georgina Bermann
Presentation transcript:

New designs and paradigms for science- based oncology clinical trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Prognostic & Predictive Biomarkers in Oncology Many cancer treatments benefit only a minority of patients to whom they are administered – Large NNT Being able to predict which patients are likely to benefit can – Help patients get effective treatment – Help control medical costs – Improve the success rate of drug development We now have better tools for understanding variability in outcome and response to rx

Prognostic biomarker or signature – Identify patients with very good or very poor outcome on standard treatment Predictive biomarker – Identify patients likely or unlikely to respond to a specific treatment

Major problems with prognostic studies of gene expression signatures Inadequate focus on intended use – Patient selection for study and approach to data analysis are not guided by an intended use Reporting biased estimates of predictive value

Validation of Prognostic Model Completely independent validation dataset Splitting dataset into training and testing sets – Evaluate 1 completely specified model on test set Complete cross-validation

Leave-one-out Cross Validation for Estimating the Misclassification Rate of Binary Classifier Full dataset P={1,2,…,n} Omit case 1 V 1 ={1}; T 1 ={2,3,…,n} Develop classifier using training set T 1 Classify cases in V 1 and count whether classification is correct or not Repeat for case 2,3,… Total number of mis-classified cases

Complete cross Validation Cross-validation simulates the process of separately developing a model on one set of data and predicting for a test set of data not used in developing the model – All aspects of the model development process must be repeated for each loop of the cross-validation Feature selection Tuning parameter optimization

The cross-validated estimate of misclassification error is an estimate of the prediction error for the model fit applying the specified algorithm to the full dataset

Partition data set D into K equal parts D 1,D 2,...,D K First training set T 1 =D-D 1 Develop completely specified prognostic model M 1 using only data T 1 eg Using M 1, compute risk group for cases in D 1 Repeat, for new training set T 2 =D-D 2 Develop model M 2 using only T 2 and then score cases in D 2

Repeat for... T 3, …, T K Calculate Kaplan-Meier survival curve for each risk-group Calculate time-dependent ROC curve using the cross-validated risk groups and compute AUC

The cross-validated estimate of the survival curves for the risk groups is an estimate of the future performance for the model fit applying the specified algorithm to the full dataset

To evaluate significance, the log-rank test cannot be used for cross-validated Kaplan- Meier curves because the survival times are not independent

Statistical significance can be properly evaluated by approximating the null distribution of the cross-validated log-rank statistic Permute the survival times and repeat the entire cross-validation procedure to generate new cross-validated K-M curves for low risk and high risk groups – Compute log-rank statistic for the curves Repeat for many sets of permutations

In many positive phase III clinical trials comparing a new treatment to control, most of the patients treated with the new treatment do not benefit. – Adjuvant breast cancer: 70% long-term disease- free survival on control. 80% disease-free survival on new treatment. 70% of patients don’t need the new treatment. Of the remaining 30%, only 1/3 rd benefit.

Standard Paradigm of Phase III Clinical Trials Broad eligibility Base primary analysis on ITT eligible population Don’t size for subset analysis, allocate alpha for subset analysis or trust subset analysis – Only do subset analysis if overall treatment effect is significant and interaction is significant

Standard Paradigm Sometimes Leads to Large NNT Small average treatment effects Inconsistent results among studies False negative studies

Predictive Biomarkers Cancers of a primary site often represent a heterogeneous group of diverse molecular entities which vary fundamentally with regard to – the oncogenic mutations that cause them – their responsiveness to specific treatments Molecularly targeted drugs are likely to be effective only for tumors that are driven by de-regulation of the pathway which is a target of the drug

Neither conventional approaches to post-hoc subset analysis nor the broad eligibility paradigm are adequate for genomic based oncology clinical trials

How can we develop new drugs in a manner more consistent with modern disease biology and obtain reliable information about what regimens work for what kinds of patients?

When the Biology is Clear the Development Path is Straightforward Develop a classifier that identifies the patients likely to benefit from the new drug Develop an analytically validated test – Measures what it should accurately and reproducibly Design a focused clinical trial to evaluate effectiveness of the new treatment in test + patients (i.e. those predicted likely to benefit)

Using phase II data, develop predictor of response to new drug Develop Predictor of Benefit from New Drug Patient Predicted Responsive New Drug Control Patient Predicted Non-Responsive Off Study Enrichment design

Successful use of targeted enrichment design Trastuzumab, pertuzumab, ado-trastuzumab emtansine for HER2 over-expressed or amplified breast cancer Vemurafinib, dabrafinib, trametinib for BRAF mutated melanoma Crizotinib and ceritinib in ALK translocated NSCLC Afatinib in EGFR mutated NSCLC 31

Advantages of enrichment design Targets larger treatment effect less diluted by non-sensitive tumors Avoids exposing patients less likely to benefit to adverse effects of drug until drug is shown effective for those whom it is supposed to benefit Clarity of interpretation

Regulatory Pathway for Test Companion diagnostic test with intended use of identifying patients who have disease subtype for which the drug is proven effective

If the drug is effective in test positive patients, it can be later evaluated in test negative patients. – Saves test – patients toxicity until drug is shown effective in the target population it should work in 34

All comers design Invites poor design – Too few test + patients – Too many test – patients – Failure to have a specific analysis plan Invites inappropriate analysis – Inappropriate requirement of not doing subset analysis unless ITT test is significant and interaction is significant 35

Evaluating the Efficiency of Targeted Design Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10: , 2004; Correction and supplement 12:3229, 2006 Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10: , 2004; Correction and supplement 12:3229, 2006 Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24: , Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24: , 2005.

Two Clinical Trial Designs Standard design – Randomized comparison of new drug E to control C without the test for screening patients Enrichment design – Test patients – Randomize only test + patients Size each design to have power 0.9 and significance level 0.05

RandRat = n untargeted /n targeted TE + =rx effect in marker + stratum TE - = rx effect in marker – stratum p + =proportion of patients marker +

RandRat = n untargeted /n targeted If TE - =0 RandRat = 1/ p + 2 – if p + =0.5, RandRat=4

RandRat = n untargeted /n targeted If TE - = TE + /2 RandRat = 4/(p + +1) 2 – if p + =0.5, RandRat=16/9=1.77

Comparing E vs C on Survival or DFS 5% 2-sided Significance and 90% Power % Reduction in HazardNumber of Events Required 25%509 30%332 35%227 40%162 45%118 50%88

RandRat = n untargeted /n targeted B + =TE in biology + pts TE + =TE in test + pts TE + =(ppv)B + + (1-ppv)B -

Sensitivity = Pr[Test is + | Biology is +] Specificity = Pr[Test is - | Biology is -] PPV = Pr[Biology is + | Test is +]

44

SensitivitySpecificityp+B-PPVRand Ratio B+/

Implications for Early Phase Studies Need to design and size early phase studies to define predictive biomarker Need to establish an analytically validated test

Early Phase Studies May need to evaluate several candidate markers May need to evaluate several candidate tests – e.g. protein expression of target or amplification of gene Phase II trials sized for adequate numbers of test positive patients and to determine appropriate cut-point of positivity

Umbrella Design

Run-in Short term (e.g. 4 weeks) On test treatment or control Happens before randomization

Pharmacodynamic measure of treatment effect Immunologic measure of treatment effect Short-term quantitative imaging change in tumor size PSA or quantitative CTC Run-in “response” used as predictive biomarker, not as surrogate endpoint

Cancer biology is complex and it is not always possible to have the right single predictive classifier identified with an appropriate cut- point by the time the phase III trial of a new drug is ready to start accrual Cancer biology is complex and it is not always possible to have the right single predictive classifier identified with an appropriate cut- point by the time the phase III trial of a new drug is ready to start accrual

Provides a general framework for adaptive enrichment, i.e. restricting the eligibility criteria during the course of the trial based on interim results. Framework includes threshold based enrichment or enrichment based on multi- marker modeling Framework handles multiple types of endpoints (continuous, binary, time-to-event)

One primary statistical significance test is performed at the end of the trial, including all randomized patients, of the strong null hypothesis that the new treatment is uniformly ineffective Fixed sample size regardless of changes in eligibility Framework identifies classes of significance tests which preserve the type I error even with time dependent and data dependent changes to outcome distributions of patients

Example: Single binary biomarker If at interim analysis the posterior probability that the test treatment is better than control for marker negative patients is < ε, then stop enrolling marker negative patients If it is also true for maker positive patients, then stop enrolling all patients

Example: Single quantitative biomarker Single biomarker with K candidate cut-points b 1,…,b K Use uniform prior distribution that the optimal cut- point is 0 or b 1,…,b K,or b K+1 – cut-point 0 means that all patients benefit from the test treatment – cut-point b K+1 means no patients benefit from the test treatment – cut-point b k means no treatment effect for B b k do benefit At interim analyis, compute posterior distribution for the true cut-point If Pr post [b true >b k ]>1-ε, stop accrual of patients with B>b k

Example: K binary biomarkers At interim analysis evaluate whether any of the K biomarkers appear sufficiently predictive If not, continue to enroll all patients and do not use any biomarker for final analysis If so, use the most prognostic biomarker to restrict further accrual to marker negative patients.

Simulation Example Adaptive threshold enrichment Single biomarker uniformly distributed on (0,1) K candidate thresholds Binary response with probability p 0 (b) for control and p 1 (b) for test treatment p 0 (b)=p 0 for all b p 1 (b)=p 0 for b x* True threshold x* Single interim analysis

p 0 =.2, p 1 =.5, K=5, N tot =200, 100 patients/yr True cut-pointPower adaptive Power non- adaptive Accrual adaptive Accrual non- adaptive

Significance tests that preserve type I error with group sequential adaption

Survival Data Use intermediate endpoint for adapting eligibility – PFS or response t k =standardized log-rank statistic for patients accrued in period k but evaluated at final analysis time

Adaptive enrichment designs We have subsequently extended the analysis plans to Test the null hypothesis of no treatment effect for the eligible population selected at the final interim analysis Obtain an unbiased estimate of treatment effect for the eligible population selected at the final interim analysis

Designs When there are many candidate markers and multi-marker classifiers are of Interest

The indication classifier is not a binary classifier of whether a patient has good prognosis or poor prognosis It is a predictive classifier that indicates whether the prognosis of a patient on E is better than the prognosis of the patient on C

The classifier need not use all the covariates but variable selection must be determined using only the training set – Variable selection may be based on selecting variables with apparent interactions with treatment, with cut-off for variable selection determined by cross-validation within training set for optimal classification 75

Example 1: Predictive Classifier Optimize the cut-point of a pre-specified biomarker in the training set If B > c then assign E; otherwise C. Optimize to – Exclude patients unlikely to benefit from E. – Maximize E-C treatment effect for B>c – Maximize chance of obtaining statistical significance in B>c subset in validation set

Example 2: Predictive Classifier Select one from among K candidate biomarkers based on training set performance and optimize cut-point of that marker Candidate biomarkers B 1, …, B K Select B k* If B k* >c k* then treat with E; otherwise C.

Example 3: Predictive Classifier Generate an optimal decision tree using K candidate biomarkers and their candidate cut- points to optimize training set performance e.g. If IHC EGFR >1 and KRAS=WT, then treat with E; otherwise C.

Example 4: Predictive Classifier Fit penalized logistic regression models relating response rate to a linear combination of log gene expression values separately for each treatment group E and C. If predicted response for patient on E exceeds predicted response on C by ≥ Δ then treat with E; otherwise C.

Key Idea Replace multiple significance testing by development of one predictive classifier Obtain unbiased estimate of treatment effect in classifier + patients and test null hypothesis that the treatment effect in classifier + patients is 0.

The cross-validated adaptive signature “design” can be used as a method for re- analysis of previously conducted phase III clinical trials to find a predictive classifier and to internally validate that classifier.

At the conclusion of the trial randomly partition the patients into K approximately equally sized sets D 1, …, D K Let D -i denote the full dataset minus data for patients in D i Omit patients in D 1 Apply the defined algorithm to analyze the data in D -1 to obtain a classifier M -1 Classify each patient j in D 1 using model M -1 Record the treatment recommendation E or C

Repeat the above steps for all K loops of the cross-validation (develop classifier from scratch in each loop and classify omitted patients) When cross-validation is completed, all patients have been classified once as what their optimal treatment is predicted to be

Let P denote the set of patients for whom treatment E is predicted optimal Compare outcomes for patients in P who actually received E to those in P who actually received C – Compute Kaplan Meier curves of those receiving E and those receiving C – The difference between these Kaplan Meier curves is our measure of treatment effect – e.g. Let z = standardized log-rank statistic

Test of Significance for Effectiveness of E vs C Compute statistical significance of z by randomly permuting treatment labels and repeating the entire cross-validation procedure to obtain a new set P’ and a new logrank statistic z’ – Do this 1000 or more times to generate the permutation null distribution of treatment effect for the patients in P

The size of the E vs C treatment effect for the indicated population is estimated by the Kaplan Meier survival curves of E and of C in P The predictive classifier to be used in the future is obtained by applying our classifier development algorithm to the full dataset

Adaptive Treatment Selection 96

97

99

100

101

102

Thoughts for future development Need to study lots of combinations – Need efficient new phase I combination designs and perhaps greater use of phase II selection designs – Use doses so that all components have serum levels necessary for full activity

Thoughts for future development Need to re-consider when to do a phase III trial – Different considerations for companies and for non-commercial research organizations – May not be optimal to do phase 3 trials targeting small effects or to restrict combinations to drugs which have demonstrated prolongation of survival

How many drugs do we need to combine? Naïve thoughts about tumor heterogeneity may give an overly pessimistic view Many tumors may have only a small (e.g. <5) number of “founder” mutations These mutations are present in all sub-clones of the tumor and may be the best targets The tumor is heterogeneous with regard to the other “driver” and passenger mutations

How many drugs do we need to combine? The other driver mutations develop in the context of the founder mutations and may only be viable in that context Reversing the effect of the founder mutations may lead to apoptosis of the tumor cells if the tumor is treated early enough

How many drugs do we need to combine? We need to hit each founder mutation with either – A high enough concentration of the inhibitor to inhibit even partially resistant cells – Two inhibitors with distinct binding sites – An inhibitor of the mutated target and an inhibitor “downstream” of “the” de-regulated pathways

We need to Encourage development of kinase inhibitors more specific for mutated protein as in the case of vemurafinib Systematically map and model the key “pathways” de-regulated in tumors Develop synthetic lethality strategies for targeting cells with de-regulated tumor suppressors and apoptosis controls

We need to Carry out more synthetic lethality studies on panels of tumor cell lines and provide public access Carry out deep single molecule sequencing studies of tumors and candidate precursors to identify cell of origin and founder mutations

We need to Focus academic developmental therapeutics on a strategy complementary to that of industry Fund some academic developmental therapeutics to address problems too risky for industry or R01 small lab research Solicit the cooperation of industry to facilitate the conduct of academic studies of combinations involving the drugs of several companies

We need to Provide oversight to regulatory agencies to ensure that they do not roadblock sound adaptive biomarker based designs

Acknowledgements Boris Freidlin (NCI) Wenyu Jiang (Queens U) Aboubakar Maitournam (U Niger) Noah Simon (U Washington)