Predictive Biomarkers and Their Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Slides:



Advertisements
Similar presentations
Patient Selection Markers in Drug Development Programs
Advertisements

A New Paradigm for the Utilization of Genomic Classifiers for Patient Selection in the Critical Path of Medical Product Development Richard Simon, D.Sc.
New Paradigms for Clinical Drug Development in the Genomic Era Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
It is difficult to have the right single completely defined predictive biomarker identified and analytically validated by the time the pivotal trial of.
Breakout Session 4: Personalized Medicine and Subgroup Selection Christopher Jennison, University of Bath Robert A. Beckman, Daiichi Sankyo Pharmaceutical.
Biomarker Analyses in CLEOPATRA: A Phase III, Placebo-Controlled Study of Pertuzumab in HER2- Positive, First-Line Metastatic Breast Cancer (MBC) Baselga.
Synopsis of FDA Colorectal Cancer Endpoints Workshop Michael J. O’Connell, MD Director, Allegheny Cancer Center Associate Chairman, NSABP Pittsburgh, PA.
Federal Institute for Drugs and Medical Devices | The Farm is a Federal Institute within the portfolio of the Federal Ministry of Health (Germany) How.
Transforming Correlative Science to Predictive Personalized Medicine Richard Simon, D.Sc. National Cancer Institute
Statistical Issues in Incorporating and Testing Biomarkers in Phase III Clinical Trials FDA/Industry Workshop; September 29, 2006 Daniel Sargent, PhD Sumithra.
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Use of Archived Tissue in Evaluating the Medical Utility of Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the.
Clinical Trial Design Considerations for Therapeutic Cancer Vaccines Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Statistical Issues in the Evaluation of Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
New designs and paradigms for science- based oncology clinical trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Use of Prognostic & Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Predictive Classifiers Based on High Dimensional Data Development & Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch.
Richard Simon, D.Sc. Chief, Biometric Research Branch
Moving from Correlative Studies to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute brb.nci.nih.gov.
Statistical Challenges for Predictive Onclogy Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Predictive Analysis of Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Re-Examination of the Design of Early Clinical Trials for Molecularly Targeted Drugs Richard Simon, D.Sc. National Cancer Institute linus.nci.nih.gov/brb.
Using Predictive Biomarkers in the Design of Adaptive Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Use of Genomics in Clinical Trial Design and How to Critically Evaluate Claims for Prognostic & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric.
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
Novel Clinical Trial Designs for Oncology
Trastuzumab [Genentech Inc.] Labeling Supplement to Include FISH Testing as a Method to Select Patients for Treatment FDA Clinical Review December 5, 2001.
Predictive Analysis of Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Prospective Subset Analysis in Therapeutic Vaccine Studies Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Use of Prognostic & Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Some Statistical Aspects of Predictive Medicine
Cancer Clinical Trials in the Genomic Era Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Validation of Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Development and Use of Predictive Biomarkers Dr. Richard Simon.
Use of Prognostic & Predictive Genomic Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
EDRN Approaches to Biomarker Validation DMCC Statisticians Fred Hutchinson Cancer Research Center Margaret Pepe Ziding Feng, Mark Thornquist, Yingye Zheng,
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Steps on the Road to Predictive Oncology Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Moving from Correlative Studies to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Therapeutic Equivalence & Active Control Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Use of Candidate Predictive Biomarkers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Integration of Diagnostic Markers into the Development Process of Targeted Agents Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Adaptive Designs for Using Predictive Biomarkers in Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Federal Institute for Drugs and Medical Devices The BfArM is a Federal Institute within the portfolio of the Federal Ministry of Health (BMG) The use of.
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
New Approaches to Clinical Trial Design Development of New Drugs & Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National.
Introduction to Design of Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Sample Size Determination
Steps on the Road to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Advanced Clinical Trial Educational Session Richard Simon, D.Sc. Biometric Research Branch National Cancer Institute
Design & Analysis of Phase III Trials for Predictive Oncology Richard Simon Chief, Biometric Research Branch National Cancer Institute
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Moving From Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Response, PFS or OS – what is the best endpoint in advanced colorectal cancer? Marc Buyse IDDI, Louvain-la-Neuve & Hasselt University
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
 Adaptive Enrichment Designs for Confirmatory Clinical Trials Specifying the Intended Use Population and Estimating the Treatment Effect Richard Simon,
Sample Size Determination
Björn Bornkamp, Georgina Bermann
Presentation transcript:

Predictive Biomarkers and Their Use in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

BRB Website Powerpoint presentations and audio files Reprints & Technical Reports BRB-ArrayTools software BRB-ArrayTools Data Archive Sample Size Planning for Targeted Clinical Trials

Many cancer treatments benefit only a small proportion of the patients to whom they are administered –Many early stage patients don’t need systemic treatment –Many tumors are not sensitive to the drugs administered Targeting treatment to the right patients –Benefits patients –May reduce health care costs –May improve the success rate of clinical development

Conducting a phase III trial in the traditional way with tumors of a specified site/stage/pre-treatment category may result in a false negative trial –Unless a sufficiently large proportion of the patients have tumors driven by the targeted pathway

Positive results in traditionally designed broad eligibility phase III trials may result in subsequent treatment of many patients who do not benefit

“Biomarkers” Surrogate endpoints –A measurement made before and after treatment to determine whether the treatment is working Prognostic markers –A measurement made before treatment to indicate long-term outcome for patients untreated or receiving standard treatment Predictive classifiers –A measurement made before treatment to select good patient candidates for the specific treatment

Surrogate Endpoints It is very difficult to properly validate a biomarker as a surrogate for clinical outcome. It requires a series of randomized trials with both the candidate biomarker and clinical outcome measured –Must demonstrate that treatment vs control differences for the candidate surrogate are concordant with the treatment vs control differences for clinical outcome –It is not sufficient to demonstrate that the biomarker responders survive longer than the biomarker non- responders

Biomarkers for use as endpoints in phase I or II studies need not be validated as surrogates for clinical outcome Unvalidated biomarkers can also be used for early “futility analyses” in phase III trials

Prognostic Factors Most prognostic factors are not used because they are not therapeutically relevant Most prognostic factor studies use a convenience sample of patients for whom tissue is available. Often the patients are too heterogeneous to support therapeutically relevant conclusions Prognostic factors in a focused population can be therapeutically useful –Oncotype DX

Validation=Fit for Purpose FDA terminology of “valid biomarker” and “probable valid biomarker” are not applicable to predictive classifiers “Validation” has meaning only as fitness for purpose and the purpose of predictive classifiers are completely different than for surrogate endpoints

The Roadmap 1.Develop a completely specified predictive classifier of the patients likely to benefit from a new drug 2.Establish reproducibility of measurement of the classifier 3.Use the completely specified classifier to design and analyze a new clinical trial to evaluate effectiveness of the new treatment with a pre-defined analysis plan.

Guiding Principle The data used to develop the classifier must be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier –Developmental studies are exploratory –Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier

Predictive Classifier Based on biological measurements of one or more genes, transcripts, or protein products If multivariate, includes a specified form for combining measurements of components to provide a binary prediction Weights and cut-off for positivity specified

Predictive Index Based on biological measurements of one or more genes, transcripts, or protein products If multivariate, includes a specified form for combining measurements of components to provide a multi-level or quantitative index Weights specified

Development of Genomic Classifiers Single gene or protein based on knowledge of therapeutic target –Indicates whether drug can inhibit targeted gene or protein and whether tumor progression is driven by the targeted pathway Empirically determined based on evaluation of a set of candidate genes or assays –e.g. EGFR assays Empirically determined based on genome-wide correlating gene expression to response

Developing Predictive Classifiers During phase II development or After failed phase III trial using archived specimens. Adaptively during early portion of phase III trial.

Developing Predictive Classifiers To predict response from new drug using response data for single arm phase II trials To predict non-response from control regimen using response data for control treated patients To predict preferential response or delayed progression from randomized phase II (or phase III) trial data of new drug vs control

New Drug Developmental Strategy (I) Develop a predictive classifier that identifies the patients likely to benefit from the new drug Develop a reproducible assay for the classifier Use the classifier to restrict eligibility to a prospectively planned evaluation of the new drug Demonstrate that the new drug is effective in the prospectively defined set of patients determined by the classifier

Using phase II data, develop predictor of response to new drug Develop Predictor of Response to New Drug Patient Predicted Responsive New Drug Control Patient Predicted Non-Responsive Off Study

Applicability of Design I Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug –eg Herceptin With a strong biological basis for the classifier, it may be unacceptable to expose classifier negative patients to the new drug Without strong biological basis or adequate phase II data, FDA may have difficulty approving the test based on this phase III design

We don’t think that this drug will help you because your tumor is test negative. But we need to show the FDA that a drug we don’t think will help test negative patients actually doesn’t

Evaluating the Efficiency of Strategy (I) Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10: , 2004; Correction and supplement 12:3229, 2006 Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24: , reprints and interactive sample size calculations at

Compared two Clinical Trial Designs Standard design –Randomized comparison of T to C without screening or selection using classifier Targeted design –Obtain tissue and evaluate classifier on candidate patients –Randomize only classifier + patients –Classifier – patients not further studied

Efficiency of targeted design relative to standard design depends on –proportion of patients test positive –effectiveness of new drug (compared to control) for test negative patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients The targeted design may require fewer or more screened patients than the standard design

No treatment Benefit for Assay - Patients n std / n targeted Proportion Assay Positive RandomizedScreened

Treatment Benefit for Assay – Pts Half that of Assay + Pts n std / n targeted Proportion Assay Positive RandomizedScreened

Trastuzumab Metastatic breast cancer 234 randomized patients per arm 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level If benefit were limited to the 25% assay + patients, overall improvement in survival would have been 3.375% –4025 patients/arm would have been required If assay – patients benefited half as much, 627 patients per arm would have been required

Treatment Hazard Ratio for Marker Positive Patients Number of Events for Targeted Design Number of Events for Traditional Design Percent of Patients Marker Positive 20%33%50% Comparison of Targeted to Untargeted Design Simon R, Development and Validation of Biomarker Classifiers for Treatment Selection, JSPI

Web Based Software for Comparing Sample Size Requirements

Developmental Strategy (II) Develop Predictor of Response to New Rx Predicted Non- responsive to New Rx Predicted Responsive To New Rx Control New RXControl New RX

Developmental Strategy (II) Do not use the diagnostic to restrict eligibility, but to structure a prospective analysis plan Having a prospective analysis plan is essential “Stratifying” (balancing) the randomization is not sufficient but ensures that all randomized patients will have tissue available The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier

Analysis Plan A (confidence in classifier) Compare the new drug to the control for classifier positive patients –If p + >0.05 make no claim of effectiveness –If p +  0.05 claim effectiveness for the classifier positive patients and Compare new drug to control for classifier negative patients using 0.05 threshold of significance

Sample size for Analysis Plan A 88 events in classifier + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power If 25% of patients are positive, then when there are 88 events in positive patients there will be about 264 events in negative patients –264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level

Study-wise false positivity rate is limited to 5% with analysis plan A It is not necessary or appropriate to require that the treatment vs control difference be significant overall before doing the analysis within subsets

Analysis Plan B (confidence in overall effect) Compare the new drug to the control overall for all patients ignoring the classifier. –If p overall  0.03 claim effectiveness for the eligible population as a whole Otherwise perform a single subset analysis evaluating the new drug in the classifier + patients –If p subset  0.02 claim effectiveness for the classifier + patients.

This analysis strategy is designed to not penalize sponsors for having developed a classifier It provides sponsors with an incentive to develop genomic classifiers

Sample size for Analysis Plan B To have 90% power for detecting uniform 33% reduction in overally hazard at 3% two-sided level requires 297 events (instead of 263 for similar power at 5% level) If 25% of patients are positive, when there are 297 total events there will be approximately 75 events in positive patients –75 events provides 75% power for detecting 50% reduction in hazard at 2% two-sided significance level –By delaying evaluation in test positive patients, 80% power is achieved with 84 events and 90% power with 109 events

Analysis Plan C Test for interaction between treatment effect in test positive patients and treatment effect in test negative patients If interaction is significant at level  int then compare treatments separately for test positive patients and test negative patients Otherwise, compare treatments overall

Sample Size Planning for Analysis Plan C 88 events in classifier + patients needed to detect 50% reduction in hazard at 5% two- sided significance level with 90% power If 25% of patients are positive, when there are 88 events in positive patients there will be about 264 events in negative patients –264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level

Simulation Results for Analysis Plan C Using  int =0.10, the interaction test has power 93.7% when there is a 50% reduction in hazard in test positive patients and no treatment effect in test negative patients A significant interaction and significant treatment effect in test positive patients is obtained in 88% of cases under the above conditions If the treatment reduces hazard by 33% uniformly, the interaction test is negative and the overall test is significant in 87% of cases

Web Based Software for Designing Stratified Trials Using Predictive Biomarkers

The Roadmap 1.Develop a completely specified genomic classifier of the patients likely to benefit from a new drug 2.Establish reproducibility of measurement of the classifier 3.Use the completely specified classifier to design and analyze a new clinical trial to evaluate effectiveness of the new treatment with a pre-defined analysis plan.

Guiding Principle The data used to develop the classifier must be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier –Developmental studies are exploratory And not closely regulated by FDA –Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier

Test How does this approach differ from conducting a RCT comparing a new treatment to a control and then performing numerous post-hoc subset analyses?

Use of Archived Samples Develop a binary classifier of the patients most likely to benefit from the new treatment using archived specimens from a “negative” phase III clinical trial Evaluate the new treatment compared to control treatment in the classifier positive subset in a separate clinical trial –Prospective targeted type I trial –Using archived specimens from a second previously conducted clinical trial

Biomarker Adaptive Threshold Design Wenyu Jiang, Boris Freidlin & Richard Simon JNCI 99: , 2007

Biomarker Adaptive Threshold Design Randomized phase III trial comparing new treatment E to control C Survival or DFS endpoint

Biomarker Adaptive Threshold Design Have identified a predictive index B thought to be predictive of patients likely to benefit from E relative to C Eligibility not restricted by biomarker No threshold for biomarker determined

Analysis Plan S(b)=log likelihood ratio statistic for treatment versus control comparison in subset of patients with B  b Compute S(b) for all possible threshold values Determine T=max{S(b)} Compute null distribution of T by permuting treatment labels –Permute the labels of which patients are in which treatment group –Re-analyze to determine T for permuted data –Repeat for 10,000 permutations

If the data value of T is significant at 0.05 level using the permutation null distribution of T, then reject null hypothesis that E is ineffective Compute point and bootstrap confidence interval estimates of the threshold b

Adaptive Biomarker Threshold Design Sample size planning methods described by Jiang, Freidlin and Simon, JNCI 99: , 2007

Adaptive Signature Design An adaptive design for generating and prospectively testing a gene expression signature for sensitive patients Boris Freidlin and Richard Simon Clinical Cancer Research 11:7872-8, 2005

Adaptive Signature Design End of Trial Analysis Compare E to C for all patients at significance level 0.03 –If overall H 0 is rejected, then claim effectiveness of E for eligible patients –Otherwise

Otherwise: –Using only the first half of patients accrued during the trial, develop a binary classifier that predicts the subset of patients most likely to benefit from the new treatment E compared to control C –Compare E to C for patients accrued in second stage who are predicted responsive to E based on classifier Perform test at significance level 0.02 If H 0 is rejected, claim effectiveness of E for subset defined by classifier

Treatment effect restricted to subset. 10% of patients sensitive, 10 sensitivity genes, 10,000 genes, 400 patients. TestPower Overall.05 level test46.7 Overall.04 level test43.1 Sensitive subset.01 level test (performed only when overall.04 level test is negative) 42.2 Overall adaptive signature design85.3

Conclusions New technology makes it increasingly feasible to identify which patients are likely or unlikely to benefit from a specified treatment Targeting treatment can benefit patients, reduce health care costs and improve the success rate of new drug development

Conclusions Some of the conventional wisdom about “biomarkers”, how to develop predictive classifiers and how to use them in clinical trials is seriously flawed Prospectively specified analysis plans for phase III studies are essential to achieve reliable results –Biomarker analysis does not mean exploratory analysis except in developmental studies

Conclusions Achieving the potential of new technology requires paradigm changes in “correlative science” and in important aspects of design and analysis of clinical trials

Collaborators Boris Freidlin Aboubakar Maitournam Kevin Dobbin Wenu Jiang Yingdong Zhao