Presentation is loading. Please wait.

Presentation is loading. Please wait.

Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute

Similar presentations


Presentation on theme: "Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute"— Presentation transcript:

1 Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://linus.nci.nih.gov/brb

2 BRB Website brb.nci.nih.gov Powerpoint presentations and audio files Reprints & Technical Reports BRB-ArrayTools software BRB-ArrayTools Data Archive –100+ published cancer gene expression datasets with clinical annotations Sample Size Planning for Clinical Trials with Predictive Biomarkers

3 Kinds of Biomarkers Surrogate endpoint –Pre & post rx, early measure of clinical outcome Pharmacodynamic –Pre & post rx, measures an effect of rx on disease Prognostic –Which patients need rx Predictive –Which patients are likely to benefit from a specific rx Product characterization For biological rx

4 Surrogate Endpoints It is extremely difficult to properly validate a biomarker as a surrogate for clinical outcome. It requires a series of randomized trials with both the biomarker and clinical outcome measured and demonstrating correlated differences Even the concept of surrogate is dubious because often a large treatment effect on PFS corresponds to a small treatment effect on survival

5 Pharmacodynamic biomarkers used as endpoints in phase I or II studies need not be validated surrogates of clinical outcome Unvalidated biomarkers can be used for early “futility analyses” in phase III trials

6 Prognostic Biomarkers Most prognostic factors are not used because they are not therapeutically relevant Most prognostic factor studies are poorly designed –They are not focused on a clear therapeutic decision context –They use a convenience sample of patients for whom tissue is available. Generally the patients are too heterogeneous to support therapeutically relevant conclusions –They address statistical significance rather than predictive accuracy relative to standard prognostic factors

7 Pusztai et al. The Oncologist 8:252-8, 2003 939 articles on “prognostic markers” or “prognostic factors” in breast cancer in past 20 years ASCO guidelines only recommend routine testing for ER, PR and HER-2 in breast cancer “With the exception of ER or progesterone receptor expression and HER-2 gene amplification, there are no clinically useful molecular predictors of response to any form of anticancer therapy.”

8 Prognostic Biomarkers Can be Therapeutically Relevant 3-5% of node negative ER+ breast cancer patients require or benefit from systemic rx other than endocrine rx Prognostic biomarker development should focus on specific therapeutic decision contexts

9 Key Features of OncotypeDx Development Identification of important therapeutic decision context –Prognostic marker development was based on patients with node negative ER positive breast cancer receiving tamoxifen as only systemic treatment –Use of patients in NSABP clinical trials Staged development and validation –Separation of data used for test development from data used for test validation Development of robust assay with rigorous analytical validation –21 gene RTPCR assay for FFPE tissue –Quality assurance by single reference laboratory operation

10 Predictive Classifiers Most cancer treatments benefit only a minority of patients to whom they are administered –Particularly true for molecularly targeted drugs Being able to predict which patients are likely to benefit would –save patients from unnecessary toxicity, and enhance their chance of receiving a drug that helps them –Help control medical costs –Improve the success rate of clinical drug development

11 Cancers of a primary site are often a heterogeneous grouping of diverse molecular diseases The molecular diseases vary enormously in their responsiveness to a given treatment It is feasible (but difficult) to develop prognostic markers that identify which patients need systemic treatment and which have tumors likely to respond to a given treatment –e.g. breast cancer and ER/PR, Her2

12 Conducting a phase III trial in the traditional way with tumors of a specified site/stage/pre-treatment category may –Result in a false negative trial Unless a sufficiently large proportion of the patients have tumors driven by the targeted pathway –Require a very large number of randomized patients to detect the small average treatment effect

13 Positive results in traditionally designed broad eligibility phase III trials may result in subsequent treatment of many patients who do not benefit

14

15 Predictive Biomarkers In the past often studied as un-focused post-hoc subset analyses of RCTs. –Numerous subsets examined –Same data used to define subsets for analysis and for comparing treatments within subsets –Multiple comparisons with no control of type I error Led to conventional wisdom –Only for hypothesis generation –Only valid if overall treatment difference is significant

16

17 The Roadmap 1.Develop a completely specified genomic classifier of the patients likely to benefit from a new drug 2.Establish analytical and pre-analytical validity of the classifier 3.Use the completely specified classifier to design and analyze a new clinical trial to evaluate effectiveness of the new treatment with a pre-defined analysis plan that preserves the overall type-I error of the study.

18 Guiding Principle The data used to develop the classifier must be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier –Developmental studies are exploratory –Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier

19 New Drug Developmental Strategy I Restrict entry to the phase III trial based on the binary predictive classifier, i.e. targeted design

20 Using phase II data, develop predictor of response to new drug Develop Predictor of Response to New Drug Patient Predicted Responsive New Drug Control Patient Predicted Non-Responsive Off Study

21 Applicability of Design I Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug –eg trastuzumab With a strong biological basis for the classifier, it may be unacceptable to expose classifier negative patients to the new drug Analytical validation, biological rationale and phase II data provide basis for regulatory approval of the test Phase III study focused on test + patients to provide data for approving the drug

22 Evaluating the Efficiency of Strategy (I) Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004; Correction and supplement 12:3229, 2006 Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005. reprints and interactive sample size calculations at http://linus.nci.nih.gov

23 Relative efficiency of targeted design depends on –proportion of patients test positive –effectiveness of new drug (compared to control) for test negative patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients The targeted design may require fewer or more screened patients than the standard design

24 Trastuzumab Herceptin Metastatic breast cancer 234 randomized patients per arm 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level If benefit were limited to the 25% assay + patients, overall improvement in survival would have been 3.375% –4025 patients/arm would have been required

25 Web Based Software for Comparing Sample Size Requirements http://linus.nci.nih.gov/brb/

26

27

28

29 Developmental Strategy (II) Develop Predictor of Response to New Rx Predicted Non- responsive to New Rx Predicted Responsive To New Rx Control New RXControl New RX

30 Developmental Strategy (II) Do not use the diagnostic to restrict eligibility, but to structure a prospective analysis plan Having a prospective analysis plan is essential “Stratifying” (balancing) the randomization is useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier

31 Validation of EGFR biomarkers for selection of EGFR-TK inhibitor therapy for previously treated NSCLC patients 2 nd line NSCLC with specimen FISH Testing FISH + (~ 30%) FISH − (~ 70%) Erlotinib Pemetrexed Erlotinib Pemetrexed Outcome 1° PFS 2° OS, ORR PFS endpoint –90% power to detect 50% PFS improvement in FISH+ –90% power to detect 30% PFS improvement in FISH− Evaluate EGFR IHC and mutations as predictive markers Evaluate the role of RAS mutation as a negative predictive marker 957 patients 4 years accrual, 1196 patients 1-2 years minimum additional follow-up

32 Analysis Plan B (Limited confidence in test) Compare the new drug to the control overall for all patients ignoring the classifier. –If p overall  0.03 claim effectiveness for the eligible population as a whole Otherwise perform a single subset analysis evaluating the new drug in the classifier + patients –If p subset  0.02 claim effectiveness for the classifier + patients.

33 This analysis strategy is designed to not penalize sponsors for having developed a classifier It provides sponsors with an incentive to develop genomic classifiers Incentives are appropriate because developing new drugs with companion diagnostics increases the complexity and cost of the drug development process

34 Analysis Plan C (adaptive) Test for difference (interaction) between treatment effect in test positive patients and treatment effect in test negative patients If interaction is significant at level  int then compare treatments separately for test positive patients and test negative patients Otherwise, compare treatments overall

35 Sample Size Planning for Analysis Plan C 88 events in test + patients needed to detect 50% reduction in hazard at 5% two- sided significance level with 90% power If 25% of patients are positive, when there are 88 events in positive patients there will be about 264 events in negative patients –264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level

36 Simulation Results for Analysis Plan C Using  int =0.10, the interaction test has power 93.7% when there is a 50% reduction in hazard in test positive patients and no treatment effect in test negative patients A significant interaction and significant treatment effect in test positive patients is obtained in 88% of cases under the above conditions If the treatment reduces hazard by 33% uniformly, the interaction test is negative and the overall test is significant in 87% of cases

37 Development of Genomic Classifiers Single gene or protein based on knowledge of therapeutic target Single gene or protein culled from set of candidate genes identified based on imperfect knowledge of therapeutic target Empirically determined based on correlating gene expression to patient outcome after treatment –Pusztai, Anderson, Hess. Clin Cancer Res 2007;13:6080

38 Myth Huge sample sizes are needed to develop effective predictive classifiers

39 Sample Size Planning References K Dobbin, R Simon. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6:27-38, 2005 K Dobbin, R Simon. Sample size planning for developing classifiers using high dimensional DNA microarray data. Biostatistics (In Press)

40 Sample size as a function of effect size (log-base 2 fold-change between classes divided by standard deviation). Two different tolerances shown,. Each class is equally represented in the population. 22000 genes on an array.

41 Development of Genomic Classifiers During phase II development or After failed phase III trial using archived specimens. Adaptively during early portion of phase III trial.

42

43 Prognostic and Predictive Classifiers for Guiding Use of Approved Drugs

44 Developmental Studies vs Validation Studies Validation studies use prognostic or predictive biomarkers or composite classifiers that have been completely defined in previous developmental studies

45 Types of Validation for Prognostic and Predictive Biomarkers Analytical validation –Pre-analytical,analytical, post-analytical robustness Clinical validation –Does the biomarker predict what it’s supposed to predict for independent data Not whether independent studies produce the same predictive biomarkers Clinical utility –Does use of the biomarker result in patient benefit

46 Clinical Utility Benefits patient by improving treatment decisions Depends on context of use of the biomarker –Treatment options and practice guidelines – Other prognostic factors

47 Establishing Clinical Utility of a Prognostic Biomarker Classifier Identify patients for whom – practice standards imply cytotoxic chemotherapy –who have good prognosis without chemotherapy Prospective trial using pre-defined classifier to identify good risk patients and withhold chemotherapy TAILORx, MINDACT Analysis of archived specimens from previous clinical trial in which patients did not receive chemotherapy Pre-defined classifier Prospective analysis plan developed before doing assay Establish analytical and pre/post-analytical validity of assay Large fraction of patients with adequate archived tissue

48 Establishing Clinical Utility of a Predictive Classifier of Benefit from Regimen T Randomized trial of treatment with T versus control –Include both test + and test – patients and size trial to evaluate T vs control separately for the two groups of patients –Or include only test – patients if T is an established standard therapy Prospective trial may not be feasible –“Prospective analysis” of archived specimens from previous trial

49 Myth of “Gold Standard” Design for Establishing Clinical Utility of a Predictive Classifier of Benefit from Regimen T Randomize patients to whether or not to have classifier measured or to use standard of care –Standard of care group receive T and don’t have classifier measured –Patients randomized to have classifier measured If test + (ie predicted to benefit from T) receive T If test - receive control regimen C Very inefficient –many patients get same treatment regardless of randomized arm –Since classifier is not measured in SOC arm, the trial must be huge to detect very small overall difference in outcome

50 Microarray Myths That the greatest challenge is managing the mass of microarray data

51 Greater Challenges Are Designing, conducting and analyzing key experiments that effectively utilize microarray technology to bridge the gap between basic research and clinical development

52

53 Major Flaws Found in 40 Studies Published in 2004 50% of studies contained one or more major flaws Cluster Analysis –13/28 studies invalidly claimed that expression profiles could predict outcome based on clustering samples with regard to differentially expressed genes Finding genes correlated with outcome –9/23 studies had inadequate methods to deal with false positives 10,000 genes x.05 significance level = 500 false positives Supervised prediction –12/28 reported a misleading estimate of prediction accuracy

54

55 Solution

56 BRB-ArrayTools http://linus.nci.nih.gov Contains analysis tools that I have selected as valid and useful Targeted to biomedical scientists with analysis wizard and numerous help screens Imports data from all platforms and major databases Extensive built-in gene annotation and linkage to gene annotation websites Extensive gene-set enrichment tools for integrating gene expression with pathways, transcription factor targets, microRNA targets, protein domains and other biological information Extensive tools for the development and validation of predictive classifiers with binary outcome or survival outcome data

57 Development and Validation of Predictive Classifiers using Gene Expression Profiles To be continued Monday Thank you

58 Conclusions New technology makes it feasible to identify which patients are likely or unlikely to benefit from a specified treatment Targeting treatment can greatly improve the therapeutic ratio of benefit to adverse effects –Treated patients benefit –Economic benefit –Greater chance of success in drug devleopment

59 Conclusions Some of the conventional wisdom about how to develop prognostic and predictive classifiers and how to use them in clinical trial design is flawed Prospectively specified analysis plans for phase III studies are essential to achieve reliable results –Biomarker analysis does not mean exploratory analysis except in developmental studies

60 Conclusions Achieving the potential of new technology requires paradigm changes in “correlative science.” Effective interdisciplinary research requires increased emphasis on cross education of laboratory, clinical and statistical/computational scientists

61 Acknowledgements Kevin Dobbin Alain Dupuy Boris Freidlin Wenyu Jiang Aboubakar Maitnourim Yingdong Zhao


Download ppt "Moving from Correlative Science to Predictive Medicine Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute"

Similar presentations


Ads by Google