Impact of censoring on the statistical methods for handling non-proportional hazards in Immuno-Oncology studies Yifan Huang, Luping Zhao, Jiabu Ye, and.

Slides:



Advertisements
Similar presentations
Industry Issues: Dataset Preparation for Time to Event Analysis Davis Gates Schering Plough Research Institute.
Advertisements

METHODOLOGY FOR META- ANALYSIS OF TIME TO EVENT TYPE OUTCOMES TO INFORM ECONOMIC EVALUATIONS Nicola Cooper, Alex Sutton, Keith Abrams Department of Health.
Surviving Survival Analysis
Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Evidence synthesis of competing interventions when there is inconsistency in how effectiveness outcomes are measured across studies Nicola Cooper Centre.
Controlling for Time Dependent Confounding Using Marginal Structural Models in the Case of a Continuous Treatment O Wang 1, T McMullan 2 1 Amgen, Thousand.
Sample size optimization in BA and BE trials using a Bayesian decision theoretic framework Paul Meyvisch – An Vandebosch BAYES London 13 June 2014.
Recursive Partitioning Method on Survival Outcomes for Personalized Medicine 2nd International Conference on Predictive, Preventive and Personalized Medicine.
Using Weibull Model to Predict the Future: ATAC Trial Anna Osmukhina, PhD Principal Statistician, AstraZeneca 15 April 2010.
Phase II Design Strategies Sally Hunsberger Ovarian Cancer Clinical Trials Planning Meeting May 29, 2009.
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
CI - 1 Cure Rate Models and Adjuvant Trial Design for ECOG Melanoma Studies in the Past, Present, and Future Joseph Ibrahim, PhD Harvard School of Public.
Statistical approaches to analyse interval-censored data in a confirmatory trial Margareta Puu, AstraZeneca Mölndal 26 April 2006.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Optimal cost-effective Go-No Go decisions Cong Chen*, Ph.D. Robert A. Beckman, M.D. *Director, Merck & Co., Inc. EFSPI, Basel, June 2010.
Bayesian Analysis and Applications of A Cure Rate Model.
INTRODUCTION TO SURVIVAL ANALYSIS
Lecture 11: Hypothesis Testing III
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Survival Analysis, Type I and Type II Error, Sample Size and Positive Predictive Value Larry Rubinstein, PhD Biometric Research Branch, NCI International.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Patricia Guyot1,2, Nicky J Welton1, AE Ades1
Introduction Sample Size Calculation for Comparing Strategies in Two-Stage Randomizations with Censored Data Zhiguo Li and Susan Murphy Institute for Social.
 An exposure-response (E-R) analysis in oncology aims at describing the relationship between drug exposure and survival and in addition aims at comparing.
1 Probability and Statistics Confidence Intervals.
Clinical Trials 2015 Practical Session 2. Exercise I Normally Distributed Response Data H 0 :  = 0 H a :  =  a > 0 n=? α=0.05 (two-sided) β=0.20 
1 Chapter 6 SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
Carolinas Medical Center, Charlotte, NC Website:
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
Elizabeth Garrett-Mayer, PhD Associate Professor of Biostatistics
Bootstrap and Model Validation
Comparing Cox Model with a Surviving Fraction with regular Cox model
Interpreting Statistics in the Urological Literature
April 18 Intro to survival analysis Le 11.1 – 11.2
Anastasiia Raievska (Veramed)
Survival Analysis Rick Chappell, Ph.D. Professor,
Compassionate People World Class Care
Meta-analysis of joint longitudinal and event-time outcomes
Supporting Information for Meta-analysis
Statistics 103 Monday, July 10, 2017.
Dr Nicky Lawrence Medial Oncologist, PhD Candidate
Challenges of Bridging Studies in Biomarker Driven Clinical Trials
Sample Size Planning of Clinical Trials, Introduction
Regulatory Industry Statistics Workshop 2018
Mark Rothmann U.S. Food and Drug Administration September 14, 2018
Daniela Stan Raicu School of CTI, DePaul University
Daniela Stan Raicu School of CTI, DePaul University
Anja Schiel, PhD Statistician / Norwegian Medicines Agency
ESTIMATION
Review of survival models:
Longitudinal Analysis Beyond effect size
EVENT PROJECTION Minzhao Liu, 2018
Biostatistics Primer: What a Clinician Ought to Know: Hazard Ratios
Cure Survival Data in Oncology Study
JSM 2018 Practical Considerations on the Challenges to the Design and Analysis of Immuno-Oncology Trials Yabing Mai1, Yue Shentu2 1 AbbVie, Inc. 2 Merck.
Handling Missing Not at Random Data for Safety Endpoint in the Multiple Dose Titration Clinical Pharmacology Trial Li Fan*, Tian Zhao, Patrick Larson Merck.
Sample Size Determination Under Non-Proportional Hazards
Level of Evidence Lecture 4.
Use of Piecewise Weighted Log-Rank Test for Trials with Delayed Effect
Design and Analysis of Survival Trials with Treatment Crossover, delayed treatment effect and treatment dilution Presenter: Xiaodong Luo– R&D-SANOFI US.
Difficulties with network meta-analysis when starting to use PD-L1 thresholds Dear all, I am Mario Ouwnes, senior statistical science direct within AZ.
Björn Bornkamp, Georgina Bermann
Kaplan-Meier survival curves and the log rank test
Subgroup analysis on time-to-event: a Bayesian approach
Approaches and Challenges in Accounting for Baseline and Post-Baseline Characteristics when Comparing Two Treatments in an Observational/Non-Randomized.
Utilizing of Platform Clinical Trial to Help Make Faster Decisions
Presentation transcript:

Impact of censoring on the statistical methods for handling non-proportional hazards in Immuno-Oncology studies Yifan Huang, Luping Zhao, Jiabu Ye, and Pralay Mukhopadhyay Duke Industry Statistical Symposium 08 September 2017

Outline Introduction – statistical challenges in IO trials Objective – assess performance of different statistical analysis methods With various types of non proportional hazards patterns With different censoring mechanisms Methods Statistical analysis methods evaluated Simulation setup Preliminary results Summary

Delayed treatment effect CM – 141 (2L SCCHN) CM – 017 (2L Squamous NSCLC)

Crossing survival functions CM – 026 (1L NSCLC) CM- 057 (Non-squamous NSCLC)

Biomarker subgroups CM- 057 (Non-squamous NSCLC)

Example of proportional hazards (PH) HR=0.6 at all time points Constant hazard in each arm median 1 / median 2 = 0.6

Example of non-PH Increasing hazard in one arm, decreasing hazard in the other HR increases over time

Statistical analysis methods evaluated Based on the hazard function (using data up to max event time) Log-rank test Weighted log-rank test Based on the survival function (using data up to a specific time point) Difference in RMST (unweighted) Weighted KM Based on a point on the survival function Median Landmark

Restricted mean survival time (RMST) The shaded region (area under the survival curve) represents the RMST with a truncation time of 5 months The RMST calculated is 4.3 The statistical interpretation would be “at 5 months, the mean survival of a patient is 4.3 months” The clinical interpretation would be “the life expectancy of the patient over the next 5 months is 4.3 months”

Restricted mean survival time (RMST) RMST in the first year since treatment starts: 6.6  8.3 months Ref: Borghaei 2015, The new England journal o f medicine Data extracted by Wenmei Huang using R shiny tool developed by Monika Huhn RMST in the first 2 years since treatment starts: 11.1  12.8 months Difference in RMST in the first 2 years is 1.7 months Clinical interpretation: patients receiving nivolumab have a 1.7-month longer life expectancy during the first 2 years from treatment start

Simulation setup - scenarios Non-PH assumptions Delayed treatment effect Diminishing treatment effect Crossing survival functions Censoring mechanism Type 1 censoring: data cut-off (DCO) at 24m after LPI (DCO1) Type 2 censoring: data cut-off at 70% maturity (DCO2) Type 1 censoring: data cut-off at 12m after LPI (DCO3) Early censoring (interim analysis): data cut-off at 50% maturity

Simulation setup – generating non-PH survival data Piecewise exponential distribution Piecewise failure rate Different parameters of the same distribution Different distributions

Delayed treatment effect – simulation Sample size: 200 pts/arm, 1:1 randomization ratio, Median OS in control: 12 m Treatment effect: HR1=1, HR2=0.5, HR change point: 6 m, (a smaller trt effect was also evaluated) Accrual (uniform): 18 m, DCO1 (Type 1 censoring): 24m after LPI; DCO2 (Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity Piecewise exponential distribution HR1: HR in the first piece HR2: HR in the second piece DCO Follow up after LPI (month) DCO1 41.6 DCO2 36.1 DCO3 29.7 More censoring

Delayed treatment effect (impact of FU on HR estimate) DCO1 Average HR=0.66 DCO3 Average HR=0.70 DCO2 Average HR=0.68 DCO_IA Average HR=0.76

Delayed treatment effect (impact on power and bias) Within each method, power decreases as censoring increases Log-rank (0,0) (1,0) and RMST are sensitive to censoring – power drops faster, whereas log-rank (1,1) (0,1) are more robust Landmark is sensitive to selection of analysis time point, but robust to amount of censoring Log-rank (0,0) and RMST had similar performance Log-rank (0,0) had similar power as Log-rank (1,1) (0,1) given sufficient follow-up; on the contrary, Log-rank (1,1) (0,1) is more powerful at interim with shorter follow up Bias is generally small, but relatively larger with more censoring DCO1 (Type 1 censoring): 24m after LPI; DCO2 (Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity

Diminishing treatment effect – simulation Sample size: 200 pts/arm, 1:1 randomization ratio, Median OS in control: 12 m Treatment effect: HR1=0.5, HR2=1.1, HR change point: 12 m, (a smaller trt effect was also evaluated) Accrual (uniform): 18 m, DCO1 (Type 1 censoring): 24m after LPI; DCO2(Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity Piecewise exponential distribution HR1: HR in the first piece HR2: HR in the second piece DCO Follow up after LPI (month) DCO1 41.2 DCO2 32.8 DCO3 29.6 More censoring

Diminishing treatment effect (impact of FU on HR estimate) DCO1 Average HR=0.76 DCO3 Average HR=0.68 DCO2 Average HR=0.71 DCO_IA Average HR=0.62

Diminishing treatment effect (impact on power and bias) Within each method, power increases as censoring increases Log-rank (1,1) (0,1) are sensitive to censoring – power drops faster, whereas log-rank (0,0) (1,0) and RMST are more robust Landmark is sensitive to selection of analysis time point, but robust to amount of censoring Log-rank (0,0) and RMST had similar performance Bias is generally small, but relatively larger with more censoring DCO1 (Type 1 censoring): 24m after LPI; DCO2 (Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity

Crossing survival functions – simulation Sample size: 200 pts/arm, 1:1 randomization ratio, Median OS in control: 15 m Treatment effect: HR1=1.8, HR2=0.5, HR change point: 6 m, (a different trt effect was also evaluated) Accrual (uniform): 18 m, DCO1 (Type 1 censoring): 24m after LPI; DCO2(Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity Piecewise exponential distribution HR1: HR in the first piece HR2: HR in the second piece DCO Follow up after LPI (month) DCO1 41.6 DCO2 39.2 DCO3 29.7 More censoring

Crossing survival functions (impact of FU on HR estimate) – simulation DCO1 Average HR=0.90 DCO3 Average HR=1.02 DCO2 Average HR=0.91 DCO_IA Average HR=1.16

Crossing survival functions – simulation Power is generally low Log-rank (1,1) (0,1) is more power than RMST because there is (late) difference in the hazard function, not in survival function (AUC) Bias is generally small, but relatively larger with more censoring DCO1 (Type 1 censoring): 24m after LPI; DCO2 (Type 2 censoring): 70% maturity; DCO3 (Type 1 censoring): 12m after LPI; Interim (early censoring): 50% maturity

Summary Non-PH creates challenges in designing and analyzing IO trials Performance of various statistical methods for non-PH time-to-event data was assessed using simulation with different censoring mechanisms Log-rank test (un-weighted) performs well with sufficient FU Weighted log-rank test can be more powerful and robust to censoring, if the correct non-PH pattern is known RMST had similar power as log-rank test (un-weighted), if the truncation time is similar to the max event time used by log-rank test RMST could be less powerful than log-rank test (un-weighted), if the difference in survival (AUC) is small while there is still difference in hazard; however, RMST may reflect the survival benefit more directly Bias is generally small, but relatively larger with more censoring Ongoing and future work More simulation scenarios for better generalisability  More methods such as weighted RMST Models beyond one-number summary, e.g., change point model