To change this title, go to Notes Master 4/16/2017 Using multiple imputation and delta adjustment to implement sensitivity analyses for time-to-event data. Michael O’Kelly, Quintiles Ilya Lipkovich, Quintiles
Acknowledgements DIA Scientific Working Group (SWG) for Missing Data This presentation stems from work with Bohdana Ratitch (inVentiv Health). The authors of these slides are members of the SWG. Chair: Craig Mallinckrodt, Eli Lilly. James Roger and Mouna Akacha, speakers at this session, are also members. Great downloadable SAS macros for control-based imputation and other MNAR approaches available SWG webpage at www.missingdata.org.uk. SWG members have growing interest in discrete endpoints with missing data. Gary Koch (University of North Carolina) regular advice; in press, with Zhao and others: describes the approach used in this presentation. Taylor and others (2002) showed how to implement multiple imputation for time-to-event outcomes. Michael Hughes (Harvard School of Public Health) kindly shared the example time-to-event data.
ACTG 175: HIV study* Subjects randomized to four antiretroviral regimes in equal proportions. Primary event analysed: 50% decline in CD4 count or death. Study start Dec1991; enrolment ended Oct1992; follow-up until end Nov1994 max follow-up just 4 years. For this presentation, we examine two treatment arms zidovudine zidovudine+didanosine. * Lu and Tsiatis (2008); Hammer et al. (1996); the analyses are by O’Kelly and are not the responsibility of the authors of the cited papers.
ACTG 175: HIV study* Zidovudine Zidovudine+ Didanosine Enrolled 619 613 Event: 50% decline in CD4 182 98 Censored 437 515 Completed study 313 384 Other reasons 124 131
ACTG 175: HIV study* Zidovudine Zidovudine+ Didanosine Enrolled 619 613 Event: 50% decline in CD4 182 98 Censored 437 515 Completed study 313 384 Other reasons 124 131
ACTG 175: HIV study
Kaplan-Meier analysis Logrank statistic 46.12 Standard error 7.726 p-value <0.0001 Assumes censoring at random (CAR). (CAR is analogous to missing at random)
How robust is this result? How robust is this result to the assumption of CAR? One way to assess this: tipping point analysis. Tipping point for continuous variable: Add unfavourable quantity δ to efficacy score when imputed for experimental arm; Make δ more extreme until the p-value from the primary analysis is no longer significant – the “tipping point”. Was the “tipping point” δ clinically plausible for subjects who withdrew early? If not, the primary result may be judged robust to the missing-at-random assumption.
Tipping point for time to event, Kaplan-Meier (KM) analysis Impute time of event using some hazard worse by δ than that estimated by Kaplan-Meier. Make δ more extreme until the p-value from the primary analysis is no longer significant – the “tipping point”. Was the “tipping point” δ clinically plausible for subjects who withdrew early? If not, the primary result may be judged robust to the CAR assumption. Note unstatistical terminology in following slides: “p(no event)” = p(T>t) “p(event)” = p(T<=t)
How make p(event) worse than KM in a statistically principled way? Inversion method Case 1: assuming CAR p(event) = 1- p(no event)
How make p(event) worse than KM in a statistically principled way? Inversion method Case 1: assuming CAR p(event) = 1- p(no event) This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary.
How make p(event) worse than KM in a statistically principled way? Inversion method Case 1: assuming CAR p(event) = 1- p(no event) Imputed p(event|T>t) = U (1-p(no event), 1) This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary.
ACTG 175: HIV study
ACTG 175: HIV study Impute event for this censored subject.
ACTG 175: HIV study 1 – U[1-p(no event), 1]
Imputed time of event, case 1 ACTG 175: HIV study Imputed time of event, case 1 1 – U(1-p(no event), 1)
Imputed time of event, case 2 ACTG 175: HIV study Imputed time of event, case 2 1 – U(1-p(no event), 1)
Case 3: imputation results in censoring ACTG 175: HIV study Case 3: imputation results in censoring 1 – U(1-p(no event), 1)
How make p(event) worse than KM in a statistically principled way? Inversion method Case 2: assuming CAR + some δ. p(event) = 1- p(no event)
How make p(event) worse than KM in a statistically principled way? Inversion method Case 1: assuming CAR + some δ. p(event) = 1- p(no event) This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary.
How make p(event) worse than KM in a statistically principled way? Inversion method Case 1: assuming CAR + some δ. p(event) = 1- p(no event) Imputed p(event|T>t) = U (1-p(no event)δ, 1) This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary.
ACTG 175: HIV study p(no event)
reference line for p(no event) δ, δ = 2 ACTG 175: HIV study reference line for p(no event) δ, δ = 2
ACTG 175: HIV study Imputed time of event, δ = 2 1 – U(1-p(no event)δ, 1)
Imputed time of event, no δ ACTG 175: HIV study Imputed time of event, no δ 1 – U(1-p(no event), 1)
ACTG 175: HIV study Imputed time of event, δ = 2 Imputed event times 1 – U(1-p(no event)δ, 1) Imputed event times tend to be shorter as δ increases
ACTG 175: HIV study Imputed time of event, δ = 2 Note: this is just 1 – U(1-p(no event)δ, 1) Note: this is just single imputation!
How to use multiple imputation here? Bootstrap original data set. Calculate p(no event)δ associated with time of censoring, using the bootstrap KM estimates of p(no event). Use inversion to find corresponding time on original data set.
ACTG 175: HIV study Bootstrapped data set #1 Bootstrapped data set #2
ACTG 175: HIV study p(no event) = 0.958 p(no event) = 0.947 Bootstrap approximates variability of draws from posterior distribution needed for MI p(no event) = 0.958 p(no event) = 0.947 p(no event) = 0.952 p(no event) = 0.950
ACTG 175: HIV study Imputations include variability from U() and from the differences in bootstrapped data sets 1 – U(1-p(no event), 1)
ACTG 175: HIV study Imputed p(no event) is applied to the original data set
ACTG 175: HIV study 1 – U(1-p(no event)δ, 1)
ACTG 175: HIV study Imputed p(no event) is applied to the original data set, with δ applied
ACTG 175: HIV study Sample imputations with and without δ might look like this...
ACTG 175: HIV study 1 – U(1-p(no event), 1)
ACTG 175: HIV study 1 – U(1-p(no event)δ, 1)
Result of tipping point analysis for HIV study δ Logrank statistic* Standard error+ p-value 1 5.58 1.031 <0.0001 1.5 5.17 1.057 2 4.82 1.102 2.5 4.50 1.090 3 4.06 1.121 0.0003 3.5 3.68 1.108 0.0010 4 3.53 1.131 0.0019 4.5 3.27 1.211 0.0076 5 2.90 1.130 0.0105 5.5 2.54 1.233 0.0413 6 2.35 1.199 0.0516 6.5 2.17 1.183 0.0674 7 1.99 1.210 0.1019 *chi-squared statistic transformed to normal using Wilson-Hilferty transformation +transformed statistic has variance = 1; standard error includes between-imputation variability
What if primary analysis is Cox prop’l hazards or parametric? Implementation of MI version of Cox proportional hazards is similar to that of KM. Other implementations of MI for time-to-event analysis in progress by Lipkovich and Ratitch: logistic regression (suggested by Carpenter and Kenward (2013)); piecewise exponential. SAS macros for all four approaches planned to be available at DIA SWG web page at www.missingdata.org.uk. tasks undertaken as part of DIA SWG “New Tools” subgroup. The above methods can also be used to implement “control based imputation” for missing time to event outcomes.
References Carpenter J and Kenward M (2013) Multiple imputation and its application. Chichester: Wiley. Hammer S, Katzenstein D, Hughes M, Gundaker H, Schooley R, Haubrich R, Henry W, Lederman M, Phair J, Niu M, Hirsch M, and Merigan T, for the Aids Clinical Trials Group Study 175 Study Team (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine 335 1081-1089. Lu X, Tsiatis, A (2008) Improving the efficiency of the log-rank test using auxiliary covariates, Biometrika 95 679-694. Taylor J, Murray S, Hsu C-H (2002) Survival estimation and testing via multiple imputation. Statistics and probability letters 6 77-91. Zhao Y, Herring A, Zhou H, Ali M, Koch G (submitted) A multiple imputation method for sensitivity analyses of time-to-event data with possibly informative censoring.
Questions?