Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Leadership in Clinical Trials

Similar presentations


Presentation on theme: "Statistical Leadership in Clinical Trials"— Presentation transcript:

1 Statistical Leadership in Clinical Trials
Opportunities from the Draft Estimand Guidance Jonathan Siegel JSM Vancouver BC – July 31, 2018 Anetumab Ravtansine: Safety Update

2 Joint Statistical Meetings - Abstract #329467
Disclaimer The views presented in this presentation are those of the author alone and do not represent the views of Bayer, AG or any of its constituent companies, partners, or affiliates, or any other organization. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

3 Joint Statistical Meetings - Abstract #329467
Outline Motivation: Estimand guidance Missing data vs. intercurrent events Summary of strategies for addressing intercurrent events Leadership opportunities Examples Opportunities for research Statistics as an engineering discipline Acknowledgments and references 31st July, 2018 Joint Statistical Meetings - Abstract #329467

4 Motivation: Estimand Guidance
The draft estimand guidance attempts to address ways that study design and on-study conduct can influence conclusions, and on strategies for study design, conduct, analysis, and interpretation that can better ensure accurate conclusions It introduces a series of new concepts and terms, including “estimand”, and “intercurrent events” It requires tradeoffs between clinical and statistical considerations and more in-depth thought in defining endpoints, selecting strategies for dealing with intercurrent events, and designing studies This opens leadership opportunities for statisticians 31st July, 2018 Joint Statistical Meetings - Abstract #329467

5 Joint Statistical Meetings - Abstract #329467
Key terms An estimand attempts to address how the outcome of treatment observed in the study compares to what would have happened under different treatment conditions It requires defining a population of inference, a variable or endpoint, a specification of how to account for intercurrent events, and a population-level summary (statistic) serving as the basis for comparison Intercurrent events occur during the course of a study that complicate the description and interpretation of treatment effects Either preclude observation of a variable or affect its interpretation Includes events that result in “missing data”, confounding, and more. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

6 Joint Statistical Meetings - Abstract #329467
Changing assumptions Traditionally, the statistics profession has embraced methods that assume “missing data” is essentially noise, irrelevant to outcomes of interest Survival analysis assumes censoring is non-informative Mixed model multiple imputation assumes missing at random Many statistics simply ignore missing data Clinical trial researchers have challenged this assumption, arguing that “missing data” is often highly informative Treatment withdrawal and loss to follow-up are often due to safety events or perceived lack of treatment efficacy People in the worst condition (e.g. greatest pain) may have the most difficulty filling out questionnaires If safety events delay assessments, they can lengthen TTE efficacy indicators and make them appear more efficacious 31st July, 2018 Joint Statistical Meetings - Abstract #329467

7 Informativeness as default
The estimand guidance embraces the view that intercurrent events should be presumed informative Discourages use of methods that assume otherwise Requires justification to use them Pre-hoc scientific justification that the clinical context really warrants assuming non-informativeness Post-hoc sensitivity analyses to establish that MAR and noninformativeness assumptions really are reasonable This represents a sea change in attitudes from what the statistics profession has traditionally assumed and embraced 31st July, 2018 Joint Statistical Meetings - Abstract #329467

8 Types of informativeness
Positively informative intercurrent events tend to provide qualitative information about the event of interest Scientific question is what actually happened, including the intercurrent event Goal of improvement is to better incorporate the intercurrent event into the analysis Strategies (discussed below) include composite events Counterfactual intercurrent events tend to confound the event of interest Scientific question is what would have happened if intercurrent event had not occurred. Intercurrent events rendered uninformative conditioned on a model Strategies include hypothetical and principal stratum Uninformative events permit traditional missing-data assumptions Intercurrent events can be treated as irrelevant to scientific question Strategies include Treatment policy strategy (intercurrent event represents noise) While-on-treatment strategy (occurrence of intercurrent event renders scientific question irrelevant) 31st July, 2018 Joint Statistical Meetings - Abstract #329467

9 Joint Statistical Meetings - Abstract #329467
Summary of Strategies The guidance discuss 5 general strategies Strategies integrate study design (how and how long patients will be followed) with endpoint definition and statistical analysis strategy to create a coherent approach. 1) Treatment Policy Strategy: Simply ignore and assess through intercurrent events. Assumes Intercurrent events are not informative Example: Continue assessing patients after end of treatment, start of subsequent therapy, etc. Scientific question often needs to be carefully defined/limited Intercurrent events only need to be non-informative with respect to it Need sensitivity analyses to demonstrate non-informativeness Feasibility: Patients have to be willing and able to continue. 2) Composite Strategy: Combine events of interest with intercurrent events Appropriate for directly informative intercurrent events Many standard methods use this strategy implicitly Example: EFS (combines multiple events), Binomial (intercurrent events classified as non-event) Logistically simple (no need to observe patients past combined events) 31st July, 2018 Joint Statistical Meetings - Abstract #329467

10 Summary of Strategies Cont.
3) Hypothetical Strategy: Define scientific question as what would have occurred if intercurrent event had not happened. Appropriate when intercurrent events are counterfactual to the event of interest. Need causal inference methods with strong assumptions to evaluate Causal methods assume intercurrent events are non-informative conditioned on the variables in the model, i.e. model contains all sources of information (exchangeability) Need sensitivity analyses to demonstrate assumptions are reasonable. 4) Principal Stratum Strategy: Define scientific question by limiting the population of inference to patients in whom intercurrent events don’t occur Example: Since PRO data can only be evaluated in survivors, to compare treatment vs. placebo without confounding effect of death, evaluate only in patients who survive whether or not treated. Membership in stratum modeled based on baseline data Also requires strong assumptions 31st July, 2018 Joint Statistical Meetings - Abstract #329467

11 Summary of Strategies Cont.
5) While on Treatment Strategy: Limit scientific question to the time period prior to any intercurrent event. Examples: AEs (assessed while on treatment), Palliative treatment (assessed while alive) Appropriate if intercurrent events are non-informative and assessment of patients past event is irrelevant or infeasible. Intercurrent event has to be irrelevant to the scientific question Palliative treatments not expected to impact survival and only question is symptomatic/QoL benefit while patient is alive. Logistically simple, patients need not be observed past intercurrent events. Often used by default. Often over-used. Example: immunotherapies may have delayed safety effects occurring after treatment and traditional observation periods. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

12 Some example traditional endpoints reframed as strategies
Strategy Rationale/Comment Binomial (generally) Composite Intercurrent events preventing observation treated as non-events Treatment-Emergent AEs On Treatment Observed from treatment start to fixed period after last treatment OS Treatment policy Traditionally observation continues despite any intercurrent events PFS, Event-Free Survival Composite of multiple events PFS & EFS censoring On treatment, treatment policy, composite Discussed below Patient-Reported Outcomes (PROs) On treatment Traditionally only observable during clinic visits ePROs Electronic device capture can make independent of visits OS adjusted for therapy switch (e.g. IPCW, RPSFT) Hypothetical Causal analysis of what would have happened if treatment hadn’t switched 26th September, 2017 Anetumab Ravtansine: Safety Update

13 Opportunities for Leadership Clinical Trials
The strategies for addressing intercurrent events outlined in the estimand guidance require tradeoffs between Optimizing the scientific question and the label claim it will support The appropriateness of the assumptions made by the methodology The difficulties and feasibility issues of the study context Statistician provides crucial input and quidance to help manage the disconnect between The scientific question(s) the sponsor would like to answer, and The scientific question(s) that can feasibly be answered without significant confounding in the actual study context Precise and appropriate formulation of research questions to align with feasible assumptions and data collection methods is often critical to success 31st July, 2018 Joint Statistical Meetings - Abstract #329467

14 Opportunities for Leadership Clinical Trials (Cont)
A clinically-aware statistician is in a unique position to provide expertise to assist in making these tradeoffs, including guidance in: How the specific formulation of a research question affects the label claim that can be made The assumptions required for methods and their reasonableness in the particular clinical trial context The effect of the proposed strategy on the demands made on patients The robustness of a proposed strategy – how it will be impacted by greater-than-expected patient dropout or other intercurrent events, deviations from assumptions, etc. Statistician can also help in ways to optimize the feasibility of and compliance with the assessment schedule Statistician can help identify where non-compliance etc. will have the greatest impact on results and hence where the most resources and improvement strategies need to be directed. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

15 Opportunities for Leadership Clinical Trial Strategy
Clinically-aware statisticians can also provide expertise on strategic issues including: Tradeoffs between cost of conducting clinical trials and the value of the resulting label claims Robustness to and tradeoffs against additional needs outside specific narrowly-defined research questions and label claims, for example: Payers Medical research community Physicians and patients seeking guidance on the effectiveness and value of a treatment in “plain English” Note: The needs of the regulatory community are not the only needs to consider. Strategies that optimize design strategy for regulatory purposes may limit its use to other users and audiences, and tradeoffs or multiple parallel strategies may sometimes be appropriate. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

16 Opportunities for Leadership Statistical Research
The statistics community has traditionally classified intercurrent-event situations as “missing data” and focused on methods treating such events as essentially independent of treatment outcomes, either unconditionally or conditioned on additional variables. The estimand discussion reframes our way of looking at intercurrent events. These events are often informative of events of interest and can represent: Qualitative treatment outcomes Counterfactual to treatment outcomes An intermediate status This reframing presents an opportunity for developing new methods and advancing existing methods that better reflect the clinical model, for example: Time to event approaches that work with multiple states and classes of events Competing risks, multi-state models, and more Causal and counterfactual methods more robust to the possibility of external confounding variables. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

17 Joint Statistical Meetings - Abstract #329467
Examples We’ll look at a couple of examples of issues in practical clinical trials where applying the estimand framework may lead to Increased statistical involvement in the planning of clinical trials A rethinking of common assumptions and methods 31st July, 2018 Joint Statistical Meetings - Abstract #329467

18 Joint Statistical Meetings - Abstract #329467
PFS Progression-free survival (PFS) is a primary endpoint in many Phase II and some Phase III oncology trials, and a common secondary endpoint. It requires clinic visits, so assessments occur at discrete times and end when the patient stops coming to the clinic. It is traditionally censored at last tumor assessment, so the assessment schedule is directly related to censoring strategy. The reliability of assessments and relatedness to treatment outcomes can be influenced by a number of factors including Changes in reviewer (assessments can be subjective and variable by reviewer) Changes in therapy (e.g. patient enters new study after end of treatment) The timing of assessments can be related to treatment For example, adverse events related to treatment can result in delays, which can delay the patient coming to the clinic At the extreme, delays can make a less safe treatment appear more efficacious 31st July, 2018 Joint Statistical Meetings - Abstract #329467

19 Joint Statistical Meetings - Abstract #329467
PFS Cont Decisions to permanently stop assessments are often related to treatment effect Decisions to end treatment are often related to perceived lack of treatment benefit. Patients are often less compliant with clinic visits once they go off treatment It may be logistically infeasible for a patient to return to the clinic (e.g. new study with separate demands, dying patient in hospice, etc.) In the next slides, we will reframe standard strategies for addressing PFS into an estimand framework. From an estimand point of view, traditional strategies for handling visit schedules, assessments, and censoring can be put in 3 categories: On-treatment strategy Treatment policy strategy Composite strategy 31st July, 2018 Joint Statistical Meetings - Abstract #329467

20 Reframing PFS On-Treatment Strategy
Tumor assessments conform to treatment schedule Progression must be radiologically documented per specified criteria Preferably by a central review committee PFS is censored at end of treatment or start of new therapy The research objective, based on a clinical focus, is to provide an objective, clinically meaningful evaluation of the efficacy benefit separately for each individual patient This research objective supports ready interpretation by the general medical community and application to patient care The focus on objective response based on specified criteria and central review reduces variation Censoring for subsequent therapy ensures assessment is limited to effects of the treatment with subsequent therapy considered confounding (essence of a while-on-treatment strategy) This is the traditional approach. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

21 Reframing PFS Treatment Policy Strategy
Treatment policy strategy: Tumor assessment visits should be scheduled independently of treatment visits, and patients are assessed until documented progression regardless of ending treatment, new therapy, etc. This is the approach recommended by Fleming et al. (2009) and by the EMA in its 2012 PFS Methodological Considerations Appendix. Fleming et al.’s fundamental idea was to change the research objective: Instead of attempting to estimate each individual patient’s efficacy, focus strictly on making a population comparison between the arms Making assessments independent of treatment schedule, and continuing through subsequent therapy etc., results in an unbiased comparison even though individual-arm estimates may be biased or uninterpretable. In a randomized, blinded trial, individual-patient and individual-arm bias gets washed out in the comparison Results in greater power, due to longer follow-up and including events that would traditionally result in censoring. Reframing the research question in light of study-feasibility or statistical considerations is a standard tool in an estimand framework. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

22 PFS: Treatment Policy Strategy Drawbacks
Compliance rates may be low. Patients may not be motivated to comply following treatment, and compliance may be infeasible Continuing to assess past subsequent therapy or enrollment in a subsequent trial may often be infeasible. Making assessments fully independent of treatment is often not feasible either The likely practical effect is a mixture of patients assessed independently or past subsequent therapy, and patients who end assessments and are censored earlier. No guarantee mixture selection is treatment-independent Non-compliance may result in selection bias Tends to further diminish interpretability of point estimates Endpoint actually estimated may be a mixture of traditional PFS and OS Fleming’s idea makes intercurrent events uninformative with respect to a specific research question (comparison between the arms) and this question alone, but Point estimates may be less interpretable, Usability of trial data to answer other cognate questions may be diminished. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

23 PFS: Treatment Policy Strategy Importance of feasibility assessment
Critical to assess likelihood of compliance under study conditions Line of therapy: After treatment, will patients be receiving standard therapy at the clinic, or will they be entering new clinical trials or not receiving further treatment? Patients entering new trials may be unable to comply Patients not receiving any therapy may have less incentive to comply Does protocol permit long treatment delays? Can patients feasibly come to clinic in between? If appropriate, intermediate safety assessments in the event of treatment delay may better approximate treatment independence. If compliance is not likely feasible, Flemings’ unbiased approach is likely infeasible It might be better either to use a more conventional approach and admit its limitations, or use OS if too many patients’ only admissible events will be death events. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

24 PFS: Composite Strategy
A number of intercurrent events have traditionally been used as sensitivity analyses in PFS These events include clinical progression (based on symptom deterioration or radiological documentation by means other than the protocol criteria), change in therapy, and sometimes others. Regulatory authorities, backed by past research, have disapproved of incorporating these events into the primary PFS endpoint Variation: Events like “clinical progression” are often ill defined, subjective, and varying between evaluators Bias: Potential evaluator bias, especially if evaluator knows or guesses treatment assignment. As noted above the on-treatment strategy generally censors for these events, while the treatment policy strategy attempts to continue measuring through them 31st July, 2018 Joint Statistical Meetings - Abstract #329467

25 PFS: Composite Strategy Revisited
Under the estimand framework, the inappropriateness of a composite strategy incorporating these events might be worth reconsidering A composite strategy is often the optimal strategy under the estimand framework when Intercurrent events are highly positively correlated with the event in question It is infeasible to continue assessment through them If intercurrent events are sufficiently correlated with the event of interest (formal radiological PD), the benefits of incorporating them into a composite endpoint might outweigh the risks associated with increased variability and bias potential. When this is so may be highly dependent on clinical context Perhaps limited to contexts where e.g. Evidence shows intercurrent events highly correlated with the primary event Compliance with assessment beyond intercurrent events is expected to be low It might be worth reconsidering the role of composite PFS endpoints in appropriate contexts 31st July, 2018 Joint Statistical Meetings - Abstract #329467

26 PFS – Opportunities for leadership
Clinical trial planning Forge an appropriate compromise between desired strategy and patient compliance and feasibility needs Take into account the specific needs of the study, indication, and patient population Work in collaboration with, and help devise strategy with (and sell it to), regulatory, clinicians, researchers, and monitors Ensure visit schedule is appropriate for assessments Research opportunities Evaluate tradeoffs between bias/variation caused by including intercurrent events in composite endpoints, and bias/variation caused by informative censoring in censoring for these events. Past recommendations by Dodd et al., Fleming et al., and others on these issues, which have been accepted by regulatory authorities and incorporated into existing guidance, were generally based on strong assumptions, such as noninformative censoring. Are these recommendations still valid if we relax these assumptions? 31st July, 2018 Joint Statistical Meetings - Abstract #329467

27 Joint Statistical Meetings - Abstract #329467
PFS – Example Following example, an extreme one, shows importance of visit schedule to reliable point estimation In a Phase III 3-arm trial of ipilimumab monotherapy or ipilimumab + gp100 vs. gp100 monotherapy (Hodi et al. 2010), the first visit was scheduled at 12 weeks. As the slide below shows, all 3 arms reached median PFS prior to the first assessment Reported median PFS would virtually identical for the 3 treatment arms Reflected only the identical 12-week first visit No actual relation to efficacy Reported CIs would be very narrow Would reflect only variation in scheduling patient appointments around the 12-week target, not actual variation in real PFS. No actual relation to estimation reliability The log-rank test remained interpretable, but point estimates were useless As article mentions, team did not report them 31st July, 2018 Joint Statistical Meetings - Abstract #329467

28 Example: PFS Median Issues
26th September, 2017 Source: Hodi et al. N Engl J Med 363: (2010)

29 Example: PFS and Immunotherapy
As an example trial, we’ll use the Bristol-Myers Squibb CA pivotal trial of ipilimumab+dacarbazine vs. placebo+dacarbazine in metastatic melanoma (Robert et al. 2011) This trial began with PFS as primary endpoint During the course of the trial, a number of potentially confounding intercurrent events became better known, including Immune-mediated tumor swelling that could mimic progression (Wolchok et al 2009). Potential for delayed effects that could potentially reduce power in a faster-maturing endpoint (Siegel 2010) Study design problem: First assessment occurred at week 12, after historical median for dacarbazine (Robert et al., 2011) , similar issues to previous slide (Hodi et al., 2010) 31st July, 2018 Joint Statistical Meetings - Abstract #329467

30 Example: Immunotherapy (Cont)
Primary endpoint was switched to OS mid-study (ClinTrials.gov disclosure April 20, 2009) Study based on OS was successful (HR 0.72, p<0.001) Secondary PFS was also significant (HR 0.76, p=0.006) As with Hodi et al (2010), median PFS estimates were nearly identical Both estimated at approximately 12-week first assessment Assessment schedule could not result in a reliable estimate Although treatment effect was sufficiently strong that PFS benefit remained significant, switch to OS appears to have been prudent Identity of medians is an example of need to ensure visit schedule can support needed estimates. Subsequent research has addressed specialized effects of immunotherapy including immune response criteria and study designs addressing delayed effects (Siegel 2015) and iRECIST (Seymore et al. 2017) Immunotherapy has provided a strong example of the role statistical evaluation of intercurrent events can have in innovative design 31st July, 2018 Joint Statistical Meetings - Abstract #329467

31 Statistics as an Engineering Discipline
Statistics is often taught as a branch of mathematics Statisticians become experts in particular theories and techniques and think in terms of mathematical objects Maximal generality is best An engineering discipline thinks in terms of classes of applied problems and develops techniques appropriate to solve them Thinking tends closer to empirical observation Requires effective understanding of clinical terms and concepts Effective solutions to the estimand problems require bridging theory and practice, understanding issues like How visit schedules influence, and the feasible practicalities of what patients can be asked to do, affect statistical assumptions How limitations on the definition of a research question affect the medical or commercial value of a label claim How clinical context affects likelihood of compliance and ability to achieve particular research goals Can benefit from an engineering approach 31st July, 2018 Joint Statistical Meetings - Abstract #329467

32 Joint Statistical Meetings - Abstract #329467
Conclusions The draft estimand guidance reflects an interplay between statistical theory and clinical knowledge and experience Knowledge of how patient behavior and study design decisions affect endpoint definitions, analysis models, and interpretations is critical Statisticians could benefit from greater exposure to both disease mechanisms and treatment and study behavior in order to inform assumptions and models Clinically informed applied statisticians have a greater opportunity to assert leadership positions in both study design and clinical research Research statisticians have an opportunity to design new methods and tools better meeting clinical needs. 31st July, 2018 Joint Statistical Meetings - Abstract #329467

33 Joint Statistical Meetings - Abstract #329467
Questions? Thank You! 31st July, 2018 Joint Statistical Meetings - Abstract #329467

34 Joint Statistical Meetings - Abstract #329467
References Clintrials.gov NCT (History of changes. April 9, 2009; April 20, 2009; October 24, 2014). European Medicines Agency, Committee for Human Medicinal Products. Draft ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials, step 2b - Revision 1 (2017) Food and Drug Administration. Draft E9(R1) Statistical Principles for Clinical Trials: Addendum: Estimands and Sensitivity Analysis in Clinical Trials (2017) Frangakis C and Rubin D Principal stratum in causal inference. Biometrics 58, 21–29 (2002). Dodd L et al. Blinded independent central review of progression-free survival in phase iii clinical trials: important design element or unnecessary expense? J Clin Onco 26: (2008) Fleming T et al. Issues in using progression-free survival when evaluating oncology products. J Clin Onco 27: (2009) Hodi et al., Improved survival with ipilimumab in patients with metastatic melanoma. N Engl J Med 363: (2010) Robert C et al. Ipilimumab plus dacarbazine for previously untreated metastatic melanoma N Engl J Med 364: (2011) Seymour L et al. iRECIST: guidelines for response criteria for use in trials testing immunotherapeutics Lancet Oncol 18: e143–e152 (2017) Wolchok J et al. Guidelines for the evaluation of immune therapy activity in solid tumors: immune-related response criteria Clin Cancer Res 15:7412–20 (2009) 31st July, 2018 Joint Statistical Meetings - Abstract #329467

35 Joint Statistical Meetings - Abstract #329467
Thank you! Joint Statistical Meetings - Abstract #329467


Download ppt "Statistical Leadership in Clinical Trials"

Similar presentations


Ads by Google