Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2.

Slides:



Advertisements
Similar presentations
Treatment of missing values
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
By Trusha Patel and Sirisha Davuluri. “An efficient method for accommodating potentially underpowered primary endpoints” ◦ By Jianjun (David) Li and Devan.

Chapter 8 Estimation: Single Population
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
Maximum likelihood (ML)
Statistical Methods for Missing Data Roberta Harnett MAR 550 October 30, 2007.
Multiple imputation using ICE: A simulation study on a binary response Jochen Hardt Kai Görgen 6 th German Stata Meeting, Berlin June, 27 th 2008 Göteborg.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Determining Sample Size
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Multiple Imputation : Handling Interactions Michael Spratt.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Imputation for Multi Care Data Naren Meadem. Introduction What is certain in life? –Death –Taxes What is certain in research? –Measurement error –Missing.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
1 Updates on Regulatory Requirements for Missing Data Ferran Torres, MD, PhD Hospital Clinic Barcelona Universitat Autònoma de Barcelona.
A REVIEW By Chi-Ming Kam Surajit Ray April 23, 2001 April 23, 2001.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
1 Handling of Missing Data. A regulatory view Ferran Torres, MD, PhD IDIBAPS. Hospital Clinic Barcelona Autonomous University of Barcelona (UAB)
Sample Size Determination
Tutorial I: Missing Value Analysis
Armando Teixeira-Pinto AcademyHealth, Orlando ‘07 Analysis of Non-commensurate Outcomes.
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Research and Evaluation Methodology Program College of Education A comparison of methods for imputation of missing covariate data prior to propensity score.
Theme 6. Linear regression
Missing data: Why you should care about it and what to do about it
Chapter 7 Confidence Interval Estimation
Sample Size Determination
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
ESTIMATION.
Logistic Regression APKC – STATS AFAC (2016).
Multiple Imputation using SOLAS for Missing Data Analysis
Sampling Distributions and Estimation
MISSING DATA AND DROPOUT
THE LOGIT AND PROBIT MODELS
Confidence Intervals and p-values
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
The Centre for Longitudinal Studies Missing Data Strategy
Lecture 4: Meta-analysis
Applied Statistical Analysis
Combining Random Variables
Multiple Imputation.
Multiple Imputation Using Stata
How to handle missing data values
Analyzing Intervention Studies
THE LOGIT AND PROBIT MODELS
Chapter 9 Hypothesis Testing.
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
Ying shen Sse, tongji university Sep. 2016
What is Regression Analysis?
CH2. Cleaning and Transforming Data
Missing Data Mechanisms
Analysis of missing responses to the sexual experience question in evaluation of an adolescent HIV risk reduction intervention Yu-li Hsieh, Barbara L.
Parametric Methods Berlin Chen, 2005 References:
Handling Missing Not at Random Data for Safety Endpoint in the Multiple Dose Titration Clinical Pharmacology Trial Li Fan*, Tian Zhao, Patrick Larson Merck.
Clinical prediction models
Stanford University School of Medicine
MGS 3100 Business Analysis Regression Feb 18, 2016
Yu Du, PhD Research Scientist Eli Lilly and Company
Considerations for the use of multiple imputation in a noninferiority trial setting Kimberly Walters, Jie Zhou, Janet Wittes, Lisa Weissfeld Joint Statistical.
2019 Joint Statistical Meetings at Denver
Presentation transcript:

Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2 Joint Statistical Meeting, July 30 2019, Denver, Co Clinical Outcomes Solutions, Tucson AZ Mel and Enid College of Public Health, University of Arizona, Tucson, AZ

Responder analysis Compares proportions of patients achieving successful response Patients are classified as responders, often by dichotomizing a continuous outcome, if they improve by a specified threshold In clinical trials, responder analysis is important because: Produces interpretable results Conceptually appealing to many stakeholders Recommended by regulatory agencies Increasingly used because of recent legislation for patient-reported outcome Responder analysis compares proportions of subjects who have responded to treatment. Usually this is done by dichotomizing a continuous variable based on whether or not a change score has met a threshold. It’s a straightforward, simple approach, which is partly the appeal. We know that you lose important information from the continuous variable when dichotomizing, so ti’s not an efficient approach. (statisticians don’t like it) But it is very important for a few reasons: if you’re instead comparing mean difference of groups, you can achieve a statistically significant results, even is the group difference is small. And, a small difference might not be clinically relevant (think about a weight loss intervention where on average the treatment group lost 2 pounds more). So the threshold used to generate responder status is important, and almost always represents the minimum amount of change that is meaningful. Now, instead of assessing a trial’s success based on the group-level, we have the proportion of individuals who have reached a meaningful response. Results that can be expressed as, for example “50% of patients in the treatment arm responded compared to 10% in the control arm” are easily interpreted by a wide audience. For this reason, results from responder analysis lends labeling language, so is favored by pharm companies/sponsors And lastly, it is recommended (in published guidance, which essentially means you have to do it) when analyzing PRO in a regulated setting, now required of all trials. This is an area in which I work, and this dissertations is largely informed by my experiences.

Responder analysis Let 𝑌 𝑖𝑗 represent a continuous repeated measure for the 𝑖 𝑡ℎ individual at the 𝑗 𝑡ℎ time where 𝑗=1,…,𝑇, and 𝑇 is the end of study The response threshold, 𝜆, often represents the smallest amount of change that is meaningful Responder status is 𝑅 𝑖 = 1 𝑖𝑓 𝑌 𝑖𝑇 − 𝑌 𝑖1 ≥𝜆 0 𝑖𝑓 𝑌 𝑖𝑇 − 𝑌 𝑖1 <𝜆 The difference in proportions, 𝜃, is evaluated via 𝜒 2 test For simplicity throughout this work, I’m assuming that responder analysis is conducted at the end of the study. Trials generally measure an outcome repeatedly over the course of the study, and responder analysis uses the change of this value at the end fo the study compared to a person’s baseline value. Do you use Ci again? If not, why the intermediate step? Just define Ri using Yi’s

Missing data in responder analysis Lack of missing data methods specific to responder analysis in the literature Often complete case analysis Non-response Imputation (NRI) Missing observations = non-responder Recommended in regulatory setting Thought to be conservative Will never overestimate true within group rate of response Missing data defined here is when the outcome measure is not available at the end of the study. We assume that there is not missing at baseline. When 𝑌 𝑖𝑇 is not observed, responder status cannot be determined IN the literature, usually complete cases Also, seen is when missing observations are imputed as non–responders. This is recommended in regulatory guidance, and promoted by prominent trial statisticians. So you will see a theme in this work where we tackle this recommendation.

Motivation When a continuous outcome is ultimately dichotomized, the specifications of multiple imputation (MI) come into question. Practitioners can either impute the missing outcome before dichotomizing the response (IBD) or dichotomize the outcome then impute the response (DTI). The evidence on which is best is conflicting, for example: Demirtas (2007) concluded that DTI was superior across most scenarios Yoo (2010) concluded MI with GEEs performs better when IBD is used Demirtas evaluated efficiency and accuracy of the estimated proportions of responders using IBD under the multivariate normal assumption compared to DTI using a saturated binomial model for the dichotomous response indicator, and concluded that DTI was superior across most scenarios.64 This finding is in contrast to Yoo’s work that concluded MI with GEEs performs better when the underlying continuous outcome is imputed prior to dichotomizing.81 More generally, Von Hippel’s work supports the use of just-another-variable (JAV), analogous to DTI, to impute a quadratic and interaction term under a linear regression analysis model with a conceptual argument extending to the logistic setting.63 Others demonstrated poor performance using JAV when data were MAR particularly with logistic regression82, prompting some researchers to discourage this practice.80

Goal Provide recommendations for handling of missing data in responder analysis that is: Statistically principled Straightforward Implemented in standard statistical software Clarify inconsistent results in the performance of multiply imputing the IBD or DTI in responder analysis Challenge a currently recommended method to impute missing observations as non-responder In virtually every longitudinal trial, there are missing data.

Missing mechanisms Let 𝑌 be a 𝑖×𝑗 matrix of data and 𝑍 a 𝑖×𝑗 matrix indicating whether the (𝑖,𝑗) 𝑡ℎ element is missing or observed. Data are: Missing completely at random (MCAR) if ~𝑍 is independent of the unobserved values of 𝑌 𝑃 𝑍 𝑌 =𝑃(𝑍) for all 𝑌 Missing at random (MAR) if ~𝑍 is conditionally independent of the unobserved values of 𝑌 𝑃 𝑍 𝑌 =𝑃(𝑍| 𝑌 𝑜𝑏𝑠 ) for all 𝑌 Missing not at random (MNAR) if ~𝑍 is dependent on unobserved 𝑌 MCAR: this assumption states that missingness is not related to any factor, known or unknown MAR: missingness depends only on observed quantities, which may include outcomes and predictors MNAR: missingness cannot be simplified (i.e. it depends on unobserved quantities)

𝑌 𝑖𝑗 = 𝛽 0 + 𝑏 𝑖 + 𝛽 𝑗 + 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 +𝜖 𝑖𝑗 Simulation We simulated a randomized, controlled, two-arm trial, N=200 Let 𝑌 𝑖𝑗 represent a continuous measure for the 𝑖 𝑡ℎ individual at the 𝑗 𝑡ℎ time where 𝑗=1,…,4 Higher scores represent a better outcome Data were simulated according to the underlying model: 𝑌 𝑖𝑗 = 𝛽 0 + 𝑏 𝑖 + 𝛽 𝑗 + 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 +𝜖 𝑖𝑗 where 𝑥 𝑡𝑟𝑡 =1 for treatment arm A and 0 for treatment arm B, 𝛽 𝑗 denotes the effect of the 𝑗 𝑡ℎ timepoint and 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 is the interaction of treatment group and the timepoint. a continuous outcome measured at baseline and three subsequent time points Higher scores represent better outcomes Here, 𝑏 𝑖 ~𝑁(0, 𝜎 𝑏 2 ) represents the random subject effect and the error term, 𝜖 𝑖𝑗 ~𝑁 0, 𝜎 𝜖 2 represents the within-subject error. Compound symmetric **Linear trajectory, where only Arm A shows an effect.***

Creating missingness MAR, lower scores more likely to be missing, same for both arms: 𝑃 𝑍 𝑖𝑗 =0 ∝ 1−Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 where Φ is the normal cumulative distribution function with mean 𝜃 𝑌 𝑗−1 and standard deviation 𝜎 𝑌 𝑗−1 2 estimated from the data MAR, differing mechanism, differential dropout: Treatment group: 𝑃 𝑍 𝑖𝑗 =0 ∝ 1−Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 Control group: 𝑃 𝑍 𝑖𝑗 =0 ∝ Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 creating monotone pattern of missing 30% and 50% missing at 𝑌 4 We generated incomplete datasets consistent with two MAR mechanisms. We assumed that all baseline values were observed. The probability of a missing observation at 𝑦 𝑖𝑗 , 𝑗>1, depended on the value of 𝑦 𝑗−1 , generating a MAR mechanism. Same: Data are more likely to be missing when the outcome values are low for both groups. b Diff: For the treatment group, data are more likely to be missing when the outcome values are low and for the control group missing data are more likely when the outcome values are high. *Created monotone missing, such that if time J is missing, all J+1 are missing also

Imputation Methods NRI: Multiple Imputation: IBD MI DTI MI 𝑅 𝑖 =0 if 𝑌 𝑖4 was missing Estimated 𝜃 Multiple Imputation: Using fully conditional specification, and Imputation model included repeated continuous outcomes used as auxiliary variables IBD MI Imputed the missing continuous outcomes 𝑌 𝑖𝑗 Used 𝑌 𝑖4 − 𝑌 𝑖1 to calculate 𝑅 𝑖 Estimated 𝜃 for the 𝑀 datasets DTI MI Imputed partially observed 𝑅 For IBD MI and DTI MI, all imputation models contained the group indicator, 𝑋 𝑡𝑟𝑡 , and the continuous outcomes 𝑌 𝑗 . In some imputation models, we included 𝐶𝑉, a variable representing a correlated covariate to evaluate the utility of including an auxiliary variable. For DTI MI, the imputation model included the binary response variable, 𝑅. Scenarios using dropout model 6 also evaluated the use of AE status at 𝑗=2,3,4 in the imputation model. The 𝑀=30 or 𝑀=50 estimates 62 of the difference in proportions and respective standard errors when 30% or 50% of responses at 𝑗=4 were missing, respectively, were combined using Rubin’s Rules.15

Methods Percent bias: 𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑏𝑖𝑎𝑠 𝑜𝑓 𝜃= 1 𝑛 𝑠𝑖𝑚 𝑖=1 𝑛 𝑠𝑖𝑚 𝜃 𝑖 −𝜃 𝜃 ∗100 Coverage of the 95% CI the proportion of results where the 95% CI contained the true value Power percentage of statistically significant group differences at 𝑝≤0.05 % Bias: Positive values represent overestimates of the difference in proportions. Coverage probability as the proportion of MI results where the true value was contained within 95 CI Power was calculated as the percentage of statistically significant group differences at 𝑝≤0.05. T 1 Error rate as the percentage of statistically significant group differences when simulating a scenario with no between group difference. MSE is a combined measure of variance and bias. SEmod is the average standard error of each 𝜃 𝑖 , and SEemp, is the standard error of 𝜃 , measuring the efficiency of 𝜃 .

Difference in proportions (95% CI) Results NRI, IBD MI and DTI MI with a linear response trajectory, 30% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing   NRI 17.6 6.9 10.6 (1.7, 19.5) -29.2 81.3 0.64 DTI MI 26.5 10.7 15.9 (5.4, 26.4) 6.0 95.2 0.77 IBD MI 25.7 10.8 14.9 (4.5, 25.3) -0.6 0.70 2: Differential dropout 21.5 5.3 16.2 (7.2, 25.3) 8.5 93.8 0.94 26.0 15.2 (4.8, 25.7) 1.8 0.71 25.6 10.9 14.7 (4.3, 25.1) -1.8 94.5 0.69 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Difference in proportions (95% CI) Results NRI, IBD MI and DTI MI with a linear response trajectory, 30% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing   NRI 17.6 6.9 10.6 (1.7, 19.5) -29.2 81.3 0.64 DTI MI 26.5 10.7 15.9 (5.4, 26.4) 6.0 95.2 0.77 IBD MI 25.7 10.8 14.9 (4.5, 25.3) -0.6 0.70 2: Differential dropout 21.5 5.3 16.2 (7.2, 25.3) 8.5 93.8 0.94 26.0 15.2 (4.8, 25.7) 1.8 0.71 25.6 10.9 14.7 (4.3, 25.1) -1.8 94.5 0.69 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Difference in proportions (95% CI) Results NRI, IBD MI and DTI MI with a linear response trajectory, 50% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing   NRI 12.8 4.8 8.0 (0.3, 15.7) -46.8 55.6 0.52 DTI MI 27.5 11.2 16.3 (5.7, 26.9) 8.8 91.5 0.72 IBD MI 25.8 11.1 14.8 (4.3, 25.2) -1.5 94.1 0.59 2: Differential dropout 18.3 1.8 16.5 (8.5, 24.4) 10.0 93.9 0.99 26.2 14.5 11.7 (0.9, 22.5) -21.8 77.5 0.48 25.7 11.8 13.9 (3.4, 24.5) -6.9 92.8 0.49 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Difference in proportions (95% CI) Results NRI, IBD MI and DTI MI with a linear response trajectory, 50% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing   NRI 12.8 4.8 8.0 (0.3, 15.7) -46.8 55.6 0.52 DTI MI 27.5 11.2 16.3 (5.7, 26.9) 8.8 91.5 0.72 IBD MI 25.8 11.1 14.8 (4.3, 25.2) -1.5 94.1 0.59 2: Differential dropout 18.3 1.8 16.5 (8.5, 24.4) 10.0 93.9 0.99 26.2 14.5 11.7 (0.9, 22.5) -21.8 77.5 0.48 25.7 11.8 13.9 (3.4, 24.5) -6.9 92.8 0.49 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Difference in proportions (95% CI) Results NRI, IBD MI and DTI MI with a linear response trajectory, 50% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing   NRI 12.8 4.8 8.0 (0.3, 15.7) -46.8 55.6 0.52 DTI MI 27.5 11.2 16.3 (5.7, 26.9) 8.8 91.5 0.72 IBD MI 25.8 11.1 14.8 (4.3, 25.2) -1.5 94.1 0.59 2: Differential dropout 18.3 1.8 16.5 (8.5, 24.4) 10.0 93.9 0.99 26.2 14.5 11.7 (0.9, 22.5) -21.8 77.5 0.48 25.7 11.8 13.9 (3.4, 24.5) -6.9 92.8 0.49 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Summary Both methods of multiple imputation are slightly biased when there are moderate amounts of missing data (30%) Imputing the continuous outcome prior to dichotomizing was less biased with higher rates of missingness (50%) Non-response imputation was both positively and negatively biased