Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2.

Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2 Joint Statistical Meeting, July , Denver, Co Clinical Outcomes Solutions, Tucson AZ Mel and Enid College of Public Health, University of Arizona, Tucson, AZ

Responder analysis Compares proportions of patients achieving successful response Patients are classified as responders, often by dichotomizing a continuous outcome, if they improve by a specified threshold In clinical trials, responder analysis is important because: Produces interpretable results Conceptually appealing to many stakeholders Recommended by regulatory agencies Increasingly used because of recent legislation for patient-reported outcome Responder analysis compares proportions of subjects who have responded to treatment. Usually this is done by dichotomizing a continuous variable based on whether or not a change score has met a threshold. It’s a straightforward, simple approach, which is partly the appeal. We know that you lose important information from the continuous variable when dichotomizing, so ti’s not an efficient approach. (statisticians don’t like it) But it is very important for a few reasons: if you’re instead comparing mean difference of groups, you can achieve a statistically significant results, even is the group difference is small. And, a small difference might not be clinically relevant (think about a weight loss intervention where on average the treatment group lost 2 pounds more). So the threshold used to generate responder status is important, and almost always represents the minimum amount of change that is meaningful. Now, instead of assessing a trial’s success based on the group-level, we have the proportion of individuals who have reached a meaningful response. Results that can be expressed as, for example “50% of patients in the treatment arm responded compared to 10% in the control arm” are easily interpreted by a wide audience. For this reason, results from responder analysis lends labeling language, so is favored by pharm companies/sponsors And lastly, it is recommended (in published guidance, which essentially means you have to do it) when analyzing PRO in a regulated setting, now required of all trials. This is an area in which I work, and this dissertations is largely informed by my experiences.

Responder analysis Let 𝑌 𝑖𝑗 represent a continuous repeated measure for the 𝑖 𝑡ℎ individual at the 𝑗 𝑡ℎ time where 𝑗=1,…,𝑇, and 𝑇 is the end of study The response threshold, 𝜆, often represents the smallest amount of change that is meaningful Responder status is 𝑅 𝑖 = 1 𝑖𝑓 𝑌 𝑖𝑇 − 𝑌 𝑖1 ≥𝜆 0 𝑖𝑓 𝑌 𝑖𝑇 − 𝑌 𝑖1 <𝜆 The difference in proportions, 𝜃, is evaluated via 𝜒 2 test For simplicity throughout this work, I’m assuming that responder analysis is conducted at the end of the study. Trials generally measure an outcome repeatedly over the course of the study, and responder analysis uses the change of this value at the end fo the study compared to a person’s baseline value. Do you use Ci again? If not, why the intermediate step? Just define Ri using Yi’s

Missing data in responder analysis
Lack of missing data methods specific to responder analysis in the literature Often complete case analysis Non-response Imputation (NRI) Missing observations = non-responder Recommended in regulatory setting Thought to be conservative Will never overestimate true within group rate of response Missing data defined here is when the outcome measure is not available at the end of the study. We assume that there is not missing at baseline. When 𝑌 𝑖𝑇 is not observed, responder status cannot be determined IN the literature, usually complete cases Also, seen is when missing observations are imputed as non–responders. This is recommended in regulatory guidance, and promoted by prominent trial statisticians. So you will see a theme in this work where we tackle this recommendation.

Motivation When a continuous outcome is ultimately dichotomized, the specifications of multiple imputation (MI) come into question. Practitioners can either impute the missing outcome before dichotomizing the response (IBD) or dichotomize the outcome then impute the response (DTI). The evidence on which is best is conflicting, for example: Demirtas (2007) concluded that DTI was superior across most scenarios Yoo (2010) concluded MI with GEEs performs better when IBD is used Demirtas evaluated efficiency and accuracy of the estimated proportions of responders using IBD under the multivariate normal assumption compared to DTI using a saturated binomial model for the dichotomous response indicator, and concluded that DTI was superior across most scenarios.64 This finding is in contrast to Yoo’s work that concluded MI with GEEs performs better when the underlying continuous outcome is imputed prior to dichotomizing.81 More generally, Von Hippel’s work supports the use of just-another-variable (JAV), analogous to DTI, to impute a quadratic and interaction term under a linear regression analysis model with a conceptual argument extending to the logistic setting.63 Others demonstrated poor performance using JAV when data were MAR particularly with logistic regression82, prompting some researchers to discourage this practice.80

Goal Provide recommendations for handling of missing data in responder analysis that is: Statistically principled Straightforward Implemented in standard statistical software Clarify inconsistent results in the performance of multiply imputing the IBD or DTI in responder analysis Challenge a currently recommended method to impute missing observations as non-responder In virtually every longitudinal trial, there are missing data.

Missing mechanisms Let 𝑌 be a 𝑖×𝑗 matrix of data and 𝑍 a 𝑖×𝑗 matrix indicating whether the (𝑖,𝑗) 𝑡ℎ element is missing or observed. Data are: Missing completely at random (MCAR) if ~𝑍 is independent of the unobserved values of 𝑌 𝑃 𝑍 𝑌 =𝑃(𝑍) for all 𝑌 Missing at random (MAR) if ~𝑍 is conditionally independent of the unobserved values of 𝑌 𝑃 𝑍 𝑌 =𝑃(𝑍| 𝑌 𝑜𝑏𝑠 ) for all 𝑌 Missing not at random (MNAR) if ~𝑍 is dependent on unobserved 𝑌 MCAR: this assumption states that missingness is not related to any factor, known or unknown MAR: missingness depends only on observed quantities, which may include outcomes and predictors MNAR: missingness cannot be simplified (i.e. it depends on unobserved quantities)

𝑌 𝑖𝑗 = 𝛽 0 + 𝑏 𝑖 + 𝛽 𝑗 + 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 +𝜖 𝑖𝑗
Simulation We simulated a randomized, controlled, two-arm trial, N=200 Let 𝑌 𝑖𝑗 represent a continuous measure for the 𝑖 𝑡ℎ individual at the 𝑗 𝑡ℎ time where 𝑗=1,…,4 Higher scores represent a better outcome Data were simulated according to the underlying model: 𝑌 𝑖𝑗 = 𝛽 0 + 𝑏 𝑖 + 𝛽 𝑗 + 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 +𝜖 𝑖𝑗 where 𝑥 𝑡𝑟𝑡 =1 for treatment arm A and 0 for treatment arm B, 𝛽 𝑗 denotes the effect of the 𝑗 𝑡ℎ timepoint and 𝛿 𝑗 ∗ 𝑥 𝑡𝑟𝑡 is the interaction of treatment group and the timepoint. a continuous outcome measured at baseline and three subsequent time points Higher scores represent better outcomes Here, 𝑏 𝑖 ~𝑁(0, 𝜎 𝑏 2 ) represents the random subject effect and the error term, 𝜖 𝑖𝑗 ~𝑁 0, 𝜎 𝜖 2 represents the within-subject error. Compound symmetric **Linear trajectory, where only Arm A shows an effect.***

Creating missingness MAR, lower scores more likely to be missing, same for both arms: 𝑃 𝑍 𝑖𝑗 =0 ∝ 1−Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 where Φ is the normal cumulative distribution function with mean 𝜃 𝑌 𝑗−1 and standard deviation 𝜎 𝑌 𝑗−1 2 estimated from the data MAR, differing mechanism, differential dropout: Treatment group: 𝑃 𝑍 𝑖𝑗 =0 ∝ 1−Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 Control group: 𝑃 𝑍 𝑖𝑗 =0 ∝ Φ 𝑌 𝑗−1 , 𝜃 𝑌 𝑗−1 , 𝜎 𝑌 𝑗−1 2 creating monotone pattern of missing 30% and 50% missing at 𝑌 4 We generated incomplete datasets consistent with two MAR mechanisms. We assumed that all baseline values were observed. The probability of a missing observation at 𝑦 𝑖𝑗 , 𝑗>1, depended on the value of 𝑦 𝑗−1 , generating a MAR mechanism. Same: Data are more likely to be missing when the outcome values are low for both groups. b Diff: For the treatment group, data are more likely to be missing when the outcome values are low and for the control group missing data are more likely when the outcome values are high. *Created monotone missing, such that if time J is missing, all J+1 are missing also

Imputation Methods NRI: Multiple Imputation: IBD MI DTI MI
𝑅 𝑖 =0 if 𝑌 𝑖4 was missing Estimated 𝜃 Multiple Imputation: Using fully conditional specification, and Imputation model included repeated continuous outcomes used as auxiliary variables IBD MI Imputed the missing continuous outcomes 𝑌 𝑖𝑗 Used 𝑌 𝑖4 − 𝑌 𝑖1 to calculate 𝑅 𝑖 Estimated 𝜃 for the 𝑀 datasets DTI MI Imputed partially observed 𝑅 For IBD MI and DTI MI, all imputation models contained the group indicator, 𝑋 𝑡𝑟𝑡 , and the continuous outcomes 𝑌 𝑗 . In some imputation models, we included 𝐶𝑉, a variable representing a correlated covariate to evaluate the utility of including an auxiliary variable. For DTI MI, the imputation model included the binary response variable, 𝑅. Scenarios using dropout model 6 also evaluated the use of AE status at 𝑗=2,3,4 in the imputation model. The 𝑀=30 or 𝑀=50 estimates 62 of the difference in proportions and respective standard errors when 30% or 50% of responses at 𝑗=4 were missing, respectively, were combined using Rubin’s Rules.15

Methods Percent bias: 𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑏𝑖𝑎𝑠 𝑜𝑓 𝜃= 𝑛 𝑠𝑖𝑚 𝑖=1 𝑛 𝑠𝑖𝑚 𝜃 𝑖 −𝜃 𝜃 ∗100 Coverage of the 95% CI the proportion of results where the 95% CI contained the true value Power percentage of statistically significant group differences at 𝑝≤0.05 % Bias: Positive values represent overestimates of the difference in proportions. Coverage probability as the proportion of MI results where the true value was contained within 95 CI Power was calculated as the percentage of statistically significant group differences at 𝑝≤0.05. T 1 Error rate as the percentage of statistically significant group differences when simulating a scenario with no between group difference. MSE is a combined measure of variance and bias. SEmod is the average standard error of each 𝜃 𝑖 , and SEemp, is the standard error of 𝜃 , measuring the efficiency of 𝜃 .

Difference in proportions (95% CI)
Results NRI, IBD MI and DTI MI with a linear response trajectory, 30% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing NRI 17.6 6.9 10.6 (1.7, 19.5) -29.2 81.3 0.64 DTI MI 26.5 10.7 15.9 (5.4, 26.4) 6.0 95.2 0.77 IBD MI 25.7 10.8 14.9 (4.5, 25.3) -0.6 0.70 2: Differential dropout 21.5 5.3 16.2 (7.2, 25.3) 8.5 93.8 0.94 26.0 15.2 (4.8, 25.7) 1.8 0.71 25.6 10.9 14.7 (4.3, 25.1) -1.8 94.5 0.69 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Difference in proportions (95% CI)
Results NRI, IBD MI and DTI MI with a linear response trajectory, 50% missing. % responders in Treatment A and B was 25.6 and 10.6, respectively, 𝜃=15.0 Imputation method % Responders Trt A % Responders Trt B Difference in proportions (95% CI) % Bias Coverage, 95% CI Power 1: Lower scores more likely to be missing NRI 12.8 4.8 8.0 (0.3, 15.7) -46.8 55.6 0.52 DTI MI 27.5 11.2 16.3 (5.7, 26.9) 8.8 91.5 0.72 IBD MI 25.8 11.1 14.8 (4.3, 25.2) -1.5 94.1 0.59 2: Differential dropout 18.3 1.8 16.5 (8.5, 24.4) 10.0 93.9 0.99 26.2 14.5 11.7 (0.9, 22.5) -21.8 77.5 0.48 25.7 11.8 13.9 (3.4, 24.5) -6.9 92.8 0.49 When the response profile was linear with 30% of responses missing, bias was less than 7.3% for all MI approaches and ranged from 8.5 to -36.7% for NRI IBD MI had slightly lower or equal bias relative to DTI MI for all scenarios, and bias was conservative in direction, i.e., negative for 4 out of the 5 dropout models.

Summary Both methods of multiple imputation are slightly biased when there are moderate amounts of missing data (30%) Imputing the continuous outcome prior to dichotomizing was less biased with higher rates of missingness (50%) Non-response imputation was both positively and negatively biased

Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2.

Similar presentations

Presentation on theme: "Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2.

Similar presentations

Presentation on theme: "Imputation Strategies When a Continuous Outcome is to be Dichotomized for Responder Analysis: A Simulation Study Lysbeth Floden, PhD1 Melanie Bell, PhD2."— Presentation transcript:

Similar presentations

About project

Feedback