Download presentation
Presentation is loading. Please wait.
Published byByron Butler Modified over 9 years ago
1
Session 1 An Introduction to Efficacy and Mechanisms Evaluation (EME)
Evaluation of Potential Mediators in Randomized Trials of Complex Intervention (Psychotherapies) Session 1 An Introduction to Efficacy and Mechanisms Evaluation (EME) Graham Dunn Methodology Research Group Research funded by: MRC Methodology Grants G G , G , G MHRN Methodology Research Group
2
Plan for this Session Efficacy estimation Correlation and causality
The role of random allocation Efficacy in the presence of departures from randomisation (non-compliance) Estimation via Intention-To-Treat, Per Protocol, and As Treated approaches The Complier-Average Causal Effect (CACE) CACE estimation via Principal Stratification CACE estimation via Instrumental Variable methods Mechanisms evaluation Introduction to mediation ‘Traditional’ Baron & Kenny methods to assess mediation Extending instrumental variable methods to allow for omitted variables (hidden confounding) Therapeutic alliance and treatment-effect heterogeneity
3
Correlation and Causality
A typical problem Observation: From routine clinical records, receipt of CBT (A) is correlated (associated) with better clinical outcomes (B). What can we infer? Either A causes B (the ‘nice’ explanation) or 2. B causes A (unlikely because A is the intervention) 3. There is a common cause, C (confounding by prognostic indicators) 4. Any combination of some or all of the above.
4
A B C A Path Diagram: Receipt of treatment Clinical outcome (CBT)
(BDI score, say) A B C Confounders: factors that influence both selection for treatment and outcome
5
The Role of Random Allocation
Receipt of treatment (CBT) Clinical outcome (BDI score, say) A B Path blocked C No confounding: Correlation now implies a treatment effect
6
Efficacy Estimation We have carried out a randomized controlled trial (RCT) for the treatment of depression: Treatment As Usual (TAU) – Control versus TAU plus Cognitive Behaviour Therapy (CBT) Everyone in the CBT arm receives the allocated treatment (and none in the Control arm). Everyone in the trial provides a measure of outcome (a BDI score). Efficacy is estimated by comparing the average BDI in the CBT arm with the average BDI in the controls. It is the effect of receiving treatment.
7
Efficacy Estimation Now let the randomised trial be a bit more realistic! Let’s assume that a fairly large proportion (30-50%, say) of those allocated to the CBT arm do not turn up for their therapy. We still have an outcome measurement (BDI score) for everyone. Comparison of average outcomes for participants as randomised (the Intention-To-Treat or ITT estimate) now provides us with an estimate of the effect of offering treatment (it is a measure of the Effectiveness of the treatment offer). It’s a very sensible thing to do, but what about treatment efficacy? What’s the effect of getting the treatment?
8
Efficacy Estimation The Per Protocol (PP) estimate compares the average outcome in the Controls with the average outcome of those in the CBT arm who adhered to the protocol (i.e. received treatment). Based on an implicit assumption that the treatment compliers are comparable to those excluded participants who did not adhere to their allocated treatment. However, this is very unlikely to be true – and the PP estimate will be subject to selection effects/confounding (i.e. biased). The As Treated (or On Treatment) estimate ignores randomisation and compares the average outcome in the participants who received treatment with the average of those who did not. Again, very likely to be biased (subject to selection effects/confounding).
9
Efficacy Estimation The Complier-Average Causal Effect (CACE) estimate is the comparison of the average outcome of the compliers in the CBT arm with the average outcome of the comparable group of would-be compliers* in the Control arm. * Those people in the Control group who would have complied with their treatment allocation had they, contrary to fact, been allocated to receive CBT. This is a randomisation-respecting estimate. It is the ITT effect in the sub-group of participants who would always comply with their treatment allocation. It is not subject to confounding But how is it calculated? First, we make some explicit assumptions.
10
Efficacy (CACE) Estimation
Assumptions 1. There are two latent classes of participants (Principal Strata): Compliers and Non-compliers. Compliers get therapy if and only if allocated to the treatment. Non-compliers never get the therapy, Regardless of allocation. 2. As a consequence of randomisation, on average, the proportion of Compliers is the same in the two arms of the trial. 3. In the absence of treatment (i.e. for the Non-compliers) there is no effect of randomisation (i.e. treatment arm) on outcome. This assumption is often called an exclusion restriction.
11
Example: The ODIN trial
Trial of 2 psychological interventions to reduce depression (Dowrick et al, 2000) Randomised individuals: 236 to the psychological interventions (E) 128 to treatment as usual (S) Outcome: Beck Depression Inventory (BDI) at 6 months recorded on 317 randomised individuals ITT results Mean (SD) Difference in BDI6 (std error) E (n=177) S (n=140) Unadjusted 13.29 (9.85) 15.16 (10.42) -1.87 (1.14) Adjusted for baseline BDI -2.28 (1.02)
12
CACE analysis (complete cases)
# participants mean BDI Compliers Never-takers (Non-compliers) All Therapy (E) Control (S) ? ?
13
CACE analysis (2) # participants mean BDI Compliers Never-takers All
Therapy (E) Control (S) 6692*33961/33839=6716.1 22*33961/33839=22.1 = complier-average causal effect (CACE) randomisation balance (59*140/177) 46.7 13.22 exclusion restriction CACE = – = (cf ITT = – = -1.87) Note: 66.7% compliance (118/177) ITT / = CACE
14
CACE Estimation CACE estimate = ITT estimate for outcome
ITT estimate for treatment received Proportion of Compliers = -1.87/0.667 = -2.81
15
CACE vs. PP # participants mean BDI Compliers Never-takers All
Therapy (E) Control (S) CACE is based on the “exclusion restriction” assumption CACE equal 6692*33961/33839=6716.1 22*33961/33839=22.1 = PP equal Per-protocol analysis estimates the CACE under the “random non-compliance” assumption
16
Instrumental Variables
We wish to estimate the effect of treatment received on outcome. We suspect that treatment received and outcome are confounded (i.e. there are omitted variables that both influence treatment receipt and outcome). If we can assume that randomisation (treatment allocation) has an effect on outcome but, only through its effect on treatment receipt, and 2. randomisation is independent of all confounders, then randomisation is an instrumental variable or instrument (IV) and instrumental variable estimation will solve the problem.
17
Instrumental Variables
Complete mediation of the effect of treatment allocation by treatment receipt. Randomised Allocation (rgroup) Treatment Received (CBT) Outcome (BDI) Omitted variables (Confounders)
18
Instrumental Variable (IV) Regression
Most general packages contain instrumental variable (two stage least squares or 2SLS) routines. These include SPSS, SAS and Stata. Here, we will illustrate their use through the ivregress command in Stata Version 11: ivregress 2sls BDI (CBT=rgroup)
19
Efficacy (CACE) Estimation (IV regression)
ivregress 2sls bdi6 (complya = rgroup) Instrumental variables (2SLS) regression Number of obs = Wald chi2(1) = Prob > chi2 = R-squared = Root MSE = bdi6 | Coef. Std. Err z P>|z| [95% Conf. Interval] complya | _cons | Instrumented: complya Instruments: rgroup CACE estimate (s.e. 1.72)
20
Missing Outcome Data (loss to follow-up)
Loss to follow-up strongly related to non-compliance with allocated treatment. Possible to extend estimation procedures to allow for a credible missing data mechanism: Missing data jointly determined by allocation, baseline covariates, treatment received (Missing at Random or MAR) or Missing data jointly determined by allocation and the latent would-be compliance status (Latently Ignorable or LI).
21
ODIN Follow-up rates and Outcomes
22
CACE analysis under MAR (Outcome data Missing At Random)
Mean BDI participants Compliers Never-takers All Therapy (E) Control (S) 46.7 108 236 191 128 6692*33961/33839=6716.1 22*33961/33839=22.1 = complier-average causal effect (CACE) randomisation balance (108*191/236) 87.4 103.6 16.80 13.22 exclusion restriction CACE (MAR) = – = cf CACE (CC) = – = -2.81
23
Mechanisms Evaluation
Compliance with allocated treatment Does the participant turn up for any therapy? How many sessions does she attend? Fidelity of therapy How close is the therapy to that described in the treatment manual? Is it a cognitive-behavioural intervention, for example, or merely emotional support? Quality of the therapeutic relationship What is the strength of the therapeutic alliance? Is it associated with the effect of treatment?
24
Mechanisms Evaluation
What is the concomitant medication? Does psychotherapy improve compliance with medication which, in turn, leads to better outcome? What is the direct effect of psychotherapy? Is there any? What is the concomitant substance abuse? Does psychotherapy reduce cannabis use, which in turn leads to improvements in psychotic symptoms? What are the participant’s beliefs? Does psychotherapy change attributions (beliefs), which, in turn, lead to better outcome? How much of the treatment effect is explained by changes in attributions?
25
Mechanisms Evaluation Treatment Effect Mediation
CBT Depressed Mood γ α β Change in Medication Omitted variables (Confounders)
26
The Mediation Industry
Baron RM & Kenny DA (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology 51, 1173- 1182. February 2010:Over 13,000 citations – 2000 over the last year alone! Depends on the implicitly-assumed absence of hidden confounding (non-ignorable selection). The Assumptions are very rarely stated, let alone their validity discussed. One suspects that the majority of investigators are oblivious of the assumptions and of their implications. Results are of unknown and questionable value.
27
The Mediation Literature
It’s unfortunate that the 1986 paper by Baron & Kenny has been so influential. They were fully aware of the omitted variables problem, as is shown by reference to an earlier (much better) paper: Judd, C. M. and Kenny, D. A. (1981). Process analysis: Estimating mediation in treatment evaluations. Evaluation Review 5, 602– 619. They just didn’t think to mention it! David Kenny himself admits that this is problematic (see his Mediation website), but the paper now has a life of its own!
28
The Baron & Kenny approach
Evaluate the effect of therapy on outcome ITT – OK, no confounding 2. Evaluate effect of therapy on potential mediator 3. Evaluate effect of mediator on outcome, and of therapy on outcome, conditioning on mediator. Only valid if there is no confounding (i.e. no omitted variables)*. * The problem is not solved by replacing multiple regression by structural equation models (sem).
29
The Baron & Kenny model:
CBT Depressed Mood γ α β Change in Medication There are no omitted variables. Is this realistic? No Way!
30
The Solution? Design: Try to think of and measure all of the potential confounders. Make sure there are no omitted variables to worry about. In addition, build in convincing instrumental variables. Analysis: Extend instrumental variable, principal stratification and other models to allow for missing confounding. Need to find additional instrumental variables. These methods depend on alternative assumptions. Are the necessary assumptions realistic? Cast doubt on your results: Perhaps it’s not realistic to think we’ve got the right model, but to try several and check whether the results are dependent on the assumptions that we need. i.e. carry out a thorough Sensitivity Analyses.
31
Mechanisms Evaluation: IV Models
Treatment Centre by CBT interaction CBT (Random) Use centre differences (or trial differences) as Instrumental Variables (IVs) Beliefs Psychotic symptoms Treatment Centre C
32
Mediation – the PROSPECT trial
Results from the analysis of a US suicide prevention trial (PROSPECT) – psychotherapy for depression in the elderly. The therapy influenced compliance with antidepressant medication. Did this explain the results? – from Table 4 of Emsley, Dunn & White, 2010 ITT effect: −3.15 (0.82) Analytical method Direct effect, γ(s.e.) Indirect effect, β(s.e.) Regression (B&K) −2.66 (0.93) −1.24 (1.09) IV (ivregress) −2.38 (1.35) −1.95 (2.71) Principal stratification (with monotonicity) −2.62 (1.38)* −1.37 (2.97)* *Bootstrap standard errors.
33
Treatment-effect Modification
Treatment effects vary from one individual to another - i.e. there is treatment-effect heterogeneity. Treatment effects may be influenced by baseline (pre-randomisation) covariates such as gender, age, prior history of illness, personality, insight, treatment centre, and so on. This source of treatment effect heterogeneity is called treatment-effect moderation. [Aside: this is an essential component of the evaluation of treatment mediation as described above] Treatment effects may be influenced by therapist characteristics (Chris’ session this afternoon). Perhaps the most interesting source of treatment-effect heterogeneity are process measures such as the therapeutic alliance (another potential mediator).
34
Role of the Therapeutic Alliance
An RCT: CBT versus controls (no therapy) The therapeutic alliance can only be measured in people who receive CBT (but let’s assume we have 100% adherence to allocated treatment) Alliance scores are subject to measurement errors (as are measurements of most mediators, for that matter!) The effect of alliance on outcome is highly likely to be subject to confounding (a patient capable of forming a strong therapeutic alliance is also likely to have the better outcome, even in the absence of therapy).
35
Evaluating the effects of the Therapeutic Alliance: A typical analysis strategy
Ignore the control group and anyone else who has not received treatment Look at the correlation between alliance score and outcome (BDI score, for example) Infer that this correlation (if found) tells us something reliable about the relationship between the strength of the therapeutic alliance and the effect of therapy. This analysis is flawed! These data cannot be used to distinguish treatment effects from treatment-free prognosis.
36
Evaluating the effects of Alliance: What are we trying to evaluate?
The treatment-effect on an a particular individual is the difference between the outcome after treatment and the outcome after experiencing the control condition. We cannot observed individual treatment effects, but we can use outcome means, together with randomisation of treament, to estimate average treatment effects. For simplicity, let’s make alliance binary: strong versus weak.
37
Principal Strata defined by Alliance
Two strata: Strong: Strong alliance if allocated to receive CBT, not recorded (latent) otherwise. Weak: Weak alliance if allocated to receive CBT, unrecorded (latent) otherwise. The proportion of participants in the strong alliance stratum is, on average, the same in the treated and control groups (c.f. CACE estimation). Stratum membership is independent of treatment allocation (no confounding). We would like to compare the effect of randomisation (i.e. the ITT effect) in these two strata. It is this that would provide us with evidence of treatment-effect heterogeneity.
38
Analysis of Treatment-effect heterogeneity
Unlike CACE estimation, we cannot assume that there is no effect of allocation in one of the groups (i.e. the exclusion restriction is unlikely to be valid). In order to proceed we need strong baseline predictors of stratum membership or, in the general case, strong predictors of potential strength of the therapeutic alliance. Statistical methods used involve extensions of CACE analysis or IV estimation (as in the evaluation of mediation).
39
An Example: The SoCRATES Trial
SoCRATES was a multi-centre RCT designed to evaluate the effects of cognitive behaviour therapy (CBT) and supportive counselling (SC) on the outcomes of an early episode of schizophrenia. Participants were allocated to one of three conditions: Treatment as Usual (TAU), CBT + TAU, SC + TAU. For our illustrative purposes, we ignore the distinction between CBT and SC, using a binary variable to distinguish treatment and control. The explanations will involve repetition of some of the stuff I’ve already covered above. I hope it helps!
40
SoCRATES (contd.) 3 treatment centres: Liverpool, Manchester and Nottinghamshire. Other baseline covariates include logarithm of untreated psychosis and years of education. Outcome (a psychotic symptoms score) was obtained using the Positive and Negative Syndromes Schedule (PANSS). We consider the 18 month PANSS total score here. From an ITT analyses of 18 month follow-up data, both psychological treatment groups had a superior outcome in terms of symptoms (as measured using the PANSS) compared to the control group. There were no differences in the effects of CBT and SC, but there was a strong centre effect, with outcomes for the psychological therapies at one of the centres (Liverpool) being significantly better than at the remaining two.
41
SoCRATES (contd.) Post-randomization variables that have a potential explanatory role in exploring the therapeutic effects include the total number of sessions of therapy actually attended and the quality or strength of the therapeutic alliance. Therapeutic alliance was measured at the 4th session of therapy, early in the time-course of the intervention, but not too early to assess the development of the relationship between therapist and patient. We use a patient rating of alliance based on the CALPAS (California Therapeutic Alliance Scale). Total CALPAS scores (ranging from 0, indicating low alliance, to 7, indicating high alliance) were used in some of the analyses reported below, but we also use a binary alliance variable (1 if CALPAS score ≥5, otherwise 0). .
42
SoCRATES (contd.) 182 (88.3%) out of 206 patients in the treated groups provided data on the number of sessions attended. 56 patients from the CBT group and 58 from the SC group completed CALPAS forms at session 4 (overall 55.34%). The analysis presented here is based on all control participants but only those from treated groups who provide both a CALPAS and a record of the number of sessions (missing sessions/alliance data another potential source of bias that will be ignored here).
43
SoCRATES - Summary Statistics
Centre 1 - Liv Centre 2 – Man Centre 3 - Nott Mean (SD) Control N=39 Treated N=29 Control N=35 Treated N=49 Control N=26 Treated N=23 Baseline PANSS 80.0 (12.36) 77.7 (13.93) 97.9 (16.6) 100.5 (16.3) 84.9 (14.91) 83.4 (10.84) 18 month PANSS 69.5 (13.55) 50.2 (13.48) 73.2 (22.4) 74.4 (20.00) 54.5 (10.07) 49.1 (7.25) CALPAS - 5.73 (0.81) 5.07 (0.88) 5.15 (1.47) Sessions 18.14 (3.60) 16.16 (4.58) 13.87 (4.95) High Alliance: N(%) 23 (79.3) 30 (61.2) 13 (56.5) # of observed 18m PANSS 23 25 39 21 22
44
SoCRATES “dose”-response model: complete mediation
(direct effect assumed to be absent) Offer of Treatment (random) Sessions Attended Psychotic Symptoms U What’s the role of the therapeutic alliance? Does Alliance modify the effect of randomisation on sessions attended? Does Alliance modify the effect of treatment received on outcome?
45
Does the quantity (sessions) and quality (alliance) of the therapy influence the treatment effect?
Does the contrast between the treatment outcome and the counterfactual treatment-free outcome increase with the number of sessions attended? Does the contrast between the treatment outcome and the counterfactual treatment-free outcome increase with increasing alliance? Is there an interaction between sessions and alliance?
46
Randomisation-respecting inference
Estimate effects of post-randomisation variables that involve the comparison of randomised sub-groups of patients (within-class Intention-to-Treat or ITT effects). For example, we compare (or would like to compare) the outcome of treatment in those participants who develop a given level of alliance with the outcome in the control patients who would have developed the same level of alliance if they had been allocated to receive therapy.
47
Individual Treatment Effects
48
The identification problem
Our model is not identified (i.e. our data are not rich enough to allow us to estimate the parameters of interest) We need to be able to find variables which influence sessions and alliance but have no direct effect on outcome (instrumental variables). We need multiple instruments.
49
Multiple IVs Where do we get them from?
Randomisation involving more than one active treatment – i.e. to interventions specifically targeted at particular intermediate variables. Randomisation-by-baseline variable interactions - Randomisation-by-Centre, for example. Randomisation-by-trial (multiple trials). Genetic markers (Mendelian Randomisation) used together with randomisation – not relevant to most psychotherapy trials but could be very useful if used in conjunction with randomisation in pharmacotherapy research.
50
Estimation SMM / G-estimation (not discussed here)
IV regression - Two-Stage Least Squares (2SLS) ivregress 2sls panss i.centre (s as = i.centre*rgroup) ML using structural equation modelling software (easier to cope with missing outcome models)
51
Stata ivregress results (over-simplified complete case analysis).
Note: A has been rescaled so that maximum=0. βs = (se 0.70); βsa = (se 0.48) When A=0 (i.e. maximum alliance) the slope for effect of Sessions is -2.40 When A=-7 (i.e. minimum alliance) the slope is *1.28 = +6.56 This suggests that when alliance is very poor attending more sessions makes the outcome worse!
52
Principal Stratification
Clinically interesting questions concern the modifying effects of variables that can only be measured once treatment has been initiated (i.e. post-randomisation). Such variables include compliance with active treatment and strength of the therapeutic alliance between patient and therapist.
53
Principal Stratification (example)
For simplicity we assume that everyone allocated to psychotherapy actually receives it – everyone is a complier.** We have one sub-group of participants who receive no therapy if allocated to the control condition but receive therapy with a low alliance if allocated to the treatment group. We have a second sub-group who receive no therapy if allocated to the control condition but receive therapy with a high alliance if allocated to the treatment group. Principal stratum membership is independent of treatment allocation We can stratify by stratum membership and evaluate the effects of treatment allocation within them. ** But we could easily add a third stratum – non-compliers
54
Model Identification – Principal Strata
We need baseline covariates that are good predictors of stratum membership. With two principal strata (high vs low alliance), we would construct a logistic regression (latent class) model to predict stratum membership using baseline covariates, X (particularly treatment centre, for example). This approach (predicting principal strata from baseline covariates) is analogous to using the baseline covariate-randomization interactions as instrumental variables in 2SLS. We simultaneously model the ITT effects on outcome within the two principal strata. Estimation proceeds by specifying a full probability model, here, for example, using ML.
55
Model Identification – Principal Strata
It is possible to fit the latent class model for stratum membership and simultaneously a further regression model for the ITT effects of treatment within each of the principal strata, usually allowing for the same baseline covariates – for example, when using the finite mixture model option in Mplus (Muthén & Muthén). If we have missing outcome data (with missing outcome indicator, Ri) we can also simultaneously fit a third model predicting missing outcomes, based on the assumptions of latent ignorability. In our SoCRATES examples, we use treatment centre, logDUP, Years Education and baseline PANSS to predict stratum membership. We use the same covariates plus the effect of randomisation to model outcome within principal strata – assuming that there are no covariate by randomisation interactions in this part of the model Bootstrapping used to get standard errors.
56
Extensions: explanatory models nested within principal strata
The basic idea of principal stratification is the estimation of ITT effects within principal strata. Typically we are interested in a univariate response, but we could investigate the advantages of simultaneously estimating effects for two or more different outcomes (i.e. multivariate responses). It is possible to look at binary outcomes and, of course, one of these binary outcomes might be a missing value indicator – as in models assuming latent ignorability (Frangakis and Rubin, 1999). In the context treatment compliance, Jo and Muthen have investigated the use of latent growth curve/trajectory models for longitudinal outcome data. We will illustrate the idea by looking at the effect of sessions attended on the effects of therapy.
57
SoCRATES - results Estimated ITT effects on 18 month PANSS
Missing data ignorable (MAR) Low alliance High alliance +7.50 (8.18) (4.60) Missing data latently ignorable (LI) +6.49 (7.26) (5.95)
58
SoCRATES – effect of Sessions
Standard Structural Equation Model Missing data assumption: MAR (uncorrelated errors – no hidden confounding) Low alliance High alliance α (0.96) (0.45) β (0.38) (0.23) IV Structural Equation Model (with correlated errors – hidden confounding) α (0.97) (0.46) β (0.47) (0.29) α - effect of randomisation on sessions β - effect of sessions on 18-month PANSS
59
Evaluation of Mediation and Treatment-effect Heterogeneity CONCLUSIONS
It’s not at all straightforward! It’s not easy to find convincing instrumental variables. Need improvements in trial design. It’s not easy to use several of the relevant statistical methods. The results can be very unstable. Often finish up with low power and low precision. The key is scepticism! Don’t believe the results of simple, often naïve analyses.
60
Efficacy & Mechanisms Evaluation Further Reading
Emsley, R., Dunn, G. & White, I.R. (2010). Modelling mediation and moderation of treatment effects in randomised controlled trials of complex interventions. Statistical Methods in Medical Research 19, Dunn, G., Maracy, M. & Tomenson, B. (2005). Estimating treatment effects from randomized clinical trials with non-compliance and loss to follow-up: the role of instrumental variable methods. Statistical Methods in Medical Research 14, (Part of a Special Issue on this theme) Maracy, M. & Dunn, G. (2010). Estimating dose-response effects in psychological treatment trials: the role of instrumental variables. Statistical Methods in Medical Research. In press: first published online on November 26, 2008 as doi: /
61
Have we thought of everything?
Measurement errors – IV methods cope with random measurement errors in the intermediate variables. Principal Stratification assumes classification not error-prone. Douglas Adams wrote that the most leading assumptions are the ones you don’t even know you’re making**. ** In Adams, D. & Carwardine, M. (1990) Last Chance to See, p77. London:Heinemann.
62
Efficacy & Mechanisms Evaluation
See websites: and for downloads of Trial data & Software scripts Details of the MHRN MRG can be found at:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.