Presentation is loading. Please wait.

Presentation is loading. Please wait.

Challenges of statistical analysis in surgical trials

Similar presentations


Presentation on theme: "Challenges of statistical analysis in surgical trials"— Presentation transcript:

1 Challenges of statistical analysis in surgical trials
Professor Catherine Hewitt Senior statistician BOA Orthopaedic Surgery Research Centre

2 Non-compliance/crossovers

3 Perfect trial participants
Everyone asked is willing to take part Everyone recruited is randomised and receives their allocated intervention Everyone randomised completes their allocated intervention and provides outcome data

4

5 What happens in practice?
Some people are not willing to take part Everyone recruited is randomised and some receive their allocated intervention Some participants complete their allocated intervention and some provide outcome data

6 Problems during analysis
Even in well designed and well conducted RCTs non-compliance is almost inevitable Creates problems at analysis stage How should we deal with participants who do not comply with their allocated treatment in the analysis?

7 Non-compliance Randomisation Follow-up ? ? Analysis

8 Intention To Treat Randomisation Follow-up Analysis

9 Advantages of ITT Why use ITT?
Maintains baseline comparability Completely objective It mirrors what would happen in practice ITT answers an important question (pragmatic) E.g. “What would be the reduction in CRC mortality if we offered population screening?”

10 Disadvantage of ITT In presence of non-compliance
ITT does not answer the question: What would be the impact if people attended screening? For example, using ITT: On average, screening would reduce your risk of CRC mortality by 20%, Whether you take up the offer of screening or not

11 How do researchers answer the question: what would be the impact if people attended screening?

12 Per protocol Randomisation Follow-up Analysis

13 Drawbacks of PP Drawbacks of PP: Loss of statistical power Answer?
Due to a sample size reduction Answer?

14 On Treatment Randomisation Follow-up Analysis

15 Example: On treatment

16

17 Main problems For both PP and OT approaches to be valid participants who do not comply have to be a random sample of ALL participants who were offered treatment This rarely true in research involving humans Participants self-select themselves into groups: Breaks initial randomisation Violates basis for statistical inference Different characteristics

18 What’s the alternative to using PP or OT?
CACE approach

19 CACE Randomisation Follow-up Analysis

20 CACE General problem: If this is possible, then
Need to identify the compliance status of the control group If this is possible, then Compare the compliant group in the intervention arm with the potential compliant group in the control arm

21 CACE – assumptions We need some measure of the compliance status in the intervention arm If the control group offered the treatment same proportion would not comply Should be true due to randomisation Being offered the treatment has no effect on outcomes This needs more thought!

22 Example: Screening trial
Trial of faecal-occult-blood screening for the prevention of colorectal cancer 53% of the people invited for screening attended In publication: ITT used to measure the average benefit of screening in invitees PP used to attempt to measure the average benefit of screening in attendees

23 CACE – what is observed? Status Intervention (n = 75,253)
Control (n = 74,998) Deaths ÷ n ER % Compliers (53%) 138 ÷ 40,214 0.34 Non-compliers (47%) 222 ÷ 35,039 0.63 Total outcome 360 ÷ 75,253 0.48 420 ÷ 74,998 0.56

24 CACE – assumption 1 Status Intervention (n = 75,253)
Control (n = 74,998) Deaths ÷ n ER % Compliers (53%) 138 ÷ 40,214 0.34 Non-compliers (47%) 222 ÷ 35,039 0.63 Total outcome 360 ÷ 75,253 0.48 420 ÷ 74,998 0.56 Assuming same proportion of non-comps in control 47% x 74,998 = 35,249

25 Offer of treatment no effect on outcome
CACE – assumption 2 Status Intervention (n = 75,253) Control (n = 74,998) Deaths ÷ n ER % Compliers (53%) 138 ÷ 40,214 0.34 Non-compliers (47%) 222 ÷ 35,039 0.63 ÷ 35,249 Total outcome 360 ÷ 75,253 0.48 420 ÷ 74,998 0.56 Offer of treatment no effect on outcome 35,249 x 0.63% = 222 =0.63%

26 CACE – what is estimated?
Status Intervention (n = 75,253) Control (n = 74,998) Deaths ÷ n ER % Compliers (53%) 138 ÷ 40,214 0.34 Non-compliers (47%) 222 ÷ 35,039 0.63 222 ÷ 35,249 Total outcome 360 ÷ 75,253 0.48 420 ÷ 74,998 0.56 420 – 222 = 198 & 74,998 – 35,249 = 39,749 198 ÷ 39,749 = 0.50

27 CACE – complete table Status Intervention (n = 75,253)
Control (n = 74,998) n ER % Compliers (53%) 138 ÷ 40,214 0.34 198 ÷ 39,749 0.50 Non-compliers (47%) 222 ÷ 35,039 0.63 222 ÷ 35,249 Total outcome 360 ÷ 75,253 0.48 420 ÷ 74,998 0.56

28 Example: CACE results Compare the compliers in the intervention arm with the potential compliers in the control arm That is, a 31% reduction in CRC mortality

29 Example: conclusions Offering screening to the whole population
ITT shows a 15% reduction in CRC deaths We know this is conservative For those who accepted screening CACE shows a 31% reduction NOT a 39% reduction as suggested by PP NOR a 71% reduction as suggested by OT

30 GROUPING IN INDIVIDUALLY RANDOMISED TRIALS

31 Grouping in iRCTs Methods for the design and analysis of trials where participants are allocated to treatment clusters are now well established Clustering also happens in trials where participants are allocated individually, when the intervention is provided by individual operators, such as surgeons These operators form a hidden sample, whose effect is usually ignored

32 Grouping in iRCTs Katherine J Lee, and Simon G Thompson BMJ 2005;330: ©2005 by British Medical Journal Publishing Group

33 Grouping in iRCTs In a review of 42 trials randomising individuals published in the BMJ during 2002 they found: 38/42 (90%) had some form of clustering 17/42 (40%) having clustering by health professional imposed by design 6/38 (16%) mentioned clustering as a potential issue 4 of 6 allowed for clustering in analysis 3 did not recognise multiple sources of clustering

34 Grouping in iRCTs There is a new awareness of potential operator effects Including sample size adjustments using an ICC as if for a cluster randomised trial Groupings can occur in all trial arms or only in one of them There are different ways to analyse these trials depending on the number of groupings and whether it is in all trial arms

35 Example 1: Grouping in iRCTs
Orthopaedic surgeons were eligible to take part if they performed knee replacements Surgeons differed in which comparisons they would contribute 116 surgeons in 34 UK centres Included surgeon as a random effect in the analysis to account for potential surgeon effects

36 Grouping in one trial arm
Under the scenario of clustering in one trial arm but not the other How do we analyse that? One simple option would be to allocate participants to a surgeon irrespective of whether they are in the intervention or not If they are in the no surgery group, then they do not know about this nor does the surgeon Each surgeon then has their own cluster of controls and surgeon patients

37 Grouping in one trial arm
For each surgeon, we will then have a mean outcome score for surgery and a mean for controls We could do a paired t test on these summary statistics as we are aggregating by surgeon We could also adjust for baseline scores using analysis of covariance with surgeon as a random effect More simply (but less powerfully) we could use the means of change in scores from baseline for surgery patients and controls in a paired t analysis

38 Example 2: Grouping in iRCTs

39 Grouping in iRCTs Problem with this approach is that more complex clustering can not be accounted for Covariates need to be aggregated to the “surgeon” level To account for more complex scenarios then other options are available

40 Example 3: Grouping in iRCTs
In order to quantify the impact of grouping, case manager identifiers included as random effect in the linear mixed model, nested within treatment arm. Participants in usual care were coded as their own case managers Covariance structure was estimated separately for each treatment arm in order to account for the differences in variability for the random effect

41 Example 3: Grouping in iRCTs
Group differences remained significant Accounting for clustering by case manager reduced size of treatment effect slightly

42 Subgroup analyses

43 Subgroup analyses In a trial we might want to know if the observed treatment difference is the same across different groups E.g. Young and old; those with a baseline preference for one of the treatments Viewed as the examination of heterogeneity of an observed treatment effect across subsets of individuals The statistical term for heterogeneity of this type is interaction

44 Subgroup analyses While it may be of interest to look for an interaction, this is not always wise There are numerous subgroups which could be compared Splitting by socio-demographic or clinical Many ways to create groups e.g. age Examination of many subgroups is likely to find some spurious significant interactions We cannot tell if a specific interaction is real or spurious

45 Subgroup analyses By contrast, when there is a specific prior suspicion of an interaction it is perfectly reasonable to examine it Results of tests for interactions are likely to be convincing only if they were specified at the start of the study In any study that presents subgroup analyses it is important to specify when and why the subgroups were chosen More recently the prior direction of effect too

46 Subgroup analyses Sometimes authors will carry out significance tests in the different subgroups They will then conclude that the difference exists only or mainly in the subgroups where a significant difference was found This is incorrect!

47 Subgroup analyses A statement such as P=0.57 does not mean there is no difference Merely we have found no evidence of a difference A P value is a composite Depend on the size of an effect but also on how precisely the effect has been estimated Differences in P values can arise because of differences in effect sizes or differences in standard errors or a combination of the two The correct approach is to include an interaction in the analysis

48 Example: ProFHER Baseline preference Surgery (n=125) Not surgery (125)
41 (32.8) 31 (24.8) Not surgery 32 (25.6) 28 (22.4) No preference 52 (41.6) 63 (50.4) Missing 0 (0.0) 3 (2.4) OSS scores improved for all groups over time Patients who expressed a preference for surgery generally reported the lowest shoulder functioning Patients who expressed a preference for no surgery reported the highest shoulder functioning scores

49 Example: ProFHER This model reduced the magnitude of the treatment effect From an overall group difference of 0.75 to 0.50 points Reflecting additional variability in OSS scores explained by patient preference The interaction was not statistically significant (F=0.29, p=.751)

50 Example: ProFHER N=72 N=60 N=115

51 Quick summary If you are interested in estimating treatment effectiveness among those who comply consider using CACE approach Consider whether there may be hidden clustering in your trial and make appropriate plans to account for this in the analysis and possibly in the sample size calculation If you want to explore whether treatment effects vary across subgroups, then define them in advance, keep the numbers low, pre-specify direction of effect and use an interaction in the analysis

52 Any questions?


Download ppt "Challenges of statistical analysis in surgical trials"

Similar presentations


Ads by Google