Confidence Intervals and p-values
Clinical trials aim to generate new knowledge on the effectiveness of healthcare interventions This involves measuring the effect size, or the size of the effect of the new treatment Treatment effect can be measured in various ways ARR, RRR, RR, OR, NNT
We need to be able to assess how trustworthy or robust are the findings of the study Are the findings from this study and sample likely to be true? We need to address 2 issues before we believe the findings. Is there bias, or a systematic error, in the way the research was conducted? Could the results just be a chance finding?
3 common sorts of bias Poor randomisation, leads to unbalanced groups Poor blinding, leads to unfair treatments and distorted assessments Large numbers lost to follow-up Chance: even in a good study, random variation can affect results
P-values Assesses the role of chance in the findings Are the findings significantly different, or not, from “no effect” “null hypothesis” assumes there is in fact no difference between the treatments Calculates how likely the observed difference is a chance finding rather than no effect P<0.05 means less than 5% chance that the results are a random finding Findings are statistically significant or unlikely to have arisen by chance
Confidence intervals Provides additional information, about the range of the observed effect size Has a specified probability, usually 95%, that the real treatment effect lies within the range Confidence level is usually 95% but might be 99% in same way as p-value is usually 95% (p<0.05) but might be 99% (p<0.01) End points of the confidence interval are the confidence limits
Confidence intervals can easily show whether or not statistical significance has been reached If the interval captures the value of “no effect” it infers statistical non-significance But also the confidence limits show the largest and smallest effects that are likely Example.......
Randomised double-blind controlled trial: ramipril in pts at high risk of cardiovascular events outcome ramipril placebo Relative risk N=4,645 N=4,652 (95%CI) Number(%) Cardiovascular event(including death) 651(14%) 826(17.8%) 0.78(0.70—0.86) Death from non-cardiovascular cause 200(4.3) 192(4.1) 1.03(0.85—1.26) Death from any cause 482(10.4) 569(12.2) 0.84(0.75—0.95)
The upper and lower limits of the CI indicate how big or small the true effect is likely to be If the CI is narrow, we can be confident that any effects far from this range have been ruled out by the study Large studies with “power” tend to generate narrow CI
Pitfalls in interpretation Type 1 error Study may suggest a true difference between treatments but 5% chance that the findings are spurious. The more comparisons made in a study, the greater the chance that one of the results will be spurious Statistical significance might not be clinical significance
Type 2 error Assumption that non-significance means no true difference Non-significant CI is consistent with there being no true difference between 2 groups It doesn’t necessarily mean that there is no benefit; confidence limits provide more information than a p-value External validity: will the results apply for my patients or for a particular individual?