Design and Analysis of Cluster Randomization Trials in Health Research Allan Donner, Ph.D., Professor and Chair Department of Epidemiology & Biostatistics The University of Western Ontario, London, Ontario, Canada Neil Klar, Ph.D., Senior Biostatistician Division of Preventive Oncology Cancer Care Ontario, Toronto, Ontario, Canada
Dr. Allan Donner
Dr. Neil Klar
Learning Objectives To distinguish experimental trials based on the unit of randomization (e.g. individual, family, community). To appreciate the consequences of cluster randomization on sample size estimation and data analysis. To identify key features of a cluster randomization trial which need to be included in reports.
What Are Cluster Randomization Trials? Cluster randomization trials are experiments in which clusters of individuals rather than independent individuals are randomly allocated to intervention groups.
Example 1: Study Purpose: To evaluate the effectiveness of Vitamin A supplements on childhood mortality. 450 villages in Indonesia were randomly assigned to either participate in a Vitamin A supplementation scheme, or serve as a control. One year mortality rates were compared in the two groups. Sommer et al. (Lancet, 1986)
Example 2: Study Purpose: To promote smoking cessation using community resources. 11 paired communities were selected and one member of each pair was randomly assigned to the intervention group. 5-year smoking cessation rates were compared in the two groups. Communities were matched on demographic characteristics (e.g. size, population density) and geographical proximity. COMMIT Research Group (Am J Public Health, 1995)
Example 3: Study Purpose: To evaluate the effectiveness of treated nasal tissues versus standard tissues. 90 families were selected and randomized to one of the two intervention groups separately in each of three family size strata (2, 3, or 4 members per family). 24-week incidence of respiratory illness were compared in the two groups. Farr et al. (Am. J. Epid., 1988)
Reasons for Adopting Cluster Randomization Administrative convenience To obtain cooperation of investigators Ethical considerations To enhance subject compliance To avoid treatment group contamination Intervention is naturally applied at the cluster level
Unit of Randomization vs. Unit of Analysis A key property of cluster randomization trials is that inferences are frequently intended to apply at the individual level while randomization is at the cluster or group level. Thus the unit of randomization may be different from the unit of analysis. In this case, the lack of independence among individuals in the same cluster, i.e. between- cluster variation, creates special methologic challenges in both design and analysis.
Implications of Between-Cluster Variation Presence of between-cluster variation implies: (i) Reduction in effective sample size. Extent depends on degree of within-cluster correlation and on average cluster size. (ii) Standard approaches for sample size estimation and statistical analysis do not apply. Application of standard sample size approaches leads to an underpowered study. Application of standard statistical methods generally tends to bias p-values downwards, i.e. could lead to spurious statistical significance.
Possible Reasons for Between- Cluster Variation 1. Subjects frequently select the clusters to which they belong e.g., Patient characteristics could be related to age or sex differences among physicians 2. Important covariates at the cluster level affect all individuals within the cluster in the same manner e.g. Differences in temperature between nurseries may be related to infection rates 3. Individuals within clusters frequently interact and, as a result, may respond similarly e.g. Education strategies or therapies provided in a group setting 4. Tendency of infectious diseases to spread more rapidly within than among families or communities.
Quantifying the Effect of Clustering Consider a trial in which k clusters of size m are randomly assigned to each of an experimental and control group. Also assume the response variable Y is normally distributed with common variance 2 Aim is to test H 0 : 1 = 2. Then appropriate estimates of 1 and 2 are given by Y 1, Y 2, the usual sample means. Also: V(Y i ) = ( 2 /km) [1 + (m - 1) ], i =1,2 where is the coefficient of intracluster correlation. Then IF = 1 + (m- 1) is the “variance inflation factor” or “design effect” associated with cluster randomization.
Variance Component Interpretation of The overall response variance 2 may be expressed as the sum of two components, i.e., 2 = 2 A + 2 W, where 2 A = between-cluster component of variance 2 W = within-cluster component of variance then = 2 A / ( 2 A + 2 W )
Sample Size Requirements for Completely Randomized Designs Comparison of Means: Suppose k clusters of size m are to be assigned to each of two intervention groups. Then the number of subjects required per intervention group to test H 0 : 1 = 2 is given by n = {(Z /2 + Z ) 2 (2 2 ) [1 + (m – 1) ]} / ( 1 - 2 ) 2 where 2 = 2 A + 2 W Equivalently, the number of required clusters is given by k = n/m.
Example Hsieh (1988) reported on the results of a pilot study for a planned 5-year trial examining cardiovascular risk factors, obtaining cholesterol levels from 754 individuals in 4 worksites. Estimated variance components were S 2 W = 2209, S 2 A = 93. value of assessed as = 93 / ( ) = 0.04 Assuming = 70 subjects/worksite, IF = 1 + (70 - 1) 0.04 = 3.76
To obtain 80% power at =.05 (2 sided) for detecting a mean difference of 20 mg/dl between intervention groups, the number of required worksites per group is given by k ={( ) 2 2(2302) (3.76)} / 70 (20) 2 = 4.8 5 To adjust for the use of normal distribution critical values, and possible loss of follow-up, might enroll 7 clusters per group.
Impact on Power of Increasing the Number of Clusters vs. Increasing Cluster Size: Let d = mean difference between intervention groups then, Var (d) = (2 2 / km) [ 1 + ( m – 1) ] As the number of clusters k , Var(d) 0 but, as the cluster size size m , Var (d) (2 2 ) / k = 2 2 A /k Trial randomizing between 30 and 50 individuals will tend to have almost the same statistical power as trials randomizing the same number of much larger units. But clusters of larger size are often recruited for very practical reasons (to reduce contamination, to avoid logistic or ethical problems, etc.).
Factors Influencing Loss of Precision 1.Interventions often applied on a group basis with little or no attention given to individual study participants. 2.Some studies permit the immigration of new subjects after baseline. 3.Entire clusters, rather than just individuals, may be lost to follow-up. 4.Over-optimistic expectations regarding effect size.
Strategies for Improving Precision in Cluster Randomization Trials 1. Establish cluster-level eligibility criteria so as to reduce between-cluster variability e.g., geographical restrictions. 2.Consider increasing the number of clusters randomized, even if only in the control group. 3.Consider matching or stratifying in the design by baseline variable having prognostic importance. 4.Obtain baseline measurements on other potentially important prognostic variables. 5.Take repeated assessments over time from the same clusters or from different clusters of subjects. 6.Develop a detailed protocol for ensuring compliance and minimizing loss to follow-up.
The Importance of Cluster Level Replication Some investigators have designed community intervention trials in which exactly one cluster has been assigned to each intervention group. Such trials invariably result in interpretational difficulties caused by the total confounding of two sources of variation: –the variation in response due to the effect of intervention, and –the natural variation that exists between the two communities (clusters) even in the absence of an intervention effect. Analysis is only possible under the untenable assumption that there is no clustering of individuals responses within communities. More attention to the effects of clustering when determining sample size might help to eliminate designs which lack replication.
Analysis of Binary Outcomes Objective: To assess the effects of interventions on individuals when clusters are the sampling unit. Statistical Issue: Responses on individuals within the same cluster tend to be positively correlated, violating the assumption of independence required for the application of standard statistical methods.
Example: Data obtained from a study evaluating the effect of school-based interventions in reducing adolescent tobacco use. 12 school units were randomly assigned to each of four conditions, including three intervention conditions and a control condition (existing curriculum). We compare here the effect of the SFG (Smoke Free Generation) intervention to the existing curriculum (EC) with respect to reducing the proportion of children who report using smokeless tobacco after four years of follow-up.
Data Overall rates of tobacco use Control: 91/1479 =.062 Experimental:58/1341 =.043
Procedures Reviewed 1. Standard chi-squared test 2. Two sample t-test on school-specific event rates 3. Direct adjustment of standard chi- square test 4. Generalized estimating equations approach
Hypothesis to be tested:
Two-sample T-test Comparing Average Values of the Event Rates Pooled variance = ( ) / 2= t 22 = ( ) / .00095(1/12 + 1/12) = 1.66 (p =.11)
Adjusted Chi-square Test An adjustment which depends on computing clustering correction factors for each group, is given by: C 1 = { m 1j [1 + (m 1j - 1) ]}/ m 1j, C 2 = { m 2j [1 + (m 2j - 1) ]}/ m 2j, where m 1j, m 2j are the cluster sizes within groups 1 and 2, respectively, and is an estimate of within-cluster correlation.
For Data in the Example: =.011, C 1 = 2.599,C 2 = Then the adjusted one degree of freedom chi-square statistic is given by: 2 A = {M 1 (P 1 – P) 2 / [C 1 P (1 – P)]} + {M 2 (P 2 - P) 2 / [C 2 P (1 - P)]} = {1479 ( ) 2 / [2.599 (.053) ( )]} + {1341 ( ) 2 / [2.536 (.053) ( )]} = 1.83 (p =.18) Comments: 1. Reduces to standard Pearson chi-square test when = 0 2. Imposes no distributional assumptions on the responses within a cluster. 3. Assumes the C i, i = 1,2 are not significantly different, i.e. estimate the same population design effect.
Generalized Estimating Equations Approach The generalized estimating equations (GEE) approach, developed by Liang and Zeger (1986), can be used to construct an extension of standard logistic regression which adjusts for the effect of clustering, and does not require parametric assumptions. Consider the logistic-regression model given by log [P / (1 - P)] = 0 + 1 where = 1 if experimental and = 0 if control The odds ratio for the effect of intervention is given by exp ( 1 )
Result: The resulting robust one degree of freedom chi-square statistic, constructed using a working exchangeable correlation matrix is given by X 2 LZ = 2.62 ( = 0.11). Comments: 1) The GEE extensions of logistic regression were fit using the statistical package SAS. These procedures are also available in several other statistical packages (e.g., STATA, SUDAAN). 2) Robust statistical inference are valid so long as there are large numbers of clusters even if the working correlation is not correctly specified. 3) Can be extended to model dependence on individual level covariates.
Summary of Results
Reporting Cluster Randomization Trials: Reporting of study design: Justify the use of cluster randomization. Provide a clear definition of the unit of randomization. Explain how the chosen sample size or statistical power calculations accounts for between-cluster variation.
Reporting of study results: Provide the number of clusters randomized, the average cluster size and the number of subjects selected for study from each cluster. Provide the values of the intracluster correlation coefficient as calculated for the primary outcome variables. Explain how the reported statistical analyses account for between-cluster variations.
References & Links References: 1. Donner A, Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. Arnold, London Statistics notes: Sample size in cluster randomisation Sally M Kerry and J Martin Bland BMJ 1998; 316: 549.
3. Cluster randomised trials: time for improvement Marion K Campbell and Jeremy M Grimshaw BMJ 1998; 317: Ethical issues in the design and conduct of cluster randomised controlled trials Sarah J L Edwards, David A Braunholtz, Richard J Lilford, and Andrew J Stevens BMJ 1999; 318:
Links: The last three articles are available at Arnold Publishers