Analysis for Designs with Assignment of Both Clusters and Individuals Cristofer Price SREE - March 2017
The Problem How to analyze data from designs where: Part of sample has assignment of clusters e.g. random assignment (RA) of districts, schools, classes Part of sample has assignment of individuals within clusters e.g., RA of schools within districts e.g., RA of students within schools
Example Special education intervention in rural schools In some schools, there are enough eligible students to fill treatment slots and also have students in the business-as-usual control condition So there is within-school assignment In other schools, there are not enough eligible students to fill the slots So there is between-school assignment (T schools and C schools)
Prototypical Design with RA of Schools and RA of Students within Schools
Desired Outcome of Analysis Approach Produce a point estimate and standard error of the average treatment effect where Effect is averaged across both RA of schools and RA of students within schools designs Impact estimate is unbiased Impact standard error is unbiased
This Presentation is NOT About Partially nested RCTs Individuals randomized, but in T group individuals are placed into clusters, C group individuals are not Roberts & Roberts (2005); Bauer et al (2008; Baldwin et al. (2011) Lohr et al., (2014); Sterba et al. (2014) Individually randomized grouped treatment designs Inability to get SE.s exactly right in individual RA designs where T and C individuals are placed into clusters (post- RA), and similarity of experiences within clusters induces correlation within clusters Weiss et al. (2015); Lee & Thompson (2005)
This Presentation IS About Two approaches to analyzing the mixed assignment design to produce a point estimate and standard error of the impact Approach 1: Perform two separate analyses, get estimates and SE.s, calculate a weighted average of the two estimates Approach 2: Analyze all data in a single multilevel model that has two separate sets of intercept terms One set of random intercept terms for schools in the school RA design One set of intercept terms for schools in the RA of students within schools design
Take Away Points I show how to implement the two approaches I show how I assessed the extent to which they produced unbiased impact estimates and standard errors I conclude that both approaches: Work equally well to each other Work about as well as standard approaches to analyzing the two samples separately Produce more precise estimates than those obtained in separate analyses
Approach 1: Model For Sample with RA of Schools Two level model with students (level-1) nested in schools (level-2)
Approach 1: Model For Sample with RA of Schools Two level model with students (level-1) nested in schools (level-2) And written in combined form
Approach 1: Model For Sample with RA of Schools (For simplicity, I’m showing models with no covariates)
Approach 1: Model For Sample with RA of Schools
Approach 1: Model For Sample with RA of Schools
Approach 1: Model For Sample with RA of Schools
Approach 1: Model For Sample with RA of Schools
Approach 1: Model for Sample with RA of Students within Schools I’m going to describe three commonly used models for designs with random assignment of students within schools
Approach 1: Model for Sample with RA of Students within Schools
Approach 1: Model for Sample with RA of Students within Schools Fixed Treatment effect
Approach 1: Model for Sample with RA of Students within Schools Fixed Treatment effect Fixed dummies for schools
Approach 1: Model for Sample with RA of Students within Schools
Approach 1: Model for Sample with RA of Students within Schools Fixed dummies for schools
Approach 1: Model for Sample with RA of Students within Schools Fixed dummies for schools Treatment by school interaction terms.
Approach 1: Model for Sample with RA of Students within Schools Fixed dummies for schools Treatment by school interaction terms. This model will produce a separate impact estimate for each school.
Approach 1: Model for Sample with RA of Students within Schools Fixed dummies for schools Treatment by school interaction terms. This model will produce a separate impact estimate for each school. To get an overall estimate, need to calculate an average of the individual estimates (perhaps a weighted average)
Approach 1: Model for Sample with RA of Students within Schools
Approach 1: Model for Sample with RA of Students within Schools Random Intercepts for Schools
Approach 1: Model for Sample with RA of Students within Schools Random Intercepts for Schools Random treatment effects
Approach 1: Get the Relevant Estimates from Models 1 and 2
Approach 1: Get the Relevant Estimates from Models 1 and 2
Approach 1: Create Weights Create weights that are proportional to the inverse of the variance of the estimates
Approach 1: Calculate the Combined Treatment Effect Estimate
Approach 1: Calculate the Combined Treatment Effect Estimate I call this the “meta-analytic” approach
Approach 1: Calculate the Combined Standard Error
Approach 2: Analysis of Both Samples in a Single Model I will show: Model 3: A single model that produces separate estimates for the two samples This is to facilitate comparisons of the single model results to analyzing the data from the two sample separately Model 4a: A combined model that produces a single combined fixed treatment estimate Model 4b: A combined model that has a random treatment effect for the sample where students were randomized within schools
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students Fixed intercept terms for part of sample where students were RA’d
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students Fixed intercept terms for part of sample where students were RA’d Treatment effect for part of sample where students were RA’d
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students Fixed intercept terms for part of sample where students were RA’d Treatment effect for part of sample where students were RA’d =1 of Schools RA’d =0 of Students RA’d
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students Fixed intercept terms for part of sample where students were RA’d Treatment effect for part of sample where students were RA’d Difference between treatment effect for part of sample where schools where RA’d and part of sample were students were RA’d =1 of Schools RA’d =0 of Students RA’d
Fit a Single Model to Both Samples Get Two Separate Treatment Estimates Random intercept terms for part of sample where schools were RA’d Combined model for full sample with separate treatment estimates for subsets with RA of schools and RA of Students I’ll show you that this model produces the same estimates and SE’s as Models 1 and 2a Fixed intercept terms for part of sample where students were RA’d Treatment effect for part of sample where students were RA’d Difference between treatment effect for part of sample where schools where RA’d and part of sample were students were RA’d =1 of Schools RA’d =0 of Students RA’d
Approach 2: Obtain a Combined Estimate from a Single Model
Approach 2: Obtain a Combined Estimate from a Single Model Fixed treatment effect
Approach 2: Obtain a Combined Estimate from a Single Model Fixed treatment effect Fixed dummies (intercepts) for schools where students were RA’d
Approach 2: Obtain a Combined Estimate from a Single Model Fixed treatment effect Random intercepts for schools where schools were RA’d Fixed dummies (intercepts) for schools where students were RA’d
Approach 2: Using Software to Fit Model 4a e.g., in SAS:
Approach 2: Using Software to Fit Model 4a e.g., in SAS: Two Set of school IDs
Approach 2: Using Software to Fit Model 4a e.g., in SAS: Two Set of school IDs “class” statement creates dummy variables from the IDs
Approach 2: Using Software to Fit Model 4a e.g., in SAS: Two Set of school IDs “class” statement creates dummy variables from the IDs Fixed dummies (intercepts) for schools where students were RA’d
Approach 2: Using Software to Fit Model 4a e.g., in SAS: Two Set of school IDs “class” statement creates dummy variables from the IDs Fixed dummies (intercepts) for schools where students were RA’d Random intercepts for schools where schools were RA’d
Approach 2: How to Code Data to Get the Two Sets of Intercepts
Approach 2: How to Code Data to Get the Two Sets of Intercepts Here are the original school IDs
Approach 2: How to Code Data to Get the Two Sets of Intercepts Create two new sets of school IDs
Approach 2: How to Code Data to Get the Two Sets of Intercepts Create two new sets of school IDs
Approach 2: How to Code Data to Get the Two Sets of Intercepts Create two new sets of school IDs
Approach 2: How to Code Data to Get the Two Sets of Intercepts For the part of the sample were schools were RA’d to T & C
Approach 2: How to Code Data to Get the Two Sets of Intercepts For the part of the sample were schools were RA’d to T & C Assign the original IDs to “Schid1”
Approach 2: How to Code Data to Get the Two Sets of Intercepts For the part of the sample were schools were RA’d to T & C Assign the original IDs to “Schid1” And assign zeros to “Schid2”
Approach 2: How to Code Data to Get the Two Sets of Random Intercepts For the part of the sample were students were RA’d to T & C within schools
Approach 2: How to Code Data to Get the Two Sets of Random Intercepts For the part of the sample were students were RA’d to T & C within schools Assign the original IDs to “Schid2”
Approach 2: How to Code Data to Get the Two Sets of Intercepts For the part of the sample were students were RA’d to T & C within schools And assign zeros to “Schid1” Assign the original IDs to “Schid2”
Approach 2: Obtain a Combined Estimate from a Single Model Model 4b
Approach 2: Obtain a Combined Estimate from a Single Model Model 4b Has random treatment effects for the part of the sample where students were RA’d within schools
Approach 2: Obtain a Combined Estimate from a Single Model Model 4b Has random treatment effects for the part of the sample where students were RA’d within schools And two sets of random intercepts
Approach 2: Obtain a Combined Estimate from a Single Model Model 4b Has random treatment effects for the part of the sample where students were RA’d within schools And two sets of random intercepts = 1 if schools RA’d = 0 if students RA’d
Approach 2: Obtain a Combined Estimate from a Single Model Model 4b Has random treatment effects for the part of the sample where students were RA’d within schools And two sets of random intercepts = 1 if schools RA’d = 0 if students RA’d Average treatment effect
How do these Approaches Perform? Used simulated data to assess the extent to which each approach produces treatment effect estimates and standard errors that are unbiased. Three sets of simulations (10,000 replications each): Set 1: Fixed set of schools and students. In each replication, schools and students assigned to different conditions (T or C) Set 2: Fixed set of schools, but different students within each school in each replication. Distribution of student’s underlying abilities within each school stays the same across replications Set 3: Infinite population of schools. In each replication, different schools and students selected, randomly assigned.
Simulated Data In each simulation: True average treatment effect = 0.20 1/5 of schools or students true effect = 0.10 1/5 of schools or students true effect = 0.15 1/5 of schools or students true effect = 0.20 1/5 of schools or students true effect = 0.25 1/5 of schools or students true effect = 0.30 ICC = 0.20
Mean of impact estimates over the 10,000 simulations. True impact is 0.20
Mean of standard error estimates over the 10,000 simulations.
True SE = square root of variance of impact estimates over the 10,000 simulations
Difference between mean of SE estimates and True SEs
Difference between mean of SE estimates and True SEs as a percent of the True SE
Same
Same
Same
The models that produce a single estimate from the two samples
The models that produce a single estimate from the two samples are more precise…
The models that produce a single estimate from the two samples are more precise than the models that produce separate impact estimates in each sample
But the extent to which there is bias in the SEs is about the same
Take Away Points I presented two easy-to-implement approaches to analyzing data from a mixed assignment design I showed how I assessed the extent to which they produced unbiased impact estimates and standard errors I concluded that both approaches: Work equally well to each other Work about as well as standard approaches to analyzing the two samples separately Produce more precise estimates than those obtained in separate analyses
Which to Use Either seems good Meta-analysis approach would estimate different relationships between covariates and outcomes for the two samples Single model combined approach would have common covariate effects across the two samples (unless the covariates were interacted with the “SchoolAssign” dummy variable)
Thank you!