A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or no incentive. Outcome variable: number of sessions attended.
Classes randomised to incentive or no incentive. Two groups of 14 classes. Labeled “X” and “Y” in this data set. Blinded for analysis. Group X: 77 students Group Y: 86 students
Outcome variable: number of sessions attended.
Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t =
Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = Stata version 8.
Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — a highly significant difference!
Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — a highly significant difference! But it is wrong — it ignores the clustering!
Compare mean number of sessions ignoring clustering, regression:. regress sessions group Source | SS df MS Number of obs = F( 1, 150) = 7.78 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — identical to two sample t method. It is still wrong — it ignores the clustering!
Compare mean number of sessions including clustering, two sample t method on cluster means:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] | | combined | diff | Degrees of freedom: 26 Ho: mean(1) - mean(2) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — not significant.
Compare mean number of sessions including clustering, two sample t method on cluster means:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] | | combined | diff | Degrees of freedom: 26 Ho: mean(1) - mean(2) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — not significant. Almost correct — it takes the data structure into account, but not the variation in class size.
Compare number of sessions including clustering, two sample t method on cluster means Almost correct — it takes the data structure into account, but not the variation in class size.
Compare mean number of sessions including clustering, regression method, weighted by class size:. regress session group [aweight=learner] (sum of wgt is e+02) Source | SS df MS Number of obs = F( 1, 26) = 2.77 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — not significant. Correct — it takes the data structure into account, including the variation in class size.
Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method):. regress sessions group, cluster(class) Regression with robust standard errors Number of obs = 152 F( 1, 27) = 2.79 Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — not significant. Correct — it takes the data structure into account. Very similar estimate and P value to method using means.
Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method). Correct — it takes the data structure into account. Very similar estimate and P value to method using means.
Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method. Correct — it takes the data structure into account. Very similar estimate and P value to method using means. I can do that using SPSS. So what is the advantage?
Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method): Correct — it takes the data structure into account. Very similar estimate and P value to method using means. I can do that using SPSS. So what is the advantage? We can use subject-level covariates.
Mid-score = reading score before randomisation.
Compare individual number of sessions including clustering, robust standard error method, adjusting for mid-score:. regress sessions group midscl, cluster(class) Regression with robust standard errors Number of obs = 152 F( 2, 27) = Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | midscl | _cons | P = — significant. Correct — it takes the data structure into account.
Compare individual number of sessions including clustering, robust standard error method, adjusting for mid-score:. regress sessions group midscl, cluster(class) Regression with robust standard errors Number of obs = 152 F( 2, 27) = Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | midscl | _cons | P = — significant. Correct — it takes the data structure into account. Adjustment produces true significant difference.