Center for Biofilm Engineering Marty Hamilton Professor Emeritus of Statistics Montana State University Statistical design & analysis for assessing the efficacy of instructional modules CS 580 April 24, 2006
Why Statistics? Provide convincing results Improve communication “...I do not mean to suggest that computers eliminate stupidity---they may in fact encourage it.” Robert P. Abelson, in Statistics as Principled Argument (cited on Rocky Ross’s CS 580 home page)
What is Statistics? Data Design Uncertainty assessment
Statistical Thinking Data Design Uncertainty assessment
Data:Choosing the quantity to measure Reliable test of knowledge Quantitative response
Statistical thinking Data Design Uncertainty assessment
After-treatment score A student used the modules, then scored 80% on the test Conclusion: modules have high efficacy
Data:Choosing the quantity to measure Reliable tests of knowledge: before-treatment test after-treatment test Quantitative response: difference in test scores, after-treatment minus before-treatment
After-treatment score High Low Test score After
Before- and after-treatment scores High Low After Before Test score Response
Difference between before- and after-treatment scores A student used the modules, then scored 50 points higher on the after- treatment test than on the before treatment test (Response = 50). Conclusion: modules have high efficacy
Anticipating criticism: “natural” improvement High Low After Before Test score without the treatment Response
Anticipating criticism Before/after observations for just the “treated” student may not accurately represent the treatment effect May need treated and untreated students (i.e., a control)
Control or comparison The control can be either a negative control or positive control A student taking a conventional classroom lecture/recitation course would provide a positive control or comparison (placebo) (best conventional)
Difference scores for each of 12 students, 6 per group Treated group Control group Difference (after – before) Of practical importance?
Study design Before and after test scores for each student in both the treated and control groups
Good study design Control or comparison Replication Randomization Anticipate criticism
Data: 20 students per group (randomly assigned?) TreatmentResponse C C C C C C C C C C C C C C C C C C C C TreatmentResponse T T T T T T T T T T T T T T T T T T T T
Analysis via Minitab 14.Minitab: FirstStudy_CS580.MTW Show data layout... matrix Stat > Basic Statistics > Display Descriptive Statistics... Ask for individual value plot Stat > Basic Statistics > 2 Sample t... Minitab output Two-Sample T-Test and CI: Response, Treatment Two-sample T for Response Treatment N Mean StDev SE Mean C T Difference = mu (C) - mu (T) Estimate for difference: % CI for difference: ( , ) T-Test of difference = 0 (vs not =): T-Value = P-Value = DF = 38 Both use Pooled StDev = Null hypothesis: true mean response for Treatment = true mean response for Control Conclusions: 1. Reject the null hypothesis because it is discredited by the data (p-value < 0.001) 2. 95% confident that the treatment mean response is between 38.6 and 61.5 larger than the true control mean response 3. Is this efficacy repeatable?
Analysis via Minitab 14 (more)
Minitab: SixStudies_CS580.MTW Show data layout... matrix Stat > Tables > Descriptive Statistics Minitab output Tabulated statistics: Replicate, Treatment Rows: Replicate Columns: Treatment C T All All Cell Contents: Response: Mean Count
Analysis via Minitab 14 (more)
Stat > ANOVA > General Linear Model... Minitab output General Linear Model: Response versus Treatment, Replicate Factor Type Levels Values Treatment fixed 2 C, T Replicate(Treatment) random 12 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6 Analysis of Variance for Response, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P Treatment Replicate(Treatment) Error Total S = Variance Components, using Adjusted SS Estimated Source Value Replicate(Treatment) Variance among replicate studies Error Variance among students in same study and treatment added by Marty Total variance Repeatability Standard Deviation = 20.5 (single student) Repeatability Standard Deviation = 9.9 (mean of 20 treated students minus mean of 20 control students) Stat > Basic Statistics > Normality Test... of residuals provides an evaluation of key statistical assumption underlying the ANOVA
Analysis via Minitab 14 (more) Data copied from Tables output and pasted into the worksheet: Rep CntrlMean TrtMean Mean (Treatment minus Control) Stat > Basic Statistics > 1 sample t... analysis of 6 Means Conclusions: 1. Reject the null hypothesis because it is discredited by the data (p-value < 0.001) 2. Estimated difference in mean responses = % confident that the treatment mean response is between 36.9 and 54.9 larger than the true control mean response 4. 95% confident that the treatment mean response is at least 38.6 larger than the true control mean response 5. The efficacy measure is repeatable Note: this straightforward analysis of the six means, one mean for each of the 6 repeated studies, using a 1-sample t-test provides nearly the same results as does the ANOVA variance component analysis approach.
Trade-offs: What is the main source of variability? It is often more important to repeat the study than to expend time and materials finding a precise efficacy estimate for a single study.
Fin