Robert E. Slavin Institute for Effective Education University of York
Negative correlation noted in other fields Reasons: ◦ Underpowered studies with null results disappear ◦ Small studies of lower methodological quality ◦ Superrealization bias ◦ Measures aligned with treatments
Elementary and secondary math 185 qualifying studies Studies with inherent measures, brief durations, big pretest differences excluded
Table 1 Total Sample Size RecodeRange Number of Studies 1Up to or more23 TOTAL185
Overall correlation: -.28, p<.001 Sample sizes ≤100: ES= Sample sizes > 2000: ES= Random: ES=+0.24 Randomized quasi-experiments: ES=+0.29 Matched: ES= Difference disappears when sample size considered
Weight by sample size Require minimum sample size for high ratings BEE requires 500 students in 2+ studies
Results from large studies should be preferred, all else being equal Such results tend to be modest. We should look for outcomes of to +0.30, at best
Experimenter-made Assess outcomes emphasized in experimental but not control group
Usually standardized tests May be experimenter-made if experimental and control groups received the same content
What Works Clearinghouse includes Best Evidence Encyclopedia excludes
Legitimate need to measure and report outcomes emphasized in experimental group But, potential bias introduced if inherent measures averaged with independent measures How much bias?
Table 1 Comparison of Effect Sizes for Mathematics Studies with Treatment-Inherent and Treatment-Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Carroll (1998) Everyday Mathematics Researcher-developed geometry test Ridgeway et al (2002) Connected Mathematics ITBS Balanced assessment test Williams (1986)Saxon MathEnd-of-course test +0.65
Table 1 Comparison of Effect Sizes for Mathematics Studies with Treatment-Inherent and Treatment-Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Peters (1992)UCSMP Orleans-Hanna Understanding of algebraic components Hedges et al (1986) Transition Mathematics (UCSMP) Orleans-Hanna HSST: General math Geometry readiness +0.29
Table 1 Comparison of Effect Sizes for Mathematics Studies with Treatment-Inherent and Treatment-Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Thompson et al (2005) Transition Mathematics (UCSMP) HSST: General math Algebra readiness Geometry readiness Problem solving and understanding Thompson et al (2005) UCSMP Algebra HSST: Algebra Algebra readiness Problem solving and understanding Mean
Table 3 Comparison of Effect Sizes for Beginning Reading Studies with Treatment-Inherent and Treatment- Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Ross et al (2004) Accelerated Reader STAR Reading STAR Early Literacy Barker & Torgerson (1995) (means of two comparisons) Daisy Quest Phonological awareness (5 measures) Phonics (4 measures) Foster et al (1995) (means of two comparisons) Daisy Quest Phonological awareness (4 measures) Mitchell & Fox (2001) Daisy Quest Phonological awareness (4 measures, compared to untreated) Phonological awareness (4 measures, compared to teacher instruction) -0.46
Table 3 Comparison of Effect Sizes for Beginning Reading Studies with Treatment-Inherent and Treatment- Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Taylor et al (1991) Early Intervention in Reading Gates-MacGinitie Segmentation & blending Vowel sounds Mathes & Babyak (2001) PALS Oral reading fluency Phonological awareness Mathes et al (1998) PALSOral reading fluency Mathes et al (2003) (mean of two comparisons) PALS Woodcock Word ID Woodcock Passage Comp Oral reading fluency +0.13
Table 3 Comparison of Effect Sizes for Beginning Reading Studies with Treatment-Inherent and Treatment- Independent Measures: What Works Clearinghouse StudyProgramMeasures Effect Sizes Treatment- Inherent Treatment- Independent Hancock (2002) Read Naturally Peabody Picture Vocabulary Test Oral reading fluency Word use fluency CBM: Cloze Mesa (2004) Read Naturally Oral reading fluency Mean
Treatment-inherent measures must be excluded from reviews, or at least reported separately Clear distinction between inherent and independent measures can be made
Random assignment cannot be the only criterion of evaluation excellence Effect sizes from large, extended studies of school and classroom interventions with independent measures are modest (+0.20 to at best). These are the effects we should be looking for.