Download presentation
Presentation is loading. Please wait.
Published byRoy McCarthy Modified over 9 years ago
1
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 1 The effect of testing on student achievement: 1910-2010 Richard P. PHELPS
2
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 2 Meta-analysis A method for summarizing a large research literature, with a single, comparable measure.
3
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 3 The effect of testing on student achievement 12-year long study analyzed close to 700 separate studies, and more than 1,600 separate effects 2,000 other studies were reviewed and found incomplete or inappropriate lacking sufficient time and money, hundreds of other studies will not be reviewed
4
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 4 Looking for studies to include in the meta-analyses 1.Included only those studies that found an effect from testing on student achievement or on teacher instruction…
5
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 5 Studies included in the meta-analyses 2.…when: a test is newly introduced, or newly removed quantity of testing is increased or reduced test stakes are introduced or increased, or removed or reduced
6
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 6 Studies included in the meta-analyses 3. …plus previous research summaries (e.g.) Kulik, Kulik, Bangert-Drowns, & Schwalb (1983-1991) on: –mastery testing, –frequency of testing, and –programs for high-risk university students Basol & Johanson (2009) on testing frequency Jaekyung Lee (2007) on cross-state studies W.J. Haynie (2007) in career-tech ed
7
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 7 Number of studies of effects, by methodology type Methodology type Number of studies Number of effects Quantitative177640 Surveys and public opinion polls (US & Canada) 247813 Qualitative245 TOTAL6691698
8
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 8 Effect size: Cohen’s d d = (Y E - Y C ) / S pool Y E = mean, experimental group Y C = mean, control group S pooled = standard deviation
9
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 9 Effect size: Other formulae d = t*((n 1 +n 2 /n 1 *n 2 )^0.5 d = 2r/(1-r²)^0.5 d = (Y E pre -Y E post -Y C pre + Y C post )/S pooled post
10
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 10 Effect size: Interpretation d between 0.25 & 0.50 weak effect d between 0.50 et 0.75 medium effect d more than 0.75 strong effect
11
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 11 Quantitative studies (population coverage ≈ 7 million persons)
12
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 12 Quantitative studies: Effect size “Bare bones” calculation: d ≈ +0.55 …a medium effect Bare bones effect size adjusted for measurement error d ≈ +0.71 …a stronger effect Using same-study-author aggregation d ≈ +0.88 …a strong effect
13
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 13 Which predictors matter? Treatment Group … Mean Effect Size … is made aware of performance, and control group is not+0.98 … receives targeted instruction (e.g., remediation)+0.96 … is tested with higher stakes than control group+0.87 … is tested more frequently than control group+0.85
14
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 14 More Moderators – Source of Test Number of Studies Mean Effect Size Researcher or Teacher870.93 National240.87 Commercial380.82 State or District110.72 Total160
15
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 15 More Moderators – Sponsor of Test Number of Studies Mean Effect Size International51.02 Local990.93 National450.81 State110.64 Total160
16
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 16 More Moderators - Study Design Number of Studies Mean Effect Size Pre-post120.97 Experiment, Quasi-experiment1070.94 Multivariate260.80 Experiment, posttest only70.60 Pre-post (with shadow test)80.58 Total160
17
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 17 More Moderators – Scale of Analysis Number of Studies Mean Effect Size Aggregated91.60 Small-scale1180.91 Large-scale330.57 Total160
18
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 18 More Moderators – Scale of Administration Number of Studies Mean Effect Size Classroom1150.95 Mid-scale60.72 Large-scale390.71 Total160
19
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 19 Surveys and opinion polls
20
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 20 Percentage of survey items, by respondent group and type of survey
21
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 21 Number and percent of survey items, by test stakes and target group Test stakesNumber%Target groupNumber% High50762Students39346 Medium18423Schools28133 Low334Teachers11614 Unknown8911No stakes647 TOTAL813TOTAL854
22
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 22 Opinion polls, by year 244 between 1958--2008, in the U.S. & Canada 813 unique question-response combinations close to 700,000 individual respondents
23
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 23 Surveys and opinion polls: Regular standardized tests, performance tests Regular tests (N ≈125) Performance tests (N ≈ 50) Respondent opiniondd Achievement is increased1.21.0 …weighted by size of study population1.90.5 Instruction is improved1.01.4 …weighted by size of study population0.9 Tests help align instruction1.0 …weighted by size of study population0.50.9
24
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 24 Qualitative studies: Summary ( One cannot calculate an effect size.)
25
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 25 Qualitative studies, by methodology type Methodology Number of studies% Case study12043 Experiment or pre-post study217 Interviews (individual or group)7527 Journal21 Review of official records, documents, reports3312 Research review83 Survey228 TOTAL281100
26
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 26 Qualitative studies: Effect on student achievement Direction of effect Number of studiesPercent of studies Percent without the inferred Positive2048493 Positive inferred2410 Mixed522 No change834 Negative311 TOTAL244100 244 studies conducted in the past century in over 30 countries
27
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 27 Qualitative studies: Testing improves student achievement and teacher instruction Achievement is improved Number of studies% Yes20095 Mixed results1<1 No105 TOTAL211100 Instruction is improved Number of studies% Yes15896 No74 TOTAL165100
28
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 28 Qualitative studies: Variation by rigor and test stakes Direction of effect Level of rigor Total highmediumlow Positive956742204 Positive inferred108624 Mixed3115 No change4318 Negative1113 TOTAL1138051244 Direction of effect Stakes Total highmediumlowunknown Positive13327386204 Positive inferred125724 Mixed415 No change2158 Negative33 TOTAL15433516244
29
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 29 Qualitative studies: Regular standardized tests and performance tests Regular tests (N =176) Performance tests (N = 69) Study results% Generally positive9395 High-stakes tests7142 High level of study rigor4648 Student attitudes toward test positive6071 Teacher attitudes toward test positive5580 Student achievement improved95 Instruction improved92100 Large-scale testing8668
30
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 30 An enormous research literature But, assertions that it does not exist at all are common –Some claims are made by those who oppose standardized testing, and may be wishful thinking –Others are “firstness” claims
31
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 31 Dismissive research reviews With a dismissive research literature review, a researcher assures all that no other researcher has studied the same topic
32
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 32 Firstness claims With a firstness claim, a researcher insists that he or she is the first to ever study a topic
33
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 33 Social costs are enormous Research conducted by those without power or celebrity is dismissed -- ignored and lost Public policies are skewed, based exclusively on the research results of those with power or celebrity Society pays again and again for research that has already been done
34
© 2012, Richard P PHELPSInternational Test Commission, 8th Conference, Amsterdam, July, 2012 34 The effect of testing on student achievement: 1910-2010 Richard P. PHELPS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.