Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation.

Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation 24th January @EducEndowFoundn

Introduction The EEF is an independent charity dedicated to breaking the link between family income and educational achievement. In 2011 the Education Endowment Foundation was set up by Sutton Trust as lead charity in partnership with the Impetus Trust. The EEF is funded by a Department for Education grant of £125m and will spend over £220m over its fifteen year lifespan. In 2013, the EEF was named with The Sutton Trust as the government-designated ‘What Works’ centre for improving education outcomes for school-aged children.

The EEF Two aims: Our approach 1. Break the link between family income and school attainment 2. Build the evidence base on the most promising ways of closing the attainment gap

The Teaching and Learning Toolkit
A meta-analysis of education research Contains c.10,000 studies Cost, impact & security included to aid comparison

133 projects funded to date
EEF, March 2016 7,500 schools currently participating in projects 133 projects funded to date 750,000 pupils currently involved in EEF projects £220m estimated spend over lifetime of the EEF 26 independent evaluation teams 100 RCTs £82m funding awarded to date 66 published reports

New approach, different perspectives, design challenges
Design with the end user in mind There is no one right answer – communicate and compromise

New approach Evaluate projects Rigorous, independent evaluations
Longitudinal outcomes Robust counterfactual (RCTs) Impact and process evaluations

Education v other fields
How does this compare to evaluation in your field?

Trials: Education v public health
Public health / development Some independent evaluation Usually not independently funded? Mostly cluster and multi-site trials. Clusters clearly defined. Mostly cluster and multi-site trials. Clusters less clearly defined? High ICC Low ICC? Obtaining consent can be easy Obtaining consent can be complex and difficult Follow up is in theory easy; children must attend school Follow up can be harder Administrative data (NPD) Depends on outcome Unfamiliarity with method More familiar. More respect in medicine

Main messages Design with the end user in mind
There is no right answer – communicate and compromise

Process for appointing evaluators
Grants team identify projects, 1st Grants Comm. shortlist Evaluation teams receive 1page project descriptio-ns Teams submit 2 page EoI Teams chosen to submit proposal Teams submit 8 page proposal Teams chosen to evaluate projects 2nd set up meeting with evaluation team, project team and EEF 2nd Grants Comm. shortlist First set-up meeting with evaluation team, project team and EEF Finalise evaluation design. Decide on eligibility criteria, details of protocol, process evaluation measures linked to logic model Share understanding of intervention logic. Decide overall design, timeline , sample size, control group condition. Developer (& evaluator) budgets set

Different perspectives
EEF Evaluator Set-up meeting Developer

Different perspectives
EEF Useful results Quick results Keep costs down Evaluator Publications Funding to do research Personal interests Set-up meeting Developer Funding to deliver programme Demonstrate impact Good relationships with schools Publications?

Design challenges Improving Working Memory
Teaching memory strategies by playing computer games For 5 year-olds struggling at maths Delivered by Teaching Assistants Developed by Oxford University educational psychologists Evidence of improvement in WM from two small (30 and 150 children) controlled studies

Design challenges How many arms? Working Memory (WM)
WM blended with maths Matched time maths support Business as usual (BAU)

Design challenges When would you randomise?
Deliver programme (10 hours) 121 support for mins for total 5 hours Computer games for 5 hours Maths attainment School Recruited Identify TAs and link teacher One-day training for TAs Identify pupils (bottom 1/3) Improved working memory Oxford University

Design challenges Deliver programme (10 hours)
Delivery log Deliver programme (10 hours) Survey, observations, interviews WM test Maths test 121 support for mins for total 5 hours Computer games for 5 hours Maths attainment School Recruited Identify TAs and link teacher Identify pupils (bottom 1/3) One-day training for TAs Improved working memory Oxford University Randomisation

Estimated months’ progress Matched time support v BAU Control
Design challenges Catch Up Numeracy For 4 to 11 year-olds struggling at maths Delivered by Teaching Assistants 10 modules of tailored support Flexible delivery model (no fixed length) Evidence from EEF pupil-randomised efficacy trial: Group Number of pupils Effect size Estimated months’ progress Evidence strength Catch Up v BAU Control 108 0.21 (0.01, 0.42) +3 Matched time support v BAU Control 102 0.27 (0.06, 0.49) +4

Design challenges What control group would you use?

Design challenges Catch Up Numeracy 150 schools Recruited
Identify TAs and ~8 children in years 3-5 behind in maths Randomise 75 schools, 600 children: Business as usual control group 75 schools, 600 children: Flexible Catch Up delivery model Follow up maths test

Problems with interpretation
What if we see no effect of Catch Up and control group gets lots more support? What if we see a big effect of Catch Up and the control group has received lots less support?

A radical idea: Pre-specify interpretation!
Positive effect No effect Negative effect Control longer than Catch Up Matched time Control shorter than Catch Up

Positive effect No effect Negative effect Control longer than Catch Up x Matched time Control shorter than Catch Up

Positive effect No effect Negative effect Control longer than Catch Up Catch Up is more effective, even when more active control time →Do Catch Up (continuing active control without appropriate stopping may have a harmful effect) Both did or both did not work. Probably did given existing evidence? →Do Catch Up because same effect with less time Catch Up is less effective than providing longer active control. →Assess the cost of each and do active control if not much more expensive. Matched time Catch Up is more effective than active control Both did work or both did not work. Probably did given existing evidence? →Do Catch Up or active control Catch Up is less effective than active control →Do active control Control shorter than Catch Up Catch Up is more effective than less time active control →Do Catch Up because need structure to stop TAs stopping too early →Do active control as same effect with less time Catch Up is less effective than providing less active control.

Design challenges Boarding school Teenage Sleep
Children in need at risk of going into care Referred by Local Authorities Teenage Sleep Changing school start times to later Positive effects from US trials (8am start v 11am start)

Main messages (and sub-messages)
Design with the end user in mind Test the right intervention Make sure your comparison is relevant Measure implementation and cost There is no right answer – communicate and compromise Use logic model to understand the intervention Pre-specify the interpretation to aid decision making Not all interventions can be randomised

Thank you camilla. nevill@eefoundation. org. uk www
Thank you @EducEndowFoundn

Measuring the security of trials
Summary of the security of evaluation findings ‘Padlocks’ developed in consultation with evaluators Five categories – combined to create overall rating: Group Number of pupils Effect size Estimated months’ progress Evidence strength Literacy intervention 550 0.10 (0.03, 0.18) +2 Rating 1. Design 2. Power (MDES) 3. Attrition 4. Balance 5. Threats to validity 5 Fair and clear experimental design (RCT) < 0.2 < 10% Well-balanced on observables No threats to validity 4 Fair and clear experimental design (RCT, RDD) < 0.3 < 20% 3 Well-matched comparison (quasi-experiment) < 0.4 < 30% 2 Matched comparison (quasi-experiment) < 0.5 < 40% 1 Comparison group with poor or no matching < 0.6 < 50% No comparator > 0.6 > 50% Imbalanced on observables Significant threats

Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation.

Similar presentations

Presentation on theme: "Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation.

Similar presentations

Presentation on theme: "Evaluation in Education: 'new' approaches, different perspectives, design challenges Camilla Nevill Head of Evaluation, Education Endowment Foundation."— Presentation transcript:

Similar presentations

About project

Feedback