Evaluation of Complex Whole School Interventions

Evaluation of Complex Whole School Interventions
Jake Anders, Rebecca Allen, Chris Brown, Melanie Ehren, Toby Greany, Jessica Heal, Bibi Groot, Rebecca Nelson, Michael Sanders UCL Institute of Education / Behavioural Insights Team / Education Datalab

Evaluation of Complex Whole School Interventions (CWSIs)
Design of evaluations of CWSIs RCTs and consideration of alternatives Use of multi-stage evaluation protocols Designs of RCTs for evaluation of CWSIs Use of Quasi-Experimental Designs (QEDs) for evaluation of CWSIs Considerations for IPE Alternative approaches to collection of attainment data

What do we mean by a CWSI? Complex Whole School Intervention
Not the same as just being “complicated” significant factors working in combination at different levels with variations dependent on context Complexity characterised by three additional, interlinked features (Rogers, 2008) Recursive causality (reinforcing loops) Disproportionate relationships (tipping points) Emergent outcomes This review was guided by considerations needed to ensure we capture such features in evaluation work

This workshop Multi-stage evaluation protocols
Particularly addresses issue of emergent outcomes Revision of evaluation protocol to add testable hypotheses identified through IPE during the period of implementation but before outcomes data are available Discussion of this in practice Quasi-experimental design (QED) impact evaluation When might these be needed as RCT not feasible or desirable Important considerations when conducting QED Potential of matched difference in differences Discussion of projects where these might be needed

Multi-Stage Evaluation Protocols for CWSIs

Multi-stage evaluation protocols: The problem
The use of pre-registered protocols for evaluations in education is, as in other fields, an increasingly recognised part of conducting rigorous research. Although analysis not specified in a protocol is not strictly prohibited by this process, it is treated as exploratory analysis only. However, in the case of CWSIs, this purist method is unlikely to be practical. Conducting a second, perhaps 5-year-long study to confirm a finding from exploratory analysis would mean waiting perhaps 12 years from a study being initially commissioned to being able to talk confidently about its findings. It will be difficult to predict in advance which (if any) sub-groups, or outcomes, are likely to be particularly influenced by the intervention.

Multi-stage evaluation protocols: A potential solution?
We suggest the use of multi-stage evaluation protocols to help alleviate this problem in the context of CWSIs while, importantly, preserving the principle of making analytical decisions before interacting with the data. These principles may be applied to RCT (including at both efficacy or effectiveness stages) and QED evaluations. Uses a multi-mixed method approach, where insights from IPE inform Impact evaluation more iteratively. Addresses some limitations of ‘single phase’ mixed method approaches. They do not solve all issues of flexibility of design, commonly raised by critics of these models of impact evaluation, but attempt to go a significant way while preserving features of the design vital to its credibility.

Multi-stage evaluation protocols: Process
Under this model, the following process would be adopted: An evaluation protocol is written and published as usual in advance of an evaluation launching. This protocol would remain the main document relating to the evaluation, but would state at its conclusion that a second-stage protocol would be published later, ideally giving a date. At the end of the experimental period, the IPE will have been conducted and should be analysed before impact analysis is conducted and before the quantitative data for this analysis are even available to evaluators. If designed and conducted well, the IPE should provide insight into the mechanisms and channels through which the intervention is working, in a way that was not possible before the evaluation. The findings from the IPE should be used to form hypotheses testable using the quantitative data for impact analysis. This should be written up and published as the second-stage protocol. This can also reflect any changes to the evaluation that occurred due to its duration and/or complexity.

DISCUSSION

DISCUSSION How would this process apply to your context?
For which projects could you imagine this suitable? What do you see as the benefits and drawbacks of this approach? How might you overcome hurdles to this approach? DISCUSSION

Using Quasi-Experimental Design Impact Evaluation for CWSIs

QED Impact Evaluation We argue that the range of situations where an RCT is likely to be technically impossible is likely to be relatively small, as the technique is robust to a multitude of challenges. Nevertheless, when evaluating CWSIs there will be situations where RCTs are not desirable (in terms of not being the best way to answer the question at hand given budgetary constraints) or are simply not feasible. In such cases, a well-designed quasi-experimental evaluation design (QED) may be able to provide an impact estimate where one would otherwise simply not exist (Petticrew et al., 2005) or a more credible impact estimate than from an RCT with implementation issues. In general, it is likely to be the case that the more complex an environment is, the more complex an intervention is, and the longer the period over which the outcome is to be measured, the more challenging an RCT will be to run.

QED Impact Evaluation: Considering the feasibility of RCTs
Intervention is such that it is likely to be difficult to recruit schools to be randomly allocated because it would require dramatic organisational changes (e.g. differences in staff deployment) at short notice depending on the outcome of the randomisation. Programmes requiring groups of schools to work together meaning that: the sample required for an RCT to be sufficiently powered makes it infeasibly expensive and difficult to manage; it may not be feasible to expect schools formed into these groups before randomisation not to work together if allocated to the control group, preventing this from being a true “business as usual”. Intervention requires a lengthy period of implementation over which it will be difficult to preserve the integrity of a control group.

QED Impact Evaluation: General design considerations
EEF-funded evaluations (including those using QEDs) should remain focussed on pre-planned impact evaluations, not opportunistic application of these methods to already collected observational data where interventions have been carried out. Useful to adopt some practices from RCT design for planning QED impact evaluation: Full evaluation protocol and analysis plan published Quasi-randomisation date set for defining intention to treat sample Also importance of robustness checking of preferred approach. We should not have much confidence on results that are highly dependent on the specific method selected, rather than small deviations from this.

QED Impact Evaluation: Potential for matched diff-in-diff
Not suggesting this is the only possible QED, however we do think it has significant applicability in the context of EEF evaluations using the available administrative data from the National Pupil Database. Combination of using: matching approaches to assemble a credible comparison group, difference in differences to deal with certain kinds of unobserved differences. We suggest that this are more robust in combination than either approach is likely to be in isolation. We don’t advocate for or against one specific approach to matching because this will depend on specifics of the evaluation. However, important to be transparent in this approach. Particularly important: assembling matched comparison group before any access to outcomes.

DISCUSSION

Are there other settings you can think of where RCT infeasible?
For which projects could you imagine this suitable? What do you see as the benefits and drawbacks of this approach? How might you overcome hurdles to this approach? DISCUSSION

Evaluation of Complex Whole School Interventions: Methodological and Practical Considerations
Jake Anders, Chris Brown, Melanie Ehren, Toby Greany, Rebecca Nelson, Jessica Heal, Bibi Groot, Michael Sanders, Rebecca Allen (2017) Report available from EEF website Evaluation Resources soon…

Evaluation of Complex Whole School Interventions

Similar presentations

Presentation on theme: "Evaluation of Complex Whole School Interventions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evaluation of Complex Whole School Interventions

Similar presentations

Presentation on theme: "Evaluation of Complex Whole School Interventions"— Presentation transcript:

Similar presentations

About project

Feedback