Narrowing the evaluation gap

Slides:



Advertisements
Similar presentations
1 Chapter 4 The Designing Research Consumer. 2 High Quality Research: Evaluating Research Design High quality evaluation research uses the scientific.
Advertisements

© Institute for Fiscal Studies Evaluation design for Achieve Together Ellen Greaves and Luke Sibieta.
Building Evidence in Education: Conference for EEF Evaluators 11 th July: Theory 12 th July: Practice
Mywish K. Maredia Michigan State University
Many Important Issues Covered Current status of ICH E5 and implementation in individual Asian countries Implementation at a regional level (EU) and practical.
Designing Influential Evaluations Session 5 Quality of Evidence Uganda Evaluation Week - Pre-Conference Workshop 19 th and 20 th May 2014.
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
Assessing Program Impact Chapter 8. Impact assessments answer… Does a program really work? Does a program produce desired effects over and above what.
WHY LARGE-SCALE RANDOMIZED CONTROL TRIALS? David Myers Senior Vice President IES 2006 Research Conference David Myers Senior Vice President IES 2006 Research.
Quantitative Research Deals with quantities and relationships between attributes (variables). Involves the collection and analysis of highly structured.
© Institute for Fiscal Studies The role of evaluation in social research: current perspectives and new developments Lorraine Dearden, Institute of Education.
School of Education Archive Analysis: On EEF Trials Adetayo Kasim, ZhiMin Xiao, and Steve Higgins.
Reporting and Using Evaluation Results Presented on 6/18/15.
 Be familiar with the types of research study designs  Be aware of the advantages, disadvantages, and uses of the various research design types  Recognize.
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
The Mimix Command Reference Based Multiple Imputation For Sensitivity Analysis of Longitudinal Trials with Protocol Deviation Suzie Cro EMERGE.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Copyright ©2008 by Pearson Education, Inc. Pearson Prentice Hall Upper Saddle River, NJ Foundations of Nursing Research, 5e By Rose Marie Nieswiadomy.
IFS When you are born matters: the impact of date of birth on child cognitive outcomes in England Claire Crawford, Lorraine Dearden & Costas Meghir Institute.
Programme Objectives Analyze the main components of a competency-based qualification system (e.g., Singapore Workforce Skills) Analyze the process and.
Ch. 2 Tools of Positive Economics. Theoretical Tools of Public Finance theoretical tools The set of tools designed to understand the mechanics behind.
Study designs. Kate O’Donnell General Practice & Primary Care.
Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
1 Joint meeting of ESF Evaluation Partnership and DG REGIO Evaluation Network in Gdańsk (Poland) on 8 July 2011 The Use of Counterfactual Impact Evaluation.
HTA Efficient Study Designs Peter Davidson Head of HTA at NETSCC.
Outcomes Evaluation A good evaluation is …. –Useful to its audience –practical to implement –conducted ethically –technically accurate.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Randomisation Bias and Post-Randomisation Selection Bias in RCTs: Barbara Sianesi Institute for Fiscal Studies September 14, 2006 The role of non-experimental.
Copyright ©2011 Brooks/Cole, Cengage Learning Gathering Useful Data for Examining Relationships Observation VS Experiment Chapter 6 1.
EEF Evaluators’ Conference 25 th June Session 1: Interpretation / impact 25 th June 2015.
Applied Analytics in Business Plans Lessons learnt during the SRC15 business planning process Robert Murray – Scottish Water Analytics Team Leader – 27th.
The Evaluation Problem Alexander Spermann, University of Freiburg, 2007/ The Fundamental Evaluation Problem and its Solution.
Chapter 12 Quantitative Questions and Procedures.
Measuring College Value-Added: A Delicate Instrument
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Why this talk? you will be seeing a lot of GRADE
DATA COLLECTION METHODS IN NURSING RESEARCH
Types of Research Studies Architecture of Clinical Research
Multiple Indicator Cluster Surveys Survey Design Workshop
a New Focus for External Validity
Types of Research Designs
Data Collection Methods
Confidence Intervals and p-values
Donald E. Cutlip, MD Beth Israel Deaconess Medical Center
Supplementary Table 1. PRISMA checklist
Research Designs, Threats to Validity and the Hierarchy of Evidence and Appraisal of Limitations (HEAL) Grading System.
Conducting Efficacy Trials
Strategies to incorporate pharmacoeconomics into pharmacotherapy
Chapter Eight: Quantitative Methods
Strength of Evidence; Empirically Supported Treatments
S1316 analysis details Garnet Anderson Katie Arnold
Chapter 6 Research Validity.
Development Impact Evaluation in Finance and Private Sector
Evaluating research Is this valid research?.
The Use of Counterfactual Impact Evaluation Methods in Cohesion Policy
Matching Methods & Propensity Scores
Implementation Challenges
Joint Statistical Meetings, Vancouver, August 1, 2018
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Statistics beyond the National Level –Regional Experiences
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Regulated Health Professions Network Evaluation Framework
Positive analysis in public finance
Overview of assessment approach
CCSSO National Conference on Student Assessment June 21, 2010
Monitoring and Evaluating FGM/C abandonment programs
Internal Control Internal control is the process designed and affected by owners, management, and other personnel. It is implemented to address business.
Root Cause Analysis Identifying critical campaign challenges and diagnosing bottlenecks.
Presentation transcript:

Narrowing the evaluation gap Session 2.B. Beyond Randomised Controlled Trials – How non-experimental methods can contribute to educational research 2018 EEF Evaluators’ Conference Narrowing the evaluation gap #EEFeval18 @EducEndowFound

University of Warwick and Institute for Fiscal Studies Claire Crawford University of Warwick and Institute for Fiscal Studies

What do we mean by non-experimental methods? Methods that take advantage of “natural” experiments which produce close to random assignment of treatment (conditional on covariates) (Or that can account for some types of unobserved heterogeneity) e.g. RDD (sharp/fuzzy, i.e. partnered with IV) DiD (Panel data methods?) (OLS/matching not typically included in this group)

How might nonexperimental methods be used? Initial piloting of policies (DiD) e.g. EMA, free school meals Geographic rollout of policies (DiD) e.g. expansion of free nursery places Administrative rules used for implementation of policies (RDD) e.g. date of birth cut-offs determining entitlement to start nursery/school e.g. rules to regulate class sizes in Israel Key assumption required (vs. RCTs) is that these “natural” experiments mimic random assignment; credibility relies on this assumption

University of Westminster Richard Dorsett University of Westminster

A role for non-experimental methods in EEF research Where an RCT is impractical/impossible perhaps logistical challenges programmes already underway/scaling up Where some early/quick/rough evidence is needed How much of a change is this? Will involve a different approach to impact estimation May require different data collection May leave process study unchanged (if prospective, not if retrospective) not clear what happens to costs

RCT vs non-experimental estimators RCTs gold standard in the hierarchy of evidence Face validity – easy to understand/communicate/persuade Non-experimental estimators Cover a range of approaches… … which vary in their underlying assumptions (some credible, others less so). Often lower face validity (are data rich enough? is instrument valid?)… but no issue with randomisation bias (possible external validity advantage). In practice Practical issues can reduce purity of RCT design (drop-out, attrition, fidelity) Non-experimental estimators may form part of RCT analysis (e.g. CACE, imputation) spectrum

Going forward - some considerations Non-experimental estimators need careful consideration of underlying assumptions How to assess non-experimental estimators’ security alongside RCT security When is a careful non-experimental estimator better than a flawed RCT? Take best practice elements from RCTs Protocol/SAP – need justification for choice of approach Power calculations – not traditionally a focus of nonexperimental analysis Blinding – where feasible Greater role for sensitivity analysis? Understanding more about non-experimental estimators in this space: Revisit completed RCTs using nonexperimental estimators Understanding general suitability of commonly-used data (NPD), e.g.: Stability of school trends over time Situations where selection on observed characteristics is plausible

Table Discussions 1.- How can we assess the quality and credibility of non-experimental impact estimates? How credible are non-experimental estimates? Is there are hierarchy of evidence? How can we assess the validity and credibility of non-experimental estimates? How can we assess the stability of results in one study? What is the role of using multiple specifications and methods? How can we combine those results? 2.- How can we ensure that non-experimental estimates are comparable – to each other and to experimental estimates? Is it possible to specify a prospective protocol/SAP for non-experimental methods? To what extent will these overlap or differ by method? What is the level of detail that should be expected in prospective documents? Should they include robustness checks and falsification tests? Should prospective documents and reporting follow any specific standards? If yes, which ones? Under what circumstances should analyses that deviate from those set out in prospective documents be considered? Are these circumstances the same as for RCTs or different?