Session 4: Analysis and reporting Steve Higgins (Chair) Paul Connolly Stephen Gorard.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

Statistical Analysis and Data Interpretation What is significant for the athlete, the statistician and team doctor? important Will Hopkins
Meta-analysis: summarising data for two arm trials and other simple outcome studies Steff Lewis statistician.
Exploring Research-Led Approaches to Increasing Pupil Learning Steve Higgins School of Education, Durham University Addressing.
What is meta-analysis? ESRC Research Methods Festival Oxford 8 th July, 2010 Professor Steven Higgins Durham University
Adapting Designs Professor David Torgerson University of York Professor Carole Torgerson Durham University.
The use of administrative data in Randomised Controlled Trials (RCT’s) John Jerrim Institute of Education, University of London.
Using evidence to raise the attainment of children facing disadvantage James Richardson Senior Analyst, Education Endowment Foundation 1 st April 2014.
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Session 4: Analysis and reporting Managing missing data Rob Coe (CEM, Durham) Developing a statistical analysis plan Hannah Buckley (York Trials Unit)
Elements of a clinical trial research protocol
Introduction to Meta-Analysis Joseph Stevens, Ph.D., University of Oregon (541) , © Stevens 2006.
15 de Abril de A Meta-Analysis is a review in which bias has been reduced by the systematic identification, appraisal, synthesis and statistical.
Chapter 13: Descriptive and Exploratory Research
Practical Meta-Analysis -- D. B. Wilson
A randomised controlled trial to improve writing quality during the transition between primary and secondary school Natasha Mitchell, Research Fellow Hannah.
Heterogeneity in Hedges. Fixed Effects Borenstein et al., 2009, pp
Practical Meta-Analysis -- D. B. Wilson 1 Practical Meta-Analysis David B. Wilson.
Sample size calculations
Campbell Collaboration Colloquium 2012 Copenhagen, Denmark The effectiveness of volunteer tutoring programmes Dr Sarah Miller Centre.
Reading and interpreting quantitative intervention research syntheses: an introduction Steve Higgins, Durham University Robert Coe, Durham University Mark.
The Education Adjustment Program Profile – Revised.
Research evidence and effective use of the Pupil Premium Professor Steve Higgins, School of Education, Durham
Are the results valid? Was the validity of the included studies appraised?
STrengthening the Reporting of OBservational Studies in Epidemiology
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Data Analysis and Interpretation: Computing effect sizes.
9.0 A taste of the Importance of Effect Size The Basics of Effect Size Extraction and Statistical Applications for Meta- Analysis Robert M. Bernard Philip.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Funded through the ESRC’s Researcher Development Initiative
Systematic Reviews: The Potential of Meta-analysis ESRC Research Methods Festival Oxford 5 th July, 2012 Professor Steven Higgins Durham University
Advanced Statistics for Researchers Meta-analysis and Systematic Review Avoiding bias in literature review and calculating effect sizes Dr. Chris Rakes.
Systematic Reviews Professor Kate O’Donnell. Reviews Reviews (or overviews) are a drawing together of material to make a case. These may, or may not,
Reviewing systematic reviews: meta- analysis of What Works Clearinghouse computer-assisted reading interventions. October 2012 Improving Education through.
PTP 560 Research Methods Week 8 Thomas Ruediger, PT.
Systematic Reviews.
Evaluating a Research Report
Simon Thornley Meta-analysis: pooling study results.
Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Funded through the ESRC’s Researcher Development Initiative Prof. Herb MarshMs. Alison O’MaraDr. Lars-Erik Malmberg Department of Education, University.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Developments & Issues in the Production of the Summary Hospital-level Mortality Indicator (SHMI) Health and Social Care Information Centre (HSCIC)
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Introduction to meta-analysis.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Developing a Review Protocol. 1. Title Registration 2. Protocol 3. Complete Review Components of the C2 Review Process.
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Objectives  Identify the key elements of a good randomised controlled study  To clarify the process of meta analysis and developing a systematic review.
1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal.
Impact of two teacher training programmes on pupils’ development of literacy and numeracy ability: a randomised trial Jack Worth National Foundation for.
Evaluation Designs Adrienne DiTommaso, MPA, CNCS Office of Research and Evaluation.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
1 Lecture 10: Meta-analysis of intervention studies Introduction to meta-analysis Selection of studies Abstraction of information Quality scores Methods.
Chapter 11 The t-Test for Two Related Samples
Indirect and mixed treatment comparisons Hannah Buckley Co-authors: Hannah Ainsworth, Clare Heaps, Catherine Hewitt, Laura Jefferson, Natasha Mitchell,
Replication in Prevention Science Valentine, et al.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Systematic Review and Meta-Analysis.
Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Session 2.1 – Revision of Day 1.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
1 Lecture 10: Meta-analysis of intervention studies Introduction to meta-analysis Selection of studies Abstraction of information Quality scores Methods.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Reviewing systematic reviews: meta- analysis of What Works Clearinghouse computer-assisted interventions. November 2011 American Evaluation Association.
Looking for statistical twins
The English RCT of ‘Families and Schools Together’
Effect size measures for single-case designs: General considerations
Let’s make education fairer: Disadvantage, school intakes and outcomes
Analysing educational trials: the Education Endowment Foundation Archive Steve Higgins, Adetayo Kasim, ZhiMin Xiao, with Nasima Akhter, Ewoud De Troyer,
Meta-analysis, systematic reviews and research syntheses
Presentation transcript:

Session 4: Analysis and reporting Steve Higgins (Chair) Paul Connolly Stephen Gorard

Analysis of Randomised Controlled Trials (RCTs) Paul Connolly Centre for Effective Education Queen’s University Belfast Conference of EEF Evaluators: Building Evidence in Education Training Day, 11 July 2013

Main Analysis of Simple RCT These slides provide an introductory overview of one approach to analysing RCTs Assume we are dealing with a continuous outcome variable that is broadly normally distributed Three variables: Pre-test score “score1” (centred so that mean = 0) Post-test score “score2” Group membership “intervention” (coded 0 = control group; 1 = intervention group) Basic analysis via linear regression: predicted score2 = b 0 *constant + b 1 * intervention + b 2 *score1

Main Analysis of Simple RCT Predicted score2 = b 0 *constant + b 1 * intervention + b 2 *score1 b 0 = adjusted mean post-test score for those in control group b 0 + b 1 = adjusted mean post-test score for those in intervention group Estimate standard deviations for post-test mean scores using s.d. for predicted score2 for control and intervention group separately* Significance of b 1 = significance of difference between post-test mean scores for intervention and control groups Effect size, Cohen’s d = b 1 / [s.d. for pred. score2] 95% confidence interval for effect size: = b 1 ± 1.96*(standard error of b 1 ) standard deviation for pred. score2 *Most statistical software packages provide the option of creating a new variable comprising the predicted scores of the model. This new variable is the one to use to estimate standard deviations for adjusted post-test scores.

Exploratory Analysis of Mediating Effects for RCT Take example of gender differences (variable “boy”, coded as: 0 = girls; 1= boys) Analysis via extension of basic linear regression model: predicted score2 = b 0 *constant + b 1 * intervention + b 2 *score1 + b 3 *boy + b 4 *boy*intervention Significance of b 4 indicates whether there is evidence of an interaction effect (i.e. in this case that the intervention has differential effects for boys and girls) Same approach when your contextual variable is continuous rather than binary as here

Exploratory Analysis of Mediating Effects for RCT predicted score2 = b 0 *constant + b 1 * intervention + b 2 *score1 + b 3 *boy + b 4 *boy*intervention Use the model to estimate adjusted mean post-test scores*: b 0 = girls in control group b 0 + b 3 = girls in control group b 0 + b 1 = girls in intervention group b 0 + b 3 + b 4 = boys in intervention group Estimate standard deviations by calculating s.d. for predicted score2 for each subgroup separately *When dealing with a continuous contextual variable, it is often still useful to calculate adjusted mean post-test scores to illustrate any interaction effects found. This can be done by using the model to predict the adjusted post-test mean scores for those participants in the control and intervention groups who have a score for the contextual variable concerned that is one standard deviation below the mean and then doing the same for those who have a score one standard deviation above the mean.

Extending the Analysis For trials with binary or ordinal outcome measures, the same approach can be used but with generalised linear regression models: –Binary logistic regression (binary outcomes) –Ordered logistic regression (ordinal outcomes) For cluster randomised trials (with >30 clusters), the same models can be used but extended to create two level models For quasi-experimental designs, either: –Same models as above but adding in a number of additional co-variates (all centred) to control for pre-test differences –Propensity score matching For repeated measures designs can also extend the above using multilevel models with observations (level 1) clustered within individuals (level 2)

Discussion (2 mins) Write on post-it notes: What are the key issues or questions for evaluators? Have you found any solutions?

Analysis Stephen Gorard

What is N? how many cases were assessed for eligibility? how many of those assessed did not participate, and for what reasons (not meeting criteria, refused etc.)? how many then agreed to participate? how many were allocated to each group (if relevant)? how many were lost or dropped out after agreeing to participate (and after allocation to a group, if relevant)? how many were analysed, and why were any further cases excluded from the analysis?

AllocationPre-test scorePost-test scoreReason Treatment group78-Left school, not traced Treatment group73-Long-term sick during post-test Control74-Left school, new school would not test Control75-Withdrawn, personal reasons Control-70Pre-test not recorded, technical reasons Control73-Permanently excluded by school In total, 314 individual Year 7 pupils took part in the study. 157 pupils were assigned to treatment and 157 to control. The sample included students from a disadvantaged background (eligible for free school meals), those with a range of learning disabilities (SEN) and those for whom English was a second language. By the final analysis six students had dropped out or could not be included in the gain score analysis. One took the pre-test (repeatedly) but his school were unable to record the score. His post-test score was 78, and he would have been in the control. Five others took the pre-test but did not sit the post-test. One left the school and could not be traced, initially scored 78 and would have been in treatment. One left the school and their new school was not able to arrange the post-test, initially scored 64 and would have been control. One changed schools, one could not get their score saved at pre-test, one refused to cooperate and one was persistently absent at post-test (perhaps excluded). Although this loss of data, and the reduction of the sample to 308 pupils, is unfortunate, there is no specific reason to believe that this dropout was biased or favoured one group over the other. Pupils allocated to groups but with no gain score, and reason for omission An example of reporting problems with a sample Source: Gorard, S., Siddiqui, N. and See, BH (2013) Process and summative evaluation of the Switch-On literacy transition programme, Report to the Educational Endowment Foundation

Discussion (2 mins) Write on post-it notes: What are the key issues or questions for evaluators? Have you found any solutions?

Calculating effect sizes and the toolkit meta-analysis – implications for evaluators Steve Higgins School of Education, Durham University EEF Evaluators Conference, June 2013

Sutton Trust/EEF Teaching and Learning Toolkit Comparative evidence Aims to identify ‘best buys’ for schools Based on meta-analysis

What is meta-analysis? A way of combining the results of quantitative research To accumulate evidence from smaller studies To compare results of similar studies - consistency To investigate patterns of association in the findings of different studies – explaining variation ‘Surveys’ research studies

Why meta-analysis? Cumulative – synthesis of evidence Based on size of effect and confidence intervals rather than significance testing – patterns in the data Identifying and understanding variation helps develop explanatory models

What is an “effect size”? Standardised way of looking at difference Different methods for calculation Binary (Risk difference, Odds ratio, Risk ratio) Continuous  Correlational (Pearson’s r)  Standardised mean difference (d, g, Δ)  Difference between control and intervention group as proportion of the dispersion of scores  Intervention group score – control group score / standard deviation of scores

Examples of Effect Sizes: ES = 0.2 “Equivalent to the difference in heights between 15 and 16 year old girls” 58% of control group below mean of experimental group Probability you could guess which group a person was in = 0.54 Change in the proportion above a given threshold: from 50% to 58% or from 75% to 81%

“Equivalent to the difference in heights between 13 and 18 year old girls” 79% of control group below mean of experimental group Probability you could guess which group a person was in = 0.66 ES = 0.8 Change in the proportion above a given threshold: from 50% to 79% or from 75% to 93%

The rationale for using effect sizes Traditional quantitative reviews focus on statistical significance testing Highly dependent on sample size Null finding does not carry the same “weight” as a significant finding Meta-analysis focuses on the direction and magnitude of the effects across studies From “Is there a difference?” to “How big is the difference?” and “How consistent is the difference?” Direction and magnitude represented by “effect size”

Issues and challenges in meta-analysis Conceptual Reductionist - the answer is.42 Comparability - apples and oranges Atheoretical - ‘flat-earth’ Technical Heterogeneity Publication bias Methodological quality

Comparative meta-analysis Theory testing Emphasises practical value Incorporate EEF findings in new Toolkit meta-analyses Ability grouping Slavin 1990 b (secondary low attainers)-0.06 Lou et al 1996 (on low attainers)-0.12 Kulik & Kulik 1982 (secondary - all)0.10 Kulik & Kulik 1984 (elementary - all)0.07 Meta-cognition and self-regulation strategies Abrami et al Haller et al Klauer & Phye Higgins et al Chiu Dignath et al

Calculating effect sizes The difference between the two means, expressed as a proportion of the standard deviation ES = (M e – M c ) / SD Cohen's d Glass’ Δ Hedges' g

Reporting effect sizes: RCTs Post-test standardised mean difference with confidence intervals Fixed effect ok for individual randomisation Not for clusters… Cluster analysis MLM Equivalent measure Other comparisons Matched, Regression discontinuity

Discussion task What analyses are you intending to undertake? How do you plan to calculate effect size(s)? What statistical techniques: 1. Are you confident to undertake? 2. Would be happy to advise other evaluation teams? 3. Would appreciate advice and/or support?

Key requirement: be explicit… Describe analysis decisions (e.g. ITT and missing data) Report clusters separately Submit complete data-set in case different analysis is required for comparability

References, further readings and information Books and articles Borenstein, M., Hedges, L.V., Higgins, J.P.T. & Rothstein, H.R. (2009) Introduction to Meta Analysis (Statistics in Practice) Oxford: Wiley Blackwell. Chambers, E.A. (2004). An introduction to meta-analysis with articles from the Journal of Educational Research ( ). Journal of Educational Research, 98, pp Cooper, H.M. (1982) Scientific Guidelines for Conducting Integrative Research Reviews Review Of Educational Research 52; 291. Cooper, H.M. (2009) Research Synthesis and meta-analysis: a step-by-step approach London: SAGE Publications (4th Edition). Cronbach, L. J., Ambron, S. R., Dornbusch, S. M., Hess, R.O., Hornik, R. C., Phillips, D. C., Walker, D. F., & Weiner, S. S. (1980). Toward reform of program evaluation: Aims, methods, and institutional arrangements. San Francisco, Ca.: Jossey-Bass. Eldridge, S. & Kerry, S. (2012) A Practical Guide to Cluster Randomised Trials in Health Services Research London: Wiley Blackwell Glass, G.V. (2000). Meta-analysis at 25. Available at: (accessed 9/9/08) Lipsey, Mark W., and Wilson, David B. (2001). Practical Meta-Analysis. Applied Social Research Methods Series (Vol. 49). Thousand Oaks, CA: SAGE Publications. Torgerson, C. (2003) Systematic Reviews and Meta-Analysis (Continuum Research Methods) London: Continuum Press. Websites What is an effect size?, by Rob Coe: The meta-analysis of research studies: The Meta-Analysis Unit, University of Murcia: The PsychWiki: Meta-analysis: Meta-Analysis in Educational Research:

Discussion (2 mins) Write on post-it notes: What are the key issues or questions for evaluators? Have you found any solutions?

Interpreting and Reporting Findings and Managing Expectations Paul Connolly Centre for Effective Education Queen’s University Belfast Conference of EEF Evaluators: Building Evidence in Education Training Day, 11 July 2013

Interpreting Findings Findings: –only relate to the outcomes measured –represent effects of programme compared to what those in the control group currently receive –usually only relate to sample recruited (and thus are context- and time-specific) Dangers of: –‘fishing exercises’ characterised by post-hoc decisions to consider other outcomes and/or differences in effects for differing sub-groups –hypothesising regarding the causes of the effects (or reasons for the non-effects)

Reporting Findings Being clear: –Option of using adjusted post-test scores –Conversion of findings into effect sizes more readily understandable (e.g. ‘improvement index’) Being transparent: –Identify outcomes at the beginning and stick to these; register the trial –Report methods fully (CONSORT statement) Being tentative: –Acknowledge limitations –Move from evidence of “what works” to evidence of “what works for specific pupils, in a particular context and at a particular time”

Example: Adjusted post-test scores Source: Connolly, P., Miller, S. & Eakin, A. (2010) A Cluster Randomised Controlled Trial Evaluation of the Media Initiative for Children: Respecting Difference Programme. Belfast: Centre for Effective Education (p. 31). See:

Example: Improvement index Take effect size and convert to Cohen’s U3 index (either by using statistical tables of effect size calculators online) The improvement index represents the increase/decrease in the percentile rank for an average student in the intervention group (assuming at pre-test they are at the 50 th percentile) Effect size of 0.30  U3 of 62% i.e. the intervention is likely to result in an average student in the intervention group being ranked 12 percentile points higher compared to the average student in the control group (who would remain at the 50 th percentile)  4 percentile points 0.20  8 percentile points 0.40  16 percentile points 0.50  19 percentile points

Managing Expectations Regular and ongoing communication is the key Importance of logic models and agreement of outcomes with programme developers/providers at the outset –Careful consideration of the intervention and associated activities and clear link between these and expected outcomes –Ensure outcomes are domain-specific Include sufficient time to discuss findings with programme developers/providers –Talk through possible interpretations –Discuss further potential analyses (but be clear that these are exploratory)

Discussion (2 mins) Write on post-it notes: What are the key issues or questions for evaluators? Have you found any solutions?

Group discussion and feedback Tables will be arranged by theme. Evaluators should move to the table with a theme which either they are able to contribute expertise on or which they are struggling with. Tables should discuss: What are the key issues or questions for evaluators? What are the solutions? How can the EEF help? Feedback from tables.