Rigour of evaluation Dr Carole Torgerson Senior Research Fellow Institute for Effective Education University of York.

Slides:



Advertisements
Similar presentations
Appraisal of an RCT using a critical appraisal checklist
Advertisements

COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.
Ch 8: Experimental Design Ch 9: Conducting Experiments
∂ What works…and who listens? Encouraging the experimental evidence base in education and the social sciences RCTs in the Social Sciences 9 th Annual Conference.
Sample size issues & Trial Quality David Torgerson.
Experimental evaluation in education Professor Carole Torgerson School of Education, Durham University, United Kingdom International.
Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.
Adapting Designs Professor David Torgerson University of York Professor Carole Torgerson Durham University.
Experimental Research Designs
KINE 4565: The epidemiology of injury prevention Randomized controlled trials.
The use of administrative data in Randomised Controlled Trials (RCT’s) John Jerrim Institute of Education, University of London.
What makes a good quality trial? Professor David Torgerson York Trials Unit.
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.
Biostatistics and Research Design, #6350 Sampling
Journal Club Alcohol and Health: Current Evidence May–June 2005.
1 Exploring Quasi-Experiments Lab 5: May 9, 2008 Guthrie, J.T., Wigfield, A., & VonSecker, C. (2000). Effects of integrated instruction on motivation and.
Common Designs and Quality Issues in Quantitative Research Research Methods and Statistics.
Common Problems in Writing Statistical Plan of Clinical Trial Protocol Liying XU CCTER CUHK.
Non-Experimental designs: Developmental designs & Small-N designs
Questions What is the best way to avoid order effects while doing within subjects design? We talked about people becoming more depressed during a treatment.
Journal Club Alcohol and Health: Current Evidence September–October 2004.
Unit 3: Sample Size, Sampling Methods, Duration and Frequency of Sampling #3-3-1.
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
Experimental Study.
Experimental Research
8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship.
I want to test a wound treatment or educational program but I have no funding or resources, How do I do it? Implementing & evaluating wound research conducted.
1 Developing evidence-based products using the systematic review process Session 4/Unit 10 Assessing study quality Carole Torgerson November 13 th 2007.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
Overview of MSP Evaluation Rubric Gary Silverstein, Westat MSP Regional Conference San Francisco, February 13-15, 2008.
Avoiding bias in RCTs David Torgerson Director, York Trials Unit
Sampling Techniques LEARNING OBJECTIVES : After studying this module, participants will be able to : 1. Identify and define the population to be studied.
GROUP DIFFERENCES: THE SEQUEL. Last time  Last week we introduced a few new concepts and one new statistical test:  Testing for group differences 
ARROW Trial Design Professor Greg Brooks, Sheffield University, Ed Studies Dr Jeremy Miles York University, Trials Unit Carole Torgerson, York University,
Reading and reporting evidence from trial-based evaluations Professor David Torgerson Director, York Trials Unit
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.
How to Analyze Therapy in the Medical Literature: practical session Akbar Soltani.MD. Tehran University of Medical Sciences (TUMS) Shariati Hospital
Collection of Data Chapter 4. Three Types of Studies Survey Survey Observational Study Observational Study Controlled Experiment Controlled Experiment.
Programme Information Incredible Years (IY)Triple P (TP) – Level 4 GroupPromoting Alternative Thinking Strategies (PATHS) IY consists of 12 weekly (2-hour)
Plymouth Health Community NICE Guidance Implementation Group Workshop Two: Debriding agents and specialist wound care clinics. Pressure ulcer risk assessment.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
Classifying Designs of MSP Evaluations Lessons Learned and Recommendations Barbara E. Lovitts June 11, 2008.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Critical appraisal of randomized controlled trial
How to Analyze Therapy in the Medical Literature (part 1) Akbar Soltani. MD.MSc Tehran University of Medical Sciences (TUMS) Shariati Hospital
Experiments. The essential feature of the strategy of experimental research is that you… Compare two or more situations (e.g., schools) that are as similar.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Objectives  Identify the key elements of a good randomised controlled study  To clarify the process of meta analysis and developing a systematic review.
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Chapter 11.  The general plan for carrying out a study where the independent variable is changed  Determines the internal validity  Should provide.
Finishing up: Statistics & Developmental designs Psych 231: Research Methods in Psychology.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Chapter Eight: Quantitative Methods
CONSORT 2010 Balakrishnan S, Pondicherry Institute of Medical Sciences.
Making Randomized Clinical Trials Seem Less Random Andrew P.J. Olson, MD Assistant Professor Departments of Medicine and Pediatrics University of Minnesota.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
PS Research Methods I with Kimberly Maring Unit 9 – Experimental Research Chapter 6 of our text: Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy,
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Issues in Evaluating Educational Research
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Experimental Design-Chapter 8
CHAPTER 4 Designing Studies
Randomized Trials: A Brief Overview
Research Designs, Threats to Validity and the Hierarchy of Evidence and Appraisal of Limitations (HEAL) Grading System.
Evidence Based Practice 3
Common Problems in Writing Statistical Plan of Clinical Trial Protocol
RESEARCH METHODS Lecture 33
RESEARCH METHODOLOGY ON ENVIRONMENTAL HEALTH PRACTICE IN WEST AFRICA
RESEARCH METHODS Lecture 33
Presentation transcript:

Rigour of evaluation Dr Carole Torgerson Senior Research Fellow Institute for Effective Education University of York

“A careful look at randomized experiments will make clear that they are not the gold standard. But then, nothing is. And the alternatives are usually worse.” Berk RA. (2005) Journal of Experimental Criminology 1,

Characteristics of a rigorous trial Once randomised, all participants are included within their allocated groups. Random allocation is undertaken by an independent third party. Outcome data are collected blindly. Sample size is sufficient to exclude an important difference. A single analysis is pre-specified before data analysis.

Education: comparison with health education Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of randomised controlled trials in health and education. British Educational Research Journal,31: (based on n = 168 trials)

Problems with RCTs Failure to keep to random allocation – can introduce selection bias Attrition - can introduce selection bias Unblinded ascertainment - can lead to ascertainment bias Small samples - can lead to Type II error Multiple statistical tests - can give Type I errors Poor reporting of uncertainty (e.g., lack of confidence intervals)

Which are RCTs? “We took two groups of schools – one group had high ICT use and the other low ICT use – we then took a random sample of pupils from each school and tested them”. “We put the students into two groups, we then randomly allocated one group to the intervention whilst the other formed the control” “We formed the two groups so that they were approximately balanced on gender and pre-test scores” “We identified 200 children with a low reading age and then randomly selected 50 to whom we gave the intervention. They were then compared to the remaining 150”. “Of the eight [schools] two randomly chosen schools served as a control group”

Mixed allocation “Students were randomly assigned to either Teen Outreach participation or the control condition either at the student level (i.e., sites had more students sign up than could be accommodated and participants and controls were selected by picking names out of a hat or choosing every other name on an alphabetized list) or less frequently at the classroom level” Allen et al, Child Development 1997;64:

Is it randomised? “The groups were balanced for gender and, as far as possible, for school. Otherwise, allocation was randomised.” Thomson et al. Br J Educ Psychology 1998;68:

Is it randomised? “The students were assigned to one of three groups, depending on how revisions were made: exclusively with computer word processing, exclusively with paper and pencil or a combination of the two techniques.” Greda and Hannafin, J Educ Res 1992;85:144.

Non-random assignment confused with random allocation “Before mailing, recipients were randomized by rearranging them in alphabetical order according to the first name of each person. The first 250 received one scratch ticket for a lottery conducted by the Norwegian Society for the Blind, the second 250 received two such scratch tickets, and the third 250 were promised two scratch tickets if they replied within one week.” Finsen V, Storeheier, AH (2006) Scratch lottery tickets are a poor incentive to respond to mailed questionnaires. BMC Medical Research Methodology 6, 19. doi: /

What is the problem here? “Pairs of students in each classroom were matched on a salient pretest variable, Rapid Letter Naming, and randomly assigned to treatment and comparison groups.” “The original sample – those students were tested at the beginning of Grade 1 – included 64 assigned to the SMART program and 63 assigned to the comparison group.” Baker S, Gersten R, Keating T. (2000) When less may be more: A 2-year longitudinal evaluation of a volunteer tutoring program requiring minimal training. Reading Research Quarterly 35,

What is wrong here? “the remaining 4 classes of fifth-grade students (n = 96) were randomly assigned, each as an intact class, to the [4] prewriting treatment groups;” Brodney et al. J Exp Educ 1999;68,5-20.

Misallocation issues “23 offenders from the treatment group could not attend the CBT course and they were then placed in the control group”.

Independent assignment “Randomisation by centre was conducted by personnel who were not otherwise involved in the research project” [1] Distant assignment was used to: “protect overrides of group assignment by the staff, who might have a concern that some cases receive home visits regardless of the outcome of the assignment process”[2] [1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48, [2] Davis RG, Taylor BG. (1997) Criminology 35,

Attrition Attrition can lead to bias; a high quality trial will have maximal follow-up after allocation. It can be difficult to ascertain the amount of attrition and whether or not attrition rates are comparable between groups. A good trial reports low attrition with no between group differences. Rule of thumb: 0-5%, not likely to be a problem. 6% to 20%, worrying, > 20% selection bias.

Poorly reported attrition In a RCT of Foster-Carers extra training was given. »“Some carers withdrew from the study once the dates and/or location were confirmed; others withdrew once they realized that they had been allocated to the control group” “117 participants comprised the final sample” No split between groups is given except in one table which shows 67 in the intervention group and 50 in the control group. 25% more in the intervention group – unequal attrition hallmark of potential selection bias. But we cannot be sure. Macdonald & Turner, Brit J Social Work (2005) 35,1265

What is the problem here? Random allocation 160 children in 20 schools (8 per school) 80 in each group 76 children allocated to control 76 allocated to intervention group 1 school 8 children withdrew N = 17 children replaced following discussion with teachers

What about matched pairs? We can only match on observable variables and we trust to randomisation to ensure that unobserved covariates or confounders are equally distributed between groups.

Matched Pairs on Gender Control (unknown covariate) Intervention (unknown covariate) Boy (high)Boy (low) Girl (high) Girl (low)Girl (high) Boy (high)Boy (low) Girl (low)Girl (high) 3 Girls and 3 highs3 Girls and 3 highs.

Drop-out of 1 girl ControlIntervention Boy (high)Boy (low) Girl (high) Girl (low)Girl (high) Boy (high)Boy (low) Girl (high) 2 Girls and 3 highs3 Girls and 3 highs.

Removing matched pair does not balance the groups! ControlIntervention Boy (high)Boy (low) Girl (high) Girl (low)Girl (high) Boy (high)Boy (low) 2 Girls and 3 highs2 Girls and 2 highs.

Blinding of Outcome Assessment Ascertainment bias can result when the assessor is not blind to group assignment, e.g., homeopathy study of histamine showed an effect when researchers were not blind to the assignment but no effect when they were. Example of outcome assessment blinding: Study “was implemented with blind assessment of outcome by qualified speech language pathologists who were not otherwise involved in the project”[1] [1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48,

ITT analysis: examples Seven participants allocated to the control condition (1.6%) received the intervention, whilst 65 allocated to the intervention failed to receive treatment (15%). The authors, however, analysed by randomised group - CORRECT approach. “It was found in each sample that approximately 86% of the students with access to reading supports used them. Therefore, one-way ANOVAs were computed for each school sample, comparing this subsample with subjects who did not have access to reading supports.” -INCORRECT Davis RG, Taylor BG. (1997) Criminology 35, Feldman SC, Fish MC. (1991) Journal of Educational Computing Research 7,

The CONSORT guidelines, adapted for trials in educational research Was the target sample size adequately determined? Was intention to teach analysis used? (i.e. were all children who were randomised included in the follow-up and analysis?) Were the participants allocated using random number tables, coin flip, computer generation? Was the randomisation process concealed from the investigators? (i.e. were the researchers who were recruiting children to the trial blind to the child’s allocation until after that child had been included in the trial?) Were follow-up measures administered blind? (i.e. were the researchers who administered the outcome measures blind to treatment allocation?) Was precision of effect size estimated (confidence intervals)? Were summary data presented in sufficient detail to permit alternative analyses or replication? Was the discussion of the study findings consistent with the data?

Flow Diagram In health care trials reported in the main medical journals authors are required to produce a CONSORT flow diagram. The trial by Hatcher et al, clearly shows the fate of the participants after randomisation until analysis.

Flow Diagrams Hatcher et al J Child Psych Psychiatry: online

Year 7 Pupils N = 155 Randomised ICT group N = 77 N = 3 left school 70 valid pre-tests 67 valid post-tests 63 valid pre and post No ICT Group N = 78 N = 1 left school 75 valid pre-tests 71 valid post-tests 67 valid pre and post tests