Assessing Intervention Fidelity in RCTs: Models, Methods and Modes of Analysis David S. Cordray & Chris Hulleman Vanderbilt University Presentation for.

Slides:



Advertisements
Similar presentations
Chapter 8 Flashcards.
Advertisements

Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.
Experimental Research Designs
Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs David S. Cordray Vanderbilt University IES/NCER Summer Research.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Concept of Measurement
Analyzing Intervention Fidelity and Achieved Relative Strength David S. Cordray Vanderbilt University NCER/IES RCT Training Institute,2010.
RELIABILITY consistency or reproducibility of a test score (or measurement)
Clustered or Multilevel Data
Today Concepts underlying inferential statistics
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Chapter 14 Inferential Data Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Analysis of Clustered and Longitudinal Data
Chapter 8 Experimental Research
Chapter 5 Research Methods in the Study of Abnormal Behavior Ch 5.
Assessing Intervention Fidelity in RCTs: Concepts and Methods Panelists: David S. Cordray, PhD Chris Hulleman, PhD Joy Lesnick, PhD Vanderbilt University.
Inferences about School Quality using opportunity to learn data: The effect of ignoring classrooms. Felipe Martinez CRESST/UCLA CCSSO Large Scale Assessment.
From GLM to HLM Working with Continuous Outcomes EPSY 5245 Michael C. Rodriguez.
Achieved Relative Intervention Strength: Models and Methods Chris S. Hulleman David S. Cordray Presentation for the SREE Research Conference Washington,
Progressing Toward a Shared Set of Methods and Standards for Developing and Using Measures of Implementation Fidelity Discussant Comments Prepared by Carol.
What Was Learned from a Second Year of Implementation IES Research Conference Washington, DC June 8, 2009 William Corrin, Senior Research Associate MDRC.
Introduction Multilevel Analysis
Today: Our process Assignment 3 Q&A Concept of Control Reading: Framework for Hybrid Experiments Sampling If time, get a start on True Experiments: Single-Factor.
Conceptualizing Intervention Fidelity: Implications for Measurement, Design, and Analysis Implementation Research Methods Meeting September 20-21, 2010.
Copyright ©2008 by Pearson Education, Inc. Pearson Prentice Hall Upper Saddle River, NJ Foundations of Nursing Research, 5e By Rose Marie Nieswiadomy.
Chapter Eight. Figure 8.1 Relationship of Experimentation to the Previous Chapters and the Marketing Research Process Focus of This Chapter Relationship.
HOW TO WRITE RESEARCH PROPOSAL BY DR. NIK MAHERAN NIK MUHAMMAD.
Conceptualizing Intervention Fidelity: Implications for Measurement, Design, and Analysis Implementation: What to Consider At Different Stages in the Research.
Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 3: The Foundations of Research 1.
URBDP 591 I Lecture 3: Research Process Objectives What are the major steps in the research process? What is an operational definition of variables? What.
Laying the Foundation for Scaling Up During Development.
Assessing the Quality of Research
All Hands Meeting 2005 The Family of Reliability Coefficients Gregory G. Brown VASDHS/UCSD.
Methodology Matters: Doing Research in the Behavioral and Social Sciences ICS 205 Ha Nguyen Chad Ata.
Research Methods for Counselors COUN 597 University of Saint Joseph Class # 4 Copyright © 2015 by R. Halstead. All rights reserved.
Nonresponse Rates and Nonresponse Bias In Surveys Robert M. Groves University of Michigan and Joint Program in Survey Methodology, USA Emilia Peytcheva.
MSRP Year 1 (Preliminary) Impact Research for Better Schools RMC Corporation.
1 Session 6 Minimally Important Differences Dave Cella Dennis Revicki Jeff Sloan David Feeny Ron Hays.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
Brian Lukoff Stanford University October 13, 2006.
Design of Clinical Research Studies ASAP Session by: Robert McCarter, ScD Dir. Biostatistics and Informatics, CNMC
Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.
Progressing Toward a Shared Set of Methods and Standards for Developing and Using Measures of Implementation Fidelity Symposium Chair: Chris S. Hulleman,
Designing An Adaptive Treatment Susan A. Murphy Univ. of Michigan Joint with Linda Collins & Karen Bierman Pennsylvania State Univ.
Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students June 8, 2009 IES Annual Research.
Research in Psychology Chapter Two 8-10% of Exam AP Psychology.
The Cause…or the “What” of What Works? David S. Cordray Vanderbilt University IES Research Conference Washington, DC June 16, 2006.
Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.
What is Research Design? RD is the general plan of how you will answer your research question(s) The plan should state clearly the following issues: The.
Definition Slides Unit 2: Scientific Research Methods.
Definition Slides Unit 1.2 Research Methods Terms.
David S. Cordray, PhD Vanderbilt University
Chapter 4: Studying Behavior
HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.
CHAPTER 2 Research Methods in Industrial/Organizational Psychology
12 Inferential Analysis.
Chapter Eight: Quantitative Methods
Reliability and Validity of Measurement
From GLM to HLM Working with Continuous Outcomes
12 Inferential Analysis.
Considering Fidelity as an Element Within Scale Up Initiatives Validation of a Multi-Phase Scale Up Design for a Knowledge-Based Intervention in Science.
Analyzing Intervention Fidelity and Achieved Relative Strength
Educational Testing Service
Presentation transcript:

Assessing Intervention Fidelity in RCTs: Models, Methods and Modes of Analysis David S. Cordray & Chris Hulleman Vanderbilt University Presentation for the IES Research Conference Washington, DC June 9, 2009

Overview Fidelity and Achieved Relative Strength: –Definitions, distinctions and illustrations –Conceptual foundation for assessing fidelity in RCTs –Achieved relative strength, a special case in RCTs Modes of analysis: –Approaches and challenges Chris Hulleman -- Assessing implementation fidelity and achieved relative strength indices: The single core component case Questions and discussion

Distinguishing Implementation Assessment from the Assessment of Implementation Fidelity Two ends on a continuum of intervention implementation/fidelity: A purely descriptive model: –Answering the question “What transpired as the intervention was put in place (implemented). Based on a priori intervention model, with explicit expectations about implementation of program components: –Fidelity is the extent to which the realized intervention (t Tx ) is faithful to the pre-stated intervention model (T Tx ) –Infidelity = T Tx – t Tx Most implementation fidelity assessments involve descriptive and model-based approaches.

Dimensions Intervention Fidelity Aside from agreement at the extremes, little consensus on what is meant by the term “intervention fidelity”. Most frequent definitions: –True Fidelity = Adherence or compliance: Program components are delivered/used/received, as prescribed With a stated criteria for success or full adherence The specification of these criteria is relatively rare –Intervention Exposure: Amount of program content, processes, activities delivered/received by all participants (aka, receipt, responsiveness) This notion is most prevalent –Intervention Differentiation: The unique features of the intervention are distinguishable from other programs, including the control condition A unique application within RCTs

Linking Intervention Fidelity Assessment to Contemporary Models of Causality Rubin’s Causal Model: –True causal effect of X is (Y i Tx – Y i C ) –RCT methodology is the best approximation to this true effect –In RCTs, the difference between conditions, on average, is the causal effect Fidelity assessment within RCTs entails examining the difference between causal components in the intervention and control conditions. Differencing causal conditions can be characterized as achieved relative strength of the contrast. –Achieved Relative Strength (ARS) = t Tx – t C –ARS is a default index of fidelity

Achieved Relative Strength =.15 Infidelity “Infidelity” (85)-(70) = 15 t C t tx T Tx TCTC Treatment Strength Expected Relative Strength = ( ) = Outcome

Why is this Important? Statistical Conclusion validity –Unreliability of Treatment Implementation: Variations across participants in the delivery receipt of the causal variable (e.g., treatment). Increases error and reduces the size of the effect; decreases chances of detecting covariation. Resulting in a reduction in statistical power or the need for a larger study….

The Effects Structural Infidelity on Power Fidelity

Influence of Infidelity on Study-size Fidelity

If That Isn’t Enough…. Construct Validity: –Which is the cause? (T Tx - T C ) or (t Tx – t C ) Poor implementation: essential elements of the treatment are incompletely implemented. Contamination: The essential elements of the treatment group are found in the control condition (to varying degrees). Pre-existing similarities between T and C on intervention components. External validity – generalization is about (t Tx - t C ) –This difference needs to be known for proper generalization and future specification of the intervention components

So what is the cause? …The achieved relative difference in conditions across components Augmentation of Control Infidelity PD= Professional Development Asmt=Formative Assessment Diff Inst= Differentiated Instruction

Achieved Relative Strength =.15 Infidelity “Infidelity” t C t tx T TX’ T Tx TCTC Treatment Strength Outcome Intervention Exposure True Fidelity Tx Contamination Augmentation of C Intervention Exposure Positive Infidelity Intervention Differentiation Review: Concepts and Definitions

Some Sources and Types of Infidelity If delivery or receipt could be dichotomized (yes or no): –Simple fidelity involves compliers; –Simple infidelity involves “No shows” and cross- overs. Structural flaws in implementing the intervention: –Missing or incomplete resources, processes –External constraints (e.g. snow days) Incomplete delivery of core intervention components –Implementer failures or incomplete delivery

A Tutoring Program: Variation in Exposure 4-5 tutoring sessions per week, 25 minutes each, 11weeks Expectations: sessions Cycle Cycle Cycle Average Sessions Delivered Range Random Assignment of Students Time 

Variation in Exposure: Tutor Effects Individual Tutors Average Number of Tutoring Sessions per Tutor The other fidelity question: How faithful to the tutoring model is each tutor?

In Practice…. Identify core components in the intervention group –e.g., via a Model of Change Establish bench marks (if possible) for T TX and T C Measure core components to derive t Tx and t C –e.g., via a “Logic model” based on Model of Change Measurement (deriving indicators) Converted to Achieved Relative Strength and implementation fidelity scales Incorporated into the analysis of effects

What do we measure? What are the options? (1) Essential or core components (activities, processes); (2) Necessary, but not unique, activities, processes and structures (supporting the essential components of T); and (3) Ordinary features of the setting (shared with the control group) Focus on 1 and 2.

Fidelity Assessment Starts With a Model or Framework for the Intervention From: Gamse et al. 2008

Core Reading Components for Local Reading First Programs Use of research-based reading programs, instructional materials, and assessment, as articulated in the LEA/school application Teacher professional development in the use of materials and instructional approaches 1)Teacher use of instructional strategies and content based on five essential components of reading instruction 2) Use of assessments to diagnose student needs and measure progress 3) Classroom organization and supplemental services and materials that support five essential components Design and Implementation of Research-Based Reading Programs After Gamse et al. 2008

From Major Components to Indicators… Professional Development Reading Instruction Support for Struggling Readers Assessment Instructional Time Instructional Material Instructional Activities/Strategies Block Actual Time Scheduled block? Reported time Major Components Sub- components Facets Indicators

Reading First Implementation: Specifying Components and Operationalization ComponentsSub-componentsFacetsIndicators (I/F) Reading Instruction Instructional Time 2 2 (1) Instructional Materials 4 12 (3) Instructional Activities /Strategies 8 28 (3.5) Support for Struggling Readers (SR) Intervention Services 3 12 (4) Supports for Struggling Readers 2 16 (8) Supports for ELL/SPED 2 5 (2.5) AssessmentSelection/Interpretation 5 12 (2.4) Types of Assessment 3 9 (3) Use by Teachers 1 7 (7) Professional development Improved Reading Instruction (6.1) (4) Adapted from Moss et al. 2008

Reading First Implementation: Some Results ComponentsSub- components Performance LevelsARSI (U3) RFNon-RF Reading Instruction Instructional Time (minutes) (63%) Support79%58%0.50 (69%) Struggling Readers More Tx, Time, Supplemental Service 83%74%0.20 (58%) Professional Development Hours of PD (66%) Five reading dimensions 86%62%0.55 (71%) AssessmentGrouping, progress, needs 84%71%0.32 (63%) 0.39 (65%) Adapted from Moss et al. 2008

So What Do I Do With All This Data? Start with: –Scale construction, aggregation over facets, sub- components, components Use as: –Descriptive analyses –Explanatory (AKA exploratory) analyses –There are a lot of options In this section we describe a hierarchy of analyses, higher to lower levels of causal inference Caveat: Except for descriptive analyses, most approaches are relative new and not fully tested.

Hierarchy of Approaches to Analysis: –ITT (Intent-to-treat) estimates (e.g., ES) plus: an index of true fidelity: –ES=.50 Fidelity = 96% an index of Achieved Relative Strength (ARS). –Hulleman’s initial analysis: ES=0.45, ARS=0.92. –LATE (Local Average Treatment Effect): If treatment receipt/delivery can be meaningfully dichotomized and there is experimentally induced receipt or non-receipt of treatment: – adjust ITT estimate by T and C treatment receipt rates. Simple model can be extended to an Instrumental Variable Analysis (see Bloom’s 2005 book). –ITT retains causal status; LATE can approximate causal statements.

More on Fidelity to Outcome Linkages –TOT (Treatment-on-Treated) Simple: ITT estimate adjusted for compliance rate in Tx, no randomization. Two-level linear production function, modeling the effects of implementation factors in Tx and modeling factors affecting C in separate Level 2 equations. Regression-based model, exchanging implementation fidelity scales for treatment exposure variable.

Descriptive Analyses Fidelity is often examined in the intervention group, only. –Dose-response relationship –Partition intervention sites into “high” and “low” implementation fidelity: My review of some ATOD prevention studies, the ES HIGH =0.13 to 0.18 ES LOW =0.00 to 0.03

Some Challenges Interventions are rarely clear; Measurement involves novel constructs; How should components be weighted? If at all. Fidelity assessment occurs at multiple levels; Fidelity indicators are used in 2nd and 3 rd levels of HLM models, few degrees of freedom; There is uncertainty about the psychometric properties of fidelity indicators; and Functional form of fidelity and outcome measures is not always known. But, despite these challenges, Chris Hulleman has a dandy example…

Assessing Implementation Fidelity in the Lab and in Classrooms: The Case of a Motivation Intervention

PERCEIVED UTILITY VALUE INTEREST PERFORMANCE MANIPULATED RELEVANCE Model Adapted from: Eccles et al. (1983); Hulleman et al. (2009) The Theory of Change

Methods (Hulleman & Cordray, 2009) LaboratoryClassroom SampleN = 107 undergraduatesN = 182 ninth-graders 13 classes 8 teachers 3 high schools TaskMental Multiplication Technique Biology, Physical Science, Physics Treatment manipulationWrite about how the mental math technique is relevant to your life. Pick a topic from science class and write about how it relates to your life. Control manipulationWrite a description of a picture from the learning notebook. Pick a topic from science class and write a summary of what you have learned. Number of manipulations12 – 8 Length of Study1 hour1 semester Dependent VariablePerceived Utility Value

Motivational Outcome g = 0.05 (p =.67) ?

Fidelity Measurement and Achieved Relative Strength Simple intervention – one core component Intervention fidelity: –Exposure: “quality of participant responsiveness” –Rated on scale from 0 (none) to 3 (high) –2 independent raters, 88% agreement

Exposure LaboratoryClassroom CTxC Quality of Responsiveness N%N%N%N% Total Mean SD

Indexing Fidelity Absolute –Compare observed fidelity (t Tx ) to absolute or maximum level of fidelity (T Tx ) Average –Mean levels of observed fidelity (t Tx ) Binary –Yes/No treatment receipt based on fidelity scores –Requires selection of cut-off value

Fidelity Indices ConceptualLaboratoryClassroom AbsoluteTx C AverageTx C BinaryTx C

Indexing Fidelity as Achieved Relative Strength Intervention Strength = Treatment – Control Achieved Relative Strength (ARS) Index Standardized difference in fidelity index across Tx and C Based on Hedges’ g (Hedges, 2007) Corrected for clustering in the classroom (ICC’s from.01 to.08) See Hulleman & Cordray (2009)

Average ARS Index Where, = mean for group 1 (t Tx ) = mean for group 2 (t C ) S T = pooled within groups standard deviation n Tx = treatment sample size n C = control sample size n = average cluster size p = Intra-class correlation (ICC) N = total sample size Group DifferenceSample Size Adjustment Clustering Adjustment

Absolute and Binary ARS Indices Where, p Tx = proportion for the treatment group (t Tx ) p C = proportion for the control group (t C ) n Tx = treatment sample size n C = control sample size n = average cluster size p = Intra-class correlation (ICC) N = total sample size Group DifferenceSample Size Adjustment Clustering Adjustment

Achieved Relative Strength = 1.32 Fidelity Infidelity T Tx TCTC Treatment Strength tCtC t tx Average ARS Index (0.74)-(0.04) = 0.70

Achieved Relative Strength Indices Observed Fidelity Lab vs. Class Contrasts LabClassLab - Class AbsoluteTx C g AverageTx C g BinaryTx C0.00 g

Linking Achieved Relative Strength to Outcomes

Sources of Infidelity in the Classroom Student behaviors were nested within teacher behaviors Teacher dosage Frequency of student exposure Student and teacher behaviors were used to predict treatment fidelity (i.e., quality of responsiveness/exposure).

Sources of Infidelity: Multi-level Analyses Part I: Baseline Analyses Identified the amount of residual variability in fidelity due to students and teachers. –Du to missing data, we estimated a 2-level model (153 students, 6 teachers) Student:Y ij = b 0j + b 1j (TREATMENT) ij + r ij, Teacher:b 0j = γ 00 + u 0j, b 1j = γ 10 + u 10j

Sources of Infidelity: Multi-level Analyses Part II: Explanatory Analyses Predicted residual variability in fidelity (quality of responsiveness) with frequency of responsiveness and teacher dosage Student:Y ij = b 0j + b 1 (TREATMENT) ij + b 2 (RESPONSE FREQUENCY) ij + r ij Teacher:b 0j = γ 00 + u 0j b 1j = γ 10 + b 10 (TEACHER DOSAGE) j + u 10j b 2j = γ 20 + b 20 (TEACHER DOSAGE) j + u 20j

Sources of Infidelity: Multi-level Analyses Baseline ModelExplanatory Model Variance Component Residual Variance % of TotalVariance % Reduction Level 1 (Student) * * < 1 Level 2 (Teacher) * * Total * p <.001.

Case Summary The motivational intervention was more effective in the lab (g = 0.45) than field (g = 0.05). Using 3 indices of fidelity and, in turn, achieved relative treatment strength, revealed that: –Classroom fidelity < Lab fidelity –Achieved relative strength was about 1 SD less in the classroom than the laboratory Differences in achieved relative strength = differences motivational outcome, especially in the lab. Sources of fidelity: teacher (not student) factors

Key Points and Issues Identifying and measuring, at a minimum, should include model-based core and necessary components Collaborations among researchers and practitioners (e.g., developers and implementers) is essential for specifying: –Intervention models –Core and essential components –Benchmarks for T Tx (e.g., an educationally meaningful dose; what level of X is needed to instigate change) –Tolerable adaptation

Key Points and Issues Fidelity assessment serves two roles: –Average causal difference between conditions; and –Using fidelity measures to assess the effects of variation in implementation on outcomes. Post-experimental (re)specification of the intervention

Thank You Questions and Discussion