Defining, Conceptualizing, and Measuring Fidelity of Implementation and Its Relationship to Outcomes in K–12 Curriculum Intervention Research Prepared.

Slides:

Advertisements

Similar presentations

Chapter 2 The Process of Experimentation

Advertisements

Performance Assessment

Donald T. Simeon Caribbean Health Research Council

Cross Cultural Research

Writing an Effective Proposal for Innovations in Teaching Grant

Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs David S. Cordray Vanderbilt University IES/NCER Summer Research.

TWS Aid for Scorers Information on the Background of TWS.

Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.

Concept of Measurement

MSc Applied Psychology PYM403 Research Methods Validity and Reliability in Research.

Planning Value of Planning What to consider when planning a lesson Learning Performance Structure of a Lesson Plan.

Formative and Summative Evaluations

National Science Foundation: Transforming Undergraduate Education in Science, Technology, Engineering, and Mathematics (TUES)

Personality, 9e Jerry M. Burger

Chapter 7 Correlational Research Gay, Mills, and Airasian

Understanding Validity for Teachers

What should be the basis of

performance INDICATORs performance APPRAISAL RUBRIC

Science Inquiry Minds-on Hands-on.

Descriptive and Causal Research Designs

Principles of Assessment

Striving for Quality Using continuous improvement strategies to increase program quality, implementation fidelity and durability Steve Goodman Director.

CO-TEACHING INSTRUCTION

Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.

Achieved Relative Intervention Strength: Models and Methods Chris S. Hulleman David S. Cordray Presentation for the SREE Research Conference Washington,

Improving Implementation Research Methods for Behavioral and Social Science Working Meeting Measuring Enactment of Innovations and the Factors that Affect.

Kristie J. Newton, Temple University Jon R. Star, Harvard University.

Classroom Assessments Checklists, Rating Scales, and Rubrics

Progressing Toward a Shared Set of Methods and Standards for Developing and Using Measures of Implementation Fidelity Discussant Comments Prepared by Carol.

2 The combination of three concepts constitutes the foundation for results: 1) meaningful teamwork; 2) clear, measurable goals; and 3) regular collection.

The Research Enterprise in Psychology. The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each.

EDU 385 Education Assessment in the Classroom

INTERNATIONAL SOCIETY FOR TECHNOLOGY IN EDUCATION working together to improve education with technology Using Evidence for Educational Technology Success.

A Network Approach To Improving Teaching and Learning Center Point High School Instructional Rounds in Education.

Evaluating a Research Report

GETTING HIGH SCHOOL STUDENT’S BUY-IN: Target Language Only Mandarin Chinese Classes.

Implementation and process evaluation: developing our approach Ann Lendrum University of Manchester Neil Humphrey University of Manchester Gemma Moss Institute.

KATEWINTEREVALUATION.com Education Research 101 A Beginner’s Guide for S STEM Principal Investigators.

HOW TO WRITE RESEARCH PROPOSAL BY DR. NIK MAHERAN NIK MUHAMMAD.

Classroom Assessment for Student Learning March 2009 Assessment Critiquing.

URBDP 591 I Lecture 3: Research Process Objectives What are the major steps in the research process? What is an operational definition of variables? What.

Laying the Foundation for Scaling Up During Development.

PPA 502 – Program Evaluation Lecture 2c – Process Evaluation.

Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.

Thomson South-Western Wagner & Hollenbeck 5e 1 Chapter Sixteen Critical Thinking And Continuous Learning.

Issues in Validity and Reliability Conducting Educational Research Chapter 4 Presented by: Vanessa Colón.

META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.

Researchers Without Borders Webinar 4 A Framework and Suite of Instruments for Examining Fidelity of Implementation Jeanne Century Center for Elementary.

(c) 2007 McGraw-Hill Higher Education. All rights reserved. Accountability and Teacher Evaluation Chapter 14.

Securing External Federal Funding Janice F. Almasi, Ph.D. Carol Lee Robertson Endowed Professor of Literacy University of Kentucky

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.

Open Forum: Scaling Up and Sustaining Interventions Moderator: Carol O'Donnell, NCER

The Cause…or the “What” of What Works? David S. Cordray Vanderbilt University IES Research Conference Washington, DC June 16, 2006.

1 DEMONSTRATION PROJECTS TO ENSURE STUDENTS WITH DISABILITIES RECEIVE A QUALITY HIGHER EDUCATION PROGRAM Performance Measurement, Program and Project Evaluation.

Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.

Stages of Research and Development

Classroom Assessments Checklists, Rating Scales, and Rubrics

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.

Classroom teaching observation tools

Classroom Assessments Checklists, Rating Scales, and Rubrics

Chapter Eight: Quantitative Methods

Research Methods A Method to the Madness.

© 2012 The McGraw-Hill Companies, Inc.

Analyzing Reliability and Validity in Outcomes Assessment Part 1

Unit 7: Instructional Communication and Technology

The Heart of Student Success

Fidelity of Implementation in Scaling-up Highly Rated Science Curriculum Units for Diverse Populations Carol O’Donnell and Sharon Lynch The George Washington.

Analyzing Reliability and Validity in Outcomes Assessment

Biological Science Applications in Agriculture

Presentation transcript:

Defining, Conceptualizing, and Measuring Fidelity of Implementation and Its Relationship to Outcomes in K–12 Curriculum Intervention Research Prepared by Carol O’Donnell Institute of Education Sciences The opinions expressed are those of the presenter and do not necessarily represent the views of IES or the US Dept of Education. This webinar is based on a paper published in the Review of Educational Researcher (O'Donnell, 2008) and findings reported in O'Donnell, 2007, which were published before the presenter joined IES.

A little about my background… I was a classroom teacher for 10 years (grades 4-8) and now teach undergraduates part-time. –“I know it’s not always easy to teach with fidelity.” I was a curriculum developer for 11 years developing curriculum materials for science teachers. –“I believed in the materials I developed. They were field tested. I knew they could be effective if taught with fidelity.” I was a researcher at GWU for 5 years managing a large scale-up study on the effectiveness of middle school science curriculum units. –“How can we be certain the Treatment is being implemented as planned?” This lead to my role as a program officer and Review of Educational Research article (O’Donnell, 2008) on fidelity of implementation. –How can I help other researchers define, conceptualize and measure fidelity of implementation in their efficacy and effectiveness studies?

Motivation: What problems exist? Teachers have difficulty teaching with fidelity when creativity, variability, and local adaptations are encouraged. Developers often fail to identify the critical components of an intervention. Researchers often fail to measure whether components are delivered as intended.

What problems exist? If we want to determine effectiveness of an intervention, we need to define the treatment and its counterfactual. If “it” works, what is “it” and how do we distinguish “it” from what is happening in the comparison classroom? Most fidelity studies are correlational and do not involve impact analysis. Implementation under under ideal conditions (efficacy studies) may yield higher effects than those under routine conditions (effectiveness studies). Lipsey, 1999; Petrosino & Soydan, 2005; Weisz, Weiss & Donenberg, 1992

Points I Will Address In This Webinar A.How do teachers, developers, and researchers define fidelity of implementation? B.How is fidelity of implementation conceptualized within efficacy and effectiveness studies?

C.How do we measure fidelity of implementation? D.How do we analyze fidelity data to determine how it impacts program effectiveness? E.An example from my own research (if time). Points I Will Address In This Webinar

A.How do teachers, developers, and researchers define fidelity of implementation?

Teachers

As a teacher, I would ask: “Can I modify the program to meet the needs of my diverse students (SPED, ESOL, etc.)?” “How do I meet state indicators (e.g. vocabulary) not covered by the new program?” “Can I use instructional practices that I typically use in the classroom (e.g. exit cards, warm ups) even if they aren’t promoted by the intervention?” “Can I add supplemental readings?” Source: O’Donnell, Lynch, & Merchlinsky, 2004 What does fidelity of implementation mean to a teacher?

To a teacher, fidelity is… Adhering to program purpose, goals, and objectives. Applying the program’s pedagogical approaches. Following the program’s sequence. Using the recommended equipment or materials. Making an adaptation to the program that does NOT change its nature or intent. Source: O’Donnell, Lynch, & Merchlinsky, 2004

Reducing or modifying program objectives. Gradually replacing parts of the new program with previous practices. Varying grouping strategies outlined in the program. Changing the program’s organizational patterns. Substituting other curriculum materials or lessons for those described by the program. Source: O’Donnell, Lynch, & Merchlinsky, 2004 To a teacher, fidelity is NOT…

Developers

As a developer, I would ask: What are the critical components of the program? If the teacher skips part of the program, why does that happen, and what effect will it have on outcomes? Is the program feasible (practical) for a teacher to use? Is it usable (are the program goals clear)? If not, what changes should I make to the program? What programmatic support must be added? What ancillary components are part of the program (e.g., professional development) and must be scaled- up with it? What does fidelity of implementation mean to a developer?

To distinguish between the effects of pre-existing good teaching practices and those prompted by the instructional program. If the program doesn’t add value, why spend money on it? To understand why certain aspects of instructional delivery are consistently absent, despite curricular support (e.g., skipping lesson closure). Source: O’Donnell, Lynch, & Merchlinsky, 2004 Why should developers collect fidelity of implementation data?

Researchers

What does fidelity of implementation mean to a researcher? Determination of how well a program is implemented in comparison with the original program design during an efficacy and/or effectiveness study (Mihalic, 2002). Extent to which the delivery of an intervention adheres to the program model originally developed; confirms that the independent variable in outcome research occurred as planned (Mowbray et al., 2003).

To explore how effective programs might be scaled up across many sites (i.e., if implementation is a moving target, generalizability of research may be imperiled). To gain confidence that the observed student outcomes can be attributed to the program. To gauge the wide range of fidelity with which an intervention might be implemented. Source: Lynch, O’Donnell, Ruiz-Primo, Lee, & Songer, Why do researchers study fidelity of implementation?

Definitions: Summary Fidelity of implementation is: the extent to which a program (including its content and process) is implemented as designed; how it is implemented (by the teacher); how it is received (by the students); how long it takes to implement (duration); and, what it looks like when it is implemented (quality).

Questions?

B.How is fidelity of implementation conceptualized within efficacy and effectiveness studies?

Definition: Efficacy Study Efficacy is the first stage of program research following development. Efficacy is defined as “the ability of an intervention to produce the desired beneficial effect in expert hands and under ideal circumstances” (RCTs) (Dorland’s Illustrated Medical Dictionary, 1994, p. 531). Failure to achieve desired outcomes in an efficacy study "give[s] evidence of theory failure, not implementation failure" (Raudenbush, 2003, p. 4).

Internal validity - determines that the program will result in successful achievement of the instructional objectives, provided the program is “delivered effectively as designed” (Gagne et al., 2005, p. 354). Efficacy entails continuously monitoring and improving implementation to ensure the program is implemented with fidelity (Resnick et al., 2005). Explains why innovations succeed or fail (Dusenbury et al., 2003); Helps determine which features of program are essential and require high fidelity, and which may be adapted or deleted (Mowbray et al., 2003). Fidelity in Efficacy Studies

Interventions with demonstrated benefit in efficacy studies are then transferred into effectiveness studies. Effectiveness study is not simply a replication of an efficacy study with more subjects and more diverse outcome measures conducted in a naturalistic setting (Hohmann & Shear, 2002). Effectiveness is defined as “the ability of an intervention to produce the desired beneficial effect in actual use” under routine conditions (Dorland, 1994, p. 531) where mediating and moderating factors can be identified (Aron et al., 1997; Mihalic, 2002; Raudenbush, 2003; Summerfelt & Meltzer, 1998). Definition: Effectiveness Study

External validity – fidelity in effectiveness studies helps to generalize results and provides “adequate documentation and guidelines for replication projects adopting a given model” (Mowbray et al, 2003; Bybee, 2003; Raudenbush, 2003). Role of developer and researcher is minimized. Focus is not on monitoring and controlling levels of fidelity; instead, variations in fidelity are measured in a natural setting and accounted for in outcomes. Fidelity in Effectiveness Studies

Questions?

C.How do we measure fidelity of implementation?

Adherence – Strict adherence to structural components and methods that conform to theoretical guidelines. Dose (Duration) – Completeness and amount of program delivered. Quality of Delivery – The way by which a program is implemented. Participant Responsiveness – The degree to which participants are engaged. Program Differentiation – The degree to which elements which would distinguish one type of program from another are present or absent. Adapted from: Dane & Schneider (1998); Dusenbury, Brannigan, Falco, & Hansen (2003) Multiple Dimensions

O’Donnell (2008): Steps in Measuring Fidelity 1.Start with curriculum profile or analysis; review program materials and consult with developer. Determine the intervention’s program theory. What does it mean to teach it with fidelity? 2.Using developer’s and past implementers’ input, outline critical components of intervention divided by structure (adherence, duration) and process (quality of delivery, program differentiation, participant responsiveness) and outline range of variations for acceptable use. O’Donnell, C. L. (2008).Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K–12 curriculum intervention research. Review of Educational Research, 78, 33–84.

O’Donnell (2008): Steps in Measuring Fidelity 3.Develop checklists and other instruments to measure implementation of components (in most cases unit of analysis is the classroom). 4.Collect multi-dimensional data in both treatment and comparison conditions: questionnaires, classroom observations, self-report, student artifacts, interviews. Self-report data typically yields higher levels of fidelity than observed in the field. 5.Adjust outcomes if fidelity falls outside acceptable range. O’Donnell, C. L. (2008).Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in K–12 curriculum intervention research. Review of Educational Research, 78, 33–84.

Measuring Fidelity of Implementation Psychometricians should be involved in the development of fidelity measures—validity and reliability. Fidelity to structure (adherence) easy to measure. Fidelity to process (quality) less reliable, but has higher predictive utility. “Global classroom quality” should be considered separately from implementation fidelity, unless the global items are promoted by the program. Measure adaptation separately from fidelity. Adapting a program is different from supplementing the program, which has been shown to enhance outcomes as long as fidelity is high (Blakely et al, 1987). Fidelity measures are not universal. They are program-specific. As a field, we need to standardize the methods—not the measures. See Hulleman et al SREE 2010 papers

Questions?

D.How do we analyze fidelity data to determine how it impacts program effectiveness?

Descriptive - frequency or percentage of fidelity. Associative – simple correlation; relationship between percentage of fidelity and outcomes. Predictive - fidelity explains percentage of variance in outcomes in the treatment group. Causal - requires randomization of teachers to high and low fidelity groups; fidelity causes outcomes; rarely done in research (Penuel). Impact – fidelity as 3 rd variable: e.g., fidelity moderates relationship between intervention and outcomes; effects of intervention on outcomes mediated by level of fidelity. Adjusting Outcomes – achieved relative strength; fidelity vs infidelity (Hulleman & Cordray, 2009) Source: O’Donnell, 2008 Analyzing the Impact of Fidelity on Outcomes

Correlational studies provide a nice foundation for impact analysis, but it is impact analysis that asks if implementation fidelity changes the program’s effects. Multiple correlations between fidelity items and outcomes are often disparate—what does it all mean? We need a more complete fidelity assessment to better understand construct validity and generalizability. Analyzing the Impact of Fidelity on Outcomes

Questions?

Okay. So, how do you go from identifying the program theory (using a logic model to define and conceptualize fidelity in your own study), to measuring fidelity, to analyzing its effects on outcomes?

An Example O’Donnell, C. L. (2007). Fidelity of implementation to instructional strategies as a moderator of curriculum unit effectiveness in a large- scale middle school science experiment. Dissertation Abstracts International, 68(08). (UMI No. AAT )

Step 1: Identify the critical components First, we worked with the curriculum developers to identify the program’s critical components, which weren’t always explicit to users. We then separated the components into structure (adherence, duration) and process (quality of implementation, program differentiation). We hired a third party evaluator to conduct a curriculum analysis to determine if the components were present; and, if they were, to what degree? Sharon Lynch will talk more about this work, which was part of the larger SCALE-uP study.

Instructional Category Chemistry That Applies* ARIES** Motion & Forces McMillian/McGraw -Hill Science* III. Engaging Students with Relevant Phenomena Providing a variety of phenomena ●● X Providing vivid experiences ●● X IV. Developing and Using Scientific Ideas Introducing terms meaningfully ● ◒ X Representing ideas effectively ◒◒ X Demonstrating use of knowledge ◕ XX Providing practice ● XX ● =Excellent, ◕ =Very Good, ◒ = Satisfactory, X =Poor *Source: **Available from under Reportswww.gwu.edu/~scale-up Curriculum Profile

Step 2: Define the intervention a priori using a logic model I then created a logic model of implementation to illustrate the theory of change. I used the model to theorize a priori what should happen in the classroom relative to outcomes. I kept the counterfactual in mind as I conceptualized the logic model because I hypothesized that teachers’ fidelity to the instructional practices identified by the curriculum analysis were moderating outcomes, and I knew I would have to collect fidelity data in both the comparison and treatment classrooms (structure vs process)? O’Donnell, C. L. (2007).

My Logic Model I hypothesized that the presence of the curriculum materials in the teachers’ hands relative to the comparison condition (business as usual) would have a direct effect on students’ understanding of motion and forces, but that this relationship would be moderated by a teacher’s use of the instructional practices identified by the curriculum analysis. In other words, I hypothesized that the causal relationship between the IV and DV would vary as a function of fidelity as a moderator (Cohen et al, 2003). O’Donnell, C. L. (2007).

We developed the Instructional Strategies Classroom Observation Protocol (O’Donnell, Lynch, & Merchlinsky, 2007) using the critical components identified by the curriculum analysis as our foundation. 24 items were developed, some dichotomous (Yes/No), some polytomous (Likert-like scale 0 – 3) to measure the degree of fidelity to that item. The problem was, the items were not on an interval scale and were not additive. Subjects receiving the same fidelity score had different implementation profiles. Step 3 : Measure fidelity O’Donnell, C. L. (2007).

O’Donnell, Lynch & Merchlinsky, 2007)

Step 4: Analyze fidelity data I knew that 24 items analyzed separately would complicate the model conceptually and structurally, because multiple measures often inflate standard errors of parameter estimators. I needed parsimony. So I computed a unidimensional fidelity score for each classroom using Rasch analysis. I mean-centered the fidelity score and entered it into my model. I avoided the dangers of removing low fidelity implementers from the sample, or creating bivariate median split between high and low fidelity users (which loses continuous data). O’Donnell, C. L. (2007).

R 2 =.616 No statistically significant differences between T and C classroom means for observed instructional strategies; except for Assisting teachers in identifying own students’ ideas. However, 5 of 8 criteria rated highly in in the program were positively significantly correlated with classroom mean achievement in the Treatment classrooms; no positive correlations in Comparison classrooms.

Regression analysis testing for interaction effects showed that treatment classrooms were predicted to score points higher on final assessment than comparison classrooms when their fidelity measure was High (2.40), t = , p <.05. There was no statistically significant difference in classroom mean achievement when the fidelity measures of classrooms were Low (-.85) or medium (.78). (O’Donnell, 2007)

Item maps in Rasch analysis showed that it was harder for teachers to teach the more reform- oriented practices with fidelity (items at the top of the map = accurate representations, justifying ideas); it was easier to teach the more traditional practices with fidelity (items at the bottom of the map = using terms appropriately). O’Donnell, C. L. (2007).

Questions?

Conclusions

Know when & how to use fidelity data Development - Use fidelity results to inform revisions. Decide now what components are required to deliver the intervention as intended when implemented at scale. Efficacy - Monitor fidelity and relate it to outcomes to gain confidence that outcomes are due to the program (internal validity). Replication - Determine if levels of fidelity and program results under a specific structure replicate under other organizational structures. Scale-up - Understand implementation conditions, tools, and processes needed to reproduce positive effects under routine practice on a large scale (external validity). Are methods for establishing high fidelity financially feasible?

Bottom Line: If the intervention can be implemented with adequate fidelity under conditions of routine practice and yield positive results, scale it up. Source: O’Donnell, 2008

Questions & Discussion

Please feel free to contact me at any time: Dr. Carol O’Donnell Research Scientist National Center for Education Research Institute of Education Sciences U.S. Department of Education 555 New Jersey Ave., NW, Room 610c Washington, DC – 5521 Voice: Fax: Web: