Download presentation
Presentation is loading. Please wait.
Published byBrice McDaniel Modified over 9 years ago
1
IES Summer Research Training Institute: Single-Case Intervention Design and Analysis August 18-22, 2014 The Lowell Center Madison, Wisconsin 1
2
Institute Faculty, Participants, and Institute of Education Sciences Assumptions/Goals and Objectives Schedule and Logistics Small Group Research Planning Sessions and Individual Consultation Sessions Resources on Single-Case Design and Analysis Follow Up with Institute Faculty
3
Thomas R. Kratochwill, PhD (University of Wisconsin-Madison) Joel R. Levin, PhD (University of Arizona) John Ferron, PhD (University of South Florida) Erin Barton, PhD (Vanderbilt University) Wendy Machalicek, PhD (University of Oregon) William Shadish, PhD (University of California, Merced) Carnac the Magnificent Statistician, PhD, PsyD, JD, MD, DDS, (but…ABD) (Psychic University) 3
5
Scholars with some knowledge and expertise in traditional and single-case research methods (some basic and many advanced). Some with less experience, and maybe some skepticism about utility of single-case research methods. Commitment to the role of science in society and the importance of intervention research in psychology, education, and related fields.
6
Review Logic and Foundations of Single-Case Design Intervention Research Review “pilot” Single-Case Design Standards WWC design standards (Single-Case Designs) WWC evidence criteria (Visual Analysis) Summarize proposed approaches to visual and statistical analysis and effect size measures within single-case intervention research Define criteria for documenting evidence-based practices using single-case intervention research methods
7
Agenda for the Institute (The Lowell Center) Breaks, Lunch, and Dinner Break-out Sessions Individual Project Consultations (see agenda schedule)
8
Format for Small Group Activities Small Group Break-Out Rooms (Rooms will be B1B, 115, 116, and 117; reserved for the week) Graduate Student Assistants: Elizabeth Hagermoser, MS Megan Willes, MS Schedule for Individual Project Consultations
9
Personal Design Consultation: Each participant leaves the Institute with a single- case research study (or program of study) that fits their grant/research agenda. Group Activity: Each participant leaves the Institute with the experience of working in a collaborative group to build a single case study, and apply visual analysis, statistical analysis, and effect size measures.
10
20 min meetings with faculty (additional time available per schedule) Specify design or analysis questions Product: Personal Research Research Question(s) Conceptual Model Single-Case Design, and anticipated data Analysis plan NOTE: Personal research plans are not required to be shared or disseminated. We want to respect the intellectual contributions of each scholar, yet provide a context where each participant leaves the institute with at least one new study that will meet WWC criteria.
11
Goal Each participant leaves the Institute with the experience of working in a collaborative group to build a single case study, and apply visual analysis, statistical analysis, and effect size measures.
12
Required Readings Recommended Readings Additional Resources Books and Monographs Faculty as a Resource
13
Thomas R. Kratochwill, PhD Wisconsin Center for Education Research 1025 West Johnson Street University of Wisconsin-Madison Madison, Wisconsin 53706 E-Mail: tomkat@education.wisc.edu
15
DAY 1 Logic and Foundations of Single-Case Intervention Research Joel R. Levin and Thomas R. Kratochwill University of Wisconsin-Madison
17
Tom Kratochwill Logic and Foundations of Single-Case Intervention Research Purposes and Fundamental Assumptions of Single- Case Intervention Research Methods Defining features of SCDs Core design types Internal validity and the role of replication Characteristics of Scientifically Credible Single- Case Intervention Studies “True” Single-Case Applications and the WWC Standards (design and evidence credibility) Classroom-Based Applications (design and evidence credibility) 17
18
Features of Single-Case Research Methods Single-Case Research will have Four Features: Independent variable Dependent variable Focus is on functional relation (causal effect) Dimension(s) of predicted change (e.g., level, trend, variability, score overlap)
19
Operational definition of dependent variable (DV) Measure of DV is valid, reliable, and addresses the dimension(s) of concern. Operational definition of independent variable (IV) Core features of IV are defined, and if necessary measured to document fidelity (see Sanetti & Kratochwill, 2014). Unit of IV implementation Group versus individual unit.
20
Background on Single-Case Designs: Defining Features Design: Repeated measurement of an outcome before, during, and/or after active manipulation of independent variable Often Used in Applied and Clinical Fields Allows study of low prevalence disorders where otherwise would need large sample for statistical power (Odom, et al., 2005). Sometimes more palatable to service providers because SCDs do not include a no-treatment comparison group. Hammond and Gast (2010): Descriptive analysis of “single subject” research designs: 1983-2007. Shadish and Sullivan (2011): Characteristics of single-case designs used to assess intervention effects in 2008. 20
21
Hammond and Gast (2011) reviewed 196 randomly identified journal issues (from 1983- 2007) containing 1,936 articles (a total of 556 single-case designs were coded). Multiple baseline designs were reported more often than withdrawal designs and these were more often reported across individuals and groups.
22
Sullivan and Shadish (2011) assessed the WWC pilot Standards related to implementation of the intervention, acceptable levels of observer agreement/reliability, opportunities to demonstrate a treatment effect, and acceptable numbers of data points in a phase. In published studies in 21 journals in 2008, they found that nearly 45% of the research met the strictest WWC standards of design and 30% met with some reservations. So, it can be concluded from this sample that around 75% of the published research during a sampling year of major journals that publish single-case intervention research would meet (or meet with reservations) the WWC design standards.
23
Types of Research Questions that Can be Answered with Single-Case Design Types Evaluate Intervention Effects Relative to Baseline Does Multi-systemic Therapy reduce the level of problem behavior for students with emotional behavior disorders? Compare Relative Effectiveness of Interventions Is “function-based behavior support” more effective than “non- function-base support” at reducing the level and variability of problem behavior for this participant? Compare Single- and Multi-Component Interventions Does adding Performance Feedback to Basic Teacher Training improve the fidelity with which instructional skills are used by new teachers in the classroom?
24
Is a certain teaching procedure functionally related to an increase in the level of social initiations by young children with autism? Is time delay prompting or least-to-most prompting more effective in increasing the level of self-help skills performed by young children with severe intellectual disabilities? Is the pacing of reading instruction functionally related to increased level and slope of reading performance (as measured by ORF) for third graders? Is Adderal (at clinically prescribed dosage) functionally related to increased level of attention performance on the Attention Network Test for elementary age students with Attention Deficit Disorder?
25
Like RCTs, purpose is to document causal relationships Control for major threats to internal validity Document effects for specific individuals / settings Replication (across studies) required to enhance external validity Can be distinguished from case studies
26
Ambiguous Temporal Precedence Selection History Maturation Testing Instrumentation Additive and Interactive Effects of Threats
27
Examples: Cluster Selection Cluster Composition Within-Cluster Variability Attrition of Within-Cluster Participants and of Clusters Within-Cluster Extraneous Variables Across-Cluster Contagion Effects
29
Often characterized by narrative description of case, treatment, and outcome variables Typically lack a formal design with replication but can involve a basic design format (e.g., A/B) Methods have been suggested to improve drawing valid inferences from case study research [e.g., Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings(2nd ed.). New York: Oxford University Press]
30
Type of data Assessment occasions Planned vs. ex post facto Projections of performance Treatment effect size Treatment effect impact Number of participants/replication Standardization of treatment Integrity of treatment
34
Professional agreement on the criteria for design and analysis of single-case research Better standards (materials) for training in single-case methods More precision in RFP stipulations and grant reviews Established expectations for reviewers Better standards for reviewing single-case intervention research Journal editors Reviewers Development of effect size and meta-analysis technology Consensus on what is required to identify “evidence-based practices.”
35
Publication criteria for peer reviewed journals Design, Analysis, Interpretation Grant review criteria (e.g., IES, NSF, NIMH/NIH) RFP stipulations, grant reviewer criteria Documentation of “Evidence-based Practices” Professional agreement Training expectations for new scholars Visual Analysis; Statistical Analysis Meta-analyses procedures that will allow single-case research content to reach broader audiences
36
Single-case researchers have a number of conceptual and methodological standards to guide their synthesis work. These standards, alternatively referred to as “guidelines,” have been developed by a number of professional organizations and authors interested primarily in providing guidance for reviewing the literature in a particular content domain (Smith, 2012; Wendt & Miller, 2012). The development of these standards has also provided researchers who are designing their own intervention studies with a protocol that is capable of meeting or exceeding the proposed standards.
37
Wendt and Miller (2012) identified seven “quality appraisal tools” and compared these standards to the single-case research criteria advanced by Horner et al. (2005). Smith (2012) reviewed research design and various methodological characteristics of single- case designs in peer-reviewed journals, primarily from the psychological literature (over the years 2000-2010). Based on his review, six standards for appraisal of the literature were identified (some of which overlap with the Wendt and Miller review).
38
National Reading Panel American Psychological Association (APA) Division 12/53 American Psychological Association (APA) Division 16 What Works Clearinghouse (WWC) Consolidated Standards of Reporting Trials (CONSORT) Guidelines for N-of-1 Trials (the CONSORT Extension for N-of1 Trials [CENT] 38
39
Single-case methods developed and traditionally used within Applied Behavior Analysis Shavelson & Towne, 2002 Claims that Visual Analysis is Unreliable Emergence of “Evidence- based” practices IES commitment to Rigorous Education Research
40
Single-case methods developed and used within Applied Behavior Analysis Shavelson and Towne (2002) Established RCT as “gold standard” Formally dismissed single-case research Recent Investment by IES Funding of grants focused on single-case methods Formal policy that single-case studies are able to document experimental control Inclusion of single-case options in IES RFPs What Works Clearinghouse Pilot SCD Standards White Paper Training IES/WWC reviewers Single-Case Design Institute to Educate Researchers
41
Single-Case Intervention Research Design Standards Panel Thomas R. Kratochwill, Chair University of Wisconsin-Madison John H. Hitchcock Ohio University Robert H. Horner University of Oregon Joel R. Levin University of Arizona Samuel M. Odom University of North Carolina at Chapel Hill David M. Rindskopf City University of New York William R. Shadish University of California Merced Available at: http://ies.ed.gov/ncee/wwc/ pdf/wwc_scd.pdf
42
What Works Clearinghouse Standards Design Standards Evidence Criteria Social Validity
43
Evaluate the Design Meets Design StandardsMeets with ReservationsDoes Not Meet Design Standards Evaluate the Evidence Strong EvidenceModerate EvidenceNo Evidence Effect-Size Estimation Social Validity Assessment Is it possible to document Experimental Control ? Do the data document Experimental Control ? Is the effect something we should care about? StopStop
46
Issues to Consider in Selecting a Single-Case Intervention Design
47
1. Experimental control: The design allows documentation of causal (e.g., functional) relations between independent and dependent variables. 2. Individual as unit of analysis Individual provides their own control. Can treat a “group” or cluster as a participant with focus on the group as a single unit. 3. Independent variable is actively manipulated 4. Repeated measurement of dependent variable Direct observation at multiple points in time is often used. Inter-observer agreement to assess “reliability” of the dependent variable. 5. Baseline To document social problem, and control for confounding variables.
48
6. Design controls for threats to internal validity Opportunity for replication of basic effect at 3 different points in time. 7. Visual Analysis/Statistical Analysis Visual analysis documents basic effect at three different points in time. Statistical analysis options emerging and presented during the Institute 8. Replication Within a study to document experimental control. Across studies to document external validity. Across studies, researchers, contexts, participants to document Evidence-Based Practices. 9. Experimental flexibility Designs may be modified or changed within a study (sometimes called response-guided research).
49
Reversal/Withdrawal Designs Multiple Baseline Designs Alternating Treatment Designs Others: Changing Criterion Non-Concurrent Multiple Baseline Multiple Probe
50
Useful in the iterative development of interventions. Documentation of experimental effects that help define the mechanism for change, not just the occurrence of change. Analysis of interventions targeting low-incidence populations (e.g., individual as unit of implementation, individuals with disabilities). Useful for pilot research to assess the effect size needed for other research methods (RCTs). Useful for fine-grained analysis of “weak and non-responders” (Negative Results; to be discussed later in the Institute). Useful in group research when unique participants (e.g., non- responders) are further assessed to determine what modifications in the intervention may need to occur.
51
51
52
Evaluate the Design Meets Design StandardsMeets with ReservationsDoes Not Meet Design Standards Evaluate the Evidence Strong EvidenceModerate EvidenceNo Evidence Effect-Size Estimation Social Validity Assessment
53
WWC Single-Case Pilot Design Standards Four Standards for Design Evaluation Systematic manipulation of independent variable Inter-assessor agreement Three attempts to demonstrate an effect at three different points in time Minimum number of phases and data points per phase, for phases used to demonstrate an effect Standard 3 Differs by Design Type Reversal / Withdrawal Designs (ABAB and variations) Alternating Treatments Designs Multiple Baseline Designs 53
54
Standard 1: Systematic Manipulation of the Independent Variable Researcher Must Determine When and How the Independent Variable Conditions Change. If Standard Is Not Met, Study Does Not Meet Design Standards. 54
55
Examples of Manipulation that is Not Systematic Teacher/Consultee Begins to Implement an Intervention Prematurely Because of Parent Pressure. Researcher Looks Retrospectively at Data Collected during an Intervention Program. 55
56
Standard 2: Inter-Assessor Agreement Each Outcome Variable for Each Case Must be Measured Systematically by More than One Assessor. Researcher Needs to Collect Inter-Assessor Agreement: In each phase On at least 20% of the data points in each condition (i.e., baseline, intervention) Rate of Agreement Must Meet Minimum Thresholds: (e.g., 80% agreement or Cohen’s kappa of 0.60) If No Outcomes Meet These Criteria, Study Does Not Meet Design Standards. 56
57
In Current WWC Reviews: Author Queries Occur When Study Provides Insufficient IOA Information Determine if Standard is Met Based on Response If the result of the query indicates that the study does not meet standards, treat it as such. If No Response, Assume Standard is Met if: The minimum level of agreement is reached. The study assesses IOA at least once in each phase. The study assesses IOA on at least 20% of all sessions. Footnote is added to WWC Product Indicating that IOA Not Fully Determined. 57
58
Standard 3: Three Attempts to Demonstrate an Intervention Effect at Three Different Points in Time “Attempts” Are about Phase Transitions Designs that Could Meet This Standard Include: ABAB design Multiple baseline design with three baseline phases and staggered introduction of the intervention Alternating treatment design (other designs to be discussed during the Institute) Designs Not Meeting this Standard Include: AB design ABA design Multiple baselines with three baseline phases and intervention introduced at the same time for each case 58
59
Standard 4: Minimum Number of Phases and Data Points per Phase (for Phases in Std 3) 59 Reversal Design MB Design AT Design Meet Standards Number of Phases46n/a With Data Points per Phase At least 5 At most 2 per phase; At least 5 per condition Meet Standards with Reservations Number of Phases46n/a With Data Points per Phase At least 3 At most 2 per phase; At least 4 per condition
61
Meets Design Standards with Reservations (MDSWR) 3 Attempts At 3 Different Points in Time 4 Phases with At Least 3 Data Point per Phase 61 Adapted from Horner and Spaulding, 2010
62
Meets Design Standards (MDS) 3 Attempts At 3 Different Points in Time 6 Phases with At Least 5 Data Point per Phase 62 Source: Kern et al., 1994
63
Ratings Differ by Research Question with ATDs; For Example: MDSWR – Int 1 v. Int 2 DNotMDS – Int 1 v. Int 3 DNotMDS – Int 2 v. Int 3 63 Source: Horner and Spaulding, 2010
64
Extensions of Core Designs; To be Discussed on Day 2 of the Institute Changing Criterion Designs Researcher pre-schedules changes in the intervention criterion or intensity of the intervention Can meet evidence standards with at least 3 criterion shifts (for Standard 3) Non-concurrent Multiple Baseline Completely non-concurrent MBDs baselines that do not overlap when examined vertically Designs with NO vertical overlap at baseline do not meet design standards because of the history threat Multiple Probe Multiple Probe (Days) Multiple Probe (Conditions) 64
65
Meets Design Standards IV manipulated directly IOA documented (e.g.,.80 percent agreement;.60 Kappa) 20% of data points in each phase Design allows opportunity to assess basic effect at three different points in time. Five data points per phase (or design equivalent) ATD (four comparison option) Meets Design Standards with Reservation All of above, except at least three data points per phase Does not Meet Design Standards
66
AB AB AB AB AB AB Student 1 Student 2 Student 3 Student 4 11 22 33 44 Actual Time
69
ABAB Designs Multiple Baseline Designs Alternating Treatment Designs
70
Simple phase change designs [e.g., ABAB; BCBC design]. (In the literature, ABAB designs are sometimes referred to as withdrawal designs, intrasubject replication designs, within-series designs, or reversal designs)
71
ABAB Reversal/Withdrawal Designs In these designs, estimates of level, trend, and variability within a data series are assessed under similar conditions; the manipulated variable is introduced and concomitant changes in the outcome measure(s) are assessed in the level, trend, and variability between phases of the series, with special attention to the degree of overlap, immediacy of effect, and similarity of data patterns across similar phases (e.g., all baseline phases).
73
ABAB Reversal/Withdrawal Designs Some Design Limitations: Behavior must be reversible in the ABAB…series (e.g., return to baseline). May be ethical issues involved in reversing behavior back to baseline (A 2 ). May be a complex study when multiple conditions need to be compared. There may be order effects in the design.
74
Multiple baseline design. The design can be applied across units(participants), across behaviors, across situations
75
Multiple Baseline Designs In these designs, multiple AB data series are compared and introduction of the intervention is staggered across time. Comparisons are made both between and within a data series. Repetitions of a single simple phase change are scheduled, each with a new series and in which both the length and timing of the phase change differ across replications.
77
Some Design Limitations: The design is generally limited to demonstrating the effect of one independent variable on some outcome. The design depends on the “independence” of the multiple baselines (across units, settings, and behaviors). There can be practical as well as ethical issues in keeping individuals on baseline for long periods of time (as in the last series).
78
Alternating Treatment Designs Alternating treatments (in the behavior analysis literature, alternating treatment designs are sometimes referred to as part of a class of multi- element designs)
79
In these designs, estimates of level, trend, and variability in a data series are assessed on measures within specific conditions and across time. Changes/differences in the outcome measure(s) are assessed by comparing the series associated with different conditions.
81
Some Design Limitations: Behavior must be reversed during alternation of the intervention. There is the possibility of interaction/carryover effects as conditions are alternated. Comparing more than three treatments may be very challenging.
83
Does the design allow for the opportunity to assess experimental control? Baseline At least five data points per phase (3 w/reservation) Opportunity to document at least 3 basic effects, each at a different point in time.
84
Basic Effect: Change in the pattern of responding after manipulation of the independent variable (level, trend, variability). Experimental Control: At least three demonstrations of basic effect, each at a different point in time.
85
First Demonstration of Basic Effect Second Demonstration of Basic Effect Third Demonstration of Basic Effect Intervention X 1. Baseline. 2. Each phase has at least 5 data points (3 w/reservation). 3. Design allows for assessment of “basic effect” at three different points in time.
86
Intervention X Does Not Meet Standard Standard
87
Intervention XIntervention Y Does Not Meet Standard Standard
88
Intervention X Does Not Meet Standard Standard
89
Intervention X Meets Standard With Reservation Meets Standard With Reservation
90
First Demonstration of Basic Effect Second Demonstration of Basic Effect Third Demonstration of Basic Effect
91
Does Not Meet Standard
92
Meets Standard
94
Research Question: Is there a DIFFERENCE between the effects of two or more treatment conditions on the dependent variable. Methodological Issues: How many data points to show a functional relation: No current agreement in the field; WWC Standards: Specify phase order to address sequence effects At most 2 data points prior to alternating treatment Data points in each condition: At least 4 data points per phase to MDStd.WRes. Preferably 5 data points per phase to MDStd. But…the lower the separation, or higher the overlap, the more data points are needed to document experimental control.
95
Attn Escape PlayFood
96
Tangible Escape Attention Control * **
97
Attn Escape PlayFood Escape
98
For each of the following graphs, note: 1)The design type 2)The highest possible rating (Meets Standards, Meets Standards with Reservations, Does Not Meet Standards). 98
99
Example 1 99 Source: Dunlap, et al., 1994
100
Example 2 100 Source: Cunningham et al., 1998
101
Example 3 101 WCPM = Words read correctly per min Source: Begeny, J.C., Daly III, E.J., and Valleley, R.J. (2006).
102
Example 4 102 Source: Ingram et al., 2005
103
Example 5 103 Source: Todd et al., 1999
104
For each example, note the following about IOA collection: 1)Collected for each case on each outcome variable? 2)Collected at least once in all phases? 3)Collected on at least 20% of the baseline sessions and on at least 20% of the intervention sessions? 4)Does IOA meet minimum acceptable values for each case on each outcome? 104
105
March and Horner (2002) Note: Problem Behavior and Academic Engagement meet protocol screening requirements. Participants are Andy, Bill, and Cathy Interobserver agreement data were collected for problem behavior and academic engagement on at least 22% of observation periods for each phase for each participant. Two independent observers using synchronized earphones to match observation intervals monitored the behavior of a student. Interobserver agreement was calculated on an interval-by-interval basis by dividing the number of intervals with perfect agreement by the total number of intervals observed and multiplying by 100%. In addition, kappa was computed to assess reliability when chance agreement was controlled. Interobserver agreements for problem behavior for Andy, Bill, and Cathy were 87%, 80%, and 83%, respectively. Corresponding kappa scores were.60,.48, and.49. Interobserver agreements for academic engagement for Andy, Bill, and Cathy were 82%, 87%, and 88%, respectively, with kappa scores of.51,.59, and.58. 105
106
Reliability data for student and teacher behaviors were collected across 14 sessions for Patricia (19%), and across 2 sessions for Michael (8%). For disruptive behaviors, the mean percentage agreement across conditions was 92% (59–100), for academic compliance 94% (69–100), for praise 84% (0–100), and for reprimands 93% (0–100). Low rates reflected low incidence during sessions. On-task reliability was collected for six sessions (8%) for Patricia (M = 93%) and two sessions (8%) for Michael (M = 96%). 106
108
Joel Levin
110
WWC Standards Evaluating Single-Case Design Outcomes With Visual Analysis: Evidence Criteria 110
111
Evaluate the Design Meets Design StandardsMeets with ReservationsDoes Not Meet Design Standards Evaluate the Evidence Strong EvidenceModerate EvidenceNo Evidence Effect-Size Estimation Social Validity Assessment
112
Visual Analysis of Single-Case Evidence 112 Traditional Method of Data Evaluation for SCDs Determine whether evidence of a causal relation exists Characterize the strength or magnitude of that relation Singular approach used by WWC for rating SCD evidence Methods for Effect-Size Estimation Several parametric and non-parametric methods proposed Some SCD WWC panel members among those developing these methods, but methods are still being tested and most not comparable with group-comparison studies WWC standards for effect-size are being developed as field reaches greater consensus on appropriate statistical approaches
113
Goal, Rationale, Advantages, and Limitations of Visual Analysis 113 Goal is to Identify Intervention Effects A basic effect is a change in the dependent variable in response to researcher manipulation of the independent variable. “Subjective” determination of evidence, but practice and common framework for applying visual analysis can help to improve agreement rate. Evidence criteria are met by examining effects that are replicated at different points. Encourages Focus on Interventions with Strong Effects Strong effects are generally desired by applied researchers and clinicians. Weak results are filtered out because effects should be clear from looking at data - viewed as an advantage. Statistical evaluation can be more sensitive than visual analysis in detecting intervention effects.
114
Goal, Rationale, Advantages, Limitations (cont’d) 114 Statistical Evaluation and Visual Analysis have Some Conceptual Simularies (Kazdin, 2011): Both attempt to avoid Type I and Type II errors Type I: Concluding the intervention produced an effect when it did not Type II: Concluding the intervention did not produce an effect when it did Possible Limitations of Visual Analysis Lack of concrete decision-making rules (e.g., in contrast to p<0.05 used in statistical analysis) Multiple influences need to be analyzed simultaneously
115
Multiple Influences Need to be Considered in Applying Visual Analysis 115 Level: Mean of the data series within a phase Trend: Slope of the best-fit line within a phase Variability: Deviation of data around the best-fit line Percentage of Overlap: Percentage of data from an intervention phase entering that enters the range of data from the previous phase Immediacy: Magnitude of change between the last 3 data points in one phase and the first 3 in the next Consistency: Extent to which data patterns are similar in similar phases
116
Applied Outcome Criteria and Visual Analysis Decision Criteria in Visual Analysis Standards for Visual Analysis
117
Research on Visual Analysis Research on visual analysis contains a number of methodological limitations. These limitations have been recognized by Brossart et al. (2006, p. 536) in offering the following recommendations for improvement of visual-analysis research: Graphs should be fully contextualized, describing a particular client, target behavior(s), time frame, and data collection instrument. Judges should not be asked to predict the degree of statistical significance (i.e., a significance probability p-value) of a particular statistic, but rather should be asked to judge graphs according to their own criteria of practical importance, effect, or impact. 117
118
Research on Visual Analysis (Contd.) Judges should not be asked to make dichotomous yes/no decisions, but rather to judge the extent or amount of intervention effectiveness. No single statistical test should be selected as “the valid criterion”; rather, several optional statistical tests should be tentatively compared to the visual analyst’s judgments. Only graphs of complete SCD studies should be examined (e.g., ABAB, Alternating Treatment, and Multiple-Baseline Designs). 118
119
Some Recent Research Findings Lieberman, R. G., Yoder, P. J., Reichow, B., & Wolery, M. (2010). Visual analysis of multiple baseline across participants graphs when change is delayed. School Psychology Quarterly, 25, 28-44. Kahng, S. W., Chung, K-M., Gutshall, K., Pitts, S. C., Kao, J., & Girolami, K. (2010). Consistent visual analysis of intrasubject data. Journal of Applied Behavior Analysis, 43, 35-45. 119
120
Lieberman, Yoder, Reichow, and Wolery (2010) tested various characteristics of multiple-baseline designs to determine whether the data features affected the judgments of visual- analysis experts (N= 36 editorial board members of journals that publish SCDs) regarding the presence of a functional relation and agreement on the outcomes. It was found that graphs with steep slopes (versus shallow slopes) when the intervention was introduced were judged as more often having a functional relation. Nevertheless, there was still some disagreement on whether the functional relation had been established. Lieberman et al. (2010) noted that training visual judges to address conditions in which there is change long after the intervention, and where there is inconsistent latency of change across units, may be helpful in reviewers’ concurrence about a functional relation. 120
121
Kahng, Chung, Gutshall, Pitts, Kao, and Girolami (2010) replicated and extended earlier research on visual analysis by including editorial board members of the Journal of Applied Behavior Analysis as participants in the study. Board members were asked to judge 36 ABAB design graphs on a 100-point scale while rating the degree of experimental control. These authors reported high levels of agreement among judges, noting that the reliability of visual analysis has improved over the years, due in part to better training in visual-analysis methods. 121
122
122
123
Erin Barton Training Protocols in Visual Analysis Overview of Visual Analysis of Single-Case Data Parameters Associated with Visual Analysis Four steps in visual analysis Six features considered in visual analysis Additional considerations for MBL and ATD Concerns about Visual Analysis
124
Evaluate the Design Meets Design StandardsMeets with ReservationsDoes Not Meet Design Standards Evaluate the Evidence Strong EvidenceModerate EvidenceNo Evidence Effect-Size Estimation Social Validity Assessment
125
Evidence Criteria Strong Baseline Documentation of research question “problem” Documentation of predictable pattern (>5 data points) Each Phase of the Analysis Documentation of predictable pattern (> 5 data points) Basic effects Documentation of predicted change in the DV when IV is manipulated Experimental Control Three demonstrations of basic effect, each at a different point in time. No demonstrations of intervention failure
126
Evidence Criteria Moderate All of “Strong” criteria, with these exceptions: Only 3-4 data points per phase Three demonstrations of effect, but with additional demonstrations of failure-to-document effect No Evidence Misnomer Evidence does not meet Moderate level.
127
Visual Analysis Baseline Document the “problem” requiring intervention Typically 5 or more data points Documentation of a pattern of responding that allows prediction into the future. Each Phase Documents a clear pattern of responding Typically 5 or more data points Adjacent phases Do data document a “basic effect” Whole study Do the phases document experimental control (e.g., at least three demonstrations of a basic effect, each at a different point in time).
128
Documenting Experimental Control Three demonstrations of a “basic effect” at three different points in time. A “basic effect” is a predicted change in the dependent variable when the independent variable is actively manipulated. To assess a “basic effect” Visual Analysis includes simultaneous assessment of: Level, Trend, Variability, Immediacy of Effect, Overlap across Adjacent Phases, Consistency of Data Pattern in Similar Phases. (Parsonson & Baer, 1978; Kratochwill & Levin, 1992) Visual Analysis within Single-Case Design
129
Assessing within phase “pattern” and Between phase “basic effect” Level Trend Variability + Overlap Immediacy of Effect Consistency across similar phases _________________________________________ Other: vertical analysis; intercept gap Within Phase Between Phases
131
Special Topics in Single-Case Intervention Research 131
132
Tom Kratochwill and Erin Barton Overview of Special Topics 132 Negative Results Effect Size Applications Applications of the WWC Standards in Literature Reviews
133
Negative Results in Single-Case Intervention Research 133
134
Negative Results in Single-Case Intervention Research The Legacy of Negative Results and its Relationship to Publication Bias The Importance of Negative Results in Developing Evidence-Based Practices (Kratochwill, Stoiber, & Gutkin, 2000) Negative Results in Single-Case Intervention Research Examples using the WWC Standards 134
135
Negative Results Definition The term negative results traditionally has meant that there are either: (a) no statistically significant differences between groups that receive different intervention conditions in randomized controlled trials; or (b) no documented differences (visually and/or statistically) between baseline and intervention conditions in experimental single-case designs. 135
136
Negative Results in Single-Case Design In the domain of SCD research, negative results reflect findings of (a) no difference between baseline (A) and intervention (B) phases (A = B), (b) a difference between baseline and intervention phases but in the opposite direction to what was predicted (A > B, where B was predicted to be superior to A), (c) no difference between two alternative interventions, B and C (B = C), or (d) a difference between two alternative interventions, but in the direction opposite to what was predicted (B > C, where C was predicted to be superior to B). 136
137
Negative Effects Negative results/findings in SCD intervention research should be distinguished from negative effects in intervention research (i.e., iatrogenic effects). Some interventions may actually produce negative effects on participants (i.e., participants get worse or show negative side effects from an intervention)―see, for example Barlow (2010). 137
138
Selective Results Selective results refer to the withholding of any findings in a single study or in a replication series (i.e., a series of single-case studies in which the treatment is replicated several times in independent experiments; see also our discussion below for selective results issues in replication series) and can be considered as a part of the domain of negative results. 138
139
Erroneous Results Erroneous results have been considered in traditional “group” research in situations where various statistical tests are incorrectly conducted or interpreted to yield findings that are reported as statistically significant but are found not to be when the correct test or interpretation is applied (e.g., Levin, 1985). Also included in the erroneous results category are “spurious” findings that are produced in various research contexts. 139
140
Erin Barton Example Negative Results Research 140
141
141
142
Applications of the WWC Standards in Literature Reviews 142
143
Five studies documenting experimental control (i.e., MDS or MDSWRs) Conducted by at least three research teams with no overlapping authorship at three different institutions The combined number of cases totals at least 20 Each study demonstrates an effect size of ___ ??
144
A systematic evaluation of token economies as a classroom management tool for students with challenging behavior (Maggin, Chafouleas, Goddard, & Johnson, 2011) Studies documenting experimental control [n=7/3 (MDS- student/classroom),4/0 (MDSWR-student/classroom)] At least three settings /scholars (yes) At least 20 participants (no) EVIDENCE CRITERIA: Strong evidence (n=1 at the student level and n=3 at the classroom level) Moderate evidence (n=8 at the student level and n=0 at the classroom level) No evidence (n=2 at the student level and n=0 at the classroom level)
145
An application of the What Works Clearinghouse Standards for evaluating single-subject research: Synthesis of the self-management literature base (Maggin, Briesch, & Chafouleas, 2013). Studies documenting experimental control [n=37 (MDS)/n=31(MDSWR)] At least three settings /scholars (Yes) At least 20 participants (Yes) EVIDENCE CRITERIA: Strong evidence (n=25) Moderate evidence (n=30) No evidence (n=13)
146
Evidence-based is not enough In addition to the features of the practice: define what outcomes, when/where used, by whom, with what target populations, at what fidelity? The innovative practice needs to not only be evidence- based, but dramatically easier and better than what is already being used. The practice should be defined conceptually as well as procedurally, to allow guidance for adaptation.
147
Role for Single-case research in the development of PROGRAMS of intervention research Iterative development of interventions Documentation of effective practices Documentation of modifications for weak and non- responders.
148
Increase precision of Research Questions Define conceptual logic for research question Define research question with greater precision IV related to change in level, trend, variability? “Is there is a functional relation between self- management interventions and reduction in the level and variability of problem behavior?” Measures Define assumptions Distribution (counts)
149
Baseline 5 data points Document “problem” under study Document predictable patterns Data points per phase At least 5 points per phase (maybe more for some effect size measures) More data points when the data indicate elevated trend and/or variability Combination of visual and statistical analysis Visual analysis confirmed with statistical analysis Need for effect size measures in single-case designs For individual studies For meta-analyses
150
Single case methods are an effective and efficient approach for documenting experimental effects. Need exists for more precise standards for training and using visual analysis, and combinations of visual analysis with statistical analysis. There are encouraging (but still emerging) approaches for statistical analysis that will improve meta-analysis options. More precision in review stipulations Establishes expectations for reviewers.
151
Barlow, D. H. (2010). Negative effects from psychological treatments: A perspective. American Psychologist, 65, 13- 20. Brossart, D. F., Parker, R. I., Olson, E. A., & Mahadevan, L. (2006). The relationship between visual analysis and five statistical analyses in a simple AB single-case research design. Behavior Modification, 30, 531-563.
152
Hammond, D. & Gast, D. L. (2010). Descriptive analysis of single-subject research designs: 1983-2007. Education and Training in Autism and Developmental Disabilities, 45, 187-202. Hartmann, D. P., Barrios, B. A., & Wood, D. D. (2004). Principles of behavioral observation. In S. N. Haynes and E. M. Hieby (Eds.), Comprehensive handbook of psychological assessment (Vol. 3, Behavioral assessment) (pp. 108-127). New York: John Wiley & Sons.
153
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., & Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71, 165-179. Horner, R., & Spaulding, S. (2010). Single-case research designs (pp. 1386-1394). In N. J. Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage Publications.
154
Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M & Shadish, W. R. (2010). Single case designs technical documentation. In What Works Clearinghouse: Procedures and standards handbook (version 2.0). Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/wwc/pdf/wwc_procedures_v2_st andards_handbook.pdf http://ies.ed.gov/ncee/wwc/pdf/wwc_procedures_v2_st andards_handbook.pdf Maggin, D. M., Chafouleas, S. M., Goddard, K. M., & Johnson, A. H. (2011). A systematic evaluation of token economies as a classroom management tool for students with challenging behavior. Journal of School Psychology, 49, 529-554.
155
Maggin, D. M., Briesch, A. M., & Chafouleas, S. M. (2013). An application of the What Works Clearinghouse Standards for evaluating single-subject research: Synthesis of the self- management literature base. Remedial and Special Education, 34, 44-58. Kratochwill, T. R., Stoiber, K. C., & Gutkin, T. B. (2000). Empirically supported interventions in school psychology: The role of negative results in outcome research. Psychology in the Schools, 37, 399- 413. Levin, J. R. (1985). Some methodological and statistical “bugs” in research on children’s learning. In M. Pressley & C. J. Brainerd (Eds.), Cognitive learning and memory in children (pp. 204–233), New York, NY: Springer-Verlag.
156
Odom, S.L., Brantlinger, E., Gersten, R., Horner, R. H., Thompson, B., & Harris, K. (2005). Research in special education: Scientific methods and evidence-based practices. Exceptional Children, 71, 137–148. Parsonson, B., & Baer, D. (1978). The analysis and presentation of graphic data. In T. Kratochwill (Ed.) Single Subject Research (pp. 101–166). New York: Academic Press. Reichow, B., Barton, E. E., Sewell, J. N., Good, L., & Wolery, M. (2010). Effects of weighted vests on the engagement of children with developmental delays and autism. Focus on Autism and Other Developmental Disabilities, 25, 3-11.
157
Shadish, W. R., & Sullivan, K. J. (2011). Characteristics of single-case designs used to assessment treatment effects in 2008. Behavioral Research Methods, 43, 971-980. DOI 10.3758/s13428-011-0111-y Smith, J. D. (2012). Single-case experimental designs: A systematic review of published research and current standards. Psychological Methods, 17,510-550. Sullivan, K. J. & Shadish, W. R. (2011). An assessment of single-case designs by the What Works Clearinghouse.
158
Wendt, O., & Miller, B. (2012). Quality appraisal of single-subject experimental designs: An overview and comparison of different appraisal tools. Education and Treatment of Children, 35, 235–268.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.