Non-Overlap Methods in Single Case Research Methodology Erin E. Barton, PhD, BCBA-D.

Slides:

Advertisements

Similar presentations

Richard M. Jacobs, OSA, Ph.D.

Advertisements

Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.

Effect Size and Meta-Analysis

+ Evidence Based Practice University of Utah Presented by Will Backner December 2009 Training School Psychologists to be Experts in Evidence Based Practices.

PTP 560 Research Methods Week 4 Thomas Ruediger, PT.

Chapter 12: Single-Subject Designs An alternative to experimental designs Purpose: To draw conclusions about the effects of treatment based on the responses.

Introduction to Educational Statistics

Meta-analysis & psychotherapy outcome research

Single-Case Designs. AKA single-subject, within subject, intra-subject design Footnote on p. 163 Not because only one participant (although might sometimes)

10.0 Systematic Reviews for Single Subject Designs.

Understanding and Comparing Distributions

Single-Subject Designs

Chapter 6 Flashcards. systematic process for interpreting results of single-case design data that involves the visual examination of graphed data within.

Chapter 3: Central Tendency

Single-Case Research: Documenting Evidence-based Practice Rob Horner University of Oregon.

Measures of Central Tendency

Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.

Response to Intervention in The Social Domain. Response to Intervention (RTI) Response to evidence-based interventions (Elliott, Witt, Kratchowill, &

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.

Statistics and Research methods Wiskunde voor HMI Betsy van Dijk.

Single-Case Research: Standards for Design and Analysis Thomas R. Kratochwill University of Wisconsin-Madison.

Covariance and correlation

Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.

The Effect of Computers on Student Writing: A Meta-Analysis of Studies from 1992 to 2002 Amie Goldberg, Michael Russell, & Abigail Cook Technology and.

Chapter 1: Introduction to Statistics. 2 Statistics A set of methods and rules for organizing, summarizing, and interpreting information.

Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!

Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O’Halloran.

Single-Case Research Designs: Training Protocols in Visual Analysis Wendy Machalicek University of Oregon Acknowledgement: Rob Horner Tom.

Effect Sizes for Meta-analysis of Single-Subject Designs S. Natasha Beretvas University of Texas at Austin.

Evaluating Behavioral Interventions Week 3:Interpreting & Graphing Data.

Current Methodological Issues in Single Case Research David Rindskopf, City University of New York Rob Horner, University of Oregon.

Systematic reviews to support public policy: An overview Jeff Valentine University of Louisville AfrEA – NONIE – 3ie Cairo.

Single-Subject Experimental Research

For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.

McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. Using Single-Subject Designs.

Statistical Models for the Analysis of Single-Case Intervention Data Introduction to:  Regression Models  Multilevel Models.

Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.

Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.

1 Chapter 4 Numerical Methods for Describing Data.

Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.

BPS - 5th Ed. Chapter 251 Nonparametric Tests. BPS - 5th Ed. Chapter 252 Inference Methods So Far u Variables have had Normal distributions. u In practice,

DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.

Overlap Methods derived from Visual Analysis in Single Case Research Methodology In collaboration with Brian Reichow and Mark Wolery.

Single-Subject and Correlational Research Bring Schraw et al.

Single- Subject Research Designs

Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.

IES Project Director’s Meeting June 2010 Rob Horner University of Oregon.

IES Single-Case Research Institute: Training Visual Analysis Rob Horner University of Oregon

Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.

SINGLE SUBJECT RESEARCH PREPARED FOR: DR EDDY LUARAN PREPARED BY: AFZA ARRMIZA BINTI RAZIF [ ] HANIFAH BINTI RAMLEE IZYAN NADHIRAH BINTI.

Definition Slides Unit 2: Scientific Research Methods.

Definition Slides Unit 1.2 Research Methods Terms.

Statistics & Evidence-Based Practice

Experimental Research

DAY 2 Visual Analysis of Single-Case Intervention Data Tom Kratochwill

Single-Case Effect Size and Meta-Analytic Measures

Single-Case Research and Meta-Analysis: A How-To panel

Matthew Burns University of Missouri

Actual analyses Visual analysis Increasing trends Immediacy of effects

Descriptive Statistics (Part 2)

A Meta-Analysis of Video Modeling Interventions that Teach Employment Related Skills to Individuals with Autism Carol Sparber, M.Ed. Intervention Specialist.

META ANALYSIS OF VIDEO MODELING INTERVENTIONS

Elementary Statistics

Gerald Dyer, Jr., MPH October 20, 2016

Visually Interpreting Your Client’s Progress

Non-Overlap Measures PND PEM ECL (PEM-T) NAP TauU TauUadj.

Presentation transcript:

Non-Overlap Methods in Single Case Research Methodology Erin E. Barton, PhD, BCBA-D

Non-overlap Methods 1.PND 2.PEM (ECL) 3.PEM-T 4.PAND 5.R-IRD 6. NAP 7. Tau-U

Rationale: Non-overlap Methods 1.Need to aggregate across studies to determine evidence for practice Meta-analysis is a well established practice for group experiments –Magnitude –Aggregate findings –Moderator analyses

Rationale: Non-overlap Methods 1.Need to aggregate across studies to determine evidence for practice –Many have argued if SCRD will not be included in reviews of evidence-based practices unless an effect size estimator is used –They are often left out of reviews in disciplines outside of special education

Rationale: Non-overlap Methods 2.Emphasis on “effect sizes” in education research to quantify the magnitude –Standardized effect sizes are particularly valued –Reviewers compare results across studies having different outcome measures that otherwise could not be easily compared

Rationale: Non-overlap Methods 3.Meta-analytic techniques are compromised when data are serially dependent –On a single individual –Using the same definitions –Using the same data collection procedures –In the same context –Under the same procedures –Often with short intervals between observations

Rationale: Non-overlap Methods 4.Current meta-analytic techniques are inappropriate for aggregating SCR data  The data patterns must shift consistently in the predicted (therapeutic) direction with each change in experimental condition  Replication logic is used to make judgments about functional relations  The design needs an adequate number of replications of the experimental conditions (internal validity)

Rationale: Non-overlap Methods 5.Non-parametric techniques needed Short data sets or few data points Non-normal or unknown distributions Unknown parameters

PND: Percent of Non- overlapping Data

One of the oldest of the overlap methods (Scruggs, & Mastropieri, 1998; Scruggs, Mastropieri, & Casto, 1987) Used extensively Easily calculated Does not assume data are independent

Calculating PND 1.Identify the intended change 2.Drawing a straight line from the highest (or lowest) point in Phase A and counting the number of data point in Phase B above the line 3.Quotient = # above the line / total number in Phase B X 100

Interpreting PND  70% is effective, 50% to 70% is questionable effectiveness, and <50% is no observed effect  (Scruggs & Mastropieri, 1998)

Practice: Calculating PND Schilling (2004), David In Seat A1 to B1: B1 to A2: A2 to B2: Engaged A1 to B1: B1 to A2: A2 to B2:

A1 to B1: 22% B1 to A2: 100% A2 to B2: 100% AVERAGE: 74% Practice: Calculating PND Schilling (2004), David In SeatEngaged A1 to B1: 100% B1 to A2: 100% A2 to B2: 100% AVERAGE: 100%

Practice: Calculating PND Vaughn (2002) Disruptive Behavior Arrival: 100% Mealtime: 75% Departure: 100% AVERAGE: 92% Engaged Arrival: 100% Mealtime: 75% Departure: 100% AVERAGE: 92%

Outliers PND = 0% Functional Relation?

Trends PND = 0% Functional Relation?

Magnitude PND = 100% Functional Relation?

Outliers PND = 0% Functional Relation?

BUT NO ONE USES A-B-A-B….. You might be thinking….

Functional Relation? PND = 68% PND = 43% PND = 86% Social Initiations to Peers

PND = 68% PND = 43% PND = 86% PND = 66% TREND?

Functional Relation? PND = 0% Social Initiations to Peers

PND = 0% MAGNITUDE?

Functional Relation? PND = 100% Social Initiations to Peers

WHAT DOES THAT MEAN? You SHOULD be thinking….

PND Flaws Compared with consensus visual analysis, PND resulted in an error in about 1 of 5 condition changes—high rate of errors (Wolery, Busick, Reichow, & Barton, 2010) Compromised by: –Longer data sets, # of data points –Variability –Outliers –Trends Should not be used (Brossart et al., 2013; Kratochwill et al., 2010; Parker & Vannest, 2009)

PND Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Level 2.Trend 3.Variability 4.Immediacy 5.Overlap 6.Consistency  Vertical analysis

PAND: Percent of All Non- overlapping Data

Not compromised by serial dependency or other data assumptions Percentage of data remaining after determining the fewest data points that must be removed to eliminate all between-phase overlap

Calculating PAND 1.Count the total number of data points in comparison 2.Identify how many need data points need to be removed to eliminate overlap 3.Count the number of remaining data points 4.Divide count in step 3 by count in step 1 5.X by 100

Practice: Calculating PAND Schilling (2004), David In Seat A1 to B1: 94% B1 to A2: 100% A2 to B2: 100% AVERAGE: 98% (PND Average was 74%) Engaged A1 to B1: 100% B1 to A2: 100% A2 to B2: 100% AVERAGE: 100% (PND Average was 100%)

Practice: Calculating PAND Vaughn (2002) Disruptive Behavior Arrival: 100% Mealtime: 85% Departure: 100% AVERAGE: 95% PND was 92% Engaged Arrival: 100% Mealtime: 85% Departure: 100% AVERAGE: 95% PND was 92% Adding more data points Extinction burst Variability

PAND = 96% PAND = 93% Social Initiations to Peers

PAND = 96% PAND = 93% PAND = 94% MAGNITUDE?

PAND = 100% Social Initiations to Peers

PAND Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Level 2.Trend 3.Variability 4.Immediacy 5.Overlap 6.Consistency  Vertical analysis Should not be used (Brossart et al., 2013; Manolov et al., 2010)

PEM: Percent Exceeding the Median (Ma, 2006)

Not compromised by serial dependency and other data assumptions Designed to eliminate problem with baseline datum point being at floor or ceiling –Designed to not rely on the most extreme datum point –Less influenced by variability in baseline

Calculating PEM 1.Drawing a line at the median of Phase A data through Phase B data 2.Count the number of data points in Phase B above (or below) the line and divide by the total number of data points in Phase B

PEM Flaws Compared with consensus visual analysis, PEM resulted in an error in about 1 of 6 condition changes—high rate of errors (Wolery et al., 2010) Should not be used (Parker, Vannest, & Davis, 2011)

PEM Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Level 2.Immediacy 3.Overlap 4.Consistency  Vertical analysis

PEM-T: Percent Exceeding the Median Trend Line ECL: Extended Celeration Line

PEM-T: Percent Exceeding the Median Trend Line (Wolery et al., 2010) Not compromised by serial dependency and other data assumptions Designed to eliminate problem with baseline datum point being at floor or ceiling –Designed to not rely on the most extreme datum point –Less influenced by variability in baseline –Less influenced by trends in data

Calculating PEM-T 1.Graph data on semi-logarithmic chart 2.Calculate and draw a split middle line of trend estimation for Phase A data and extend it through Phase B 3.Count # of Phase B data points above/below the split middle line of trend estimation 4.Divide count from Step 4 by # data points in Condition 2 and multiply quotient by 100

PEM-T Flaws Compared with consensus visual analysis, PEM-T resulted in an error in about 1 of 8 condition changes—high rate of errors (Wolery et al., 2010)

PEM-T Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Level 2.Trend 3.Variability 4.Immediacy 5.Overlap 6.Consistency  Vertical analysis

So what, right?

Social Stories for Children with ASD 20 studies met design standards with or without reservation, which exceeded the minimum number of five studies set by the WWC as needed to be represented across studies. Across 3 research groups Qi & Barton, under review

Social Stories for Children with ASD Using Non-overlap indices: –28 participants (51%) had a PND score higher than 70; 41 (75%) had a PEM score higher than 70; 40 (73%) had a PEM-T score higher than 70; and 50 (91%) had a PDO 2 score higher than 70. Using visual analyses: –only 13 participants were included across the one study that provided strong evidence and in the six studies that provided moderate evidence. Qi & Barton, under review

Social Stories for Children with ASD Based on visual analysis, social stories interventions were not considered an EBP according to WWC criteria. Based on non-overlap indices, social stories interventions were considered an EBP according to WWC criteria. Qi & Barton, under review

R-IRD: Robust Improvement Rate Difference

Not compromised by serial dependency and other data assumptions Not about rate IRD calculation begins as PAND, but in a second step converts the results to two improvement rates (IR), for phase A and B respectively. The two IR values are finally subtracted to obtain the “Improvement Rate Difference” (IRD) R-IRD requires rebalancing (by hand) of a 2 x 2 matrix

R-IRD: Robust Improvement Rate Difference The original IRD article recommended that in the first step, data point ‘removal’ “should be balanced across the contrasted phases” (Parker et al., 2009, p. 141) for more robust results. A better robust IRD solution was later described and formalized as “Robust IRD” (R-IRD). R-IRD requires rebalancing (by hand) of a 2 x 2 matrix IRD is interpreted as the difference in the proportion of high or “improved” scores between phases B and A.

R-IRD: Robust Improvement Rate Difference The superior robust version of IRD (R- IRD) requires that quadrants be balanced. –Balancing when a large number of data points are be removed arbitrarily from one side and a few from the other… –Does not allow bias in removal of data points from A versus B, as some datasets provide two or more equally good removal solutions.

Calculating R-IRD 1.Determine the fewest data points that must be removed to eliminate overlap 2.Balance quadrant W and Z 3.Then balance Y = A –Phase A: W / (W + Y) 4.Then balance X = B –Phase B: X / (X + Z) 5.R - IRD = B – A

Calculating R-IRD ors/ird Flaws: Length of data can impact (Brossart et al., 2013; Manolov et al., 2011)

NAP: Non-overlap of All Pairs

The percentage of data that improve from A to B or operationally, the percentage of all pairwise comparisons from Phase A to B which show improvement or growth (Parker & Vannest, 2009)

Calculating NAP 1.NAP begins with all pairwise comparisons (#Pairs = n A × n B ) between phases. 2.Each paired comparison has one of three outcomes: improvement over time (Pos), deterioration (Neg), or no change over time (Tie). 3.NAP is calculated as (Pos +.5 × Tie) / #Pairs.

Practice: Calculating NAP Phase A: Phase B:

# of Pairs = 5*8 = 40 #Pos = 34, #Neg = 4, #Tie = 2 NAP = (#Pos +.5*#Ties)/#Pairs NAP = (34 +.5*2)/40 NAP =.875 N=5 N=8

NAP Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Level 2.Trend 3.Variability 4.Immediacy 5.Consistency 6.Overlap  Vertical analysis

Tau-U Extension of NAP – but can control for trend.

Tau-U: (Kendall’s Tau + Mann-Whitney U) NAP’s major limitation of insensitivity to data trend led to development of a new index that integrates non- overlap and trend: TauU (Parker, Vannest, Davis, & Sauber, 2011). Melding KRC and MW-U are transformations of one another and share the same S sampling distribution The Tau-U score is not affected by the ceiling effect present in other non-overlap methods, and performs well in the presence of autocorrelation. NAP is percent of non-overlapping data, whereas TauU is percent of non-overlapping minus overlapping data. Can control for baseline trends

Calculating Tau-U Simplest Tau (non-overlap only) Conduct the same pairwise comparisons (n A × n B = #Pairs) across phases as is NAP, resulting in a Pos, Neg, or Tie for each pair The Tau simple non-overlap form (not considering trend) is Tau = (Pos - Neg) / Pairs Tau-U can control for baseline trend

Practice: Calculating TauU Phase A: Phase B:

# of Pairs = 5*8 = 40 #Pos = 34, #Neg = 4, #Tie = 2 Tau-U = (#Pos - #Neg)/#Pairs Tau-U = (34 - 4)/40 Tau-U =.75 N=5 N=8

Practice Tau-U = -.82 Tau-U =.79 Tau-U = pairs, 2+, 6=, pairs, 40+, 6=, pairs, 1+, 6=, 35-

Calculating Tau-U

Calculating Tau-U Schilling (2004), David In Seat A1 to B1: 1 (0.54, 1.46) B1 to A2: -1 (-1.59, -.41) A2 to B2: 1 (.36, 1.64) PAND Average: 98% (PND Average was 74%) Engaged A1 to B1:.83 (.37, 1.29) B1 to A2: -1 (-1.59, -.41) A2 to B2: 1 (.36, 1.64) PAND Average: 100% (PND Average was 100%)

Calculating Tau-U Vaughn (2002) Disruptive Behavior Arrival: -1 (-1.708, -0.23) Mealtime: -.67 (-1.44,.11) Departure: -1 (-1.78, -.23) PAND was 95% PND was 92% Engaged Arrival: 1 (.292, 1.71) Mealtime:.83 (.06, 1.61) Departure: 1 (.23, 1.76) PAND was 95% PND was 92%

Tau-U = 1.0 Social Initiations to Peers Tau-U = 1.0

Social Initiations to Peers Tau-U = 1.0

Tau-U Flaws Is it an Effect Size? Replication Magnitude Can it supplement visual analysis? 1.Trend 2.Level 3.Variability 4.Overlap 5.Immediacy 6.Consistency  Vertical analysis Tau-U is the recommended non- overlap index

Summary Complete non-overlap measures offer the most robust option (NAP, Tau-U) –Complete measures equally emphasize all scores –Incomplete measures emphasize particular scores (e.g., median)

Summary Determination of evidence-based practice does not need to involve summary statistic Dynamic, flexible nature of SCRD allow for ongoing decision making while maintaining experimental control Replication logic can not be ignored

Summary Visual Analysis is complex and involves more than overlap –Graphing is important! Effect sizes, non-overlap measures should not take the place of visual analysis

Continue to: 1.Determine if the design supports the demonstration of a functional relation and meets design standards 2.Use systematic visual analysis to determine if data support a functional relational –Not just behavioral change or effect –Report protocol used and perhaps training of VAs –Include predicted data pattern in RQ 3.Consider magnitude and social validity 4.If using an effect size estimator, test and report assumptions

Develop: Effect sizes that are: Is consistent with single case research design logic Synthesis of SC studies with similar IVs and DVs Synthesis rigorous, experimental studies using SCR and RCTs

References Broissart, D. F., Vannest, K. J., Davis, J. L., & Patience, M. A. (2014). Incorporating nonoverlap indices with visual analysis for quantifying intervention effectiveness in single-case experimental designs. Neuropsychological Rehabilitation: An International Journal, 24, Ma, H. H. (2006).An alternative method for quantitative synthesis of single-subject research: Percentage of datapoints exceeding the median. Behavior Modification, 30, 598–617. Parker, R., & Vannest, K. J. (2008). An improved effect size for single case research: Non- overlap of all pairs (NAP). Behavior Therapy, 40, Parker, R. I., Vannest, K. J., & Brown, L. (2009). The improvement rate difference for single case research. Exceptional Children, 75, 135–150. Parker, R. I., Vannest, K. J., & Davis, J. L. (2011). Effect size in single-case research: A review of nine nonoverlap techniques. Behavior Modification, 35, Wolery, M., Busick, M., Reichow, B., & Barton, E. (2010). Comparison of overlap methods for quantitatively synthesizing single-subject data. Journal of Special Education, 44,