Single-Case Intervention Research Training Institute

Single-Case Intervention Research Training Institute
Madison, WI - June, 2019 James E. Pustejovsky Effect size measures for single-case designs: within-case parametric indices

Effect size Broadly: a quantitative [index] of relations among variables (Hedges, 2008, p. 167). In context of SCDs: a quantitative index describing the direction and magnitude of a functional relationship (i.e., effect of intervention on an outcome) in a way that allows for comparison across cases and studies (Pustejovsky & Ferron, 2017) Direction and magnitude so that we can tell the difference between a strong positive effect (good), a null effect (inconsequential), and a strong negative effect (harmful!).

Why do we need effect sizes?
“Reporting and interpreting effect sizes in the context of previously reported effects is essential to good research. It enables readers to evaluate the stability of results across samples, designs, and analyses. Reporting effect sizes also informs power analyses and meta-analyses needed in future research.” (Wilkinson & APA Task Force on Statistical Inference, 1999) Characterize main findings of a study in a common, widely understood way. Put main findings in context of previous research. As a basis for research synthesis. Current APA publication manual says that reporting effect sizes is “almost always necessary.”

Characteristics of a good effect size (Lipsey & Wilson, 2001)
Readily interpretable. Comparable across studies (and cases, for SCDs) that use different operational procedures. Accompanied by a measure of sampling variability (i.e., a standard error/confidence interval). Calculable from available data. Lipsey & Wilson discussed effect sizes in the context of their use in meta-analysis. But these are really general purpose criteria. Interpretability because effect sizes are a tool for scientific communication. Really about interpretability/meaningfulness in the context of a research field. More to say about this in a minute. Standard errors or other measures of uncertainty are needed, as in any statistical analysis, because effect sizes are only estimates. Need to know how precise those estimates are. Standard errors also play a big role in meta-analysis, but that’s not our focus today. This criterion is a particular challenge with single-case effect sizes. Calculability from available data is an important criterion for an effect size to be useful in meta-analysis of studies that use between-groups designs, because typically only summary information is available to secondary analyst/meta-analyst. With single-case research, the raw data are usually available in the form of a published graph, so this is less of a constraint.

Comparability across studies and cases (single-case designs)
Imagine several single-case design studies investigating the same intervention, with similar participants, similar outcome construct. What procedural factors might be different across these studies? Ideally, effect size indices should not be strongly affected by such factors. Study design (e.g., ABAB, multiple baseline, multiple probe) Measurements per phase With behavioral outcome measures like on-task behavior/problem behavior/self-injury: Observation recording system (continuous recording, momentary time sampling, partial interval recording) Session length

Parametric within-case effect size measures
Within-case = characterize magnitude of functional relationship separately for each case in a study. Parametric = based on a model for the process that generated the data. Advantage: clear separation between effect size parameter definition and how it is estimated. Challenge: developing/assessing modeling assumptions.

Simplest Possible Scenario
Stable baseline and treatment phase (no trends) Immediate level shift due to treatment Independence of outcome measurements (Sorry Joel!)

Notation Mean levels of outcome in each phase: μA, μB
Standard deviation of outcome in each phase: σA, σB Variability around mean level of outcome for one case Sample of m observations in baseline phase, 𝑦 1 𝐴 ,…, 𝑦 𝑚 𝐴 n observations in treatment phase, 𝑦 1 𝐵 ,…, 𝑦 𝑚 𝐵

Three ways to describe a change in level
Raw difference in levels: μB – μA Standardized difference in levels (within-case standardized mean difference): (μB – μA) / σA Proportional change in levels μB / μA

Within-case standardized mean difference
SMD is one of most commonly used effect sizes in between-groups research. Gingerich (1984), Busk and Serlin (1992) proposed within-case SMD for single-case designs. Parameter definition: Difference in means, “standardized” by variability in baseline phase. Standardization makes δ scale-free NOT equivalent to between-case SMD, because σA only represents within-case variation. Gingerich and Busk and Serlin basically argued by analogy to the SMD used in between-groups research.

Estimating the within-case SMD
As originally proposed, estimate δ by replacing parameters with corresponding sample statistics: d-estimator has a small-sample bias. A bias-corrected estimator is: Approximate standard error of g (assuming independence): Bias correction and standard error are derived assuming normally distributed outcome measures, although fairly robust to this assumption. Say that m = 6. Then J(6 – 1) = 1 – 3 / (20 – 1) = 16 / 19 = 0.84.

Estimating the within-case SMD (continued)
If it is reasonable to assume that the SD of outcome is constant across phases (i.e., homogeneity of variance), then SMD can be estimated using pooled sample variance: A bias-corrected estimator: Approximate standard error of g (assuming independence):

Rodriguez & Anderson (2014)
Rodriguez & Anderson (2014). Integrating a social behavior intervention during small group academic instruction using a total group criterion intervention.

Rodriguez & Anderson example
Case yA sA m yB sB n d* g* (SE) Deborah’s group 49.44 8.75 6 16.78 7.94 26 -3.74 -3.15 (0.92) Amy’s group 33.43 9.48 10 8.31 6.18 21 -2.65 -2.42 (0.62) Barbara’s group 54.84 12.72 12 13.76 7.32 18 -3.23 -3.01 (0.67) Natasha’s group 26.10 10.53 17 8.47 4.08 11 -1.67 -1.59 (0.37) Candice’s group 33.97 14.38 16 19.17 9.08 -1.03 -0.98 (0.34) * Standardized mean difference estimates (d and g) are calculated using standard deviation of baseline phase only. Estimates and standard errors are based on assumption that outcome measurements are mutually independent.

Comments on within-case SMD
Appropriate for interval-scale outcome measures Is variability of outcome measure constant (approximately) for different mean levels? Within-case SMD involves scaling by test-retest unreliability. Measurement procedures of varying reliability will mean that SMD is not comparable across studies. Scales with restricted ranges will ES magnitude and statistical significance are affected by auto-correlation. Maggin and colleagues (2011) proposed extensions to within-case SMD Addresses bias due to serial dependence Adjusts for time trends in baseline and/or intervention phases. Interval-scale outcome measures (e.g., academic performance measures). Range restriction, measures near ceiling/floor will tend to make SMD less interpretable. Serial dependence means that effect size estimator itself (not just the standard error) is biased. Bias arises because sample variance is biased in presence of serial correlation.

Proportional change in levels
Percentage/proportional change from baseline to intervention is a common, easily interpretable “informal” effect size measure (Campbell & Herzinger, 2010). Occasional applications of percentage change in meta-analyses of SCDs (e.g., Campbell, 2003; Kahng, Iwata, & Lewin, 2002; Marquis et al., 2000). The log response ratio (LRR) is a formal effect size measure that quantifies functional relationships in terms of proportionate change Used as effect size for between-groups research in some fields (e.g., ecology, economics). Pustejovsky (2015) argues for its use in single-case research with behavioral outcome measures.

Within-case log response ratio
LRR parameter:: 𝜓=log⁡ 𝜇 𝐵 / 𝜇 𝐴 Appropriate for ratio-scaled outcomes (frequency counts, percentage duration). Natural logarithm used to make range less restricted. If treatment has zero effect, then μB / μA = 1 and ψ = 0. Particularly appropriate for behavioral outcomes measured by direct observation (Pustejovsky, 2015). Magnitude remains stable even when outcomes are measured using different procedures. Under certain assumptions, also comparable across dimensional constructs. Relationship to percentage change:

Estimating the within-case LRR (Pustejovsky, 2015)
Basic estimator: This estimator will be biased if m or n is small A bias-corrected estimator: Approximate standard error of R2 (assuming independence):

Estimating the within-case LRR (continued)
Approximate confidence interval for ψ (assuming independence): where zα is 2-tailed critical value from standard normal distribution Approximate confidence interval for % change (assuming independence):

Comments on within-case LRR
Serial dependence will affect SE and CI but not the effect size estimator itself If there is positive auto-correlation, SE and CI will tend to be too small. When applying to outcomes measured as proportions, need to be careful that direction of improvement is consistent (Pustejovsky, 2018). E.g., need to re-code “% time on-task” if other cases or studies use “% time off-task” Partial interval recording data presents further complications (Pustejovsky & Swan, 2015).

Case R1 R2 (SE)* 95% CI* % Change* Deborah’s group -1.08 (0.12) [-1.31, -0.85] [-73, -57] Amy’s group -1.39 -1.38 (0.19) [-1.75, -1.02] [-83, -64] Barbara’s group (0.14) [-1.66, -1.10] [-81, -67] Natasha’s group -1.13 -1.12 (0.18) [-1.46, -0.78] [-77, -54] Candice’s group -0.57 [-0.92, -0.22] [-60, -20] * Standard errors and confidence intervals are based on assumption that outcome measurements are mutually independent.

Dealing with time trends

Predictions at a focal follow-up time
𝜇 𝐴 𝐹 : predicted level of the outcome at time F if intervention never happens. 𝜇 𝐵 𝐹 : predicted level of the outcome at time F if intervention happens. 𝜇 𝐵 𝐹 𝜇 𝐴 𝐹

Defining effect sizes for a focal follow-up time
Define effect sizes as comparisons between 𝜇 𝐴 𝐹 and 𝜇 𝐵 𝐹 . Within-case SMD at time F: 𝛿 𝐹 = 𝜇 𝐵 𝐹 − 𝜇 𝐴 𝐹 𝜎 𝑒 Within-case LRR at time F: 𝜓 𝐹 =log⁡ 𝜇 𝐵 𝐹 𝜇 𝐴 𝐹 Difference in levels at focal follow-up time Residual standard deviation

Non-linear models for gradual effects
𝑌 𝑖 = 𝛽 0 + 𝛽 1 1− 𝜔 𝑈 𝑖 where Ui is cumulative number of treatment sessions

Non-linear models for gradual effects
Can be extended for ABAB/treatment reversal designs Works with LRR and other parametric effect sizes See Swan and Pustejovsky (2018) for further details Web-app for effect size estimation:

Meta-analysis across cases
Within-case ES estimates describe effects (FRs) for each case. What if you want to characterize the overall pattern of findings across cases/participants in a study? Simple fixed effects meta-analysis Estimate an average effect size across cases you have observed. SEs/CIs represent the uncertainty of estimates for the observed cases. Random effects meta-analysis Estimate an average effect size in a population of cases (similar to the observed cases), as well as heterogeneity of effects in the population. SEs/CIs represent the uncertainty of estimates for the population. FE meta-analysis: SEs/CIs represent uncertainty of estimates for the observed cases. What would happen and how different might the average effect estimate be if you replicated the study with the same participants, same intervention, same setting and conditions? RE meta-analysis: SEs/CIs represent the uncertainty of estimates for the population. What would happen and how different might the average effect estimate be if you replicated the study with new participants/cases who are similar to those used in the present study?

Simple fixed-effect meta-analysis
M cases, with effect size estimates 𝑇 1 ,…, 𝑇 𝑀 and standard errors 𝑆 𝐸 1 ,…,𝑆 𝐸 𝑀 Average effect size estimate: 𝑇 = 1 𝑀 𝑖=1 𝑀 𝑇 𝑖 SE for the average effect size: 𝑆 𝐸 𝑇 = 1 𝑀 2 𝑖=1 𝑀 𝑆 𝐸 𝑖 2 Approximate confidence interval: 𝑇 ± 𝑧 𝛼 ×𝑆 𝐸 𝑇 FE meta-analysis: SEs/CIs represent uncertainty of estimates for the observed cases. What would happen and how different might the average effect estimate be if you replicated the study with the same participants, same intervention, same setting and conditions? RE meta-analysis: SEs/CIs represent the uncertainty of estimates for the population. What would happen and how different might the average effect estimate be if you replicated the study with new participants/cases who are similar to those used in the present study?

Case g* (SE)* 95% CI R2 Deborah’s group -3.15 (0.92) -1.08 (0.12) Amy’s group -2.42 (0.62) -1.38 (0.19) Barbara’s group -3.01 (0.67) (0.14) Natasha’s group -1.59 (0.37) -1.12 (0.18) Candice’s group -0.98 (0.34) -0.57 Simple average -2.23 (0.28) [-2.77, -1.69] -1.11 (0.07) [-1.25, -0.96] Case g* (SE)* 95% CI R2 Deborah’s group -3.15 (0.92) -1.08 (0.12) Amy’s group -2.42 (0.62) -1.38 (0.19) Barbara’s group -3.01 (0.67) (0.14) Natasha’s group -1.59 (0.37) -1.12 (0.18) Candice’s group -0.98 (0.34) -0.57 Simple average -2.23 (0.28) [-2.77, -1.69] -1.11 (0.07) [-1.25, -0.96] RE average -2.04 (0.42) [-2.86, -1.21] [-1.39, -0.83] RE heterogeneity 0.76 0.28 * Standard errors and confidence intervals are based on assumption that outcome measurements are mutually independent.

References (1/2) Busk, P. L., & Serlin, R. C. (1992). Meta-analysis for single-case research. In T. R. Kratochwill & J. R. Levin (Eds.), Single-Case Research Design and Analysis: New Directions for Psychology and Education (pp. 187–212). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Campbell, J. M. (2003). Efficacy of behavioral interventions for reducing problem behavior in persons with autism: a quantitative synthesis of single-subject research. Research in Developmental Disabilities, 24(2), 120–138. doi: /S (03) Campbell, J. M., & Herzinger, C. V. (2010). Statistics and single subject research methodology. In D. L. Gast (Ed.), Single Subject Research Methodology in Behavioral Sciences (pp. 417–450). New York, NY: Routledge. Gingerich, W. J. (1984). Meta-analysis of applied time-series data. Journal of Applied Behavioral Science, 20(1), 71–79. doi: / Hedges, L. V. (2008). What are effect sizes and why do we need them? Child Development Perspectives, 2(3), 167–171. Kahng, S., Iwata, B. a, & Lewin, A. B. (2002). Behavioral treatment of self-injury, 1964 to American Journal of Mental Retardation : AJMR, 107(3), 212–221. doi: / (2002)107<0212:BTOSIT>2.0.CO;2 Lipsey, M. W., & Wilson, D. B. (2001). Practical Meta-Analysis. Thousand Oaks, CA: Sage Publications, Inc. Maggin, D. M., Swaminathan, H., Rogers, H. J., O’Keeffe, B. V, Sugai, G., & Horner, R. H. (2011). A generalized least squares regression approach for computing effect sizes in single-case research: Application examples. Journal of School Psychology, 49(3), 301–321. doi: /j.jsp

References (2/2) Marquis, J. G., Horner, R. H., Carr, E. G., Turnbull, A. P., Thompson, M., Behrens, G. A., … Doolabh, A. (2000). A meta-analysis of positive behavior support. In R. Gersten, E. P. Schiller, & S. Vaughan (Eds.), Contemporary Special Education Research: Syntheses of the Knowledge Base on Critical Instructional Issues (pp. 137–178). Mahwah, NJ: Lawrence Erlbaum Associates. Pustejovsky, J. E. (2015). Measurement-comparable effect sizes for single-case studies of free-operant behavior. Psychological Methods, 20(3), 342–359. doi: /met Pustejovsky, J. E., & Ferron, J. M. (2017). Research synthesis and meta-analysis of single-case designs. Handbook of Special Education: Second Edition. Pustejovsky, J. E., & Swan, D. M. (2015). Four methods for analyzing partial interval recording data, with application to single-case research. Multivariate Behavioral Research, 50(3), 365–380. doi: / Rodriguez, B. J., & Anderson, C. M. (2014). Integrating a social behavior intervention during small group academic instruction using a total group criterion intervention. Journal of Positive Behavior Interventions, 16(4), 234–245. doi: / Swan, D. M., & Pustejovsky, J. E. (2018). A gradual effects model for single-case designs. Multivariate Behavioral Research, forthcoming. doi: / Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604.

Single-Case Intervention Research Training Institute

Similar presentations

Presentation on theme: "Single-Case Intervention Research Training Institute"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Single-Case Intervention Research Training Institute

Similar presentations

Presentation on theme: "Single-Case Intervention Research Training Institute"— Presentation transcript:

Similar presentations

About project

Feedback