Presentation is loading. Please wait.

Presentation is loading. Please wait.

Meta-Analysis of Single-Case Designs

Similar presentations


Presentation on theme: "Meta-Analysis of Single-Case Designs"— Presentation transcript:

1 Meta-Analysis of Single-Case Designs
William R. Shadish University of California, Merced Borrowing liberally from coauthored work with Larry Hedges, David Rindskopf, James Pustejovsky, Kristynn Sullivan, Jonathan Boyajian and Eden Nagler The research reported here was supported by a grant from the University of California Office of the President to the University of California Educational Evaluation Consortium, and by the Institute of Education Sciences, U.S. Department of Education, Grant R305D to the University of California Merced. The opinions expressed are those of the author and do not represent views of the Institute or the U.S. Department of Education.

2 Overview of Talk Why Meta-Analysis of SCDs
Past Effect Sizes and Their Limits Computing a d-statistic in the same metric as that used for group experiments How to compute power How to do a meta-analysis

3 Why Meta-Analysis of SCDs
Evidence-Based Practice WWC Handbook says EBP Reviews should take “into consideration the number of studies, the sample sizes, and the magnitude and statistical significance of the estimates of effectiveness” And also that “to adequately assess the effects of an intervention, it is important to know the statistical significance of the estimates of the effects in addition to the mean difference, effect size, or improvement index” To increase statistical power compared to analyses of single studies.

4 Past Effect Sizes Many have been proposed Versions of d
Varieties of overlap statistics (e.g., PAND). These are very important, especially to measure the within-case effect.

5 Limitations of Past Effect Sizes
Usually standardized using within case rather than between case variability (unlike between groups d) So not comparable to and cannot be combined with between groups d. Rarely take into account trend, within- versus between case variability, autocorrelation Not well-developed statistically They lack distribution theory, standard errors unclear Without valid standard errors, much of modern meta-analysis cannot be applied Ditto for power analysis

6 A d-statistic for SCDs That Begins to Remedy These Limitations
Equivalent to between-groups d Takes into account Autocorrelation Ratio of between/total (between + within) variance Number of data points in each phase Number of cases in each study Corrects for small sample bias However, it does not fix all limitations. Still assumes: continuous outcomes, absence of trends, fixed treatment effect across cases within studies And requires three cases on the same outcome to compute. Two versions (so far):

7 Two Versions A version for ABk designs
Takes into account the number (k) of repetitions of the AB pair A version for multiple baseline designs Takes into account the different start times for treatment in each case. Manual and SPSS macros at Including for power analysis

8 ABAB Example Number of Intervals of Disruptive Behavior Recorded during single-student responding (SSR—baseline) and response card treatment (RC—treatment) conditions. Lambert et al. 2006

9 Results G = -2.513, V{G} = 0.041 Significance test
95% Confidence Interval: Estimated autocorrelation = 0.225 Estimated ICC r = .03 (ratio of between to total variance)

10 Multiple Baseline Example: Saddler et al
Multiple Baseline Example: Saddler et al Treatment to Improve Writing G = 1.963, V{G} = 0.335 f = 0.010, ρ =

11 Power ABk Designs MB Designs d = .8 d = .5 Assume ICC and AC both = .5

12 Meta-Analysis Across SCD Studies: PRT
Pivotal Response Training (PRT) for Autism 7 studies (12 effect sizes) with at least 3 cases (66 cases total), with outcomes about verbalization.

13 Snapshot of the Data

14 Computer Programs We use the R package metafor
Syntax driven but easy (see Shadish et al. JSP) Very comprehensive and cutting edge Possible to use Stata or SAS (or others). Comprehensive Meta-Analysis is good SPSS is, unfortunately, terrible.

15 Multivariate Data Means some studies have more than one dv.
That violates the independence assumption Best way to deal with is Multivariate meta-analysis (metafor) Robust variance estimation (Tipton robumeta) Both require knowledge of correlation among dvs Simplest way is to average d and VarG within study Produces results that are not optimal but usually very good unless the effect size is small or variance is large

16 Forest Plot: (Ordered by Precision)

17 Fixed vs Random Effects
Basic principle of both: Give more weight to studies that measure d with less error Fixed effects. w = 1/VarG Generalization: Same studies but with different people Random effects w = 1/(VarG + t2) Generalize to other studies. Consensus is to use random effects

18 Overall Random Effects Results

19 Cumulative Forest Plot

20 Diagnostic Tests Various Influence Statistics Radial Plots
Normal Quantile-Quantile Plots It is not important for now that you understand these statistics Read Shadish et al. JSP for understanding Just look at them to get the general idea—looking for studies that influence results more than others

21 Influence Statistics Notice Study 4 (Schreibman et al. 2009) consistently an outlier in these tests. This is a substantive issue not a statistical one. Why is it an outlier? Does it have special characteristics? For example, Schreibman et al. (2009) treated the youngest children with autism. Does PRT work less well with younger children?

22 Radial Plot Helps identify which studies contribute to heterogeneity—those that fall outside the gray area. Recall heterogeneity was not significant, so not surprisingly, none of the studies fall outside that area. If one did fall outside, the task again is to figure out why.

23 Quintile-Quintile Plot
A way to test for normality of the effect sizes. All dots (studies) should fall within the 95% confidence intervals (dotted lines) and they do. For studies that are outside the CI, again, explore why.

24 Publication Bias Studies with significant results more likely to be published (Rosenthal) Omitting the unpublished studies may overestimate the effect (the file drawer problem) But does that apply to SCDs since they don’t do significance tests?

25 Publication Bias in SCDs
Mahoney (1977) sent SCD reviewers randomly assigned manuscripts varying visual positive, negative, or mixed results. Manuscripts with positive results were more likely to be published. Why would this bias exist in SCDs? Traditional need for visually large effects? Downplaying results that are not visually large?

26 Publication Bias Tests
Begg and Mazumdar’s rank correlation test Egger’s regression test Funnel plot Trim-and-fill Selection bias models All require homogeneous data Except selection bias models Fortunately the PRT data are homogenous

27 Two Statistical Tests Begg and Mazumdar’s rank correlation test
Computes the correlation between effect size and study precision Should be zero if no publication bias exists r = .29 (p = .36), suggesting no bias Egger’s regression test predicts [G/SE] from study precision, can be more powerful than the rank correlation test intercept should be zero in the absence of bias intercept is 3.31 (df = 5, p = .021), suggesting the presence of bias

28 Funnel Plot Plots effect size against standard error.
Plot should be symmetric in the absence of publication bias (Why?) HINT: Where are all the low precision studies showing small effects? But can we quantify this judgment?

29 Trim-and-Fill Identifies “missing” studies and fills them in (the white dots). Recomputes the meta-analysis using them. Doing so, g drops from 1.01 to (se = 0.16, p < .001). Smaller but still significant.

30 Moderator Analyses If data were heterogenous, then the effect sizes differ by more than we expect by chance. Can we predict that extra variation? None of the predictors are significant here, individually or as a whole Not surprising because Effect sizes were homogenous Small number of studies so lower power Each year of age adds (p = .078) to g. Effect of PRT larger for older cases.

31 Summary Need to make further developments for
Alternating treatment designs Changing criterion designs Studies that combine different designs within one case, e.g,. Multiple baseline combined with alternating treatments. Different outcome metrics than normality Better taking trend into account (we assume no trend) We are working on these things. Still, this work offers a large number of new statistical opportunities for SCD research


Download ppt "Meta-Analysis of Single-Case Designs"

Similar presentations


Ads by Google