Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Randomization Techniques for Single-Case Intervention Data Statistical Randomization Techniques for Single-Case Intervention Data Joel R. Levin.

Similar presentations


Presentation on theme: "Statistical Randomization Techniques for Single-Case Intervention Data Statistical Randomization Techniques for Single-Case Intervention Data Joel R. Levin."— Presentation transcript:

1 Statistical Randomization Techniques for Single-Case Intervention Data Statistical Randomization Techniques for Single-Case Intervention Data Joel R. Levin University of Arizona

2 Purpose of This Presentation (Dejà Vu?) To broaden your minds by introducing you to new and exciting, scientifically credible single-case intervention design-and-analysis possibilities. To let you know that these procedures are becoming increasingly acceptable to single-case intervention researchers and are beginning to appear in the SCD research literature. Whether YOU ever choose to adopt them in your own SCD research (after this week!) is entirely up to you.

3 A Permutation-Test Primer (Based on Levin, 2007) Rationale and assumptions Samples and populations Individuals and groups Scores and ranks –exact probabilities and sampling approximations for scores Levin, J. R. (2007). Randomization tests: Statistical tools for assessing the effects of educational interventions when resources are scarce. In S. Sawilowsky (Ed.), Real data analysis (pp. 115-123). Greenwich, CT: Information Age.

4 Example 1: Very Small Samples From a total of 6 elementary school classrooms, 3 are randomly assigned to receive an instructional intervention that is designed to boost students’ academic performance (intervention classrooms), while students in the other 3 classrooms continue to receive their regular instruction (control classrooms). Following the instructional phase of the experiment, students in all classrooms are administered a 50-point achievement test and the average test performance within each classroom is calculated. Of interest in this study is the mean achievement-test difference between the 3 intervention classrooms and the 3 control classrooms.

5 Example 1: Data-Analysis Rationale The obtained mean difference is examined in the context of the distribution of all possible mean differences that can be generated by assigning the 6 obtained classroom means to two instructional conditions, assuming that 3 classroom means must be assigned to each condition. A statistical test is then conducted by addressing the question: How (un)likely or (im)probable is what actually occurred (i.e., the obtained intervention-control mean difference) in relation to everything that could have occurred (i.e., the distribution of all possible intervention- control mean differences, given the study's design structure and the set of means produced)? Should the result of the foregoing test be deemed statistically improbable (e.g., p <.05), then the researcher would conclude that the two instructional methods differ with respect to students' average achievement-test performance.

6 In how many different ways can 6 scores be assigned to 2 groups, if 3 scores must end up in each group? That is the same thing as asking how many different combinations of 3 objects are there if selected from a total of 6 objects. So as not to waste time attempting to express that quantity symbolically here: The answer, my friends, boils down to 6!/3!3! = 20. For example, consider the following 6 scores: 1 2 3 4 5 6 Let us systematically count the specific ways that 3 scores could be assigned to Group 1. (Note: The order in which the 3 scores are listed is not important.) 1. 1 2 3 2. 1 2 4 3. 1 2 5 4. 1 2 6 5. 1 3 4 6. 1 3 5 7. 1 3 6 8. 1 4 5 9. 1 4 6 10. 1 5 611. 2 3 412. 2 3 5 13. 2 3 614. 2 4 515. 2 4 616. 2 5 6 17. 3 4 518. 3 4 6 19. 3 5 620. 4 5 6

7 Example 1: Classroom Means and All Possible Assignments of Them to the Two Conditions (N 1 = N 2 = 3) Condition 1 Condition 2 M Difference (C1-C2) 1.42.6, 40.1, 39.6 (122.3, 40.8) 37.6, 36.7, 36.3 (110.6, 36.9) 3.9* p 1 = 1/20 =.05 2. 42.6, 40.1, 37.6 (120.3, 40.1) 39.6, 36.7, 36.3 (112.6, 37.5) 2.6 p 2 = 2/20 =.10 3. 42.6, 40.1, 36.7 (119.4, 39.8) 39.6, 37.6, 36.3 (113.5, 37.8) 2.0 4. 42.6, 40.1, 36.3 (119.0, 39.7) 39.6, 37.6, 36.7 (113.9, 38.0) 1.7 5. 42.6, 39.6, 37.6 (119.8, 39.9) 40.1, 36.7, 36.3 (113.1, 37.7) 2.2 6. 42.6, 39.6, 36.7 (118.9, 39.6) 40.1, 37.6, 36.3 (114.0, 38.0) 1.6 7. 42.6, 39.6, 36.3 (118.5, 39.5) 40.1, 37.6, 36.7 (114.4, 38.1) 1.4 8. 42.6, 37.6, 36.7 (116.9, 39.0) 40.1, 39.6, 36.3 (116.0, 38.7) 0.3 9. 42.6, 37.6, 36.3 (116.5, 38.83) 40.1, 39.6, 36.7 (116.4, 38.80) 0.03 10. 42.6, 36.7, 36.3 (115.6, 38.5) 40.1, 39.6, 37.6 (117.3, 39.1) -0.6 11. 40.1, 39.6, 37.6 (117.3, 39.1) 42.6, 36.7, 36.3 (115.6, 38.5) 0.6 12. 40.1, 39.6, 36.7 (116.4, 38.80) 42.6, 37.6, 36.3 (116.5, 38.83) -0.03 13. 40.1, 39.6, 36.3 (116.0, 38.7) 42.6, 37.6, 36.7 (116.9, 39.0) -0.3 14. 40.1, 37.6, 36.7 (114.4, 38.1) 42.6, 39.6, 36.3 (118.5, 39.5) -1.4 15. 40.1, 37.6, 36.3 (114.0, 38.0) 42.6, 39.6, 36.7 (118.9, 39.6) -1.6 16. 40.1, 36.7, 36.3 (113.1, 37.7) 42.6, 39.6, 37.6 (119.8, 39.9) -2.2 17. 39.6, 37.6, 36.7 (113.9, 38.0) 42.6, 40.1, 36.3 (119.0, 39.7) -1.7 18. 39.6, 37.6, 36.3 (113.5, 37.8) 42.6, 40.1, 36.7 (119.4, 39.8) -2.0 19. 39.6, 36.7, 36.3 (112.6, 37.5) 42.6, 40.1, 37.6 (120.3, 40.1) -2.6 20. 37.6, 36.7, 36.3 (110.6, 36.9) 42.6, 40.1, 39.6 (122.3, 40.8) -3.9

8 A Permutation-Test Primer (Based on Levin, 2007) Rationale and assumptions Samples and populations Individuals and groups Scores and ranks –exact probabilities and sampling approximations for scores –tables for ranks Two-sample permutation test –small hypothetical example, with all steps included Levin, J. R. (2007). Randomization tests: Statistical tools for assessing the effects of educational interventions when resources are scarce. In S. Sawilowsky (Ed.), Real data analysis (pp. 115-123). Greenwich, CT: Information Age.

9 Example 2: A Little Larger Sample Sizes 8 typically developing (TD) second graders and 6 reading- disabled (RD) sixth graders were administered a sight-reading task and the number of errors made by each student were: TDRD 3 5 4 6 5 6 2 M = 3.38 5 M = 5.17 4 SD = 1.06 3 SD = 1.17 4 6 3 2 With 14 students, 8 in one group and 6 in the other, there is a total of: 14! / 8! 6! = 3,003 different possible assignments of scores to groups. With α =.05 (two-tailed), the.05 x 3,003 = 150 most extreme mean differences on either side of the permutation distribution would be included in the rejection region.

10 Example 2: A Little Larger Sample Sizes For these data, the observed mean difference turns out to be the 52 nd most extreme, which leads to rejection of the statistical hypothesis and which also yields a significance probability (p-value) of 52/3003 =.017. Doing the calculations by hand can of course be tedious – and in many cases “impossible” – and so one not should be afraid to apply Monte Carlo sampling procedures instead. For the present example, the exact p-value reported above is.017. With 5 replications of 10,000 samples (and less than 3 seconds apiece) from a nice little freeware sampling program created by David Howell, University of Vermont, the successive p-values were.017,.016,.019,.016, and.018. Thus, one can conclude that it is statistically unlikely that the error distributions of these two groups are identical.

11 A Permutation-Test Primer (Based on Levin, 2007) Rationale and assumptions Samples and populations Individuals and groups Scores and ranks –exact probabilities and sampling approximations for scores –tables for ranks Two-sample permutation test –small hypothetical example, with all steps included –larger examples, with some steps missing hypothetical example actual application Levin, J. R. (2007). Randomization tests: Statistical tools for assessing the effects of educational interventions when resources are scarce. In S. Sawilowsky (Ed.), Real data analysis (pp. 115-123). Greenwich, CT: Information Age.

12 Example 3: Very Unequal Sample Sizes 66 discussion sections of a first-year university calculus class 2 of the sections were specifically targeted at mathematically talented students from under-represented groups the average course grade attained by students in each of the 66 sections was calculated, after statistically controlling for relevant high-school achievement variables the total number of mean differences in the permutation distribution is equal to 66!/2!64! = 2,145 the means in the two special sections were the 1st and 14th highest in the set of 66 means and the corresponding p- value associated with the test of group identity was equal to.023.

13 Different Patterns of Baseline-to-Intervention Phase Change Kratochwill, T. R., & Levin, J. R. (1978). What time-series designs may have to offer educational researchers. Contemporary Educational Psychology, 3, 273-329.

14 What Summary Measure(s) Can/Should Be Analyzed by The To-Be-Presented Techniques? Means (Levels) of the phases Medians Truncated/Censored data based on a priori rules Even randomly selected observations? Slopes (Trends) of the phases Variances of the phases Any “Predicted Pattern” within or between phases Special considerations when groups are the units of intervention administration

15

16

17 Randomized Adaptations of These Designs Randomized Phase Designs –the order of the A and B phases is randomly determined for each case or “unit” (e.g., participant, pair, group, classroom) Randomized Start-Point Designs –the transition point from the A to the B phase(s) is randomly determined for each unit Random Assignment of Cases to Interventions and/or to Intervention Orders –when multiple units are included in the study

18 Background Eugene Edgington’s randomization-test contributions Randomized phase designs –Basic design (AB) and return to baseline design (ABA), including when A and B consist of two different interventions –Reversal (or withdrawal or “operant”) design (ABAB…AB) and alternating/simultaneous treatment design –Multiple-baseline design Randomized intervention start-point designs –Basic design (AB) and return-to-baseline design (ABA), including when A and B consist of two different interventions –Reversal (or withdrawal or “operant”) design (ABAB…AB) and alternating/simultaneous treatment design –Multiple-baseline design Combination of the two [e.g., Koehler & Levin’s (1998) regulated randomization multiple-baseline design] Replications (i.e., multiple units) and extensions of these

19 Systematic vs. Randomized ABAB…AB and Alternating-Treatment Designs A few selected results from a Monte Carlo simulation study by Levin, Ferron, and Kratochwill (2012)

20 Five 24-Observation Designs (Individual Observations) ABABABABABABABABABABABAB24 Phase AABBAABBAABBAABBAABBAABB 12 Phase AAAABBBBAAAABBBBAAAABBBB 6 Phase AAAAAABBBBBBAAAAAABBBBBB 4 Phase AAAAAAAAAAAABBBBBBBBBBBB 2 Phase

21 Type I Error Probability as a Function of Autocorrelation in the 24-Observation Design (Individual Observations)

22 Five 24-Observation ABAB…AB Designs (Phase Means) ABABABABABABABABABABABABBlock=1 (24 phases) BAABABABBAABBAABABBABABARandom Pair (13-24 phases) AABBAABBAABBAABBAABBAABB Block=2 (12 phases) AAABBBAAABBBAAABBBAAABBB Block=3 (8 phases) AAAABBBBAAAABBBBAAAABBBB Block=4 (6 phases)

23 Type I Error Probability as a Function of Autocorrelation in the 24-Observation Design (Phase Means)

24 Effect-Size Alert for Single-Case Research Outcomes, or Don’t “Dis” Large Effect Sizes Here! Marquis et al. (2000) noted in their meta-analysis of positive behavior support that “[t]he smallest [conventional effect-size measure] for outcomes was 1.5, which would be considered quite large in a group study context” (p. 165); and that their effect-size estimates “ranged from 1.5 standardized units to 3.1 units” (p. 167). Rogers and Graham (2008, p. 885) indicated that “[W]hen we have used [the conventional method of effect-size calculation in meta-analyses of] single subject design studies in writing, the effect sizes are typically 3.0 and higher.” In a single-case enuresis-treatment study conducted by Miller (1973), the conventional effect sizes calculated for the two participants were 5.98 and 6.41 (Busk & Serlin, 1992, p. 201-202).

25 Power for d = 1.00 as a Function of Autocorrelation in the 24-Observation Design (Phase Means)

26 Major Conclusions 1. Claims that systematic ABAB…AB designs produce inflated Type I error probabilities in series with positively autocorrelated observations are grossly “inflated.” 2. A systematic design consisting of alternating A and B individual observations yields very respectable power for detecting larger effect sizes. From a methodological perspective, an even more appealing option is the randomized pairs ─ or, in certain situations, the randomized paired doubles ─ design, perhaps with one or more mandatory initial A' observations in situations where a true baseline is desired (Kratochwill & Levin, 2010). The latter is not an issue if A and B represent two alternative interventions.

27 Major Conclusions 3. With respect to statistical power, for a constant number of total observations a design with a greater number of phases trumps one with a greater number of observations per phase (akin to previous classroom-based research power findings comparing the number of classrooms and the number of students per classroom when the intraclass correlation is nonzero). On the other hand, a six-phase ABABAB design with 12 observations per phase yields nearly equivalent power (.77) to a 24-phase ABAB…AB design based on individual observations per phase (.80). 4. An analogous multiple-observation randomized pair version of the six-phase ABABAB design might have particular applicability in group- or classroom-based intervention research contexts (e.g., one group of students learning a series of 6 different content units according to 2 different instructional approaches).

28 References Busk, P. L., & Serlin, R. C. (1992). Meta-analysis for single-case research. In T. R. Kratochwill & J. R. Levin (Eds.), Single-case research design and analysis (pp. 187-212). Hillsdale, NJ: Erlbaum. Levin, J. R., Ferron, J., M., & Kratochwill, T. R. (2012). Nonparametric statistical tests for single-case systematic and randomized ABAB…AB and alternating treatment designs: New developments, new directions. Journal of School Psychology. 50, 599-624. Marquis, J. G., Horner, R. H., Carr, E. G., Turnbull, A. P., Thompson, M., Behrens, G. A., et al. (2000). A meta-analysis of positive behavior support. In R. Gersten, E. P. Schiller, S. Vaughn, (Eds.), Contemporary special education research: Syntheses of knowledge base on critical instructional issues (pp. 137-178). Mahwah, NJ: Erlbaum. Miller, P. M. (1973). An experimental analysis of retention control training in the treatment of nocturnal enuresis in two institutional adolescents. Behavior Therapy, 4, 288-294. Rogers, L. A., & Graham, S. (2008). A meta-analysis of single subject design writing intervention research. Journal of Educational Psychology, 100, 879- 906.

29 Adapted from Levin, J. R., & Wampold, B. E. (1999). Generalized single ‑ case randomization tests: Flexible analyses for a variety of situations. School Psychology Quarterly, 14, 59 ‑ 93.

30

31 Replicated AB Design With Three Cases (“Units”), Two Within-Series Intervention Conditions, 20 Time Periods, and 13 Potential Intervention Points for Each Case Marascuilo, L. A., & Busk, P. L. (1988). Combining statistics for multiple-baseline AB and replicated ABAB designs across subjects. Behavioral Assessment, 10, 1- 28.

32 A Two-Intervention (Between Cases) Example

33 Levin & Wampold’s (1999) Simultaneous Start-Point Model Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pair 1X A A A A A A A A A A A │ B B B B B B B B B Pair 1Y A A A A A A A A A A A │ B B B B B B B B B Note: Potential intervention start points are between Time Periods 5 and 17 inclusive. *Randomly selected intervention start point for the pair of units Levin, J. R., & Wampold, B. E. (1999). Generalized single-case randomization tests: Flexible analyses for a variety of situations. School Psychology Quarterly, 14, 59–93.

34 From Levin, J. R., & Wampold, B. E. (1997, July) Single-case randomization tests for a variety of situations. Paper presented at the 10th European Meeting of the Psychometric Society, Santiago de Compostela, Spain.

35 Means and Mean Differences Associated With Each of the 9 Potential Intervention Startpoints, By Reinforcer Type Start Point SocialToken T-S A B B-A A B B-A Diff Rank 3 13.0 13.9 0.9 12.5 16.3 3.8 2.9 9 4 13.3 13.9 0.6 12.7 16.7 4.0 3.4 8 5 13.5 13.9 0.0 12.5 17.3 4.8 4.8 7 6 13.8 13.7 -0.1 12.6 17.9 5.3 5.4 3 *7 13.7 13.8 0.0 13.2 18.2 5.0 5.0 4* 8 14.0 13.4 -0.6 14.0 18.0 4.0 4.6 5.5 9 14.1 13.0 -1.1 14.5 18.0 3.5 4.6 5.5 10 14.4 11.7 -2.7 14.7 18.7 4.0 6.7 2 11 14.5 10.0 -4.5 15.1 18.5 3.4 7.9 1

36 Levin & Wampold’s (1999) Replicated Simultaneous Start-Point Model Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pair 1X A A A A A A A A A A A* B B B B B B B B B Pair 1Y A A A A A A A A A A A* B B B B B B B B B Pair 2X A A A A A A A A A* B B B B B B B B B B B Pair 2Y A A A A A A A A A* B B B B B B B B B B B Note: Potential intervention start points are between Time Periods 5 and 17 inclusive. *Randomly selected intervention start point for each pair of units Levin, J. R., & Wampold, B. E. (1999). Generalized single-case randomization tests: Flexible analyses for a variety of situations. School Psychology Quarterly, 14, 59–93

37 Additional Considerations A randomized intervention start-point statistical test is useful for comparing three or more different intervention conditions in a replicated AB design, although it yields respectable power only for larger effect sizes, numbers of outcome observations, and potential intervention start points. For two intervention conditions, a short-series version of the original test could have many useful classroom-based intervention-research applications. –mathematics intervention example The number of potential intervention start points is a more critical “power” factor than the total number of outcome observations per se. Levin, J. R., Lall, V. F., & Kratochwill, T. R. (2011). Extensions of a versatile randomization test for assessing single-case intervention effects. Journal of School Psychology, 49, 55-79.

38 Proposed Single-Case Intervention Design/Analysis Schemes* 1.AB Randomized Phase-Order Design With Intervention Start-Point Randomization *Levin, J. R., Ferron, J. M., & Gafurov, B. S. (2014). Improved randomization tests for a class of single-case intervention designs. Unpublished manuscript. [Referred to in a chapter by J. M. Ferron & J. R. Levin (2014). Single-case permutation and randomization statistical tests: Present status, promising new developments. In T. R. Kratochwill & J. R. Levin (Eds.), Single-case intervention research: Methodological and statistical advances (pp. 153-183). Washington, DC: American Psychological Association.]

39 Suppose that in a 16-observation design, A and B are either a baseline phase and an intervention phase or two different interventions that a single case is to receive. The case is assigned randomly to one of the two phase orders (AB or BA). (With a true baseline-intervention design this can be accomplished by including one or more mandatory baseline/adaptation observations (A') for both phase orders.) The random assignment of phase orders is required for the subsequent AB randomization test (modified Edgington test) to be valid. In addition, the case receives a randomly selected intervention start point, with an a priori specification of 10 potential start points, from Observations 4 through 13 inclusive.

40 AB Randomized Phase-Order Design (With Mandatory Initial A' Baseline Phase) With the original Edgington (1975) model, the study can be diagrammed as: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 A' A' A' A A A B B B* B B B B B B B B B B However, with the revised Edgington model, the opposite “pretend” ordering of As and Bs was also possible and therefore can be included in the randomization distribution: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 A' A' A' B B B A A A* A A A A A A A A A A

41 AB Randomized Phase-Order Design (With Mandatory Initial A' Baseline Phase) For Edgington’s model, with 10 potential intervention points, even if the observed outcome were the most extreme the lowest one-tailed significance probability would be p = 1/10 =.10. In contrast, for the revised model with both orders taken into account, even with only 10 potential intervention points it would be possible to attain a one-tailed significance probability of p = 1/20 =.05. Moreover, our simulation results indicate that the revised procedure: (1) maintains satisfactory Type I error control; and (2) exhibits power that is generally about.20-40 points higher (and in some situations, even more) than that of the original Edgington procedure.

42 Comparison (α =.05, one-tailed) of randomization tests for a one- case (N = 1) AB randomized intervention start-point design (Single) and the randomized intervention start-point plus randomized intervention-order design (Dual), where the start point was randomly selected between the 6 th through the 25 th observations inclusive in a 30-observation study.

43 Comparison (α =.05, one-tailed) of randomization tests for a one- case (N = 1) AB randomized intervention start-point design (Single) and the randomized intervention start-point plus randomized intervention-order design (Dual). The effect size is 2.0 and the number of potential intervention start points (x) is equal to the series length minus 10 and encompasses the middle x observations.

44 Comparison (α =.05, one-tailed) of randomization tests for the Single and Dual basic AB randomized designs replicated across N cases. The rejection rate of the null hypothesis is shown as a function of effect size and N, for a 15 observations design with 5 potential intervention start points designated from between the 6 th and 10 th observations inclusive and an autocorrelation of.3.

45 Comparison (α =.05, one-tailed) of randomization tests for the Single and Dual randomized ABAB designs replicated across N cases. The rejection rate of the null hypothesis is shown as a function of effect size and N, for an autocorrelation of.3, and a 23-observations design with a minimum of 5 observations in each of the four phases.

46 Selected Single- Versus Dual-Randomization Power Comparisons of Longer and Shorter Series Simulations (SL = Series Length, NSP = Number of Potential Intervention Start Points) NdrSize (SL/NSP)SingleDualDifference 22.0.30Longer (15/5).44.85.41 Shorter (9/5).42.80.38 31.5.30Longer (15/5).49.90.41 Shorter (7/3).28.73.45 51.0.30Longer (15/5).45.89.44 Shorter (8/2).15.71.56

47 Proposed Single-Case Intervention Design/Analysis Schemes* 1.AB Randomized Phase-Order Design With Intervention Start-Point Randomization 2.AB Crossover Design With Intervention Start-Point Randomization *Levin, J. R., Ferron, J. M., & Gafurov, B. S. (2014). Improved randomization tests for a class of single-case intervention designs. Unpublished manuscript. [Referred to in a chapter by J. M. Ferron & J. R. Levin, J. R. (2014). Single-case permutation and randomization statistical tests: Present status, promising new developments. In T. R. Kratochwill & J. R. Levin (Eds.), Single-case intervention research: Methodological and statistical advances (153-183). Washington, DC: American Psychological Association.]

48 Now suppose that A and B consist of two different interventions that all cases are to receive. The cases are assigned randomly to one of the two intervention orders (AB or BA) and again, each case receives a randomly selected intervention start point between O 8 and O 14 inclusive. So, with N = 2 cases and 20 observations, with an a priori specification of k = 7 potential intervention start points for each case, we have: Case 1 B B B B B B B B A*A A A A A A A A A A A Case 2 A A A A A A A A A A A A B* B B B B B B B

49 With the Marascuilo-Busk (1988) model, there are k N, or here with k =7 and N = 2, 7 2 = 49 possible randomization-distribution outcomes. In contrast, with the revised “both orders” procedure there are 2 N (2 2 = 4) more possible outcomes (specifically 4 x 49 = 196, which is four times as many as in the original procedure. With N = 3, there would be 8 times as many possible randomization-distribution outcomes (2,744 vs. 343). One can use this model to test the hypothesis that the two interventions are equally effective. And as will be illustrated shortly and tomorrow, with paired cases one can similarly adapt the Levin-Wampold (1998) model to test the same hypothesis with greater power.

50 ABA Design (Two Phase Start-Point Choices, B and A 2 ) N = 10min n A1 = 2k = 2 min n B = 2 min n A2 = 2 Total number of start-point combinations = N – Σn i + k = 10 – 6 + 2 = 6 = 15 ( ) ( ) ( ) k 2 2 Restrictions for the present specifications: 1. There must be at least 2 observations in each phase. Therefore: a. The B phase must start between T 3 and T 7 inclusive. b. The A 2 phase must start between T 5 and T 9 inclusive. 2. The sum of the observations across the 3 phases must be 10. This procedure extends directly to ABAB designs and beyond. Based on Onghena, P. (1992). Randomization tests for extensions and variations of ABAB single- case experimental designs: A rejoinder. Behavioral Assessment, 14, 153-171.

51 Permissible n per phase Phase start points A 1 BA 2 BA 2 1.22635 2.23536 3.24437 4.25338 5.26239 6.32546 7.33447 8.34348 9.35249 10.42457 11.43358 12.44259 13.52368 14.53269 15.62279

52 Multiple-Baseline Design Revusky’s (1967) statistical procedure –Revusky, S. H. (1967). Some statistical treatments compatible with individual organism methodology. Journal of the Experimental Analysis of Behavior, 10, 319-330. [Revised/improved procedure by Levin, J. R., Ferron, J. M., & Gafurov, B. S. (2014). Modification and extension of Revusky’s rank test for single-case multiple-baseline designs. Research in progress.] Wampold and Worsham’s (1986) improved permutation procedure –Wampold, B. E., & Worsham, N. L. (1986). Randomization tests for multiple- baseline designs. Behavioral Assessment, 8, 135-143. Type I error and power assessments have been made by Ferron and Sentovich (2002) of the Wampold-Worsham (1986), Marascuilo-Busk (1988), and Koehler-Levin (1998) procedures –Ferron, J., & Sentovich, C. (2002). Statistical power of randomization tests used with multiple-baseline designs. Journal of Experimental Education, 70, 165-178

53 Wampold, B. E., & Worsham, N. L. (1986). Randomization tests for multiple-baseline designs. Behavioral Assessment, 8, 135-143.

54 Koehler, M. J., & Levin, J. R. (1998). Regulated randomization: A potentially sharper analytical tool for the multiple ‑ baseline design. Psychological Methods, 3, 206 ‑ 217. Multiple-Baseline Design With Randomized Intervention Start Points

55 Minimum Number of Units (N), Outcome Assessments (T), and Potential Intervention Start Points for Each Unit (k) In Order To Detect an Intervention Effect Based on α .05: Comparison of Three Nonparametric Multiple-Baseline Statistical Procedures Procedure Primary Basis of Comparison N T k Revusky (1967) Between Cases 4 3 ‑ 5 a 1 Wampold-Worsham (1986) Within Cases 4 5 1 Koehler-Levin (1998)Within Cases 3 7 2 a With the Revusky procedure, if only raw post ‑ intervention outcomes are analyzed (rather than standardized or regression ‑ adjusted outcomes), no pre- intervention outcome assessment period is necessary. Moreover, if only raw, standardized, or adjusted post ‑ intervention outcomes are analyzed (rather than within ‑ case pre ‑ intervention vs. post ‑ intervention differences), for each successive case the intervention need not be continued beyond the first outcome assessment following its introduction. These are two practical advantages of the Revusky procedure that should be considered (see Levin, Evmenova, & Gafurov, 2014).

56 Application of The General Regulated Randomization Formula to Various “Multiple-Baseline” Nonparametric Procedures Procedure Sample Specifications No. of Possible Outcomes Marascuilo-Busk (1988) N = 3 0! (6)(6)(6) = 216 k 1 = k 2 = k 3 = 6 Sampling ‑ Without ‑ Replacement N = 3 0! (6)(5)(4) = 120 Analog to Marascuilo-Busk k 1 = 6, k 2 = 5, k 3 = 4 Wampold-Worsham (1986) N = 3 3! (1)(1)(1) = 6 k 1 = k 2 = k 3 = 1 Revusky (1967) N = 3 3! (1)(1)(1) = 6 k 1 = k 2 = k 3 = 1 Note: N = the number of units (or randomized units in the regulated randomization approach) and k i = the number of specified start points associated with each partition

57

58 Proposed Single-Case Intervention Design/Analysis Schemes* 1.AB Randomized Phase Design With Intervention Start-Point Randomization 2.AB Crossover Design With Intervention Start-Point Randomization 3.“Uber Supernova” Multiple-Baseline Comparative- Treatment Design With Intervention Start-Point Randomization *Levin, J. R., Ferron, J. M., & Gafurov, B. S. (2014). Improved randomization tests for a class of single-case intervention designs; and Levin, J. R. Ferron, J. M., & Gafurov, B. S. (2014). Modification and extension of Revusky’s rank test for single-case multiple-baseline designs. [Referred to in a chapter by Levin, J. R., Evmenova, A. S., & Gafurov, B. S. (2014). The single- case data-analysis ExPRT (Excel ® Package of Randomization Tests). In T. R. Kratochwill & J. R. Levin (Eds.), Single-case intervention research: Methodological and statistical advances (pp. 185- 219), Washington, DC: American Psychological Association.]

59 Multiple-Baseline Comparative-Treatment Design (Modified Levin-Wampold & Koehler-Levin Designs) Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Pair 3XA A A A B* B B B B B B B B B B B Pair 3YA A A A B* B B B B B B B B B B B Pair 1YA A A A A A B* B B B B B B B B B Pair 1XA A A A A A B* B B B B B B B B B Pair 4YA A A A A A A A A A B* B B B B B Pair 4XA A A A A A A A A A B* B B B B B Pair 2XA A A A A A A A A A A A A A B* B Pair 2YA A A A A A A A A A A A A A B* B Note: Pairs are randomly assigned to stagger positions. Interventions X and Y are randomly assigned within each pair. The bolded letters represent the potential start points for each pair, with the asterisked B indicating the actual start point randomly selected for each pair.

60 Some Randomization-Test Software Randomized Phase Designs Edgington & Onghena’s (2007) SCRT (Single-Case Randomization Tests) program, in their book Levin, Ferron, & Kratochwill’s (2012) SAS software for various ABAB…AB and alternating treatment designs Randomized Intervention Start-Point Designs Edgington & Onghena’s (2007) SCRT program, in their book (also Bulté & Onghena, 2008) Koehler’s (2012) program for the Koehler-Levin (1998) multiple- baseline procedure (http://mkoehler.educ.msu.edu/regrand/)http://mkoehler.educ.msu.edu/regrand/ Gafurov & Levin’s (2014) ExPRT (Excel Package of Randomization Tests); downloadable at http://code.google.com/p/exprthttp://code.google.com/p/exprt Other Borckhardt et al.’s (2008) Simulation Modeling Analysis (SMA) program

61 Preliminary and Projected Features of Gafurov & Levin’s ExPRT Statistical Software Features of the programs include: exact nonparametric statistical analyses based on some type of randomization options for either random or fixed intervention start points and intervention orders designs based on either one or multiple cases (replications) an unlimited number of total observations for up to 15 cases either individual or paired cases analyses conducted with either raw or standardized data

62 Preliminary and Projected Features of the ExPRT Statistical Software Features of the programs include: user-defined α levels (one- or two-tailed tests) statistical decisions (reject, do not reject) and significance probabilites (p-values) statistical tests based on either mean (level) or slope (trend) output distribution of all possible outcomes graph of the outcomes for each case case-by-case and across-case summary measures and effect-size estimates a randomizing routine for planned studies

63 AB Design Basic time-series design Baseline/Control (A) vs. Intervention (B) Intervention A vs. Intervention B Intervention start-point randomization procedure (Edgington model; Marascuilo-Busk model; Levin- Wampold simultaneous start-point model for two different matched interventions: comparative- effectiveness and general-effectiveness tests) Levin et al.’s (2014) randomized intervention-order option; single-case crossover design: intervention- effect and time-effect.

64

65

66

67

68

69

70

71 Additional ExPRT Randomization Tests ABA Design Intervention start-point randomization (Onghena model) Overall test and separate two-phase tests Reversal (ABAB) Design Intervention start-point randomization (Onghena model) Overall test and separate two-phase tests Multiple-Baseline Design Within-case comparisons (Wampold-Worsham model) Within-case comparisons; intervention start-point randomization (Koehler-Levin model) Between-case comparisons (Revusky model)

72 Additional References Bulté, I., & Onghena, P. (2008). An R package for single-case randomization tests. Behavior Research Methods, 40, 467–478. Edgington, E. S., & Onghena, P. (2007). Randomization tests (4th ed.) Boca Raton, FL: Chapman & Hall/CRC. Gafurov, B. S., & Levin, J. R. (2014). EXPRT (Excel Package of Randomization Tests). Downloadable at http://code.google.com/p/exprt. Koehler, M. J., & Levin, J. R. (1998). Regulated randomization: A potentially sharper analytical tool for the multiple ‑ baseline design. Psychological Methods, 3, 206 ‑ 217. Levin, J. R., & Ferron, J. M. (in press). Review of Dugard, File, and Todman’s Single- case and small-n designs: A practical guide to randomization tests (2 nd ed.). American Statistician. Levin, J. R. Ferron, J., & Kratochwill, T.R. (2012). Nonparametric permutation tests for systematic and randomized single-case ABAB…AB and alternating treatment intervention designs: New developments, new directions. Journal of School Psychology, 50, 599-624. Levin, J. R., Lall, V. F., & Kratochwill, T. R. (2011). Extensions of a versatile randomization test for assessing single-case intervention effects. Journal of School Psychology, 49, 55-79. Levin, J. R., Marascuilo, L. A., & Hubert, L. J. (1978). N = nonparametric randomization tests. In T. R. Kratochwill (Ed.), Single subject research: Strategies for evaluating change (pp. 167–196). New York, NY: Academic Press. Levin, J. R., & Wampold, B. E. (1999). Generalized single ‑ case randomization tests: Flexible analyses for a variety of situations. School Psychology Quarterly, 14, 59 ‑ 93.


Download ppt "Statistical Randomization Techniques for Single-Case Intervention Data Statistical Randomization Techniques for Single-Case Intervention Data Joel R. Levin."

Similar presentations


Ads by Google