Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Comparisons Q560: Experimental Methods in Cognitive Science Lecture 10.

Similar presentations


Presentation on theme: "Multiple Comparisons Q560: Experimental Methods in Cognitive Science Lecture 10."— Presentation transcript:

1 Multiple Comparisons Q560: Experimental Methods in Cognitive Science Lecture 10

2 The problem with t-tests…
We could compare three groups with multiple t-tests: M1 vs. M2, M1 vs. M3, M2 vs. M3 But this causes our chance of a type I error (alpha) to compound with each test we do: 1 - (1 - p)n Testwise Error: Probability of a type I error on any one statistical test Experimentwise Error: Probability of a type I error over all statistical tests in an experiment ANOVA keeps our experimentwise error = alpha

3 Pairwise Comparisons Overall significance test is called Omibus ANOVA
When rejecting H0 in an ANOVA test, we just know there is a difference somewhere…we need to do some detective work to find it Testwise error is p(Type I error) on any one test Experimentwise error is p(Type I error) over a series of separate hypothesis tests We have to make sure we do not exceed a .05 chance of a Type I error while we “investigate”

4 ANOVA CSI Detective At least one pair of means is “guilty” of causing the overall omnibus variance Omnibus ANOVA tells you a variance crime has been committed. Your job is to further investigate to figure out where the abnormal variance is coming from:

5 “Even a stopped clock is right twice a day”
Pairwise Comparisons Planned comparisons (a priori tests) are based on theory and are planned before the data are collected Unplanned comparisons (post-hoc tests) are “fishing expeditions” after data have been observed Planned comparisons are preferred b/c they have a much smaller chance of making a Type I error “Even a stopped clock is right twice a day”

6 Post-Hoc Tests Exploring data after H0 has been rejected
Specific tests to control experimentwise error Simplest possible is to follow up with t-tests using a Bonferroni correction to alpha (Dunn test) Alpha is divided across number of comparisons Paired t-test (RM Anova) or Independent t-tests (Oneway ANOVA)

7 Post-Hoc Tests Other post-hoc tests have specific methods for controlling experiment-wise error. There are over a dozen unique tests; we’ll just look at three Tukey’s HSD: Where q is the studentized range statistic You look up the value for q in table using k (# of treatment groups) and dfwithin HSD is the minimal mean difference for significance

8 Placebo Drug A Drug B Drug C
Tukey’s HSD Data for three drugs designed to act as pain relievers: Placebo Drug A Drug B Drug C

9 Tukey’s HSD For our drug experiment:
k = 4, n = 5, dfW = 16, MSW = 2.00 We look up q from table and get q = 4.05 Mean differences greater than 2.56 are significant Our means were: 1, 2, 4, 5 from the drug study So: 1 < 4, 5 and 2 < 5; all other means are statistically equal

10 Scheffe’s Test Scheffe’s test is considered one of the most conservative (cautious) tests. It uses the MS between the two treatments you are comparing, but uses the MS error from the omnibus ANOVA and k-1 as the numerator df So the critical value for a Scheffe F-ratio is the same as it was for the omnibus ANOVA Two treatments you are comparing

11 Even though you’re only comparing two means, they were selected from k overall means, so k is used to determine the df. From the drug experiment, let’s compare Placebo to Drug C (M=1 vs. M=5) Fcrit(3,16) = 3.24  Placebo < Drug C

12 Planned vs. Unplanned Comparisons
Consider an experiment w/ 5 groups: we have 10 possible pairwise comparisons (1 vs. 2, 1 vs. 3, etc) Assume that the null is true, but by chance two of the means are far enough apart to erroneously reject (ie., the data contain a type I error) If you plan a single comparison beforehand, you have a 1/10 chance of selecting the one comparison that happened to have a type I error

13 Planned vs. Unplanned Comparisons
If you look at the data first, you are certain to make a type I error You are doing all comparisons in your head, even though you only to the math for one If you plan the comparisons beforehand (and they are a subset of all comparisons) p(Type I) is much lower than if you snoop at the data first.

14 I’m only interested in Control < 2 and 3
Some of these look redundant to me I’m only interested in Control < 2 and 3

15 “The complete null” We don’t always care I’m only interested in Control < 2 and 3

16 Since the two comparisons I need to do are a priori and independent, I don’t inflate the FW error rate This really just involves picking contrast coefficients E.g., my active-passive category learning expt

17 Planned Comparisons Planned comparisons (a priori tests) are based on theory and are planned before the data are collected If comparisons are planned in advance, the likelihood of making a Type I error is smaller than if the comparisons were made on a post-hoc basis (because we guess at a subset of hypotheses) If you are making all pairwise comparisons, it won’t make a difference whether the comparisons were planned in advance or not.

18 Orthogonal Linear Contrasts
We can define a linear combination of weighted means for a particular hypothesis We set the condition that as a linear contrast. So: We also want our contrasts to be orthogonal, that is, they don’t contain overlapping amounts of information

19 We also want our contrasts to be orthogonal, that is, they don’t contain overlapping amounts of information Knowing that M1 is greater than the mean of M2 and M3 does tell us that M1 is likely greater than M3 When members of a set of contrasts are independent of each other, they are called orthogonal. Conditions for orthogonality: where a and b are weights for diff contrasts # of comparisons = # of df for treatments

20 (1, 2, 3, 4, 5) Coefficients (1, 2) vs. (3, 4, 5) 3 -2 1 -1 2 (1) vs. (2) (3) vs. (4, 5) (4) vs. (5) We start at the trunk, once we have formed two branches, we never compare groups on the same limb to those on a different limb

21 (1, 2, 3, 4, 5) Coefficients (1, 2) vs. (3, 4, 5) 3 -2 1 -1 2 (1) vs. (2) (3) vs. (4, 5) (4) vs. (5)

22 Trend Analysis Often, we are not interested in differences between groups, per se, but rather the overall trend across groups (especially in RM ANOVA) Linear, Quadratic, Cubic, etc. Linear: Quad: Cubic: Tables of orthogonal polynomial coefficients or (fortunately) SPSS has them built in for us

23 Multiple Comparisons w/ SPSS
Let’s use the category learning experiment to test our post-hoc and a priori hypotheses in SPSS Category Learning: If information sampled is the important factor to learning, then we expect the two “intelligent” sampling conditions to outperform the “random” sampling condition.

24 Experiment 1 Condition Exploration Exemplar Sampling Random Passive
Uniform Generate Active “Intelligent” Yoked 480 Training trials (feedback), followed by 480 test trials (no feedback)


Download ppt "Multiple Comparisons Q560: Experimental Methods in Cognitive Science Lecture 10."

Similar presentations


Ads by Google