Random Effects & Repeated Measures Alternatives to Fixed Effects Analyses
Questions What is the difference between fixed- and random-effects in terms of treatments? How are F tests with random effects different than with fixed effects? How is a repeated measures design different from a totally between subjects design in the collection of the data?
Questions (2) How does the significance testing change from the totally between to a design to one in which one or more factors are repeated measures (just the general idea, you don’t need to show actual F ratios or computations)? Describe one argument for using repeated measures designs and one argument against using such designs (or describe when you would and would not want to use repeated measures).
Fixed Effects Designs All treatment conditions of interest are included in the study All in cell get identical stimulus (treatment, IV combination) Interest is in specific means Expected mean squares are (relatively) simple; F tests are all based on common error term.
Random Effects Designs Treatment conditions are sampled; some or many conditions of interest are excluded. Replications of the experiment would get different treatments because treatments are sampled Interest in the variance produced by an IV rather than means Expected mean squares relatively complex; the denominator for F changes depending on the effect being tested.
Fixed vs. Random Random Fixed Examples Conditions Persuasiveness of commercials Treatment Sampled All of interest Sex of participant Experimenter effect Replication different Replication same Drug dosage Impact of team members Variance due to IV Means due to IV Training program effectiveness
Single Factor Random The expected mean squares and F-test for the single random factor are the same as those for the single factor fixed-effects design.
Experimenter effects (Hays Table 13.4.1) 2 3 4 5 5.8 6.0 6.3 6.4 5.7 5.1 6.1 5.5 5.9 6.6 6.5 5.6 6.2 5.4 5.3 6.7 5.2 6.21 6.33 6.16
Sum of Source DF Squares Mean Square F Value Pr > F Model 4 3.48150000 0.87037500 10.72 <.0001 Error 35 2.84250000 0.08121429 Corrected Total 39 6.32400000
Random Effects Significance Tests (A & B random/within) Source E(MS) F df A J-1, (J-1)(K-1) B K-1, AxB (J-1)(K-1), JK(n-1) Error
A fixed B random Source E(MS) F df A J-1, (J-1)(K-1) B K-1, AxB JK(n-1) Error
Why the Funky MS? Treatment effects for A, B, & AxB are the same for fixed & random in the population of treatments. In fixed, we have the population, in random, we just have a sample. Therefore, in a given (random) study, the interaction effects need not sum to zero. The AxB effects appear in the main effects.
Applications of Random Effects Reliability and Generalizability How many judges do I need to get a reliability of .8? How well does this score generalize to a particular universe of scores? Intraclass correlations (ICCs) Estimated variance components Meta-analysis Control (Randomized Blocks and Repeated Measures) Sample as many conditions as you can afford (5+ if possible)
Review What is the difference between fixed- and random-effects in terms of treatments? How are F tests with random effects different than with fixed effects?
Repeated Measures Designs In a repeated measures design, participants appear in more than one cell; use ALL the levels of a factor, not just some of them. Painfree study – note the design – AVOID a single group pre-post design Sports instruction Commonly used in psychology
Pros & Cons of RM Pro Con Individuals serve as own control – improved power Carry over effects May be cheaper to run Participant sees design - demand characteristics Scarce participants
RM – Participant ‘Factor’ Source df MS E(MS) F Between Subjects K-1 No test Within Subjects Treatments J-1 Subjects x Treatments (J-1)(K-1) Total JK-1 Generic representation of a single factor within design
Drugs on Reaction Time Order of drug randomized. All Ss, all drugs. Interest is drug. Note that the Trial effect is ignored. Person Drug 1 Drug 2 Drug 3 Drug 4 Mean 1 30 28 16 34 27 2 14 18 10 22 3 24 20 23 4 38 44 5 26 24.5 26.4 25.6 15.6 32 24.9 Drug is fixed; person is random. ‘1 Factor’ repeated measures design. Notice 1 person per cell. We can get 3 SS: row, column, and residual (interaction plus error).
Total SS Person Drug 1 Drug 2 Drug 3 Drug 4 Mean 1 30 28 16 34 27 2 14 18 10 22 3 24 20 23 4 38 44 5 26 24.5 26.4 25.6 15.6 32 24.9
Drug SS Person Drug M D*D 1 26.4 2.25 3 15.6 86.49 2 4 5 25.6 0.49 32 50.41 Total 698.20
Person SS Person Drug M D*D 1 27 4.41 3 2 16 79.21 23 3.61 4 34 82.81 5 24.5 0.16 Total 680.8
Summary Total = 1491.8; Drugs = 698.2, People=680.8. Residual = Total –(Drugs+People) = 1491.8-(698.2+680.8) =112.8 Source SS df MS F Between People 680.8 4 Nuisance variance Within people (by summing) 811.0 15 Drugs 698.2 3 232.73 24.76 Residual 112.8 12 9.40 Total 1491.8 19 Fcrit(.05) =3.95
R code Run the same problem using R.
2 Factor, 1 Repeated Subject B1 B2 B3 B4 M 1 5 3 2 A1 4 3.25 6 3.75 7 8 5.25 A2 9 5.75 3.83 2.5 6.17 3.33 4.56 DV=errors in control setting dials; IV(A) is dial calibration - between; IV(B) is dial shape - within. Observation is randomized over dial shape.
Summary Source SS df MS F Between people 68.21 5 A(calibration) 51.04 Note that different factors are tested with different error terms. Source SS df MS F Between people 68.21 5 A(calibration) 51.04 1 11.9 Subjects within groups 17.17 4 4.29 Within people 69.75 18 B (dial shape) 47.46 3 15.82 12.76 AB 7.46 2.49 2.01 BxSub within group 14.83 12 1.24
Graph Run the problem in R.
Post Hoc Tests Post hoc tests with repeated measures are tricky. You have to use the proper error term for each test. The error term changes depending on what you are testing. Be sure to look up the right error term. In R, you will have to look for examples of the kind of thing you want for post-hoc tests. Generally speaking, avoid ANOVA for repeated measures designs. Use MANOVA instead (see next slide).
Assumptions of RM Orthogonal ANOVA assumes homogeneity of error variance within cells. IVs are independent. With repeated measures, we introduce covariance (correlation) across cells. For example, the correlation of scores across subjects 1-3 for the first two calibrations is .89. Repeated measures designs make assumptions about the homogeneity of covariance matrices across conditions for the F test to work properly. If the assumptions are not met, you have problems and may need to make adjustments. You can avoid these assumptions by using multivariate techniques (MANOVA) to analyze your data. I suggest you do so. Howell likes corrections to df. If you use ANOVA, you need to look up your design to get the right F tests and check on the assumptions.
Review How is a repeated measures design different from a totally between subjects design in the collection of the data? How does the significance testing change from the totally between to a design to one in which one or more factors are repeated measures (just the general idea, you don’t need to show actual F ratios or computations)? Describe one argument for using repeated measures designs and one argument against using such designs (or describe when you would and would not want to use repeated measures).
If time R code simple judge reliability Judges and targets are crossed
If time.. Paired associate learning experiment: 8 randomly chosen participants were given 3 lists of 35 pairs of words to learn. Lists in random order to each participant. Score is number correctly recalled on first trial. Are these lists differently difficult?
Paired associate data Subject A B C 1 22 15 18 2 9 12 3 16 13 10 4 19 20 6 17 7 14 8