Download presentation
Presentation is loading. Please wait.
Published byOwen Knight Modified over 9 years ago
1
1/87 Group 5 AMS 572 Professor: Wei Zhu
2
Foram Sanghvi :Brief review of ANOVA Shihui Xiang: Introduction to Repeated Measures Design Qianzhu Wu: One-way repeated measures ANOVA Yue Tang: Using the repeated statement of proc anova Yan Xu: Two-Factor ANOVA with repeated Measures on One Factor Weina Gao: Two-Factor experience with Repeated Measure on both factors Yi Hu: Three-Factor experiments with a repeated measure on the last factor Xiaoke Fei: Three-Factor experiments with repeated measure on two factors Yuzhou Song: Mixed Model 2 / 87
3
Foram Sanghvi 3 / 87
4
The One-way ANOVA can test the equality of several population means. It is an extension of the pooled variance t-test That is: H 0 (null hypothesis) : µ1 = µ2 = µ3 =…….. = µn H a (alternative hypothesis): At least one of means differs from the rest. Assumptions: Equal population variances Normal population Independent samples 4 / 87
5
Conclusion: Reject H 0 if F o >F a-1,N-a 5 / 87 ~ F a-1,N-a
6
MSA =Variance of group mean MSE =Mean of within group variance Total sample size N= Sample mean: Grand mean: Y ij =observed response from experimental unit i when receiving effect j ~N(µ i,σ 2 ) 6 / 87
7
The most distinct disadvantage to the analysis of variance (ANOVA) method is that it requires two assumptions to be made: All population means from each data group must be (roughly) equal. All variances from each data group must be (roughly) equal. The normal subject-to-subject variation may strongly affect the error sum of squares. The most distinct disadvantage to the analysis of variance (ANOVA) method is that it requires two assumptions to be made: 1. All population means from each data group must be (roughly) equal. 2. All variances from each data group must be (roughly) equal. Obviously, we rarely have this luxury in real-world applications. 7 / 87
8
8/87
9
9/87 -- A repeated measures design is one in which at least one of the factors consists of repeated measurements on the same subjects or experimental units, under different conditions.
10
10/87 A repeated measures design involves measuring subjects at different points in time (typically after different treatments) It can be viewed as an extension of the paired-samples t-test (which involved only two related measures) Thus, the measures—unlike in “regular” ANOVA—are correlated, i.e., the observations are not independent
11
11/87 Data collected in a sequence of evenly spaced points in time Treatments are assigned to experimental units
12
12/87 By collecting data from the same participants under repeated conditions the individual differences can be eliminated or reduced as a source of between group differences. Also, the sample size is not divided between conditions or groups and thus inferential testing becomes more powerful. This design also proves to be economical when sample members are difficult to recruit because each member is measured under all conditions.
13
13/87
14
14/87 As with any ANOVA, repeated measures ANOVA tests the equality of means. However, repeated measures ANOVA is used when all members of a random sample are measured under a number of different conditions. As the sample is exposed to each condition, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: the data violate the ANOVA assumption of independence.
15
15/87 The simplest example of a repeated measures design is a paired t-test. Each subject is measured twice (time 1 and time 2) on the same variable or each pair of matched participants are assigned to one of two treatment levels. If we observe participants at more than two time- points, then we need to conduct a repeated measures ANOVA.
16
16/87 What we would like to do is to decompose the variability into : (1) A random effect (2) A fixed effect The effect of participants is always a random effect. We will only consider situations where the factor is a fixed effect
17
17/87 Y ij = μ j +S i +ε ij μ j = The fixed effect. S i = The random effect of subject i. ε ij = The random error independent of S i
18
18/87
19
19/87 Assumptions of a repeated measures design For a repeated measures design, we start with the same assumptions as a paired t- test : Participants are independent and randomly selected from the population Normality (actually symmetry). Due to having more than two measurements on each participant, we have an additional assumption on the variances.
20
20/87
21
21/87
22
22/87
23
The assumptions we have to check for a repeated measures design are: 1.Participants are independent and randomly selected from the population 2.Normality (actually symmetry) 3. Compound symmetry 23/87
24
24/87 Consider the following experiment: We have four drugs (1,2,3 and 4) that relieve pain. Each subject is given each of the four drugs. The subject’s pain tolerance is then measured. Enough time is allowed to pass between successive drug administrations so that we can be sure there’s no residual effect from the previous drug. The null hypothesis is: Mean(1)=Mean(2)=Mean(3)=Mean(4)
25
25/87 In the one-way analysis of variance without a repeated measure, we would have each subject receive only one of the four drugs. In this design, each subjects is measured under each of the drug conditions. This has several important advantages.
26
26 / 87 Each subject acts as his own control. i.e. : drugs effects are calculated by recording deviations between each drug score and the average drug score for each subject. The normal subject-to-subject variation can thus be removed from the error sum of squares.
27
27/87
28
28/87
29
29/87 DATA PAIN; INPUT SUBJ DRUG PAIN; DATALINES; 1 1 5 1 2 9 1 3 6 1 4 11 2 1 7 2 2 12 …… ; SAS code without using repeated statement PROC ANOVA DATA=PAIN; TITLE ‘without repeated statement'; CLASS SUBJ DRUG; MODEL PAIN=SUBJ DRUG; MEANS DRUG/DUNCAN; RUN; DATA PAIN; INPUT SUBJ @; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; DATALINES; 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ; reconstruct
30
30/87 SAS code without using repeated statement DATA PAIN; INPUT SUBJ @; DATALINES; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; iterative loop To keep reading from the same line of data 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ; a lot easier!
31
31/87 SAS code without using repeated statement Remark 1: about the DO statement the general form: Do variable = start TO end BY increment; (SAS Statements) END; initial value ending value Default: 1
32
32/87 SAS code without using repeated statement Remark 1: about the DO statement in our example: initial value: 1 ending value: 4 DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; to keep reading from the same line of data return to “DO”
33
33/87 SAS code without using repeated statement Remark 2: about the ANOVA procedure PROC ANOVA DATA=PAIN; TITLE ‘without repeated statement'; CLASS SUBJ DRUG; MODEL PAIN=SBJ DRUG; MEANS DRUG/DUNCAN; RUN; No “|” : they are each main effects and no interaction terms between them.
34
34/87 SAS code using the REPEATED Statement DATA REPEAT; INPUT PAIN1-PAIN4; DATALINES; 5 9 6 11 7 12 8 9 11 12 10 14 3 8 5 8 ; PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 (1 2 3 4); RUN;
35
35/87 SAS code using the REPEATED Statement Remark 1 : about the data set We need the data set in the form: SUBJ PAIN1 PAIN2 PAIN3 PAIN4 NOTICE that it does not have a DRUG variable
36
36/87 SAS code using the REPEATED Statement Remark 2 : about the REPEATED Statement The general form: REPEATED factor_name CONTRAST(n); To compute pairwise comparisons N is a number from 1 to k, with k being # levels of repeated factor; To get all pairwise contrasts, we need k-1 repeated statements
37
37/87 SAS code using the REPEATED Statement Remark 2 : about the REPEATED Statement In our example: PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 CONTRAST(1) / SUMMARY; REPEATED DRUG 4 CONTRAST(2) / SUMMARY; REPEATED DRUG 4 CONTRAST(3) / SUMMARY; RUN; Request ANOVA tables for each contrast
38
38/87 SAS code using the REPEATED Statement Remark 3 : more explanation of the ANOVA procedure PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 (1 2 3 4); RUN; No CLASS: our data set does not have an independent variable NOUNI: not to conduct a separate analysis for each of the four PAIN 4: the repeated factor “DRUG” has four levels; optional (1 2 3 4): the labels we want printed for each level of DRUG
39
39/87 SAS code using the REPEATED Statement Remark 4 : comparison of the DATA steps DATA PAIN; INPUT SUBJ DRUG PAIN; DATALINES; 1 1 5 1 2 9 1 3 6 1 4 11 2 1 7 2 2 12 …… ; DATA PAIN; INPUT SUBJ @; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; DATALINES; 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ; DATA REPEAT; INPUT PAIN1-PAIN4; DATALINES; 5 9 6 11 7 12 8 9 11 12 10 14 3 8 5 8 ;
40
40/87
41
41/87 18083 28586 38388 48294 58793 68498 Subject Control Treatment PREPOST Factor B: TIME Factor A: GROUP Repeated
42
42/87
43
43/87
44
44/87 Total Variance df=N-1 Between subjects Within subjects Treatment df=a-1 Error due to subjects within treatment df=a(n-1) Time df=b-1 Treatment × time df =(a-1)×(b-1) Error or residual df =a×(n-1)×(b-1) a: # of treatment groups b: # of time points n: # of subjects per treatment N=a×b×n: total # of measurements
45
45/87 Sourced.f.SSMS Factor Aa-1SSAMSA = SSA/(a-1) Factor Bb-1SSBMSB = SSB/(b-1) AB interaction(a-1)(b-1)SSABMSAB = SSAB/(a-1)(b-1) Subjects (within A)a(n-1)SSWAMSWA = SSWA/a(n-1) Errora(n-1)(b-1)SSEMSE = SSE/a(n-1)(b-1) Totalnab-1SST
46
46/87
47
47/87 Data prepost; Input subj group $ pretest postest; datalines; 1 c 80 83 2 c 85 86 3 c 83 88 4 t 82 94 5 t 87 93 6 t 84 98 ; run; proc anova data=prepost; title 'Two-way ANOVA with a Repeated Measure on One Factor'; class group; model pretest postest = group/ nouni ; repeated time 2 (0 1); means group; run;
48
48/87 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.13216314 26.27 1 4 0.0069 Pillai's Trace 0.86783686 26.27 1 4 0.0069 Hotelling-Lawley Trace 6.56640625 26.27 1 4 0.0069 Roy's Greatest Root 6.56640625 26.27 1 4 0.0069 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time*group Effect Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.32611465 8.27 1 4 0.0452 Pillai's Trace 0.67388535 8.27 1 4 0.0452 Hotelling-Lawley Trace 2.06640625 8.27 1 4 0.0452 Roy's Greatest Root 2.06640625 8.27 1 4 0.0452
49
49/87 Tests of Hypotheses for Between Subjects Effects Source DF Anova SS Mean Square F Value Pr > F group 1 90.75000000 90.75000000 11.84 0.0263 Error 4 30.66666667 7.66666667 Univariate Tests of Hypotheses for Within Subject Effects Source DF Anova SS Mean Square F Value Pr > F time 1 140.0833333 140.0833333 26.27 0.0069 time*group 1 44.0833333 44.0833333 8.27 0.0452 Error(time) 4 21.3333333 5.3333333 Level of -----------pretest----------- -----------postest----------- group N Mean Std Dev Mean Std Dev c 3 82.6666667 2.51661148 85.6666667 2.51661148 t 3 84.3333333 2.51661148 95.0000000 2.64575131
50
50/87
51
51/87 Two-factor ANOVA The subject are taken under the levers of both factors Subjec t B1B1 B2B2 …BbBb 1A1A1 Y 111 Y 112 …Y 11B ……………… IAaY I11 Y I12 Y I1B ……………… 1AaY 1a1 Y 1a2 …Y 1aB ……………… IAaY Ia1 Y Ia2 …Y IaB ……………… 1A Y 1A1 Y 1A2 …Y 1AB ……………… IA Y IA1 Y IA2 …Y IAB A and B denote the two factors and Y iab denote the measurement taken from ith subject when the level of factor A takes on the value a and that of B takes on the value b.
52
52/87 Two Factors Model All the groups have equal variances Random effects due to subjects Fixed effects of factors
53
53/87 The fixed model estimated as followed:
54
54/87 RM Anova Table: SourceDFSSMSF-Value Factor AA—1Sa Factor BB—1Sb SubjectsI—1Si A*Subjects(A—1)(I—1)Sai B*Subjects(A—1)(I—1)Sbi A*B(A—1)(B—1)Sab ErrordABISe TotalABI—1St
55
55/87 Example: A group of subjects is treated in the morning and afternoon of two different days. On one of the days, the subjects receive a strong sleeping aid the night before the experiment is to be conducted; on the other, a placebo. treat controldrug subjectreactionsubjectreaction Time A.m. 165170 272278 390397 P.M155160 264268 380385
56
56/87 data repeat; input react1-react4; datalines; 65 70 55 60 72 78 64 68 90 97 80 85 ; Run; SAS Code proc anova data=repeat; model react1-react4= /nouni; repeated time 2, treat 2 /nom; run;
57
57/87 A portion of output from SAS
58
58/84 Interpretation According to the observed p-values, except the interactions, we can reject that time and treat are not significantly different. The drug increase reaction time Reaction time is no longer in the morning compared to the afternoon The interaction of treat and time is not significant
59
59/87
60
60/87 Consider a marketing experiment: Male and female subjects are offered one of three different brands of coffee. Each brand is tasted twice; once after breakfast, the other time after dinner. The preference of each brand is measured on a scale from 1 to 10(1=lowest, 10=highest).
61
61/87 The experimental design is shown below: Three-Factor Experiment with a Repeated Measure on the last factor Meal: Repeated Measure Factor
62
62/87 SAS Program:
63
63/87 OUTPUT(Part 1/4):
64
64/87 OUTPUT(Part 2/4):
65
65/87 65/81 OUTPUT(Part 3/4):
66
66/87 OUTPUT(Part 4/4):
67
67/87
68
68/87
69
69/87 A group of high- and low-SES children is selected for the experiment. Their reading comprehension is tested each spring and fall for three consecutive years. A Diagram of the design is shown here:
70
70/87 Notice that each subject is measured each spring and fall of each year so that the variables SEASON and YEAR are both repeated measures factors. To analyze this experiment, we will use the SAS program: the REPEATED statement of PROC ANOVA: DATA READ INPUT SUBJ SES $ READ1-READ6; LABEL READ1 = 'SPRING YR 1’ READ2 = 'FALL YR 1’ READ3 = 'SPRING YR 2’ READ4 = 'FALL YR 2’ READ5 = 'SPRING YR 3’ READ6 = 'FALL YR 3';
71
71/87 DATALINES; 1 HIGH 61 50 60 55 59 62 2 HIGH 64 55 62 57 63 63 3 HIGH 59 49 58 52 60 58 4 HIGH 63 59 65 64 67 70 5 HIGH 62 51 61 56 60 63 6 LOW 57 42 56 46 54 50 7 LOW 61 47 58 48 59 55 8 LOW 55 40 55 46 57 52 9 LOW 59 44 61 50 63 60 10 LOW 58 44 56 49 55 49 ; PROC ANOVA DATA=READ; TITLE "READING COMPREHENSION ANALYSIS"; CLASS SES; MODEL READ1-READ6 = SES / NOUNI; REPEATED YEAR 3, SEASON 2; MEAN SES; RUN;
72
72/87 Since the REPEATED statement is confusing when we have more than one repeated factor, it is important for you to know how to determine the order of the factor names. Look at the REPEATED statement in this example: REPEATED YEAR 3, SEASON 2; This statement instructs the ANOVA procedure to choose the first level of YEAR(1), then loop through two levels of SEASON(SPRING FALL), then return to the next level of YEAR(2), followed by two levels of SEASON, etc.
73
73/87 READING COMPREHENSION ANALYSIS The ANOVA Procedure Class Level Information Class Levels Values SES 2 HIGH LOW Number of Observations Read 10 Number of Observations Used 10 The ANOVA Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source DF Anova SS Mean Square F Value Pr > F SES 1 680.0666667 680.0666667 13.54 0.0062 Error 8 401.6666667 50.2083333 The ANOVA Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F Source DF Anova SS Mean Square F Value Pr > F G - G H-F-L YEAR 2 252.0333333 126.0166667 26.91 <.0001 0.0002 <.0001
74
74/87 YEAR*SES 2 1.0333333 0.5166667 0.11 0.8962 0.8186 0.8450 Error(YEAR) 16 74.9333333 4.6833333 Greenhouse-Geisser Epsilon 0.6757 Huynh-Feldt-Lecoutre Epsilon 0.7642 Source DF Anova SS Mean Square F Value Pr > F SEASON 1 680.0666667 680.0666667 224.82 <.0001 SEASON*SES 1 112.0666667 112.0666667 37.05 0.0003 Error(SEASON) 8 24.2000000 3.0250000
75
75/87 Adj Pr > F Source DF Anova SS Mean Square F Value Pr > F G - G H- F-L YEAR*SEASON 2 265.4333333 132.7166667 112.95 <.0001 <.0001 <.0001 YEAR*SEASON*SES 2 0.4333333 0.2166667 0.18 0.8333 0.7592 0.7905 Error(YEAR*SEASON) 16 18.8000000 1.1750000 Greenhouse-Geisser Epsilon 0.7073 Huynh-Feldt-Lecoutre Epsilon 0.8147
76
76/87 High-SES student have higher reading comprehension scores than low-SES students (F=13.54, p=0.0062). Reading comprehension increases with each year (F=26.91, p=0.0001). Students had higher reading comprehension scores in the spring compared to the following fall (F=224.82, p=0.0001) The "slippage" was greater for the low-SES students (there was a significant SES*SEASON interaction [F=37.05, p=0.0003}). "Slippage" decreases as the students get older (YEAR*SEASON is significant [F=112.95, p=0.0001]).
77
77/87
78
78/87 Mixed Model: When we have design in which we have both random and fixed variables, we have what is often called a mixed model. What is Mixed Model?
79
79/87 Do not have to assume sphericity in the model. Do not have to assume compound symmetry in the model.
80
80/87 We can use “Proc Mixed” statement to deal with the Mixed model.
81
81/87 Example:
82
82/87 Treat this case as a standard repeated measure anova. We can get the following result:
83
83/87 Treat it as Mixed model SAS program:
84
84/87 The result of using mixed model:
85
85/87 comparing results of the two methods, it is obvious that Mixed model has following advantages:. The degree of freedom is bigger.. The interaction is significant.
86
86/87
87
87/87
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.