ONE WAY ANALYSIS OF VARIANCE ANOVA o It is used to investigate the effect of one factor which occurs at h levels (≥3). Example: Suppose that we wish to test the effect of temperature at levels (20, 30, 35, 40 o C) on the serum total proteins. Biostatistics and Data analysis 3 rd Lecture
RANDOM MODEL HYPOTHESIS
The summary statistics for each row are shown in the table below 20 o C25 o C30 o C Sample size (n)798 Mean S.D Variance (S 2 ) Temperature ( o C) Serum Total Proteins (g/dL) 20 2, 3, 2, 2, 3, 2, , 3, 2, 3, 1, 2, 2, 3, , 6, 7, 4, 2, 6, 7, 8
o The sum of the squares of the deviations between a value and the mean of the value SS between groups SS(B) SS within groups SS(W) o The average squared deviation from the mean and are found by dividing the variation by the degrees of freedom MS = SS / df MS between groups MS(B) MS within groups MS(W) Variances (Mean of Squares) = MS Variation (Sum of Squares) = SS
Are all of the values identical? – There are variations among the data called the total variation SS(T). Variation (Sum of Squares) = SS Temperature ( o C) Serum Total Proteins (g/dL)Means 20 2, 3, 2, 2, 3, 2, , 3, 2, 3, 1, 2, 2, 3, , 6, 7, 4, 2, 6, 7,
Are all of the sample means identical? – There variation called between group SS(B)variation or variation due to Factor. Temperature ( o C) Serum Total Proteins (g/dL)Means 20 2, 3, 2, 2, 3, 2, , 3, 2, 3, 1, 2, 2, 3, , 6, 7, 4, 2, 6, 7,
Are each of the values within each group identical? – There is variation within group SS(W) (error variation). Temperature ( o C) Serum Total Proteins (g/dL)Means 20 2, 3, 2, 2, 3, 2, , 3, 2, 3, 1, 2, 2, 3, , 6, 7, 4, 2, 6, 7,
– The variation between groups, SS(B), or the variation due to the factor – The variation within groups, SS(W), or the error variation There are two sources of variation
Here is the basic one-way ANOVA table SourceSSdfMSFP Between (Factor) Within (Error) Total
The summary statistics for the grades of each row are shown in the table below 20 o C25 o C30 o C Sample size (n)798 Mean S.D Variance (S 2 ) Temperature ( o C) Serum Total Proteins (g/dL) 20 2, 3, 2, 2, 3, 2, , 3, 2, 3, 1, 2, 2, 3, , 6, 7, 4, 2, 6, 7, 8
Grand Mean – The grand mean is the average of all the values – It is a weighted average of the individual sample means
Between Group Variation, SS(B)
Within Group Variation, SS(W)
After filling in the sum of squares, we have … SourceSSdfMSFp Between Within Total
– MS = SS / df MS(B)= / 2= MS(W)= / 21= Variances
After filling in the sum of squares, we have … SourceSSdfMSFp Between Within Total
– An F test statistic is the ratio of two sample variances – The MS(B) and MS(W) are two sample variances and that’s what we divide to find F. – F = MS(B) / MS(W) F = 28.2 / = F test
After filling in the sum of squares, we have … SourceSSdfMSF cal P Between Within Total Tabulated F 2,21(5%) = 3.47, F 2,21(1%) = 5.78, F 2,21(0.1%) = 5.78 Thus calculated F at df 2,21 > Tabulated at F 2,21(0.1%) = 5.78 Thus reject null hypothesis