Download presentation
1
ANALYSIS OF VARIANCE
2
In this Lecture we study whether changes in the independent variables cause changes in the mean response and we analyze the data using a method known as analysis of variance (or ANOVA). The technique of ANOVA consists of partitioning the total sum of squares into component sum of squares due to different factors and the error. For instance, suppose there are Q factors. Then the total sum of squares (SST) is partitioned as SST = SSA + SSB + · · · + SSQ + SS Error, where SSA, SSB, ..., and SSQ represent the sum of squares associated with the factors A, B, ..., and Q, respectively.
3
If the ANOVA involves only one factor, then it is called one-way analysis of variance. Similarly if it involves two factors, then it is called the two-way analysis of variance. If it involves more then two factors, then the corresponding ANOVA is called the higher order analysis of variance. We only treat the one-way analysis of variance. 20.1. One-Way Analysis of Variance with Equal Sample Sizes The standard model of one-way ANOVA is given by 𝑦 𝑖𝑗 = 𝜇 𝑖 + 𝜖 𝑖𝑗 𝑓𝑜𝑟 𝑖=1,2,…,𝑚 ,𝑗=1,2,…,𝑛, where m ≥2 and n ≥ 2. In this model, we assume that each random variable 𝑦 𝑖𝑗 ~𝑁(𝜇 𝑖 , 𝜎 2 ) 𝑓𝑜𝑟 𝑖=1,2,…,𝑚 ,𝑗=1,2,…,𝑛.
4
Theorem Suppose the one-way ANOVA model is given by the pervious equation where the 𝜖 𝑖𝑗 ’s are independent and normally distributed random variables with mean zero and variance 𝜎 2 for i = 1, 2, ...,m and j = 1, 2, ..., n. Then the MLE’s of the parameters μi and 𝜎 2 of the model are given by 𝜇 𝑖 = 𝑌 𝑖∙ 𝜎 2 = 1 𝑛𝑚 𝑆𝑆 𝑊 Where 𝑌 𝑖∙ = 1 𝑛 𝑖=1 𝑛 𝑌 𝑖𝑗 , 𝑆𝑆 𝑊 = 𝑗=1 𝑚 𝑖=1 𝑛 ( 𝑦 𝑖𝑗 − 𝑌 𝑖∙ ) 2 is the within samples sum of squares.
5
Lemma The total sum of squares is equal to the sum of within and between sum of squares, that is 𝑆𝑆 𝑇 = 𝑆𝑆 𝑊 + 𝑆𝑆 𝐵 , where 𝑆𝑆 𝐵 = 𝑖=1 𝑚 𝑗=1 𝑛 ( 𝑦 𝑖∙ − 𝑦 ∙∙ ) 2 We will be interested in testing the null hypothesis Ho : μ1 = μ2 = · · · = μm = μ against the alternative hypothesis Ha : not all the means are equal. Ho is rejected whenever the test statistics F satisfies 𝐹= 𝑆𝑆 𝐵 𝑚−1 𝑆𝑆 𝑊 𝑚(𝑛−1) > 𝐹 𝛼 (𝑚−1, 𝑚(𝑛−1)) where 𝛼 is the significance level of the hypothesis test and 𝐹 𝛼 (𝑚−1, 𝑚(𝑛−1)) denotes the 100(1− 𝛼) percentile of the F-distribution with m−1 numerator and nm − m denominator degrees of freedom.
6
𝑆𝑆 𝑇 = 𝑆𝑆 𝑊 + 𝑆𝑆 𝐵 = +
7
Where: j = particular member of treatment level i = a treatment level m = number of treatment levels n = number of observation s in a given treatment level 𝑦 𝑖𝑗 = individual value 𝑦 .. = grand mean 𝑦 𝑖. = mean of a treatment group or level ( column mean) N: TOTAL NUMBER OF OBSERVATIONS
8
The various quantities used in carrying out the test described in Theorem 20.3 are presented in a tabular form known as the ANOVA table. If the value of the test statistics is F =𝛾, then the p-value is defined as p − value = P(F(m − 1, m(n − 1))≥𝛾). Source of variation Sums of squares Degree of freedom Mean Squares F-statistics F Between Within Total 𝑆𝑆 𝐵 𝑆𝑆 𝑊 𝑆𝑆 𝑇 𝑚−1 𝑚(𝑛−1) nm-1 𝑀𝑆 𝐵 = 𝑆𝑆 𝐵 𝑚−1 𝑀𝑆 𝑊 = 𝑆𝑆 𝑊 𝑚(𝑛−1) 𝐹= 𝑀𝑆 𝐵 𝑀𝑆 𝑊
9
The ANOVA is based on the following three assumptions: (1) Independent Samples: The samples taken from the population under consideration should be independent of one another. (2) Normal Population: For each population, the variable under consideration should be normally distributed. (3) Equal Variance: The variances of the variables under consideration should be the same for all the populations.
11
Example The data in the following table gives the number of hours of relief provided by 5 different brands of headache tablets administered to 25 subjects experiencing fevers of 38 C or more. Perform the analysis of variance and test the hypothesis at the 0.05 level of significance that the mean number of hours of relief provided by the tablets is same for all 5 brands.
12
Tablets A B C D F 5 4 8 6 3 9 7 2 1
13
Answer: we compute the sum of squares SSW, SSB and SST as SSW = 57
Answer: we compute the sum of squares SSW, SSB and SST as SSW = 57.60, SSB = 79.94, and SST = The ANOVA table for this problem is shown below. Source of variation Sums of squares Degree of freedom Mean Squares F-statistics F Between Within Total 79.94 57.60 137.04 4 20 24 19.86 2.88 6.90
14
At the significance level 𝛼 = 0. 05, we find the F-table that 𝐹 0
At the significance level 𝛼 = 0.05, we find the F-table that 𝐹 0.05 (4, 20) = Since 6.90 = F > 𝐹 0.05 (4, 20) = we reject the null hypothesis that the mean number of hours of relief provided by the tablets is same for all 5 brands. Note that using a statistical package like MINITAB, SAS or SPSS we can compute the p-value to be p − value = P(F(4, 20) ≥6.90) = Hence again we reach the same conclusion since p-value is less then the given 𝛼 for this problem. See example page 627.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.