Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Example In this example we are looking at the weight gains (grams) for rats under six diets differing in level of protein (High or Low) and source of protein (Beef, Cereal, or Pork). Ten test animals for each diet Diets High protein, Beef High protein, Cereal High protein, Pork Low protein, Beef Low protein, Cereal Low protein, Pork
Gains in weight (grams) for rats under six diets Table Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and source of protein (Beef, Cereal, or Pork) Level High Protein Low protein Source Beef Cereal Pork Beef Cereal Pork Diet 1 2 3 4 5 6 73 98 94 90 107 49 102 74 79 76 95 82 118 56 96 97 104 111 64 80 86 81 88 51 100 108 72 106 87 77 91 67 70 117 120 89 61 92 105 78 58 Median 103.0 87.0 100.0 82.0 84.5 81.5 Mean 85.9 99.5 79.2 83.9 78.7 IQR 24.0 18.0 11.0 23.0 16.0 PSD 17.78 13.33 8.15 17.04 11.05 Variance 229.11 225.66 119.17 192.84 246.77 273.79 Std. Dev. 15.14 15.02 10.92 13.89 15.71 16.55
High Protein Low Protein Beef Cereal Pork Cereal Pork Beef
Exploratory Conclusions Weight gain is higher for the high protein meat diets Increasing the level of protein - increases weight gain but only if source of protein is a meat source
The F test – for comparing k means Situation We have k normal populations Let mi and s denote the mean and standard deviation of population i. i = 1, 2, 3, … k. Note: we assume that the standard deviation for each population is the same. s1 = s2 = … = sk = s
We want to test against
The data Assume we have collected data from each of th k populations Let xi1, xi2 , xi3 , … denote the ni observations from population i. i = 1, 2, 3, … k. Let
The pooled estimate of standard deviation and variance:
Consider the statistic comparing the sample means where
To test against use the test statistic
Computing Formulae
Also
To Compute F: Compute 1) 2) 3) 4) 5)
Then 1) 2) 3)
We reject if Fa is the critical point under the F distribution with n1 degrees of freedom in the numerator and n2 degrees of freedom in the denominator
Example In the following example we are comparing weight gains resulting from the following six diets Diet 1 - High Protein , Beef Diet 2 - High Protein , Cereal Diet 3 - High Protein , Pork Diet 4 - Low protein , Beef Diet 5 - Low protein , Cereal Diet 6 - Low protein , Pork
Hence
Thus Thus since F > 2.386 we reject H0
A convenient method for displaying the calculations for the F-test The ANOVA Table A convenient method for displaying the calculations for the F-test
Anova Table Mean Square F-ratio Between k - 1 SSBetween MSBetween Source d.f. Sum of Squares Mean Square F-ratio Between k - 1 SSBetween MSBetween MSB /MSW Within N - k SSWithin MSWithin Total N - 1 SSTotal
Diet Example
Equivalence of the F-test and the t-test when k = 2
the F-test
Hence
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS
Assume the data is contained in an Excel file
Each variable is in a column Weight gain (wtgn) diet Source of protein (Source) Level of Protein (Level)
After starting the SSPS program the following dialogue box appears:
If you select Opening an existing file and press OK the following dialogue box appears
The following dialogue box appears:
If the variable names are in the file ask it to read the names If the variable names are in the file ask it to read the names. If you do not specify the Range the program will identify the Range: Once you “click OK”, two windows will appear
One that will contain the output:
The other containing the data:
To perform ANOVA select Analyze->General Linear Model-> Univariate
The following dialog box appears
Select the dependent variable and the fixed factors Press OK to perform the Analysis
The Output