Example of chai square test (1) false dice When we roll dice 600 times the result is times of each pip is as follows, Is this dice distorted by some one 1: 120 2: 85 3: 113 4:115 5:80 6: 87 We presume the dice is accurate (Null hypothesis) Expectation times of each pip is 100
𝜒 2 =15.88 𝑑𝑓 6−1=5
Chai square probability upper limit df 0.99 5 0.99 0.975 0.95 0.9 0.5 0.1 0.05 0.025 0.01 0.005 df 1 0.00004 0.00016 0.00098 0.0039 0.0158 0.455 2.710 3.84 5.02 6.63 7.88 1 2 0.01003 0.02010 0.0506 0.1026 0.211 1.386 4.61 5.99 7.38 9.21 10.6 2 3 0.07172 0.1148 0.2158 0.352 0.584 2.37 6.25 7.81 9.35 11.3 12.8 3 4 0.2070 0.2971 0.484 0.711 1.06 3.36 7.78 9.49 11.1 13.3 14.9 4 5 0.4117 0.554 0.831 1.15 1.61 4.35 9.24 11.07 12.8 15.1 16.8 5 6 0.676 0.872 1.24 1.64 2.20 5.35 10.64 12.6 14.5 16.8 18.6 6 7 0.989 1.24 1.69 2.17 2.83 6.35 12.02 14.1 16.0 18.5 20.3 7 8 1.34 1.65 2.18 2.73 3.49 7.34 13.4 15.5 17.5 20.1 22.0 8 9 1.73 2.09 2.70 3.33 4.17 8.34 14.7 16.9 19.0 21.7 23.6 9 10 2.16 2.56 3.25 3.94 4.87 9.34 16.0 18.3 20.5 23.2 25.2 10
Example of chai square test (2) Difference in effectiveness of medicine There are 2 medicines. When we administered medicine A to 116 people, the medicine was effective to 66 persons. When we administered medicine B to 97 people, the medicine is effective to 56 persons. The result is summarized as follow
data ratio Expectation value 122 213 91 213 116× 122 213 97× 122 213 97× 91 213 116× 91 213
Square of difference 𝑓 𝑖 − 𝑒 𝑖 2 𝑒 𝑖 A B observed expected difference Effective 66 66.44131 -0.44131 55 55.55869 0.44131 ineffective 50 59.55869 41 41.44131 Square of difference 𝑓 𝑖 − 𝑒 𝑖 2 𝑒 𝑖 Observed value of 𝜒 2 = 0.01506622 Degree of freedom 2−1 2−1 =1
Chai square probability upper limit df 0.99 5 0.99 0.975 0.95 0.9 0.5 0.1 0.05 0.025 0.01 0.005 df 1 0.00004 0.00016 0.00098 0.0039 0.0158 0.455 2.710 3.84 5.02 6.63 7.88 1 2 0.01003 0.02010 0.0506 0.1026 0.211 1.386 4.61 5.99 7.38 9.21 10.6 2 3 0.07172 0.1148 0.2158 0.352 0.584 2.37 6.25 7.81 9.35 11.3 12.8 3 4 0.2070 0.2971 0.484 0.711 1.06 3.36 7.78 9.49 11.1 13.3 14.9 4 5 0.4117 0.554 0.831 1.15 1.61 4.35 9.24 11.07 12.8 15.1 16.8 5 6 0.676 0.872 1.24 1.64 2.20 5.35 10.64 12.6 14.5 16.8 18.6 6 7 0.989 1.24 1.69 2.17 2.83 6.35 12.02 14.1 16.0 18.5 20.3 7 8 1.34 1.65 2.18 2.73 3.49 7.34 13.4 15.5 17.5 20.1 22.0 8 9 1.73 2.09 2.70 3.33 4.17 8.34 14.7 16.9 19.0 21.7 23.6 9 10 2.16 2.56 3.25 3.94 4.87 9.34 16.0 18.3 20.5 23.2 25.2 10
Example of chai square test (3) Social class Social status is not richness. Even poorest people may think that they are belonging high society, because of their cultural refinement. It is relating to the culture of the area. Following of the data of interview survey asking social level of people. Which do you think you are belonging, high society, middle society or low society. The survey is implemented in A, B, C and D city. We want to know whether there are differences among cities.
Expectation value Difference between observed value and expectation value
Square of difference Calculation of 𝜒 2 value 𝑓 𝑖 − 𝑒 𝑖 2 𝑒 𝑖 Observed 𝜒 2 is 34.72407 Degree of freedom of city is 4-1=3 Degree of freedom of social class is 3-1=2 So, the freedom of this dataset is (4-1)(3-1)=6
Chai square probability upper limit df 0.99 5 0.99 0.975 0.95 0.9 0.5 0.1 0.05 0.025 0.01 0.005 df 1 0.00004 0.00016 0.00098 0.0039 0.0158 0.455 2.710 3.84 5.02 6.63 7.88 1 2 0.01003 0.02010 0.0506 0.1026 0.211 1.386 4.61 5.99 7.38 9.21 10.6 2 3 0.07172 0.1148 0.2158 0.352 0.584 2.37 6.25 7.81 9.35 11.3 12.8 3 4 0.2070 0.2971 0.484 0.711 1.06 3.36 7.78 9.49 11.1 13.3 14.9 4 5 0.4117 0.554 0.831 1.15 1.61 4.35 9.24 11.07 12.8 15.1 16.8 5 6 0.676 0.872 1.24 1.64 2.20 5.35 10.64 12.6 14.5 16.8 18.6 6 7 0.989 1.24 1.69 2.17 2.83 6.35 12.02 14.1 16.0 18.5 20.3 7 8 1.34 1.65 2.18 2.73 3.49 7.34 13.4 15.5 17.5 20.1 22.0 8 9 1.73 2.09 2.70 3.33 4.17 8.34 14.7 16.9 19.0 21.7 23.6 9 10 2.16 2.56 3.25 3.94 4.87 9.34 16.0 18.3 20.5 23.2 25.2 10
⋯ ⋯ Student’s t test Calculation of observed t t= 𝑀−𝜇 𝜎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑛 when μ=0, t= 𝑀 𝜎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑛 Paired t Comparison between two varieties planted in same pots ⋯ ⋯ Pot 1 Pot 2 Pot N
⋮ 𝑑 𝑛 − = 𝑖=1 𝑛 𝑑 𝑖 Sum 𝑀= 1 𝑛 𝑖=1 𝑛 𝑑 𝑖 𝑆𝑆= 𝑖=1 𝑛 𝑑 𝑖 −𝑀 2 𝑀= 1 𝑛 𝑖=1 𝑛 𝑑 𝑖 𝑆𝑆= 𝑖=1 𝑛 𝑑 𝑖 −𝑀 2 𝜎 2 = 𝑆𝑆 𝑛−1 ⋮ S.E= 𝜎 𝑛 Pot n = − 𝑑 𝑛 t= 𝑀 𝑆.𝐸 = 𝑀 𝜎 𝑛 𝑖=1 𝑛 𝑑 𝑖 Sum
Threshold of t value at p≤0.01 df=4 is 4.604 t≥4.604 Example i A B difference 1 3 2 5 4 7 6 9 n df sum 8 average 1.6 SS 1.2 σ 2 0.3 𝑀=1.6 𝜎 𝑐 = 0.3 t= 𝑀 𝑐 𝜎 𝑐 𝑛 = 1.6 0.3 5 =6.531973 Threshold of t value at p≤0.01 df=4 is 4.604 t≥4.604 We can conclude that the difference is significant at p≤0.01 and A is larger than B
Unpaired t Fertilizer B 𝑛 𝐴 𝑛 𝐵 𝑑𝑓 𝐴 = 𝑛 𝐴 −1 𝑑𝑓 𝐵 = 𝑛 𝐵 −1 𝑑𝑓 𝐴 = 𝑛 𝐴 −1 𝑀 𝐴 = 1 𝑛 𝐴 𝑖=1 𝑛 𝐴 𝑑 𝐴𝑖 𝑆𝑆 𝐴 = 𝑖=1 𝑛 𝐴 𝑑 𝐴𝑖 − 𝑀 𝐴 2 𝜎 𝐴 2 = 𝑆𝑆 𝐴 𝑛 𝐴 −1 𝑛 𝐵 𝑑𝑓 𝐵 = 𝑛 𝐵 −1 𝑀 𝐵 = 1 𝑛 𝐵 𝑖=1 𝑛 𝐵 𝑑 𝐵𝑖 𝑆𝑆 𝐵 = 𝑖=1 𝑛 𝐵 𝑑 𝐵𝑖 − 𝑀 𝐵 2 𝜎 𝐵 2 = 𝑆𝑆 𝐵 𝑛 𝐵 −1
s 2 = 1 𝑚 𝜎 𝐴−𝐵 2 + 1 𝑛 𝜎 𝐴−𝐵 2 = 1 𝑚 + 1 𝑛 𝜎 𝐴−𝐵 2 Degree of freedom 𝑑𝑓 𝐴−𝐵 = 𝑑𝑓 𝐴 + 𝑑𝑓 𝐵 = 𝑛 𝐴 + 𝑛 𝐵 −2 Quadratic moment of around 𝜇 (center of parent population) We presume homoscendasticity 𝜎 𝐴 2 = 𝜎 𝐵 2 = 𝜎 𝐴−𝐵 2 = 𝜎 𝐴+𝐵 2 E 𝑀 𝐴 2 = 1 𝑛 𝐴 𝜎 𝐴 2 = 1 𝑛 𝐴 𝜎 𝐴−𝐵 2 E 𝑀 𝐵 2 = 1 𝑛 𝐵 𝜎 𝐵 2 = 1 𝑛 𝐵 𝜎 𝐴−𝐵 2 s 2 =E 𝑀 𝐴−𝐵 2 =E 𝑀 𝐴+𝐵 2 =E 𝑀 𝐴 2 +E 𝑀 𝐵 2 s 2 = 1 𝑚 𝜎 𝐴−𝐵 2 + 1 𝑛 𝜎 𝐴−𝐵 2 = 1 𝑚 + 1 𝑛 𝜎 𝐴−𝐵 2 s= 𝜎 𝐴−𝐵 1 𝑚 + 1 𝑛 E 𝑀 𝐴−𝐵 2 = E 𝑀 𝐴 2 +2𝐸 𝑀 𝐴 𝐸 𝑀 𝐵 +E 𝑀 𝐵 2 = E 𝑀 𝐴 2 +E 𝑀 𝐵 2 ∵𝐸 𝑀 𝐴 =𝐸 𝑀 𝐵 =0 From null hypothesis liner moment of observed average is 0.
Definition of standard error: 𝜎 𝐴−𝐵 𝑛 𝐴−𝐵 =𝑠, 𝜎 𝐴−𝐵 𝑛 𝐴−𝐵 = 𝜎 𝐴−𝐵 1 𝑛 𝐴 + 1 𝑛 𝐵 𝑛 𝐴−𝐵 = 1 1 𝑛 𝐴 + 1 𝑛 𝐵 = 𝑛 𝐴 𝑛 𝐵 𝑛 𝐴 + 𝑛 𝐵 𝑛 𝐴−𝐵 = 𝑛 𝐴 𝑛 𝐵 𝑛 𝐴 + 𝑛 𝐵 t= 𝑀 𝐴 − 𝑀 𝐵 𝜎 𝐴−𝐵 1 𝑛 𝐴 + 1 𝑛 𝐵
Variance of A-B is similar to that of A+B, it can be obtained as weighted mean by each degree of freedom. 𝜎 𝐴−𝐵 2 = 𝜎 𝐴+𝐵 2 = 𝑛 𝐴 −1 𝜎 𝐴 2 𝑛 𝐴 −1 + 𝑛 𝐵 −1 + 𝑛 𝐵 −1 𝜎 𝐵 2 𝑛 𝐴 −1 + 𝑛 𝐵 −1 = 𝑛 𝐴 −1 𝜎 𝐴 2 + 𝑛 𝐵 −1 𝜎 𝐵 2 𝑛 𝐴 + 𝑛 𝐵 −2 𝜎 𝐴−𝐵 2 = 𝜎 𝐴+𝐵 2 = 𝑆𝑆 𝐴 + 𝑆𝑆 𝐵 𝑛 𝐴 + 𝑛 𝐵 −2
Example A B 1 5 6 8 n 3 4 df 2 sum 12 20 average SS 14 26 𝜎 2 7 8.6667 A : 𝑛 𝐴 =3, df 𝐴 =2, 𝑀 A =4, 𝑆𝑆 𝐴 =14 B: 𝑛 𝐵 =4, df 𝐵 =3, 𝑀 𝐵 =5, 𝑆𝑆 𝐵 =26 A-B : 𝑛 𝐴−𝐵 = 𝑛 𝐴 𝑛 𝐵 𝑛 𝐴 + 𝑛 𝐵 = 3×4 3+4 =1.714286 Df 𝐴−𝐵 = df 𝐴 + df 𝐵 =2+3=5
𝑀 𝐴 − 𝑀 𝐵 =5−4=1 𝑆𝑆 𝐴−𝐵 = 𝑆𝑆 𝐴 + 𝑆𝑆 𝐵 =14+26=40 σ 𝐴−𝐵 2 = 𝑆𝑆 𝐴−𝐵 df 𝐴−𝐵 = 40 5 =8 σ 𝐴−𝐵 = 8 t= 𝑀 𝐴 − 𝑀 𝐵 𝜎 𝐴−𝐵 𝑛 𝐴−𝐵 = 1 8 12 7 =0.46291 Threshold of t value (𝑝≤0.05) is 2.571 t<2.571 We cannot deny the null hypothesis that 𝑀 𝐴 − 𝑀 𝐵 =0, ( 𝑀 𝐴 = 𝑀 𝐵 ). From this, we cannot conclude that averages of A and B is different, and we cannot say that sample populations A and B are taken from the different parent population.
F test :Logical structure of F test is simple. It is only comparison between two variables. In the case of example of unpaired t test We need to confirm equality of variances before t test. A B 1 5 6 8 n 3 4 df 2 sum 12 20 average SS 14 26 𝜎 2 7 8.6667 𝐹= 𝜎 𝐵 2 𝜎 𝐴 2 = 8.6667 7 =1.238095
Fdistribution (𝑝=0.05 one side) f1 f2 1 2 3 4 5 6 7 8 9 10 1 161.448 199.500 215.707 224.583 230.162 233.986 236.768 238.883 240.543 241.882 2 18.513 19.000 19.164 19.247 19.296 19.330 19.353 19.371 19.385 19.396 3 10.128 9.552 9.277 9.117 9.013 8.941 8.887 8.845 8.812 8.786 4 7.709 6.944 6.591 6.388 6.256 6.163 6.094 6.041 5.999 5.964 5 8.608 5.786 5.409 5.192 5.050 4.950 4.876 4.818 4.772 4.732 6 5.987 5.143 4.757 4.534 4.387 4.284 4.207 4.147 4.099 4.060 7 5.591 4.737 4.347 4.120 3.972 3.866 3.787 3.726 3.677 3.637 8 5.318 4.459 4.066 3.838 3.687 3.581 3.500 3.438 3.388 3.347 9 5.117 4.256 3.863 3.633 3.482 3.374 3.293 3.230 3.179 3.137 10 4.965 4.103 3.708 3.478 3.326 3.217 3.135 3.072 3.020 2.978 Theoretical threshold at df of denominator 2 and df of numerator is 3 is 19.164 Observed F is 1.238095. This value is far smaller than 19.164, we can conclude that there is no reason to deny similarity of variances between denominator and numerator.
Technical difficulty of F test is separation of variances of data. The variance of data is composed from several causes. When we consider combined effect among plural causes the structure of variance is sometimes complicated depending on the experimental or survey design. However, there three categories of causes of variance 1. Main effect: variances caused by experimental conditions or categories of data to compare. Difference explained by designed factor 2. Residuals: Variance caused by unknown factors. Randomly sampled data include influence of unknown factors. 3. Interaction: Unexpectedly obtained variances caused by combination of designed condition.
Experimental design (On way ANOVA) 𝑛 𝐴 𝑛 𝐵 𝑛 𝐶 Main effect: Differences of fertilizer ✔ Residuals: Randomness ✔ 3. Interaction: not exist −
Data F 1, F 2, ⋯ F 𝑔 𝑑 11 𝑑 21 ⋯ 𝑑 𝑔1 𝑑 12 𝑑 22 ⋯ 𝑑 𝑔2 ⋮ ⋮ ⋱ ⋮ 𝑑 1 𝑛 1 𝑑 1 𝑛 2 ⋮ 𝑑 𝑔 𝑛 𝑔 Size 𝑛 1 𝑛 2 ⋯ 𝑛 𝑔 Sum 𝑖=1 𝑛 1 𝑑 1𝑖 , 𝑖=1 𝑛 2 𝑑 2𝑖 ⋯ 𝑖=1 𝑛 𝑔 𝑑 𝑔𝑖 Average 𝑀 1 = 1 𝑛 1 𝑖=1 𝑛 1 𝑑 1𝑖 , 𝑀 2 = 1 𝑛 2 𝑖=1 𝑛 2 𝑑 2𝑖 ⋯, 𝑀 𝑔 = 1 𝑛 𝑝 𝑖=1 𝑛 𝑔 𝑑 𝑔𝑖 SS SS 1 = 𝑖=1 𝑛 1 𝑑 1𝑖 − 𝑀 1 2 SS 2 = 𝑖=1 𝑛 2 𝑑 2𝑖 − 𝑀 1 2 ⋯ SS 𝑔 = 𝑖=1 𝑛 1 𝑑 𝑔𝑖 − 𝑀 𝑔 2 σ 2 σ 1 2 = SS 1 𝑛 1 , σ 2 2 = SS 2 𝑛 2 , ⋯, σ 𝑔 2 = SS 𝑔 𝑛 𝑔
SS 𝑇 = 𝑗=1 𝑔 𝑖=1 𝑛 𝑗 𝑑 𝑗𝑖 − 𝑀 𝑇 2 , σ 𝑇 2 = SS 𝑇 𝑁−1 Total 𝑁= 𝑗=1 𝑔 𝑛 𝑗 , Sum 𝑇𝑜𝑡𝑙𝑎 = 𝑗=1 𝑔 𝑖=1 𝑛 𝑔 𝑑 𝑗𝑖 Average: 𝑀 𝑇 1 𝑁 𝑗=1 𝑔 𝑖=1 𝑛 𝑔 𝑑 𝑗𝑖 SS 𝑇 = 𝑗=1 𝑔 𝑖=1 𝑛 𝑗 𝑑 𝑗𝑖 − 𝑀 𝑇 2 , σ 𝑇 2 = SS 𝑇 𝑁−1 Separation of SS 𝑑 𝑗𝑖 − 𝑀 𝑇 = 𝑑 𝑗𝑖 − 𝑀 𝑗 + 𝑀 𝑗 − 𝑀 𝑇 = 𝑒 𝑖𝑗 + 𝐸 𝑗 𝑒 𝑖𝑗 : deviation of data from mean of subsample 𝐸 𝑗 : deviation of mean of subsample from total mean = 𝑖=1 𝑛 𝑗 𝑑 𝑗𝑖 − 𝑀 𝑇 2 = 𝑖=1 𝑛 𝑗 𝑒 𝑖𝑗 + 𝐸 𝑗 2 = 𝑖=1 𝑛 𝑗 𝑒 𝑖𝑗 2 +2 𝐸 𝑗 𝑖=1 𝑛 𝑗 𝑒 𝑖𝑗 + 𝑖=1 𝑛 𝑗 𝐸 𝑗 2 = 𝑖=1 𝑛 𝑗 𝑒 𝑖𝑗 2 + 𝑛 𝑗 𝐸 𝑗 2 = SS 𝑗 + 𝑛 𝑗 𝐸 𝑗 2 SS 𝑇 = 𝑗=1 𝑔 𝑖=1 𝑛 𝑗 𝑑 𝑗𝑖 − 𝑀 𝑇 2 = 𝑗=1 𝑔 SS 𝑗 + 𝑛 𝑗 𝐸 𝑗 2 = 𝑗=1 𝑔 SS 𝑗 + 𝑗=1 𝑔 𝑛 𝑗 𝐸 𝑗 2
SS 𝑇 = 𝑗=1 𝑔 SS 𝑗 + 𝑗=1 𝑔 𝑛 𝑗 𝐸 𝑗 2 𝑗=1 𝑔 𝑛 𝑗 𝐸 𝑗 2 = SS 𝑇 − 𝑗=1 𝑔 SS 𝑗 SS 𝑗 is residual removeing maineffect, randomness SS 𝑇 is comosed from SSresidual and SS main effect 𝑗=1 𝑔 𝑛 𝑗 𝐸 𝑗 2 is maine SS of main effect SS 𝑇 = 𝑆𝑆 𝑓𝑎𝑐𝑡𝑜𝑟 + 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑆𝑆 𝑓𝑎𝑐𝑡𝑜𝑟 = 𝑗=1 𝑔 𝑛 𝑗 𝐸 𝑗 2 𝑑𝑓 𝑡𝑜𝑡𝑎𝑙 =𝑁−1= 𝑗=1 𝑔 𝑛 𝑗 −1 𝑑𝑓 𝑓𝑎𝑐𝑡𝑜𝑟 =𝑔−1 𝑑𝑓 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑁−1 − 𝑔−1 = 𝑗=1 𝑔 𝑛 𝑗 −1 =𝑁−𝑔 𝜎 𝑓𝑎𝑐𝑡𝑜𝑟 2 = 𝑆𝑆 𝑓𝑎𝑐𝑡𝑜𝑟 𝑑𝑓 𝑓𝑎𝑐𝑡𝑜𝑟 , 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑑𝑓 𝑟𝑒𝑠𝑖𝑒𝑢𝑎𝑙 𝐹= 𝜎 𝑓𝑎𝑐𝑡𝑜𝑟 2 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2
Facilitation of calculation of sum of square 𝑆𝑆= 𝑖=1 𝑛 𝑥 𝑖 − 𝑥 2 = 𝑖=1 𝑛 𝑥 𝑖 2 −2 𝑖=1 𝑛 𝑥 𝑖 𝑥 + 𝑖=1 𝑛 𝑥 2 = 𝑖=1 𝑛 𝑥 𝑖 2 −2 𝑥 𝑖=1 𝑛 𝑥 𝑖 +𝑛 𝑥 2 = 𝑖=1 𝑛 𝑥 𝑖 2 − 𝑖=1 𝑛 𝑥 𝑖 2 𝑛 =𝑆− 𝑇 2 𝑛 ∵ 𝑥 = 𝑖=1 𝑛 𝑥 𝑖 𝑛 𝑆 𝑖 = 𝑗=1 𝑛 𝑖 𝑥 𝑖𝑗 2 , 𝑇 𝑖 = 𝑗=1 𝑛 𝑖 𝑥 𝑖𝑗 𝑆𝑆 𝑖 = 𝑆 𝑖 − 𝑇 𝑖 2 𝑛 𝑖 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =𝑆 𝑆 1 +𝑆 𝑆 2 +…+𝑆 𝑆 1 = 𝑆 1 − 𝑇 1 2 𝑛 1 + 𝑆 2 − 𝑇 2 2 𝑛 2 +⋯+ 𝑆 𝑖 − 𝑇 𝑖 2 𝑛 𝑖 +⋯+ 𝑆 𝑛 − 𝑇 𝑛 2 𝑛 𝑛 𝑆− 𝑇 1 2 𝑛 1 + 𝑇 2 2 𝑛 2 +⋯+ 𝑇 𝑖 2 𝑛 𝑖 +⋯+ 𝑇 𝑛 2 𝑛 𝑛 𝑆= 𝑆 1 + 𝑆 2 +⋯+ 𝑆 𝑖 + ⋯+𝑆 𝑛 SS 𝑡𝑜𝑡𝑎𝑙 =S− 𝑇 2 𝑛 𝑡𝑜𝑡𝑎𝑙
SS 𝑡𝑜𝑡𝑎𝑙 = SS 𝑓𝑎𝑐𝑡𝑜𝑟 + SS 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑇 1 2 𝑛 1 + 𝑇 2 2 𝑛 2 +⋯+ 𝑇 𝑖 2 𝑛 𝑖 +⋯+ 𝑇 𝑛 2 𝑛 𝑛 − 𝑇 2 𝑛 = 𝑖=1 𝑁 𝑇 𝑖 2 𝑛 𝑖 − 𝑇 2 𝑛 𝑡𝑜𝑡𝑎𝑙 SS 𝑓𝑎𝑐𝑡𝑜𝑟 = 𝑖=1 𝑁 𝑇 𝑖 2 𝑛 𝑖 − 𝑇 2 𝑛 𝑡𝑜𝑡𝑎𝑙
𝐸𝑥𝑎𝑚𝑝𝑙𝑒 Sub population 1 2 3 4 sum 10 8 9 5 7 15 12 13 14 𝑛 𝑖 6 24 N 𝑇 𝑖 31 67 27 61 186 T 𝑥 𝑖 5.166667 9.25 5.4 10.16667 𝑆 𝑖 199 791 163 699 1852 S 𝑇 𝑖 2 𝑛 𝑖 160.1667 641.2857 145.8 620.1667 1567.419
𝑛 𝑡𝑜𝑡𝑎𝑙 =24 T=186 S=1852 𝑖=1 4 𝑇 𝑖 2 𝑛 𝑖 =1567.419 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 =1852− 186 2 24 =1852−1441.5=410.5 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =1852−1567.419=284.581 𝑆𝑆 𝑓𝑎𝑐𝑡𝑜𝑟 =1567.419− 186 2 24 =1567.419−1441.5=125.919 df 𝑡𝑜𝑡𝑎𝑙 =24−1=23 df 𝑓𝑎𝑐𝑡𝑜𝑟 =4−1=3 df 𝑓𝑎𝑐𝑡𝑜𝑟 =23−3=20 𝜎 𝑓𝑎𝑐𝑡𝑜𝑟 2 = 125.919 3 =41.973 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 = 284.581 20 =14.22905 F ratio= 𝜎 𝑓𝑎𝑐𝑡𝑜𝑟 2 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 = 41.973 14.22905 ≈2.9498 Threshold of F in F distribution table (p=0.05) at df 𝑛𝑢𝑚𝑒𝑟𝑎𝑡𝑜𝑟 =3, df 𝑑𝑒𝑛𝑜𝑚𝑖𝑛𝑎𝑡𝑜𝑟 =20 is 3.0984 Observed F=2.9498<3.0983 We cannot conclude that there is different among sub populations.
Sum of square degree of freedom mean square F 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 p F 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 An example of expression of the result of analysis variance table Sum of square degree of freedom mean square F 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 p F 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 Source (SS) (df) (MS= σ 2 ) among population 125.919 3 41.973 2.2498 0.05 3.0983 residual 284.581 20 14.20995
⋮ ⋱ ⋯ Main effect: Differences of fertilizer ✔ Two way ANOVA without replication Phosphate level ⋯ ⋮ ⋱ 𝑁=1, 𝑃=1 𝑁=1, 𝑃=2 𝑁=1, 𝑃= 𝜑 𝑝 𝑁=1, 𝑃= 𝜑 𝑝 Combined impacts of Phosphate level and nitrogen level in fertilize on the growth of a plant Nitrogen level 𝑁= 𝜑 𝑛 , 𝑃=2 𝑁= 𝜑 𝑛 , 𝑃= 𝜑 𝑝 𝑁= 𝜑 𝑛 , 𝑃=1 Main effect: Differences of fertilizer ✔ Interaction + Residuals: Randomness+ ✔
Factor A size sum mean Factor B 1 2 ⋯ 𝑛 𝑎 1 𝑑 11 𝑑 12 ⋯ 𝑑 1 𝑛 𝑎 𝑛 𝑎 𝑇 𝑏1 = 𝑗=1 𝑛 𝑎 𝑑 1𝑗 𝑀 𝑏1 = 𝑇 𝑏1 𝑛 𝑎 2 𝑑 21 𝑑 22 ⋯ 𝑑 2 𝑛 𝑎 𝑛 𝑎 𝑇 𝑏2 = 𝑗=1 𝑛 𝑎 𝑑 2𝑗 𝑀 𝑏2 = 𝑇 𝑏2 𝑛 𝑎 ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋮ 𝑛 𝑏 𝑑 𝑛 𝑏 1 𝑑 𝑛 𝑏 2 ⋯ 𝑑 𝑛 𝑏 𝑛 𝑎 𝑛 𝑎 𝑇 𝑏 𝑛 𝑏 = 𝑗=1 𝑛 𝑎 𝑑 𝑛 𝑏𝑗 𝑀 𝑏 𝑛 𝑏 = 𝑇 𝑏 𝑛 𝑏 𝑛 𝑎 Size 𝑛 𝑏 𝑛 𝑏 ⋯ 𝑛 𝑏 − 𝑁= 𝑛 𝑎 𝑛 𝑏 − Sum 𝑇 𝑎1 𝑇 𝑎2 𝑇 𝑎 𝑛 𝑎 𝑁= 𝑛 𝑎 𝑛 𝑏 − 𝑇 𝑛 𝑎 Average 𝑀 𝑎1 𝑀 𝑎2 ⋯ 𝑀 𝑎 𝑛 𝑎 − 𝑇 𝑛 𝑏 𝑀= 𝑇 𝑛 𝑎 𝑇 𝑎𝑗 = 𝑖=1 𝑛 𝑏 𝑑 𝑖𝑗 , 𝑀 𝑎𝑗 = 𝑇 𝑎𝑗 𝑛 𝑏 , 𝑇= 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 𝑀 𝑎 = 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑀 𝑎𝑗 , 𝑀 𝑏 = 1 𝑛 𝑏 𝑗=1 𝑛 𝑏 𝑀 𝑏𝑗
𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑑 𝑖𝑗 −𝑀 2 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑑 𝑖𝑗 −𝑀 2 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑑 𝑖𝑗 − 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏𝑖 −𝑀 +𝑀 + 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏 −𝑀 2 𝑒 𝑖𝑗 =𝑑 𝑗𝑘 − 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏𝑖 −𝑀 +𝑀 𝐸 𝑎𝑗 = 𝑀 𝑎𝑗 −𝑀, 𝐸 𝑏𝑖 = 𝑀 𝑎𝑖 −𝑀 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑒 𝑖𝑗 + 𝐸 𝑎𝑗 + 𝐸 𝑏𝑖 2 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑒 𝑖𝑗 2 + 𝐸 𝑎𝑗 2 + 𝐸 𝑏𝑖 2 +2 𝑒 𝑖𝑗 𝐸 𝑎𝑗 + 𝐸 𝑎𝑗 𝐸 𝑏𝑖 + 𝐸 𝑏𝑖 𝑒 𝑖𝑗 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑒 𝑖𝑗 2 + 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝐸 𝑎𝑗 2 + 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝐸 𝑏𝑖 2 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑢𝑑𝑢𝑎𝑙 + 𝑛 𝑏 𝑆𝑆 𝑎 + 𝑛 𝑎 𝑆𝑆 𝑏 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑒 𝑖𝑗 𝐸 𝑎𝑗 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝐸 𝑎𝑗 𝐸 𝑏𝑖 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝐸 𝑏𝑖 𝑒 𝑖𝑗 =0 𝜎 𝑎 2 = 𝑆𝑆 𝑎 𝑑𝑓 𝑎 , 𝜎 𝑏 2 = 𝑆𝑆 𝑏 𝑑𝑓 𝑏 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 = 𝑆𝑆 𝑟𝑒𝑖𝑑𝑢𝑎𝑙 𝑑𝑓 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝐹 𝑎 = 𝜎 𝑎 2 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 , 𝐹 𝑏 = 𝜎 𝑏 2 𝜎 𝑟𝑒𝑠𝑖𝑐𝑢𝑎𝑙 2
Facilitation of calculation 𝑛 𝑏 𝑆𝑆 𝑎 = 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 + 𝑛 𝑎 𝑆𝑆 𝑏 𝑛 𝑎 𝑆𝑆 𝑏 = 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 + 𝑛 𝑏 𝑆𝑆 𝑎 We calculate SS in a row 𝑆𝑆 𝑟𝑖 = 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 − 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 2 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 = 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝐷 𝑏𝑖 + 𝐷 𝑎𝑗 +𝑀 = 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝑛 𝑎 𝐷 𝑏𝑖 + 𝑛 𝑎 𝑀+ 𝑗=1 𝑛 𝑎 𝐷 𝑎𝑗 = 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝑛 𝑎 𝐷 𝑏𝑖 + 𝑛 𝑎 𝑀 𝑗=1 𝑛 𝑎 𝐷 𝑎𝑗 =0 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 = 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝐷 𝑏𝑖 +𝑀 𝑑 𝑖𝑗 − 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑑 𝑖𝑗 = 𝑒 𝑖𝑗 + 𝐷 𝑏𝑖 + 𝐷 𝑎𝑗 +𝑀− 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝐷 𝑏𝑖 +𝑀 = 𝑒 𝑖𝑗 − 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗 + 𝐷 𝑎𝑗 = 𝜀 𝑖𝑗 + 𝐷 𝑎𝑗 ∵𝜀 𝑖𝑗 = 𝑒 𝑖𝑗 − 1 𝑛 𝑎 𝑗=1 𝑛 𝑎 𝑒 𝑖𝑗
𝑆𝑆 𝑟𝑖 = 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 + 𝐷 𝑎𝑗 2 = 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 2 +2 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 𝐷 𝑎𝑗 + 𝑗=1 𝑛 𝑎 𝐷 𝑎𝑗 2 𝑖=1 𝑛 𝑏 𝑆𝑆 𝑟𝑖 = 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 2 +2 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 𝐷 𝑎𝑗 + 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝐷 𝑎𝑗 2 = 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 2 + 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝐷 𝑎𝑗 2 ∵ 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 𝐷 𝑎𝑗 =0 𝑆𝑆 𝑟 = 𝑖=1 𝑛 𝑏 𝑆𝑆 𝑟𝑖 , 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑖=1 𝑛 𝑏 𝑗=1 𝑛 𝑎 𝜀 𝑖𝑗 2 𝑆𝑆 𝑟 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 + 𝑛 𝑏 𝑆𝑆 𝐴 Similarly 𝑆𝑆 𝑐 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 + 𝑛 𝑎 𝑆𝑆 𝐵 𝑆𝑆 𝑐 ;𝑆𝑆 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛 𝑛 𝑏 𝑆𝑆 𝐴 = 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆 𝑐 𝑛 𝑎 𝑆𝑆 𝐵 = 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆 𝑟
Example of dataset A: contents of phosphate B: contents of nitrogen A1 A2 A3 B1 11 8 B2 10 13 19 B3 9 18 B4 14
Calculation process A1 A2 A3 𝑛 𝑇 𝑖 𝑆 𝑖 𝑖=1 𝑛 𝑇 𝑖 2 𝑛 𝑖=1 𝑛 𝑇 𝑖 2 𝑛 𝑆 𝑖 − 𝑖=1 𝑛 𝑇 𝑖 2 𝑛 B1 11 8 3 30 306 300 6 B2 10 13 19 42 630 588 B3 9 18 45 729 675 54 B4 14 51 881 867 4 12 168 2546 2352 116 𝑇 𝑗 44 60 64 𝑆 𝑗 498 938 1110 𝑗=1 𝑛 𝑇 𝑗 2 𝑛 484 900 1024 𝑆 𝑗 − 𝑗=1 𝑛 𝑇 𝑗 2 𝑛 38 86 138 194 𝑆𝑆 𝑟 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 + 𝑛 𝑏 𝑆𝑆 𝐴 𝑆𝑆 𝑐 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 + 𝑛 𝑎 𝑆𝑆 𝐵 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 𝑆𝑆 𝑐 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 + 𝑛 𝑎 𝑆𝑆 𝐵
𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 =194 4 𝑆𝑆 𝐴 =194−138=56 3 𝑆𝑆 𝐵 =194−116=78 𝑆𝑆 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 =194−56−78=60 𝑆𝑆 𝐴 = 56 4 =14 𝑆𝑆 𝐵 = 78 3 =26
Analysis of variance table Source SS df MS Observed F p threshold F A 16 2 8 0.8 0.05 5.1433 B 26 3 8.6667 0.867 0.05 4.7571 Residual 60 6 10 Sum 194 11 Source SS df MS Observed F p A 16 2 8 0.8 nnnn B 26 3 8.6667 0.867 nnnn Residual 60 6 10 Sum 194 11
⋯ ⋯ ⋮ ⋮ ⋮ ⋱ ⋯ N=1, P=1 N=1, P=2 N=1, P= 𝑛 𝑝 N=2, P=1 N=2, P=2 Two way ANOVA with replications N=1, P=1 ⋯ N=1, P=2 N=1, P= 𝑛 𝑝 ⋯ N=2, P=1 N=2, P=2 N=2, P= 𝑛 𝑝 ⋮ ⋮ ⋮ ⋱ ⋯ N= 𝑛 𝑛 , P= 𝑛 𝑝 N= 𝑛 𝑛 , P=1 N= 𝑛 𝑛 , P=2
Factor A Factor B 1 2 ⋯ 𝑛 𝑎 sum mean 1 𝑑 111 𝑑 121 ⋯ 𝑑 1 𝑛 𝑎 1 𝑑 112 𝑑 122 ⋯ 𝑑 1 𝑛 𝑎 2 ⋮ ⋮ ⋱ ⋮ 𝑑 11 𝑛 𝑔 𝑑 12 𝑛 𝑔 ⋯ 𝑑 1 𝑛 𝑎 𝑛 𝑔 sum 𝑇 11 𝑇 12 ⋯ 𝑇 1 𝑛 𝑎 𝑇 𝐵1 mean 𝑇 11 𝑛 𝑔 𝑇 12 𝑛 𝑔 ⋯ 𝑇 12 𝑛 𝑔 𝑇 𝐵1 𝑛 𝑔 𝑛 𝑎 ⋮ ⋮ 𝑛 𝑏 𝑑 𝑛 𝑏 11 𝑑 𝑛 𝑏 21 ⋯ 𝑑 𝑛 𝑏 𝑛 𝑎 1 𝑑 𝑛 𝑏 12 𝑑 𝑛 𝑏 22 ⋯ 𝑑 𝑛 𝑏 𝑛 𝑎 2 ⋮ ⋮ ⋱ ⋮ 𝑑 𝑛 𝑏 1 𝑛 𝑔 𝑑 𝑛 𝑏 2 𝑛 𝑔 ⋯ 𝑑 𝑛 𝑏 𝑛 𝑎 𝑛 𝑔 sum 𝑇 𝑛 𝑏 1 𝑇 𝑛 𝑏 2 ⋯ 𝑇 𝑛 𝑏 𝑛 𝑎 𝑇 𝐵 𝑛 𝑏 mean 𝑇 𝑛 𝑏 1 𝑛 𝑔 𝑇 𝑛 𝑏 2 𝑛 𝑔 ⋯ 𝑇 𝐵 𝑛 𝑏 𝑛 𝑔 𝑛 𝑎 Sum 𝑇 𝐴1 𝑇 𝐴2 ⋯ 𝑇 𝐴 𝑛 𝑎 T Mean 𝑇 𝐴1 𝑛 𝑔 𝑛 𝑎 𝑇 𝐴2 𝑛 𝑔 𝑛 𝑎 ⋯ 𝑇 𝐴 𝑛 𝑎 𝑛 𝑔 𝑛 𝑎 𝑇 𝑛 𝑔 𝑛 𝑎 𝑛 𝑏
𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 1 𝑖𝑗𝑘 −𝑀 2 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 1 𝑖𝑗𝑘 −𝑀 2 𝑑 𝑖𝑗𝑘 −𝑀= 𝑑 𝑖𝑗𝑘 − 𝜀 𝑖𝑗 − 𝜀 𝑖 − 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏𝑖 −𝑀 +𝑀 + 𝜀 𝑖𝑗 − 𝜀 𝑖 + 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏 −𝑀 𝛿 𝑖𝑗𝑘 =𝑑 𝑖𝑗𝑘 − 𝜀 𝑖𝑗 − 𝜀 𝑖 + 𝑀 𝑎𝑗 −𝑀 + 𝑀 𝑏𝑖 −𝑀 +𝑀 𝜖 𝑖𝑗 = 𝜀 𝑖𝑗 − 𝜀 𝑖 , 𝐸 𝑎𝑗 = 𝑀 𝑎𝑗 −𝑀, 𝐸 𝑏𝑖 = 𝑀 𝑎𝑖 −𝑀, 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝑒 𝑖𝑗𝑘 + 𝜖 𝑖𝑗 + 𝐸 𝑎𝑗 + 𝐸 𝑏𝑖 2 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝑒 𝑖𝑗𝑘 2 + 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝜖 𝑖𝑗 2 + 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝐸 𝑎𝑗 2 + 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝐸 𝑏𝑖 2 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑘=1 𝑛 𝑔 𝑒 𝑖𝑗𝑘 2 + 𝑛 𝑔 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝜖 𝑖𝑗 2 + 𝑛 𝑏 𝑛 𝑔 𝑗=1 𝑛 𝑎 𝐸 𝑎𝑗 2 + 𝑛 𝑏 𝑛 𝑔 𝑖=1 𝑛 𝑏 𝐸 𝑏𝑗 2 = 𝑆𝑆 𝑟𝑒𝑠𝑖𝑢𝑑𝑢𝑎𝑙 + 𝑛 𝑔 𝑆𝑆 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛(𝑎×𝑏) + 𝑛 𝑏 𝑛 𝑔 𝑆𝑆 𝑎 + 𝑛 𝑎 𝑛 𝑔 𝑆𝑆 𝑏 𝜎 𝑎 2 = 𝑆𝑆 𝑎 𝑑𝑓 𝑎 , 𝜎 𝑏 2 = 𝑆𝑆 𝑏 𝑑𝑓 𝑏 𝜎 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 2 = 𝑆𝑆 𝑟𝑒𝑖𝑑𝑢𝑎𝑙 𝑑𝑓 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 , 𝜎 (𝑎×𝑏) 2 = 𝑆𝑆 (𝑎×𝑏) 𝑑𝑓 (𝑎×𝑏) , 𝐹 𝑎×𝑏 /𝑟 = 𝜎 (𝑎×𝑏) 2 𝜎 𝑟 2 , 𝐹 𝑎/𝑟 = 𝜎 𝑎 2 𝜎 𝑟 2 , 𝐹 𝑏/𝑟 = 𝜎 𝑏 2 𝜎 𝑟 2 𝐹 𝑎/ 𝑎×𝑏 = 𝜎 𝑎 2 𝜎 𝑎×𝑏 2 , 𝐹 𝑏/ 𝑎×𝑏 = 𝜎 𝑏 2 𝜎 𝑎×𝑏 2
Total degree of freedom: df 𝑡𝑜𝑡𝑎𝑙 = 𝑛 𝑎 𝑛 𝑏 𝑛 𝑔 −1 Degree of freedom among A: df 𝐴 = 𝑛 𝑎 −1 Degree of freedom among B: df 𝐵 = 𝑛 𝑏 −1 Degree of freedom of interaction: df 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 =( 𝑛 𝑎 −1)( 𝑛 𝑏 −1) Degree of freedom of residual: df 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑛 𝑏 𝑛 𝑏 ( 𝑛 𝑔 −1) 𝑛 𝑎 𝑛 𝑏 𝑛 𝑔 −1= 𝑛 𝑎 −1 + 𝑛 𝑏 −1 + 𝑛 𝑎 −1 𝑛 𝑏 −1 + 𝑛 𝑏 𝑛 𝑏 ( 𝑛 𝑔 −1)
Sum of SS in the Row: SS 𝑡𝑜𝑡𝑎𝑙−𝐵 𝑛 𝑎 𝑛 𝑔 SS 𝐵 = SS 𝑡𝑜𝑡𝑎𝑙 − SS 𝑡𝑜𝑡𝑎𝑙−𝐵 Facilitation of calculation 𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑇 2 𝑛 𝑡𝑜𝑡𝑎𝑙 SS 𝑡𝑜𝑡𝑎𝑙 = 𝑛 𝑏 𝑛 𝑔 SS 𝐴 + 𝑛 𝑎 𝑛 𝑔 SS 𝐵 + 𝑛 𝑔 SS 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 + SS 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =sum of sum of squares of replications in each cell (combnation of condtions) = 𝑗=1 𝑛 𝑎 𝑖=1 𝑛 𝑏 𝑠𝑠 𝑖𝑗 Sum of SS in the Row: SS 𝑡𝑜𝑡𝑎𝑙−𝐵 𝑛 𝑎 𝑛 𝑔 SS 𝐵 = SS 𝑡𝑜𝑡𝑎𝑙 − SS 𝑡𝑜𝑡𝑎𝑙−𝐵 Sum of SS in the columns: SS 𝑡𝑜𝑡𝑎𝑙−𝐴 𝑛 𝑎 𝑛 𝑔 SS 𝐴 = SS 𝑡𝑜𝑡𝑎𝑙 − SS 𝑡𝑜𝑡𝑎𝑙−𝐴 𝑛 𝑔 SS 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 = SS 𝑡𝑜𝑡𝑎𝑙 − 𝑛 𝑏 𝑛 𝑔 SS 𝐴 + 𝑛 𝑎 𝑛 𝑔 SS 𝐵 − SS 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙
𝑆𝑆 𝑡𝑜𝑡𝑎𝑙 = 𝑆 𝑡𝑜𝑡𝑎𝑙 − 𝑇 2 𝑛 𝑡𝑜𝑡𝑎𝑙 =7728− 504 2 36 =7728−7056=672 SS 𝑡𝑜𝑡𝑎𝑙 =4×3× SS 𝐴 +3×3× SS 𝐵 +3× SS 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 +SS 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑆𝑆 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =sum of sum of squares of replications in each cell (combnation of cndtions) = 𝑖=1 4 𝑗=1 3 𝑠𝑠 𝑖𝑗 =24+14+18+34=90 Sum of SS in the Row: SS 𝑡𝑜𝑡𝑎𝑙−𝐵 =42+140+180+76=438 3×3×SS 𝐵 = SS 𝑡𝑜𝑡𝑎𝑙 − SS 𝑡𝑜𝑡𝑎𝑙−𝐵 =672−438=234 Sum of SS in the columns: SS 𝑡𝑜𝑡𝑎𝑙−𝐴 =58+174+272=504 4 ×3×SS 𝐴 = SS 𝑡𝑜𝑡𝑎𝑙 − SS 𝑡𝑜𝑡𝑎𝑙−𝐴 =672−504=168 3× SS 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 = SS 𝑡𝑜𝑡𝑎𝑙 −4×3× SS 𝐴 − 3×3×SS 𝐵 − SS 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 672−168−234−90=180 SS 𝐵 = 234 9 =26 SS 𝐴 = 168 12 =14 SS 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 = 180 3 =60
Source SS df MS 𝐹 0𝑏𝑠𝑒𝑟𝑣𝑒𝑑 1 𝐹 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 1 𝐹 0𝑏𝑠𝑒𝑟𝑣𝑒𝑑 2 𝐹 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 2 A 14 2 7 1.40625 3.4 0.7 4.1 B 26 3 8.6667 2.31112 3.0 0.8667 3.7 Interaction 60 6 10 2.6667 2.5 residual 90 24 3.75 Sum 35
Importance of confirmation of equality of variance (Particularly ratio data)