Chapter 11 – Analysis of Variance Introduction to Business Statistics, 6e Kvanli, Pavur, Keeling Chapter 11 – Analysis of Variance Slides prepared by Jeff Heyl, Lincoln University ©2003 South-Western/Thomson Learning™
Analysis of Variance Analysis of Variance (ANOVA) determines if a factor has a significant effect on the variable being measured Examine variation within samples and variation between samples
Measuring Variation SS(factor) measures between-sample variation [SS(between)] SS(error) measures within-sample variation [SS(within)] SS(total) measures the total variation in the sample [SS(factor)] [SS(error)]
Determining Sum of Squares SS(factor) = + - T2 n T12 n1 T22 n2 SS(total) = ∑x2 - = ∑x2 - (∑x)2 n T2 SS(error) = ∑x2 - + or T12 n1 T22 n2 SS(error) = SS(total) - SS(factor)
ANOVA Test for Ho: µ1 = µ2 Versus Ha: µ1 ≠ µ2 MS(factor) = = SS(factor) df for factor 1 MS(error) = = SS(error) n1 + n2 - 2 df for error F = = estimated population variance based on the variation among the sample means the variation within each of the samples MS(factor) MS(error)
Defining the Rejection Region Figure 11.1
p-Values for Battery Lifetime Example t curve with 8 df 4.20 t p-value = 2 (shaded area) = .0030 F curve with 1 and 8 df 17.64 F p-value = shaded area Figure 11.2
Dot Array Diagram • • | 25 30 35 40 45 50 B A Number of cartons | 25 Figure 11.3 | 25 30 35 40 45 50 B • A Number of cartons Figure 11.4
Assumptions The replicates are obtained independently and randomly from each of the populations The replicates from each population follow a (approximate) normal distribution The normal populations all have a common variance
Deriving the Sum of Squares SS(factor) = + + ... + - T12 n1 T22 n2 Tk2 nk T2 n SS(total) = ∑x2 - T2 n SS(error) = ∑x2 - + + ... + T12 n1 T22 n2 Tk2 nk = SS(total) - SS(factor)
The ANOVA Table Source df SS MS F Factor k - 1 SS(factor) MS(factor) MS(factor) Error n - 2 SS(error) MS(error) MS(error) Total n - 1 SS(total) SS(factor) k - 1 MS(factor) = SS(error) n - k MS(error) = MS(factor) MS(error) F =
Test for Equal Variances Ho: 12 = 22 = … = k2 Ha: at least 2 variances are unequal H = maximum s2 minimum s2 reject Ho if H > HTable A.14
Confidence Intervals in One-Factor ANOVA Xi - t/2,n-ksp to Xi + t/2,n-ksp 1 ni where k = number of populations (levels) ni = number of replicates in the ith sample n = total number of observations sp = MS(error)
Confidence Intervals in One-Factor ANOVA The (1 - ) • 100% confidence interval for µi - µj is (Xi - Xj) - t/2,n-ksp + to (Xi - Xj) + t/2,n-ksp + 1 ni nj
Multiple Comparisons Procedure The multiple comparisons procedure compares all possible pairs of means in such a way that the probability of making one or more Type 1 errors is Tukey’s Test Q = maximum (Xi) - minimum (Xi) MS(error)/nr
Multiple Comparisons Procedure 1. Find Q,k,v using Table A.16 2. Determine D = Q,k,v • MS(error) nr 3. Place the sample means in order, from smallest to largest 4. If two means differ by more than D, the conclusion is that the corresponding population means are unequal
Nylon Breaking Strength Plot of Group Means 1 2 3 4 5 Group Group Means 26 25 24 23 22 21 Figure 11.5
Nylon Breaking Strength Figure 11.6
Nylon Breaking Strength Figure 11.7
Nylon Breaking Strength Figure 11.7
One-Factor ANOVA Procedure Requirements The replicates are obtained independently and randomly from each of the populations The observations from each population follow (approximately) a normal distribution The populations all have a common variance
One-Factor ANOVA Procedure Ho: µ1 = µ2 = … = µk Ha: not all µ’s are equal Hypotheses Source df SS MS F Factor k - 1 SS(factor) MS(factor) Error n - 2 SS(error) MS(error) Total n - 1 SS(total) MS(factor) MS(error) reject Ho if F* > F,k-1,n-k
Completely Randomized Design Replicates are obtained in a completely random manner from each population Null hypothesis is Ho: µ1 = µ2 = ... = µn
Randomized Block Design The samples are not independent, the data are grouped (blocked) by another variable The difference between the randomized block design and the completely randomized design is that here we use a blocking strategy rather than independent samples to obtain a more precise test for examining differences in the factor level means
Randomized Block Design SS(factor) = [T12 + T22 + ... + Tk2] - 1 b T2 bk where k = number of factor levels in the design b = number of blocks in the design n = number of observations = bk T1, T2, ..., Tk represent the totals for the k factor levels S1, S2, ..., Sb are the totals for the b blocks T = T1 + T2 + ... + Tk = S1 + S2 + ... + Sb = total of all observations
Randomized Block Design SS(blocks) = [S12 + S22 + ... + Sb2] - 1 k T2 bk SS(total) = ∑x2 - T2 bk SS(error) + SS(total) - SS(factor) - SS(blocks) df for factor = k - 1 df for blocks = b - 1 df for error = (k - 1)(b - 1) df for total = bk - 1
Randomized Block Design Source df SS MS F Factor k - 1 SS(factor) MS(factor) F1 Blocks b - 1 SS(blocks) MS(blocks) F2 Error (k - 1)(b - 1) SS(error) MS(error) Total bk - 1 SS(total) F1 = MS(factor) MS(error) F2 = MS(blocks)
Factor Hypothesis Test Ho: µ1 = µ2 = … = µk Ha: not all µ’s are equal F1 = MS(factor) MS(error) reject Ho if F* > F,k-1,(k-1)(b-1)
Block Hypothesis Test Ho: µ1 = µ2 = … = µb Ha: not all µ’s are equal F2 = MS(blocks) MS(error) reject Ho if F* > F,b-1,(k-1)(b-1)
Hardness Test Data Analysis Figure 11.10
Hardness Test Data Analysis Figure 11.11
Confidence Interval Difference Between Two Means Randomized Block (1- ) 100% confidence interval (Xi - Xj) - t/2,df • s • + to (Xi - Xj) + t/2,df • s • + 1 b
Dental Claim Data Analysis Figure 11.12
Multiple Comparisons Procedure: Randomized Block |Xi - Xj| > D D = Q,k,(k-1)(b-1) MS(error) b
Machine Choice Example Figure 11.13
Two-Way Factorial Design Single Married Male Low High Female High Low Figure 11.14
Two-Way Factorial Design 1 2 ... b 1 x x x 2 x x x . a x x x Factor B Factor A Figure 11.15
Two-Way Factorial Design 1 2 ... B Totals 1 T1 2 T2 . a Ta S1 S2 Sb Factor A Factor B x, x (total = R11) (total = R21) (total = R12) (total = Ra1) (total = Ra2) (total = R22) (total = R1b) (total = Rab) Totals Figure 11.16
Two-Way Factorial Design Factor A: SSA = [T12 + T22 + ... + Ta2] - 1 br T2 abr r Interaction: SSAB = [∑R2] - SSA - SSB - Factor B: SSB = [S12 + S22 + ... + Sa2] - ar Total: SS(total) = ∑x2 -
Two-Way Factorial Design SS(error) = SS(total) - SSA - SSB - SSAB MS(error) = SS(error) ab(r - 1) MSA = SSA a - 1 MSB = SSB b - 1 MSAB = SSAB (a - 1)(b - 1)
Two-Way Factorial Design Source df SS MS F Factor A a - 1 SSA MSA F1 Factor B b - 1 SSB MSB F2 Interaction (a - 1)(b - 1) SSAB MSAB F3 Error ab(r - 1) SS(error) MS(error) Total abr - 1 SS(total)
Hypothesis Test - Factor A Ho: Factor A is not significant (µM = µF) Ha: Factor A is significant (µM ≠ µF) F1 = MSA MS(factor) reject Ho,A if F1 > F,v1,v2
Hypothesis Test - Factor B Ho: Factor B is not significant (µ1 = µ2 = µ3) Ha: Factor B is significant (not all µ’s are equal) F2 = MSB MS(error) reject Ho,B if F2 > F,v1,v2
Hypothesis Test - Interaction Ho: Interaction is not significant Ha: Interaction is significant F3 = MSAB MS(error) reject Ho,AB if F2 > F,v1,v2
Multiple Comparisons Procedure: Two-Way Factorial Design D = Q,k,v • MS(error) nr v = df for error nr = number of replicates in each sample
Annual amount claimed on dental insurance Interaction Effect – 300 – 250 – 200 – 150 – 100 – | Category 1 Category 2 Category 3 Category 4 Male Female Employee classification Annual amount claimed on dental insurance A Figure 11.17
Annual amount claimed on dental insurance Interaction Effect – 300 – 250 – 200 – 150 – 100 – | Category 1 Category 2 Category 3 Category 4 Male Female Employee classification Annual amount claimed on dental insurance B Figure 11.17
Gender Factor Analysis Figure 11.18
Gender Factor Analysis Figure 11.19