Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multifactor Analysis of Variance

Similar presentations


Presentation on theme: "Multifactor Analysis of Variance"— Presentation transcript:

1 Multifactor Analysis of Variance
11 Multifactor Analysis of Variance Copyright © Cengage Learning. All rights reserved.

2 Copyright © Cengage Learning. All rights reserved.
11.3 Three-Factor ANOVA Copyright © Cengage Learning. All rights reserved.

3 Three-Factor ANOVA To indicate the nature of models and analyses when ANOVA experiments involve more than two factors, we will focus here on the case of three fixed factors—A, B, and C. The numbers of levels of these factors will be denoted by I, J, and K, respectively, and Lijk = the number of observations made with factor A at level i, factor B at level j, and factor C at level k. The analysis is quite complicated when the Lijk’s are not all equal, so we further specialize to Lijk = L.

4 Three-Factor ANOVA Then Xijkl and xijkl denote the observed value, before and after the experiment is performed, of the lth replication (l = 1, 2,…, L) when the three factors are fixed at levels i, j, and k. To understand the parameters that will appear in the three-factor ANOVA model, first recall that in two-factor ANOVA with replications, E(Xijk) = ij =  + i + j + γij , where the restrictions ii = jj = 0, iγij = 0 for every j, and iγij = 0 for every i were necessary to obtain a unique set of parameters.

5 Three-Factor ANOVA If we use dot subscripts on the ij’s to denote averaging (rather than summation), then is the effect of factor A at level i averaged over levels of factor B, whereas is the effect of factor A at level i specific to factor B at level j.

6 Three-Factor ANOVA When the effect of A at level i depends on the level of B, there is interaction between the factors, and the γij ’s are not all zero. In particular, (11.11)

7 The Fixed Effects Model and Test Procedures

8 The Fixed Effects Model and Test Procedures
The restrictions necessary to obtain uniquely defined parameters are that the sum over any subscript of any parameter on the right-hand side of (11.13) equal 0.

9 The Fixed Effects Model and Test Procedures
The parameters , and are called two-factor interactions, and is called a three-factor interaction; the i’s, j ’s, and k ’s are the main effects parameters. For any fixed level k of the third factor, analogous to (11.11), is the interaction of the ith level of A with the jth level of B specific to the kth level of C, whereas

10 The Fixed Effects Model and Test Procedures
Is the interaction between A at level i and B at level j averaged over levels of C. If the interaction of A at level i and B at level j does not depend on k, then all ’s equal 0. Thus nonzero ’s represent nonadditivity of the two-factor ’s over the various levels of the third factor C. If the experiment included more than three factors, there would be corresponding higher-order interaction terms with analogous interpretations.

11 The Fixed Effects Model and Test Procedures
Note that in the previous argument, if we had considered fixing the level of either A or B (rather than C, as was done) and examining the ’s, their interpretation would be the same; if any of the interactions of two factors depend on the level of the third factor, then there are nonzero ’s. When L > 1, there is a sum of squares for each main effect, each two-factor interaction, and the three-factor interaction.

12 The Fixed Effects Model and Test Procedures
To write these in a way that indicates how sums of squares are defined when there are more than three factors, note that any of the model parameters in (11.13) can be estimated unbiasedly by averaging Xijkl over appropriate subscripts and taking differences. Thus with other main effects and interaction estimators obtained by symmetry.

13 The Fixed Effects Model and Test Procedures
Definition

14 The Fixed Effects Model and Test Procedures
Each sum of squares (excepting SST) when divided by its df gives a mean square. Expected mean squares are E(MSE) =  2 with similar expressions for the other expected mean squares. Main effect and interaction hypotheses are tested by forming F ratios with MSE in each denominator.

15 The Fixed Effects Model and Test Procedures
Usually the main effect hypotheses are tested only if all interactions are judged not significant. This analysis assumes that Lijk = L > 1. if L = 1, then as in the two-factor case, the highest-order interactions must be assumed absent to obtain an MSE that estimates  2.

16 The Fixed Effects Model and Test Procedures
Setting L = 1 and disregarding the fourth subscript summation over l, the foregoing formulas for sums of squares are still valid, and error sum of squares is SSE = with = Xijk in the expression for .

17 Example 11.10 There has been increased interest in recent years in renewable fuels such as biodiesel, a form of diesel fuel derived from vegetable oils and animal fats. Advantages over petroleum diesel include nontoxicity, biodegradability, and lower greenhouse gas emissions. The article “Application of the Full Factorial Design to Optimization of Base-Catalyzed Sunflower Oil Ethanolysis” (Fuel, 2013: 433–442) reported on an investigation of three factors on the purity (%) of the biodiesel fuel fatty acid ethyl ester (FAEE).

18 Example 11.10 The factors and levels are as follows: The data appears in Table 11.8, where I = J = K = 3 and L = 2.

19 Example 11.10 The resulting ANOVA table is shown in Table The P-value for testing 𝐻 0𝐴𝐵𝐶 is .165, which is larger than any sensible significance level. This null hypothesis therefore cannot be rejected; it appears that the extent of interaction between any pair of factors is the same for each level of the remaining factor.

20 Example 11.10 Figure 11.8 shows two-factor interaction plots.

21 Figure 11.8 Cont.

22 Example 11.10 For example, the dots in the plot appearing in the C row and B column represent the 𝑥 ∙𝑗𝑘∙ ’s—that is, the observations averaged over the levels of the first factor for each combination of levels of the second and third factors. The bottom three dots connected by solid line segments represent the third level of factor C at each level of factor B. The fact that connected line segments are quite close to being parallel is evidence for the absence of BC interactions, and indeed the P-value in Table 11.9 for testing this null hypothesis is .360.

23 Example 11.10 However, the P-values for testing 𝐻 𝑂𝐴𝐵 and 𝐻 𝑂𝐴𝐶 are .020 and .000, respectively. So at significance level .05, we are forced to conclude that AB interactions and AC interactions are present. The line segments in the AC interaction plot are clearly not close to being parallel. It appears from the interaction plots that expected purity will be maximized when all factors are at their highest levels. As it happens, this is also the message from the main effects plots, but those cannot generally be trusted when interactions are present.

24 The Fixed Effects Model and Test Procedures
Diagnostic plots for checking the normality and constant variance assumptions can be constructed as described in previous sections. Tukey’s procedure can be used in three-factor (or more) ANOVA. The second subscript on Q is the number of sample means being compared, and the third is degrees of freedom for error.

25 The Fixed Effects Model and Test Procedures
Models with random and mixed effects are also sometimes appropriate. Sums of squares and degrees of freedom are identical to the fixed effects case, but expected mean squares are, of course, different for the random main effects or interactions. A good reference is the book by Douglas Montgomery listed in the chapter bibliography.

26 Latin Square Designs

27 Latin Square Designs When several factors are to be studied simultaneously, an experiment in which there is at least one observation for every possible combination of levels is referred to as a complete layout. If the factors are A, B, and C with I, J, and K levels, respectively, a complete layout requires at least IJK observations. Frequently an experiment of this size is either impracticable because of cost, time, or space constraints or literally impossible.

28 Latin Square Designs For example, if the response variable is sales of a certain product and the factors are different display configurations, different stores, and different time periods, then only one display configuration can realistically be used in a given store during a given time period. A three-factor experiment in which fewer than IJK observations are made is called an incomplete layout. There are some incomplete layouts in which the pattern of combinations of factors is such that the analysis is straightforward.

29 Latin Square Designs One such three factor design is called a Latin square. It is appropriate when I = J = K (e.g., four display configurations, four stores, and four time periods) and all two- and three-factor interaction effects are assumed absent. If the levels of factor A are identified with the rows of a two-way table and the levels of B with the columns of the table, then the defining characteristic of a Latin square design is that every level of factor C appears exactly once in each row and exactly once in each column.

30 Latin Square Designs Figure 11.9 shows examples of 3  3, 4  4, and 5  5 Latin squares. There are 12 different 3  3 Latin squares, and the number of different Latin squares increases rapidly with the number of levels. (e.g., every permutation of rows of a given Latin square yields a Latin square, and similarly for column permutations). Examples of Latin squares Figure 11.9

31 Latin Square Designs It is recommended that the square used in a an actual experiment be chosen at random from the set of all possible squares of the desired dimension; for further details, consult one of the chapter references. The letter N will denote the common value of I, J, and K. Then a complete layout with one observation per combination would require N3 observations, whereas a Latin square requires only N2 observations.

32 Latin Square Designs Once a particular square has been chosen, the value of k (the level of factor C) is completely determined by the values of i and j. To emphasize this, we use xij(k) to denote the observed value when the three factors are at levels i, j, and k, respectively, with k taking on only one value for each i, j pair.

33 Latin Square Designs We employ the following notation for totals and averages: Note that although Xi.. previously suggested a double summation, now it corresponds to a single sum over all j (and the associated values of k).

34 Latin Square Designs Definition

35 Latin Square Designs Each mean square is, of course, the ratio SS/df. For testing H0C : 1 = 2 =    =N = 0, the test statistic value is fC = MSC/MSE, with the P –value is the area under the FN – 1,(N – 1)(N – 2) curve to the right of 𝑓 𝑐

36 Latin Square Designs The other two main effect null hypotheses are tested analogously. If any of the null hypotheses is rejected, significant differences can be identified by using Tukey’s procedure. After computing pairs of sample means (the xi..’s, x.j.’s, or x..k’s ) differing by more than w correspond to significant differences between associated factor effects (the i’s, j’s, or k’s ). The hypothesis H0C is frequently the one of central interest.

37 Latin Square Designs A Latin square design is used to control for extraneous variation in the A and B factors, as was done by a randomized block design for the case of a single extraneous factor. Thus in the product sales example mentioned previously, variation due to both stores and time periods is controlled by a Latin square design, enabling an investigator to test for the presence of effects due to different product-display configurations.

38 Example 11.11 In an experiment to investigate the effect of relative humidity on abrasion resistance of leather cut from a rectangular pattern (“The Abrasion of Leather,” J. Inter. Soc. Leather Trades’ Chemists, 1946: 287), a 6  6 Latin square was used to control for possible variability due to row and column position in the pattern.

39 Example 11.11 cont’d The six levels of relative humidity studied were 1 = 25%, 2 = 37%, 3 = 50%, 4 = 62%, 5 = 75%, and 6 = 87%, with the following results: Also, x..1 = 46.10, x..2 = 40.59, x..3 = 39.56, x..4 = 35.86, x..5 = 32.23, x..6 = 32.64, x… =

40 Example 11.11 Further computations are summarized in Table 11.10.
cont’d Further computations are summarized in Table Since F.01,5,20 = 6.46 < 26.89, P –value < Thus H0C is rejected in favor of the hypothesis that relative humidity does on average affect abrasion resistance. ANOVA Table for Example 11 Table 11.10

41 Example 11.11 To apply Tukey’s procedure,
cont’d To apply Tukey’s procedure, Ordering the x  k’s and underscoring yields In particular, the lowest relative humidity appears to result in a true average abrasion resistance significantly higher than for any other relative humidity studied.


Download ppt "Multifactor Analysis of Variance"

Similar presentations


Ads by Google