Multifactor Analysis of Variance

Multifactor Analysis of Variance
11 Multifactor Analysis of Variance Copyright © Cengage Learning. All rights reserved.

11.2 Two-Factor ANOVA with Kij  1
Copyright © Cengage Learning. All rights reserved.

Two-factor Anova With Kij  1
In Section 11.1 we analyzed data from a two-factor experiment in which there was one observation for each of the IJ combinations of factor levels. The ij’s were assumed to have an additive structure with ij =  + i + j, I = j = 0. Additivity means that the difference in true average responses for any two levels of the factors is the same for each level of the other factor.

For example, ij – I’j = ( + i + j) – ( + i’ + j) = i – I’, independent of the level j of the second factor. This is shown in Figure 11.1(a) on p. 440, in which the lines connecting true average responses are parallel. Mean responses for two types of model: (a) additive Figure 11.1(a)

Figure 11.1(b) depicts a set of true average responses that does not have additive structure. Mean responses for two types of model: (b) nonadditive Figure 11.2 (b)

The lines connecting these Ij’s are not parallel, which means that the difference in true average responses for different levels of one factor does depend on the level of the other factor. When additivity does not hold, we say that there is interaction between the different levels of the factors. The assumption of additivity in Section 11.1 allowed us to obtain an estimator of the random error variance 2 (MSE) that was unbiased whether or not either null hypothesis of interest was true.

When Ki,j  1 for at least one (i, j ) pair, a valid estimator of 2 can be obtained without assuming additivity. Our focus here will be on the case Kij = K  1, so the number of observations per “cell” (for each combination of levels) is constant.

Fixed Effects Parameters and Hypotheses

Rather than use the Ij’s themselves as model parameters, it is customary to use an equivalent set that reveals more clearly the role of interaction. Notation Thus  is the expected response averaged over all levels of both factors (the true grand mean), i is the expected response averaged over levels of the second factor when the first factor A is held at level i, and similarly for j.

Definition

The model is additive if and only if all γij ’s = 0. The γij ’s are referred to as the interaction parameters. The ai s are called the main effects for factor A, and the bj’s are the main effects for factor B. Although there are Ii ’s, Jj ’s, and IJ γij ’s in addition to , the conditions i = 0, j = 0, j γij = 0 for any i, and j γij = 0 for any j [all by virtue of (11.7) and (11.8)] imply that only IJ of these new parameters are independently determined: , I – 1 of the i ’s J – 1 of the j’s, and (I – 1)(J – 1) of the γij ’s.

There are now three sets of hypotheses to be considered: The no-interaction hypothesis H0AB is usually tested first. If H0AB is not rejected, then the other two hypotheses can be tested to see whether the main effects are significant.

If H0AB is rejected and H0A is then tested and not rejected, the resulting model ij =  + j + γij does not lend itself to straightforward interpretation. In such a case, it is best to construct a picture similar to that of Figure 11.1(b) to try to visualize the way in which the factors interact. Mean responses for two types of model: nonadditive Figure 11.1(b)

The Model and Test Procedures

We now use triple subscripts for both random variables and observed values, with Xijk and xijk referring to the kth observation (replication) when factor A is at level I and factor B is at level j.

Again, a dot in place of a subscript denotes summation over all values of that subscript, and a horizontal bar indicates averaging. Thus Xij is the total of all K observations made for factor A at level i and factor B at level j [all observations in the (i, j)th cell], and Xij is the average of these K observations. Test procedures are based on the following sums of squares:

Definition

Total variation is thus partitioned into four pieces: unexplained (SSE—which would be present whether or not any of the three null hypotheses was true) and three pieces that may be attributed to the truth or falsity of the three H0s. Each of four mean squares is defined by MS = SS/df. The expected mean squares suggest that each set of hypotheses should be tested using the appropriate ratio of mean squares with MSE in the denominator: E(MSE) = 2

Each of the three mean square ratios can be shown to have an F distribution when the associated H0 is true. If 𝐻 0𝐴𝐵 is false, the expected value of the numerator mean square in 𝐹 𝐴𝐵 exceeds that of the denominator mean square.

The larger the value of this F ratio, the stronger is the evidence against the null hypothesis, again implying an upper-tailed test. Analogous comments apply to the tests for main effects.

Example 11.7 Lightweight aggregate asphalt mix has been found to have lower thermal conductivity than a conventional mix, which is desirable. The article “Influence of Selected Mix Design Factors on the Thermal Behavior of Lightweight Aggregate Asphalt Mixes” (J. of Testing and Eval., 2008: 1–8) reported on an experiment in which various thermal properties of mixes were determined.

Example 11.7 cont’d Three different binder grades were used in combination with three different coarse aggregate contents (%), with two observations made for each such combination, resulting in the conductivity data (W/m °K) that appears in Table 11.6. Conductivity Data for Example 7 Table 11.6

Example 11.7 cont’d Here I = J = 3 and K = 2 for a total of IJK = 18 observations. The results of the analysis are summarized in the ANOVA table which appears as Table 11.7 (a table with additional information appeared in the cited paper). ANOVA Table for Example 7 Table 11.7

Example 11.7 cont’d The P-value for testing for the presence of interaction effects is .414, which is clearly larger than any reasonable significance level so the interaction null hypothesis cannot be rejected. Thus it appears that there is no interaction between the two factors. However, both main effects are significant at the 5% significance level (.002 ≤ .05 and .000 ≤ .05). So it appears that true average conductivity depends on which grade is used and also on the level of coarse-aggregate content.

Example 11.7 cont’d Figure 11.5(a) shows an interaction plot for the conductivity data. Interaction Plots for the Asphalt Data of Example 7. Response variable is conductivity. Figure 11.5(a)

Example 11.7 cont’d Notice the nearly parallel sets of line segments for the three different asphalt grades, in agreement with the F test that shows no significant interaction effects. True average conductivity appears to decrease as aggregate content decreases.

Example 11.7 cont’d Figure 11.5(b) shows an interaction plot for the response variable thermal diffusivity, values of which appear in the cited article. Interaction Plots for the Asphalt Data of Example (b) Response variable is diffusivity Figure 11.5

Example 11.7 cont’d The bottom two sets of line segments are close to parallel, but differ markedly from those for PG64; in fact, the F ratio for interaction effects is highly significant here. Plausibility of the normality and constant variance assumptions can be assessed by constructing plots similar to those of Section Define the predicted (i.e., fitted) values to be the cell means: For example, the predicted value for grade PG58 and aggregate content 38 is = ( )2 = .840 for k = 1, 2. The residuals are the differences between the observations and corresponding predicted values:

Example 11.7 cont’d A normal probability plot of the residuals is shown in Figure 11.6(a). The pattern is sufficiently linear that there should be no concern about lack of normality. Plots for Checking Normality and Constant Variance Assumptions in Example 7 Figure 11.6(a)

Example 11.7 cont’d The plot of residuals against predicted values in Figure 11.6(b) shows a bit less spread on the right than on the left, but not enough of a differential to be worrisome; constant variance seems to be a reasonable assumption. Plots for Checking Normality and Constant Variance Assumptions in Example 7 Figure 11.6(b)

Multiple Comparisons

Multiple Comparisons When the no-interaction hypothesis H0AB is not rejected and at least one of the two main effect null hypotheses is rejected, Tukey’s method can be used to identify significant differences in levels. For identifying differences among the i’s when H0A is rejected, 1. Obtain Q,I,IJ(K – 1), where the second subscript I identifies the number of levels being compared and the third subscript refers to the number of degrees of freedom for error.

Multiple Comparisons 2. Compute , where JK is the number of observations averaged to obtain each of the xi ’s compared in Step 3. 3. Order the xi ’s from smallest to largest and, as before, underscore all pairs that differ by less than w. Pairs not underscored correspond to significantly different levels of factor A. To identify different levels of factor B when H0B is rejected, replace the second subscript in Q by J, replace JK by IK in w, and replace xi by xj.

Example 11.8 Example 11.7… continued
I = J = 3 for both factor A (grade) and factor B (aggregate content). With  = .05 and error df = IJ(K – 1) = 9, Q.05,3,9 = 3.95. The yardstick for identifying significant differences is then The grade sample means in increasing order are .8033, .8180, and

Example 11.8 cont’d Only the difference between the two largest means is smaller than w. This gives the underscoring pattern PG70 PG58 PG64 Grades PG58 and PG64 do not appear to differ significantly from one another in effect on true average conductivity, but both differ from the PG70 grade.

Example 11.8 cont’d The ordered means for factor B are .7883, .8227, and All three pairs of means differ by more than , so there are no underscoring lines. True average conductivity appears to be different for all three levels of aggregate content.

Models with Mixed and Random Effects

In some problems, the levels of either factor may have been chosen from a large population of possible levels, so that the effects contributed by the factor are random rather than fixed. As in Section 11.1, if both factors contribute random effects, the model is referred to as a random effects model, whereas if one factor is fixed and the other is random, a mixed effects model results.

We will now consider the analysis for a mixed effects model in which factor A (rows) is the fixed factor and factor B (columns) is the random factor. The case in which both factors are random is dealt with in Exercise 26. Definition

Here  and i’s are constants with i = 0, and the Bj’s, Gij’s, and εijk’s are independent, normally distributed random variables with expected value 0 and variances and respectively. The relevant hypotheses here are somewhat different from those for the fixed effects model.

It is customary to test H0A and H0B only if the no-interaction hypothesis H0G cannot be rejected. Sums of squares and mean squares needed for the test procedures are defined and computed exactly as in the fixed effects case. The expected mean squares for the mixed model are E(MSE) = 2 E(MSB) = 2 + K2G + IK 2B E(MSB) = 2 + K2G

The ratio fAB = MSAB/MSE is again appropriate for testing the no-interaction hypothesis, with H0G rejected if fAB  F,(I – 1)(J – 1),IJ(K – 1). However, for testing H0A versus HaA, the expected mean squares suggest that although the numerator of the F ratio should still be MSA, the denominator should be MSAB rather than MSE. MSAB is also the denominator of the F ratio for testing H0B.

Example 11.9 A process engineer has identified two potential causes of electric motor vibration, the material used for the motor casing (factor A) and the supply source of bearings used in the motor (factor B). The accompanying data on the amount of vibration (microns) resulted from an experiment in which motors with casings made of steel, aluminum, and plastic were constructed using bearings supplied by five randomly selected sources.

Example 11.9 cont’d Only the three casing materials used in the experiment are under consideration for use in production, so factor A is fixed.

Example 11.9 cont’d However, the five supply sources were randomly selected from a much larger population, so factor B is random. The relevant null hypotheses are H0A: 1 = 2 = 3 = 0 H0B: 2B = 0 H0AB:  2G = 0

Example 11.9 Minitab output appears in Figure 11.7. cont’d
Output from Minitab’s balanced ANOVA option for the data of Example 9 Figure 11.7

Example 11.9 cont’d The P-value column in the ANOVA table indicates that the latter two null hypotheses should be rejected at significance level .05. Different casing materials by themselves do not appear to affect vibration, but interaction between material and supplier is a significant source of variation in vibration.

When at least two of the Kij’s are unequal, the ANOVA computations are much more complex than for the case Kij = K. In addition, there is controversy as to which test procedures should be used. One of the chapter references can be consulted for more information.

Multifactor Analysis of Variance

Similar presentations

Presentation on theme: "Multifactor Analysis of Variance"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multifactor Analysis of Variance

Similar presentations

Presentation on theme: "Multifactor Analysis of Variance"— Presentation transcript:

Similar presentations

About project

Feedback