Experimental Statistics - week 8 Chapter 17: Models with Random Effects
Models with Random Effects Fixed-Effects Models -- the models we’ve studied to this point -- factor levels have been specifically selected - investigator is interested in testing effects of these specific levels on the response variable Examples: -- CAR data - interested in performance of these 5 gasolines -- Pilot Plant data - interested in the specific temperatures (160o and 180o) and catalysts (C1 and C2)
Random-Effect Factor -- the factor has a large number of possible levels -- the levels used in the analysis are a random sample from the population of all possible levels - investigator wants to draw conclusions about the population from which these levels were chosen (not the specific levels themselves)
Fixed Effects vs Random Effects This determination affects - the model - the hypothesis tested - the conclusions drawn - the F-tests involved (sometimes)
1-Factor Random Effects Model Assumptions:
Hypotheses: Ho says (considering the variability of the yij’s) : Ha: sa2 0 Ho says (considering the variability of the yij’s) : - the component of the variance due to “Factor” has zero variance -- i.e. no factor “level-to-level” variation - all of the variability observed is just unexplained subject-to-subject variation -- at least none is explained by variation due to the factor
DATA one; INPUT operator output; DATALINES; 1 175.4 1 171.7 1 173.0 1 170.5 2 168.5 2 162.7 2 165.0 2 164.1 3 170.1 3 173.4 3 175.7 3 170.7 4 175.2 4 175.7 4 180.1 4 183.7 ; PROC GLM; CLASS operator; MODEL output=operator; RANDOM operator; TITLE ‘Operator Data: One Factor Random Effects Model'; RUN; These are data from an experiment studying the effect of four operators (chosen randomly) on the output of a particular machine. t = n =
One Factor Random effects Model The GLM Procedure Dependent Variable: output Sum of Source DF Squares Mean Square F Value Pr > F Model 3 371.8718750 123.9572917 14.91 0.0002 Error 12 99.7925000 8.3160417 Corrected Total 15 471.6643750 R-Square Coeff Var Root MSE output Mean 0.788425 1.674472 2.883755 172.2188 Source DF Type I SS Mean Square F Value Pr > F operator 3 371.8718750 123.9572917 14.91 0.0002 Source Type III Expected Mean Square operator Var(Error) + 4 Var(operator)
Conclusion: We reject Ho : sa2 = 0 (p = .0002) and we conclude that there is variability due to operator Note: Multiple comparisons are not used in random effects analyses -- we are interested in whether there is variability due to operator - not interested in which operators performed better, etc. (they were randomly chosen)
RECALL: 1-Factor (Fixed-Effects) ANOVA Table (page 389) Rationale for F-test and critical region: estimates estimates + constant × - if no factor effects, we expect F ≈ 1; - if factor effects, we expect F > 1
Expected Mean Squares for 1-Factor ANOVA’s (p.979) EMS Source SS df MS Fixed Effects Random Effects Treatments SST t - 1 MST Error SSE t(n - 1) MSE Total TSS tn - 1 Rationale for Test Statistic and Critical Region is the Same: Fixed or Random
DATA one; INPUT operator output; DATALINES; 1 175.4 1 171.7 1 173.0 1 170.5 2 168.5 2 162.7 2 165.0 2 164.1 3 170.1 3 173.4 3 175.7 3 170.7 4 175.2 4 175.7 4 180.1 4 183.7 ; PROC GLM; CLASS operator; MODEL output=operator; RANDOM operator; TITLE ‘Operator Data: One Factor Random Effects Model'; RUN; These are data from an experiment studying the effect of four operators (chosen randomly) on the output of a particular machine.
One Factor Random effects Model The GLM Procedure Dependent Variable: output Sum of Source DF Squares Mean Square F Value Pr > F Model 3 371.8718750 123.9572917 14.91 0.0002 Error 12 99.7925000 8.3160417 Corrected Total 15 471.6643750 R-Square Coeff Var Root MSE output Mean 0.788425 1.674472 2.883755 172.2188 Source DF Type I SS Mean Square F Value Pr > F operator 3 371.8718750 123.9572917 14.91 0.0002 Source Type III Expected Mean Square operator Var(Error) + 4 Var(operator)
Estimating Variance Components Solving for sa2 we get: so, we estimate sa2 by Also,
For OPERATOR Data,
RECALL: 2-Factor Fixed-Effects Model where
Expected Mean Squares for 2-Factor ANOVA with Fixed Effects: Expected MS F-test A MSA/MSE B MSB/MSE AB MSAB/MSE Error
2-Factor Random Effects Model Assumptions: Sum-of-Squares obtained as in Fixed-Effects case
Expected Mean Squares for 2-Factor ANOVA with Random Effects: Expected MS A B AB Error
To Test: we use F = we use F = we use F = Note: Test each of these 3 hypotheses (no matter whether Ho:sab2 = 0 is rejected)
2-Factor Random Effects ANOVA Table Source SS df MS F Main Effects A SSA a - 1 B SSB b- 1 Interaction AB SSAB (a - 1)(b- 1) Error SSE ab(n - 1) Total TSS abn - 1
Estimating Variance Components 2-Factor Random Effects Model (note error on page 986)
Filtration Process: Response - % material lost through filtration DATA one; INPUT operator filter loss; DATALINES; 1 1 16.2 1 1 16.8 1 1 17.1 1 2 16.6 1 2 16.9 1 2 16.8 . 4 1 14.9 4 2 15.4 4 2 14.6 4 2 15.9 4 3 16.1 4 3 15.4 4 3 15.6 ; PROC GLM; CLASS operator filter; MODEL loss=operator filter operator*filter; TITLE ‘2-Factor Random Effects Model'; RANDOM operator filter operator*filter/test; RUN; Filtration Process: Response - % material lost through filtration A – Operator (randomly selected) (a = ) B – Filter (randomly selected) (b = ) n = Operator 1 2 3 4 16.2 15.9 15.6 14.9 1 16.8 15.1 15.9 15.2 17.1 14.5 16.1 14.9 16.6 16.0 16.1 15.4 2 16.9 16.3 16.0 14.6 16.8 16.5 17.2 15.9 16.7 16.5 16.4 16.1 3 16.9 16.9 17.4 15.4 17.1 16.8 16.9 15.6 Filter
SAS Random-Effects Output (Filtration Data) 2-Factor Random Effects Model General Linear Models Procedure Dependent Variable: LOSS Sum of Mean Source DF Squares Square F Value Pr > F Model 11 16.60888889 1.50989899 8.16 0.0001 Error 24 4.44000000 0.18500000 Corrected Total 35 21.04888889 R-Square C.V. Root MSE LOSS Mean 0.789062 2.664175 0.4301163 16.144444 Source DF Type III SS Mean Square F Value Pr > F OPERATOR 3 10.31777778 3.43925926 18.59 0.0001 FILTER 2 4.63388889 2.31694444 12.52 0.0002 OPERATOR*FILTER 6 1.65722222 0.27620370 1.49 0.2229 Source Type III Expected Mean Square OPERATOR Var(Error) + 3 Var(OPERATOR*FILTER) + 9 Var(OPERATOR) FILTER Var(Error) + 3 Var(OPERATOR*FILTER) + 12 Var(FILTER) OPERATOR*FILTER Var(Error) + 3 Var(OPERATOR*FILTER)
SAS Random-Effects Output – continued “../test” option Tests of Hypotheses for Random Model Analysis of Variance Dependent Variable: LOSS Source: OPERATOR Error: MS(OPERATOR*FILTER) Denominator Denominator DF Type III MS DF MS F Value Pr > F 3 3.4392592593 6 0.2762037037 12.4519 0.0055 Source: FILTER 2 2.3169444444 6 0.2762037037 8.3885 0.0183 Source: OPERATOR*FILTER Error: MS(Error) 6 0.2762037037 24 0.185 1.4930 0.2229
Filtration Problem Results and Conclusions
Variable 1: Active Ingredient (in mg/mL) at End of Storage Period Table 1. 2-Factor ANOVA - Ex 15.41, page 935 -- mg/mL Data The GLM Procedure Dependent Variable: mgml Sum of Source DF Squares Mean Square F Value Pr > F Model 7 0.46740000 0.06677143 27.30 <.0001 Error 16 0.03913333 0.00244583 Corrected Total 23 0.50653333 R-Square Coeff Var Root MSE mgml Mean 0.922743 0.165090 0.049455 29.95667 Source DF Type III SS Mean Square F Value Pr > F time 3 0.29376667 0.09792222 40.04 <.0001 lab 1 0.09126667 0.09126667 37.32 <.0001 time*lab 3 0.08236667 0.02745556 11.23 0.0003
Table 3. Calculations for LSD comparisons of mg/mL Cell Means T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2 30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80 Comparison Actual Difference (lsd = .086) T3L1 vs T9L2 .37 T3L1 vs T6L2 .37 T3L1 vs T9L1 .36 T3L1 vs T3L2 .27 T3l1 vs T6L1 .16 T3L1 vs T1L2 .09 T3L1 vs T1L1 .38 X T1L1 vs T9L2 .29 T1L1 vs T6L2 .29 T1L1 vs T9L2 .28 T1L1 vs T3L2 .19 T1L1 vs T6L1 .08 X T1L2 vs T9L2 .28 T1L2 vs T6L2 .28 T1L2 vs T9L2 .27 T1L2 vs T3L2 .18 T6L1 vs T9L2 .21 T6L1 vs T6L2 .21 T6L1 vs T9L2 .20 T6L1 vs T3L2 .17 T3L2 vs T9L2 .10 T3L2 vs T6L2 .10 T3L2 vs T9L2 .09 T9L1 vs T9L2 .01 X ------------- ---------------------- ------- ------------------------
T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2 30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80 ------------- ---------------------- ------- ------------------------
2-Factor Mixed Effects Model random fixed Assumptions: Sum-of-Squares obtained as before
Expected Mean Squares for 2-Factor ANOVA with Mixed Effects: Expected MS A (fixed) (random) B AB Error
Expected Mean Squares for 2-Factor ANOVA with Mixed Effects: SAS Expected MS Book’s Expected MS A (fixed) (random) B AB Error
Mixed-Effects Model To Test: use F = SAS uses F = use F = Again: Test each of these 3 hypotheses as in random-effects case.
2-Factor Mixed-Effects ANOVA Table (using SAS Expected MS) Source SS df MS F Main Effects A SSA a - 1 B SSB b- 1 Interaction AB SSAB (a - 1)(b- 1) Error SSE ab(n - 1) Total TSS abn - 1
Estimating Variance Components 2-Factor Mixed-Effects Model (based on SAS Expected MS) Note: A is a fixed effect
Response – fatigue of mechanical part (F)ull Military Inspect. (R)educed Military Inspect. Product Inspection (C)ommercial Response – fatigue of mechanical part A – type of inspection (a = ) B – inspector (randomly selected) (b = ) 7.50 7.08 6.15 7.42 6.17 5.52 1 5.85 5.65 5.48 5.89 5.30 5.48 5.35 5.02 5.98 7.58 7.68 6.17 6.52 5.86 6.20 2 6.54 5.28 5.44 5.64 5.38 5.75 5.12 4.87 5.68 7.70 7.19 6.21 6.82 6.19 5.66 3 6.42 5.85 5.36 5.39 5.35 5.90 5.35 5.01 6.12 n = Inspector
Mixed-Effects Data DATA one; INPUT insp$ level$ fatigue; DATALINES; . 2 C 5.68 3 C 6.21 3 C 5.66 3 C 5.36 3 C 5.90 3 C 6.12 ; PROC GLM; CLASS insp level; MODEL fatigue= level insp level*insp; TITLE 'Mixed-Effects Model'; RANDOM insp level*insp/test; RUN; PROC MEANS mean var; CLASS level; VAR fatigue;
SAS Mixed-Effects Output Mixed-Effects Model The GLM Procedure Dependent Variable: fatigue Sum of Source DF Squares Mean Square F Value Pr > F Model 8 2.70711111 0.33838889 0.53 0.8282 Error 36 23.11448000 0.64206889 Corrected Total 44 25.82159111 R-Square Coeff Var Root MSE fatigue Mean 0.104839 13.35141 0.801292 6.001556 Source DF Type III SS Mean Square F Value Pr > F level 2 2.58739111 1.29369556 2.01 0.1481 insp 2 0.02523111 0.01261556 0.02 0.9806 insp*level 4 0.09448889 0.02362222 0.04 0.9973
SAS Mixed-Effects Output - Continued Mixed-Effects Model The GLM Procedure Source Type III Expected Mean Square level Var(Error) + 5 Var(insp*level) + Q(level) insp Var(Error) + 5 Var(insp*level) + 15 Var(insp) insp*level Var(Error) + 5 Var(insp*level) Tests of Hypotheses for Mixed Model Analysis of Variance Dependent Variable: fatigue Source DF Type III SS Mean Square F Value Pr > F level 2 2.587391 1.293696 54.77 0.0012 insp 2 0.025231 0.012616 0.53 0.6229 Error 4 0.094489 0.023622 Error: MS(insp*level) insp*level 4 0.094489 0.023622 0.04 0.9973 Error: MS(Error) 36 23.114480 0.642069
Multiple Comparisons for Fixed Effect (Inspection Level) -- Use MSAB in place of MSE where ▪ N denotes the # of observations involved in the computation of a marginal mean ▪ v denotes the df associated with AB interaction
SAS Mixed-Effects Output – Output from PROC Means The MEANS Procedure Analysis Variable : fatigue N level Obs Mean Variance ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ C 15 5.8066667 0.0981810 F 15 6.3393333 0.8208638 R 15 5.8586667 0.7405410
Mixed-Effects Example Results and Conclusions: