Topic 21: ANOVA and Linear Regression
Outline Review cell means and factor effects models Relationship between factor effects constraint and explanatory variables
Cell Means Model Y ij = μ i + ε ij –where μ i is the theoretical mean or expected value of all observations at level i –The ε ij are iid N(0, σ 2 ) –Y ij ~N(μ i, σ 2 ), independent
Factor Effects Model A reparameterization of the cell means model Useful way at looking at more complicated models Null hypotheses are easier to state Y ij = μ + i + ε ij –the ε ij are iid N(0, σ 2 )
Parameters The cell means model has r+1 parameters – r μ’s and σ 2 The factor effects model has r+2 parameters – μ, the r ’s, and σ 2 Build restriction on ’s in factor effects model to remove one degree of freedom (e.g., Σ i i = 0 or r = 0)
Regression Approach We can use multiple regression to reproduce the results based on the factor effects model Y ij = μ + i + ij and we will restrict Σ i i = 0
Coding for Explanatory Variables Σ i i = 0 implies r = - 1 - 2 -…- r-1 Due to restriction, i = 1 to r-1 columns X ij = 1 if Y is observation from level i = -1 if Y is observation at level r = 0 if Y is from any other level
KNNL Example Recall KNNL p 687 from Topic 20 It is a bit messy because n i = 5, 5, 4, 5 The grand mean is not necessarily the same as the mean of the group means (i.e., μ = (Σ i n i μ i )/n T ) We will calculate these two values You will have an easier example in the homework (n i is constant) where they are the same value
Means proc means data=a1 noprint; class design; var cases; output out=a2 mean=mclass; run; proc print data=a2; run;
Output Obs des _TYPE_ _FREQ_ mclass Grand sample mean…not the average of the four trt sample means shown below it
The mean of the means proc means data=a2 mean; where _TYPE_ eq 1; var mclass; run;
Output The MEANS Procedure Analysis Variable : mclass Mean ƒƒƒƒƒƒƒƒƒƒƒƒ ƒƒƒƒƒƒƒƒƒƒƒƒ Not a big difference from grand sample mean in this example
Generate explanatory variables for REG data a1; set a1; x1=(design eq 1)-(design eq 4); x2=(design eq 2)-(design eq 4); x3=(design eq 3)-(design eq 4); proc print data=a1; run;
Output Obs cases design x1 x2 x
Output with parameters des x1 x2 x μ + μ + μ + μ - 1 - 2 - 3 is the result of including an intercept
Run the regression proc reg data=a1; model cases=x1 x2 x3; run;
Output Anova Source DF SS MS F P Model <.0001 Error Total Same ANOVA table as GLM
Regression coefficients Var Est Int mean of the means x Y 1./n 1 - Int x Y 2./n 2 - Int x Y 3./n 3 - Int = = = =27.2 Get same trt means
Last slide Read KNNL Chapter 16 We used program topic21.sas to generate the output for today