© Department of Statistics 2012 STATS 330 Lecture 18 Slide 1 Stats 330: Lecture 18.

Slides:



Advertisements
Similar presentations
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
Advertisements

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
ANOVA: Analysis of Variation
© Department of Statistics 2001 Slide 1 Stats 760: Lecture 2.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Generalized Linear Models (GLM)
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Multiple Regression Predicting a response with multiple explanatory variables.
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Lecture 23: Tues., Dec. 2 Today: Thursday:
Part I – MULTIVARIATE ANALYSIS
Chapter 12 Multiple Regression
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Chapter 11 Multiple Regression.
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
Ch. 14: The Multiple Regression Model building
January 7, morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Chapter 12: Analysis of Variance
F-Test ( ANOVA ) & Two-Way ANOVA
BIOL 582 Lecture Set 19 Matrices, Matrix calculations, Linear models using linear algebra.
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Analysis of Covariance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Applications The General Linear Model. Transformations.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Chapter 14 Introduction to Multiple Regression
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
23-1 Multiple Covariates and More Complicated Designs in ANCOVA (§16.4) The simple ANCOVA model discussed earlier with one treatment factor and one covariate.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Chapter 12 Analysis of Variance 12.2 One-Way ANOVA.
Orthogonal Linear Contrasts This is a technique for partitioning ANOVA sum of squares into individual degrees of freedom.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
General Linear Model.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
© Department of Statistics 2012 STATS 330 Lecture 19: Slide 1 Stats 330: Lecture 19.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
12/22/ lecture 171 STATS 330: Lecture /22/ lecture 172 Factors  In the models discussed so far, all explanatory variables have been.
Linear Models Alan Lee Sample presentation for STATS 760.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
The p-value approach to Hypothesis Testing
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh.
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Lecture 11: Simple Linear Regression
ANOVA: Analysis of Variation
Chapter 14 Introduction to Multiple Regression
Statistics for the Social Sciences
CHAPTER 29: Multiple Regression*
Presentation transcript:

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 1 Stats 330: Lecture 18

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 2 Anova Models These are linear (regression) models where all the explanatory variables are categorical. If there is just one categorical explanatory variable, then we have the “one-way anova” model discussed in STATS 201/8 If there are two categorical explanatory variables, then we have the “two-way anova” model, also discussed in STATS 201/8 However, we shall regard these as just another type of regression model

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 3 Example: One way model In an experiment to study the effect of carcinogenic substances, six different substances were applied to cell cultures. The response variable (ratio) is the ratio of damaged to undamaged cells, and the explanatory variable (treatment) is the substance On website – carcinogenic substances data

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 4 Data ratio treatment control choral hydrate diazapan hydroquinone econidazole colchicine control choral hydrate diazapan hydroquinone econidazole colchicine control choral hydrate diazapan... More data

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 5 Distributions skewed? boxplot(ratio~treatment, data=cancer.df, ylab = "ratio", main = "Ratios for different substances")

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 6 The model where the mean depends on the substance. Thus, We make the usual assumptions about the errors (normal, equal variance, independent etc)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 7 Offset form

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 8 Dummy variable form Define: CH= 1 if treatment = Choral Hydrate, 0 else D = 1 if treatment = diazapan, 0 else H = 1 if treatment = hydroquinone, 0 else E = 1 if treatment = econidazole, 0 else C = 1 if treatment = colchicine, 0 else Then

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 9 Estimation To estimate the offsets and the baseline (control) mean, we use lm as usual. We have to rearrange the levels to make the control the baseline carcin.df = read.table(file.choose(), header=T) carcin.df$treatment = factor(carcin.df$treatment, levels = c("control", "chloralhydrate", "colchicine", "diazapan", "econidazole", "hydroquinone")) summary(lm(ratio~treatment, data=carcin.df))

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 10 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) < 2e-16 *** treatmentchloralhydrate treatmentcolchicine e-12 *** treatmentdiazapan treatmenteconidazole treatmenthydroquinone ** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 294 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 5 and 294 DF, p-value: 3.897e-12 lm output

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 11 Non- normal? Variances about equal ignore

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 12 boxcoxplot(ratio~ treatment, data=carcin.df)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 13 Analyzing ¼ power WB test: previous p=0.00 Current p=0.06 Normality better

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 14 Analyzing ¼ power: summary Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) < 2e-16 *** treatmentchloralhydrate treatmentcolchicine e-11 *** treatmentdiazapan treatmenteconidazole treatmenthydroquinone * Residual standard error: on 294 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 5 and 294 DF, p-value: 1.008e-10 Residual standard error: on 294 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 5 and 294 DF, p-value: 1.008e-10

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 15 Testing equality of means The standard F-test for equality of means is computed using the anova function Here comparing equal means model (Null model) with different means model – only one term in model > quarter.lm <- lm(ratio^(1/4)~treatment, data=carcin.df) > anova(quarter.lm) Analysis of Variance Table Response: ratio^(1/4) Df Sum Sq Mean Sq F value Pr(>F) treatment e-10 *** Residuals Highly significant differences

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 16 Oneway plot plot (s20x) > onewayPlot(quarter.lm) Tukey: all cover true values with 95% prob

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 17 Two factors: example Experiment to study weight gain in rats –Response is weight gain over a fixed time period –This is modelled as a function of diet (Beef, Cereal, Pork) and amount of feed (High, Low)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 18 Data > diets.df gain source level 1 73 Beef High 2 98 Cereal High 3 94 Pork High 4 90 Beef Low Cereal Low 6 49 Pork Low Beef High 8 74 Cereal High 9 79 Pork High Beef Low observations in all

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 19 Two factors: the model If the (continuous) response depends on two categorical explanatory variables, then we assume that the response is normally distributed with a mean depending on the combination of factor levels: if the factors are A and B, the mean at the i th level of A and the j th level of B is  ij Other standard assumptions (equal variance, normality, independence) apply

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 20 Diagramatically… Source = Beef Source = Cereal Source = Pork Level =High  11  12  13 Level =Low  21  22  23

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 21 Decomposition of the means We usually want to split each “cell mean” up into 4 terms: –A term reflecting the overall baseline level of the response –A term reflecting the effect of factor A (row effect) –A term reflecting the effect of factor B (column effect) –A term reflecting how A and B interact.

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 22 Mathematically… Overall Baseline:  11 (mean when both factors are at their baseline levels) Effect of i th level of factor A (row effect):  i1  11   The i th level of A, at the baseline of B, expressed as a deviation from the overall baseline) Effect of j th level of factor B (column effect) :  1j -  11 (The j th level of B, at the baseline of A, expressed as a deviation from the overall baseline) Interaction: what’s left over (see next slide)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 23 Interactions Each cell (except the first row and column) has an interaction: Interaction = cell mean - baseline - row effect - column effect cell mean = baseline + row effect + column effect + interaction

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 24 Notation Overall baseline:  =  11 Main effects of A  i =  i1 -  11 Main effects of B  j =  1j -  11 AB interactions:  ij =  ij -  i1 -  1j +  11 Thus,  ij =  i +  j +  ij

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 25 Importance of interactions If the interactions are all zero, then the effect of changing levels of A is the same for all levels of B In mathematical terms,  ij –  i ’ j doesn’t depend on j Equivalently, effect of changing levels of B is the same for all levels of A If interactions are zero, relationship between factors and response is simple

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 26 Why are comparisons simple when interactions are zero? Doesn’t depend on i!

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 27 Splitting up the mean: rats Cell Means BeefCerealPorkBaseline col High Low Baseline row Factors are : level (amount of food) and source (diet) Row effect for Low: 79.2 – 100 = Col effect for Cereal: = Col effect for Pork: = -0.5 Low-Cereal interaction: = 18.8 Low-Cereal interaction: = 0

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 28 Exploratory plots > plot.design(diets.df) More gain on high amount of feed and Beef diet

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 29 dotplot(source~gain|level, data=diets.df)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 30 Fit model > diets.lm<-lm(gain~source+level + source:level, data=diets.df) > summary(diets.lm) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.000e e < 2e-16 sourceCereal e e sourcePork e e levelLow e e sourceCereal:levelLow 1.880e e sourcePork:levelLow e e e Residual standard error: on 54 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: 4.3 on 5 and 54 DF, p-value:

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 31 Fitting as a regression model Note that this is equivalent to fitting a regression with dummy variables R2, C2, C3 R2 = 1 if obs is in row 2, zero otherwise C2 = 1 if obs is in column 2, zero otherwise C3 = 1 if obs is in column 3, zero otherwise The regression is Y ~ R2 + C2 + C3 + I(R2*C2) + I(R2*C3)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 32 > R2 = ifelse(diets.df$level=="Low",1,0) > C2 = ifelse(diets.df$source=="Cereal",1,0) > C3 = ifelse(diets.df$source=="Pork",1,0) > reg.lm = lm(gain ~ R2 + C2 + C3 + I(R2*C2) + I(R2*C3), data=diets.df) > summary(reg.lm) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.000e e < 2e-16 *** C e e * C e e R e e ** I(R2 * C2) 1.880e e * I(R2 * C3) e e e Regression summary

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 33 Testing for zero interactions > anova(diets.lm) Analysis of Variance Table Response: gain Df Sum Sq Mean Sq F value Pr(>F) source level *** source:level Residuals Some evidence of interaction

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 34 Interaction plot > interaction.plot(source,level,gain) Non-parallel lines indicate interaction

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 35 Do we need Source in the model? > model1<-lm(gain~source*level) # note shorthand > model2<-lm(gain~level) > anova(model2,model1) Analysis of Variance Table Model 1: gain ~ level Model 2: gain ~ source * level Res.Df RSS Df Sum of Sq F Pr(>F) Not significant! No significant effect of Source

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 36 Notations: review For two factors A and B Baseline:  =  11 A main effect:  i =  i1 -  11 B main effect:  j =  1j -  11 AB interaction:  ij =  ij -  i1 -  1j +  11 Then  ij =  +  i +  j +  ij

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 37 Zero interaction model If we have only one observation per factor level combination, we can’t estimate the interactions and the error variance We have to assume that the interactions are zero and fit an “additive model” gain ~ level + source Can test zero interactions In a reduced form) using the “Tukey one-degree of freedom test” –

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 38 Possible models for two factors For two factors A and B possible models are: Y~1 (Fit single mean only) Y~A (cell means depend on A alone) Y~B (Cell means depend on B alone) Y~A+B (Cell means have no interaction) Y~A*B (General model, cell means have no restrictions)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 39 In terms of “effects” General model is Y~A+B+A:B (equivalently Y~A*B) Mathematical form is E(Y ij ) =  +  i +  j +  ij Y~1 implies  i =0,  j =0,  ij =0 Y~A implies  j =0,  ij =0 Y~B implies  i =0,  ij =0 Y~A+B implies  ij =0

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 40 Interpreting Anova All F-tests essentially compare a model to a sub- model, using an estimate of  2 in the denominator: The anova function can do this explicitly, as in anova(model1, model2), with the estimate of  2 coming from the bigger model. When we use just 1 argument, as in anova(model1), the models being compared are selected implicitly

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 41 Interpreting Anova (cont) For example, consider a model with 2 factors A and B: > anova(lm(y~A+B+A:B)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) A ** B A:B Residuals Full-model estimate of  2

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 42 First line The first line of the table compares the model y~A with a null model (all means the same), using an estimate of  2 =1.288 from the full model y~A+B+A:B > model1<-lm(y~A) > model0<-lm(y~1) > anova(model0,model1) Analysis of Variance Table Model 1: y ~ 1 Model 2: y ~ A Res.Df RSS Df Sum of Sq F Pr(>F) ** Difference in Numerator of F test

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 43 Second line The second line of the table compares the “no interaction” model y~A+B with the model y~A, using an estimate of  2 from the full model y~A+B+A:B > model2<-lm(y~A+B) > model1<-lm(y~A) > anova(model1,model2) Analysis of Variance Table Model 1: y ~ A Model 2: y ~ A + B Res.Df RSS Df Sum of Sq F Pr(>F) Difference in Numerator of F test in line 2

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 44 Third line The third line of the table compares full model y~A+B+A:B with the “no interaction” model y~A+B, using an estimate of  2 from the full model > model2<-lm(y~A+B) > model3<-lm(y~A+B+A:B) > anova(model2,model3) Analysis of Variance Table Model 1: y ~ A + B Model 2: y ~ A + B + A:B Res.Df RSS Df Sum of Sq F Pr(>F) Difference in Numerator of F test in line 3

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 45 To summarise: Terms are added line by line The F-test compares the current model with the previous model At each stage, the estimate of  2 is obtained from the full model.

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 46 More than two factors: example An experiment was conducted to compare different diets for feeding chickens. The diets depended on 3 variables: –Source of Protein (variable protein) : either “groundnut” or “soybean” –Level of protein (variable protlevel): either 0, 1 or 2 –Level of fish solubles (variable fish) :either high or low Response variable was weight gain (variable chickweight)

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 47 data chickweight protein protlevel fish groundnut 0 Low groundnut 0 High groundnut 1 Low groundnut 1 High groundnut 2 Low groundnut 2 High soybean 0 Low soybean 0 High soybean 1 Low soybean 1 High soybean 2 Low soybean 2 High observations in all

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 48 Data characteristics There are 3 factors –protein with 2 levels (groundnut, soybean) –protlevel with 3 levels (0,1,2) –fish with 2 levels high, low There are 2 x 3 x 2 = 12 factor level combinations, so 12 means Each combination is observed twice, so 24 observations in all

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 49 Interactions Let  ijk be the population mean of all observations taken at level i of protein, level j of protlevel and level k of fish We can split this mean up into 8 terms: An overall baseline  =  “main effects” e.g.  i =  i1 1 -  “two-way interactions” e.g.  ij  ij   i   1j1    A “3-way interaction”  ijk  ijk   i   1j1   11k   ij   1jk   i1k     

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 50 Interactions (cont) Then  ijk  i   j   k   ij   jk   ik  ijk As before, if any one of the subscripts i, j, k is 1 then the corresponding interaction is zero. Interpretation: e.g. if the protlevel x fish and the 3-way interactions are all zero, then the effect of changing levels of fish is the same for all levels of protlevel. 

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 51 Why? Doesn’t depend on j!

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 52 Estimating terms > model1<-lm(chickweight~protein*protlevel*fish, data=chickwts.df) > summary(model1) Estimate Std. Error t value Pr(>|t|) (Intercept) e-13 proteinsoybean protlevel protlevel fishLow proteinsoybean:protlevel proteinsoybean:protlevel proteinsoybean:fishLow protlevel1:fishLow protlevel2:fishLow proteinsoybean:protlevel1:fishLow proteinsoybean:protlevel2:fishLow Residual standard error: on 12 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 11 and 12 DF, p-value:

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 53 Anova for the chick weights > anova(model1) Analysis of Variance Table Response: chickweight Df Sum Sq Mean Sq F value Pr(>F) protein protlevel fish ** protein:protlevel * protein:fish protlevel:fish protein:protlevel:fish Residuals Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Suggests model protein*protlevel + fish

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 54 Check > anova(lm(chickweight~protein*protlevel + fish), lm(chickweight~protein*protlevel*fish)) Analysis of Variance Table Model 1: chickweight ~ protein * protlevel + fish Model 2: chickweight ~ protein * protlevel * fish Res.Df RSS Df Sum of Sq F Pr(>F) Not significant, but interpret with caution Effect of fish the same for each protein/protlevel combination

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 55 Interpretation of interactions If a factor (say A) does not interact with the others, the effect of changing levelsof A is the same for all levels of the other factors If the 3 way interactions are zero, then the interaction between A and B is the same for all levels of C

© Department of Statistics 2012 STATS 330 Lecture 18 Slide 56 Summary Anova models are interpreted just like regressions, except –No question of planarity (linear by definition ) –Need to interpret interactions –Judge effect of factors by anova –Factors in anova added one at a time –Suitable for completely randomised experiments where it is reasonable to assume observations are independent