Topic 27: Strategies of Analysis. Outline Strategy for analysis of two-way studies –Interaction is not significant –Interaction is significant What if.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

Topic 32: Two-Way Mixed Effects Model. Outline Two-way mixed models Three-way mixed models.
Chapter 11 Analysis of Variance
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Topic 15: General Linear Tests and Extra Sum of Squares.
ANOVA: Analysis of Variation
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 11 Analysis of Variance
Statistics for Business and Economics
Chapter 17 Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Incomplete Block Designs
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Copyright ©2011 Pearson Education 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel.
Chapter 12: Analysis of Variance
Topic 16: Multicollinearity and Polynomial Regression.
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
1 Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models.
Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.
1 Experimental Statistics - week 6 Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5)
Analysis of Variance ( ANOVA )
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
CHAPTER 12 Analysis of Variance Tests
Randomized Block Design (Kirk, chapter 7) BUSI 6480 Lecture 6.
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
Topic 17: Interaction Models. Interaction Models With several explanatory variables, we need to consider the possibility that the effect of one variable.
STA305 week 51 Two-Factor Fixed Effects Model The model usually used for this design includes effects for both factors A and B. In addition, it includes.
The Completely Randomized Design (§8.3)
Lecture 9-1 Analysis of Variance
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Topic 25: Inference for Two-Way ANOVA. Outline Two-way ANOVA –Data, models, parameter estimates ANOVA table, EMS Analytical strategies Regression approach.
Bread Example: nknw817.sas Y = number of cases of bread sold (sales) Factor A = height of shelf display (bottom, middle, top) Factor B = width of shelf.
Topic 26: Analysis of Covariance. Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance.
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
Experiment Design Overview Number of factors 1 2 k levels 2:min/max n - cat num regression models2k2k repl interactions & errors 2 k-p weak interactions.
1 Experimental Statistics - week 9 Chapter 17: Models with Random Effects Chapter 18: Repeated Measures.
Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model.
1 Experimental Statistics Spring week 6 Chapter 15: Factorial Models (15.5)
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
Chapter 4 Analysis of Variance
CRD, Strength of Association, Effect Size, Power, and Sample Size Calculations BUSI 6480 Lecture 4.
Experimental Statistics - week 9
Topic 29: Three-Way ANOVA. Outline Three-way ANOVA –Data –Model –Inference.
Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.
Topic 21: ANOVA and Linear Regression. Outline Review cell means and factor effects models Relationship between factor effects constraint and explanatory.
1 Experimental Statistics - week 8 Chapter 17: Mixed Models Chapter 18: Repeated Measures.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Experimental Design and Analysis of Variance Chapter 11.
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Two-Way Analysis of Variance Chapter 11.
Factorial Experiments
Two-Factor Full Factorial Designs
Relationship with one independent variable
Topic 31: Two-way Random Effects Models
Chapter 11 Analysis of Variance
Essentials of Statistics for Business and Economics (8e)
Experimental Statistics - week 8
Presentation transcript:

Topic 27: Strategies of Analysis

Outline Strategy for analysis of two-way studies –Interaction is not significant –Interaction is significant What if levels of factor or factors is quantitative? What if cell size is n=1?

An analytical strategy Run the model with main effects and the two-way interaction Plot the data, the means and look at the normal quantile plot Check the significance test for the interaction

AB interaction not sig Possibly rerun the analysis without the interaction. Only consider if df e is small For a main effect that is not significant –No evidence to conclude that the levels of this factor are associated with different means of the response variable –Possibly rerun without this factor giving a one-way anova. Watch out for Type II errors that inflate MSE

AB interaction not sig If both main effects are not significant –Model could be run as Y=; –Basically assuming a one population model For a main effect with more than two levels that is significant, use the means or lsmeans statement with the tukey multiple comparison procedure Contrasts and linear combinations can also be examined using the contrast and estimate statements

AB interaction sig but not important Plots and a careful examination of the cell means may indicate that the interaction is not very important even though it is statistically significant Use the marginal means for each significant main effect to describe the important results You may need to qualify these results using the interaction

AB interaction sig but not important Use the methods described for the situation where the interaction is not significant but keep the interaction in the model Carefully interpret the marginal means as averages over the levels of the other factor

AB interaction is sig and important Options include –Treat as a one-way with ab levels; use tukey to compare means; contrasts and estimate can also be useful –Report that the interaction is significant; plot the means and describe the pattern –Analyze the levels of A for each level of B (use a BY statement) or vice versa

Specification of contrast and estimate statements When using the contrast statement always double check your results with the estimate statement The order of factors is determined by the order in the class statement, not the order in the model statement Contrast should come a priori from research questions

KNNL Example KNNL p 833 Y is the number of cases of bread sold A is the height of the shelf display, a=3 levels: bottom, middle, top B is the width of the shelf display, b=2: regular, wide n=2 stores for each of the 3x2 treatment combinations

Contrast/Estimate Example Suppose we want to compare the average of the two height = middle cells with the average of the other four cells With this approach, the contrast should correspond to a research question formulated before examining the data

Contrast/Estimate First, formulate question as a contrast using the cell means model –(μ 21 + μ 22 )/2 = (μ 11 + μ 12 + μ 31 + μ 32 )/4 –-.25μ μ μ μ μ μ 32 =0 Then translate the contrast into the factor effects model using μ ij = μ + α i + β j + (αβ) ij

Contrast/Estimate -.25μ 11 = -.25(μ + α 1 + β 1 + (αβ) 11 ) -.25μ 12 = -.25(μ + α 1 + β 2 + (αβ) 12 ) +.5μ 21 = +.5 (μ + α 2 + β 1 + (αβ) 21 ) +.5μ 22 = +.5 (μ + α 2 + β 2 + (αβ) 22 ) -.25μ 31 = -.25(μ + α 3 + β 1 + (αβ) 31 ) -.25μ 32 = -.25(μ + α 3 + β 2 + (αβ) 32 ) C = (-.5α 1 + α 2 -.5α 3 ) + (-.25(αβ) (αβ) (αβ) (αβ) (αβ) (αβ) 32 )

Proc glm proc glm data=a1; class height width; model sales=height width height*width;

Contrast and estimate contrast 'middle two vs all others' height height*width ; estimate 'middle two vs all others' height height*width ; means height*width; run;

Output Cont middle two vs all others DF SS MS F Pr > F <.0001 Par middle two vs all others Estimate StErr t Pr > |t| <.0001

Check with means (65+69)/2 – ( )/4 = 67 – 43 = 24

One Quantitative factor and one categorical Plot the means vs the quantitative factor for each level of the categorical factor Consider linear and quadratic terms for the quantitative factor (regression) Consider different relationships for the different levels of the categorical factor Lack of fit analysis can be useful

Two Quantitative factors Plot the means –vs A for each level B –vs B for each level A Consider linear and quadratic terms Consider products to allow for interaction Lack of fit analysis can be useful

One observation per cell For Y ijk we use –i to denote the level of the factor A –j to denote the level of the factor B –k to denote the k th observation in cell (i,j) i = 1,..., a levels of factor A j = 1,..., b levels of factor B n = 1 observation in cell (i,j)

Factor effects model μ ij = μ + α i + β j μ is the overall mean α i is the main effect of A β j is the main effect of B Because we have only one observation per cell, we do not have enough information to estimate the interaction in the usual way…it is confounded with error

Constraints Σ i α i = 0 Σ j β j = 0 SAS GLM Constraints –α a = 0 (1 constraint) –β b = 0 (1 constraint)

ANOVA Table Source df SS MS F A a-1 SSA MSA MSA/MSE B b-1 SSB MSB MSB/MSE Error (a-1)(b-1) SSE MSE _ Total ab-1 SST MST Would normally be df for interaction

Expected Mean Squares Based on model which has no interaction E(MSE) = σ 2 E(MSA) = σ 2 + b( Σ i α i 2 )/(a-1) E(MSB) = σ 2 + a( Σ j β j 2 )/(b-1) Here, α i and β j, are defined with the usual factor effects constraints

KNNL Example KNNL p 882 Y is the premium for auto insurance A is the size of the city, a=3 levels: small, medium and large B is the region, b=2: East, West n=1 the response is the premium charged by a particular company

The data data a2; infile 'c:\...\CH20TA02.txt'; input premium size region; if size=1 then sizea='1_small '; if size=2 then sizea='2_medium'; if size=3 then sizea='3_large '; proc print data=a1; run;

Output Obs premium size region sizea _small _small _medium _medium _large _large

Proc glm proc glm data=a2; class sizea region; model premium= sizea region/solution; means sizea region / tukey; output out=a3 p=muhat; run;

Output Class Level Information Class Levels Values sizea 3 1_small 2_medium 3_large region Number of observations 6

Output Source DF MS F P Model Error 2 50 Total 5

Output Source DF MS F P sizea region

Output solution Parameter Estimate Int 195 B sizea 1_small -90 B sizea 2_medium -15 B sizea 3_large 0 B region 1 30 B region 2 0 B

Check vs predicted values (muhat) reg sizea muhat 1 1_s 135= _s 105= _m 210= _m 180= _l 225= _l 195=195

Multiple comparisons Size Mean N sizea A _large A A _medium B _small

Multiple comparisons Region Mean N region A B The anova results told us that these were different. No need for multiple comparison adjustment

Plot the data symbol1 v='E' i=join c=blue; symbol2 v='W' i=join c=green; title1 'Plot of the data'; proc gplot data=a3; plot premium*sizea= region; run;

The plot Possible interaction?

Plot the estimated model symbol1 v='E' i=join c=blue; symbol2 v='W' i=join c=green; title1 'Plot of the model estimates'; proc gplot data=a2; plot muhat*sizea=region; run;

The plot

Tukey test for additivity Can’t look at usual interaction but can add one additional term to the model (θ) to look at specific type of interaction μ ij = μ + α i + β j + θα i β j We use one degree of freedom to estimate θ There are other variations, eg θ i β j

Step 1: Get muhat (grand mean) proc glm data=a2; model premium=; output out=aall p=muhat; proc print data=aall; run;

Output Obs premium size region muhat

Step 2: Get muhatA (factor A means) proc glm data=a2; class size; model premium=size; output out=aA p=muhatA; proc print data=aA; run;

Output Obs premium size region muhatA

Step 3: Find muhatB (factor B means) proc glm data=a2; class region; model premium=region; output out=aB p=muhatB; proc print data=aB; run;

Output Obs remium size region muhatB

Step 4: Combine and compute data a4; merge aall aA aB; alpha=muhatA-muhat; beta=muhatB-muhat; atimesb=alpha*beta; proc print data=a4; var size region atimesb; run;

Output Obs size region atimesb

Step 5: Run glm proc glm data=a4; class size region; model premium=size region atimesb/solution; output out=a5 p=tukmuhat; run;

Output Source DF F Value Pr > F size region atimesb Not significant…not enough evidence of this type of interaction but really small study (1 df error)

Output Par Est SE t P Int 195 B size1 -90 B size2 -15 B size3 0 B... region1 30 B region2 0 B... atimesb

Does this look more like the means plot?

Last slide Read KNNL Chapters We used program topic27.sas to generate the output for today