One way Analysis of Variance (ANOVA)

One way Analysis of Variance (ANOVA)
Comparing k Populations

The F test – for comparing k means
Situation We have k normal populations Let mi and s denote the mean and standard deviation of population i. i = 1, 2, 3, … k. Note: we assume that the standard deviation for each population is the same. s1 = s2 = … = sk = s

We want to test against

Computing Formulae: Compute 1) 2) 3) 4) 5)

The data Assume we have collected data from each of k populations
Let xi1, xi2 , xi3 , … denote the ni observations from population i. i = 1, 2, 3, … k.

Then 1) 2) 3)

Anova Table Mean Square F-ratio Between k - 1 SSBetween MSBetween
Source d.f. Sum of Squares Mean Square F-ratio Between k - 1 SSBetween MSBetween MSB /MSW Within N - k SSWithin MSWithin Total N - 1 SSTotal

Example In the following example we are comparing weight gains resulting from the following six diets Diet 1 - High Protein , Beef Diet 2 - High Protein , Cereal Diet 3 - High Protein , Pork Diet 4 - Low protein , Beef Diet 5 - Low protein , Cereal Diet 6 - Low protein , Pork

Thus Thus since F > we reject H0

Anova Table Mean Square F-ratio Between 5 4612.933 922.587 4.3**
Source d.f. Sum of Squares Mean Square F-ratio Between 5 4.3** (p = ) Within 54 Total 59 * - Significant at 0.05 (not 0.01) ** - Significant at 0.01

Equivalence of the F-test and the t-test when k = 2

the F-test

Factorial Experiments
Analysis of Variance

k Categorical independent variables A, B, C, … (the Factors) Let
Dependent variable Y k Categorical independent variables A, B, C, … (the Factors) Let a = the number of categories of A b = the number of categories of B c = the number of categories of C etc.

The Completely Randomized Design
We form the set of all treatment combinations – the set of all combinations of the k factors Total number of treatment combinations t = abc…. In the completely randomized design n experimental units (test animals , test plots, etc. are randomly assigned to each treatment combination. Total number of experimental units N = nt=nabc..

The treatment combinations can thought to be arranged in a k-dimensional rectangular block
1 2 b 1 2 A a

The Completely Randomized Design is called balanced
If the number of observations per treatment combination is unequal the design is called unbalanced. (resulting mathematically more complex analysis and computations) If for some of the treatment combinations there are no observations the design is called incomplete. (In this case it may happen that some of the parameters - main effects and interactions - cannot be estimated.)

Example In this example we are examining the effect of
The level of protein A (High or Low) and the source of protein B (Beef, Cereal, or Pork) on weight gains (grams) in rats. We have n = 10 test animals randomly assigned to k = 6 diets

The k = 6 diets are the 6 = 3×2 Level-Source combinations
High - Beef High - Cereal High - Pork Low - Beef Low - Cereal Low - Pork

Gains in weight (grams) for rats under six diets
Table Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and s ource of protein (Beef, Cereal, or Pork) Level of Protein High Protein Low protein Source of Protein Beef Cereal Pork Beef Cereal Pork Diet Mean Std. Dev

Treatment combinations
Source of Protein Beef Cereal Pork Level of Protein High Diet 1 Diet 2 Diet 3 Low Diet 4 Diet 5 Diet 6

Summary Table of Means Source of Protein
Level of Protein Beef Cereal Pork Overall High Low Overall

Profiles of the response relative to a factor
A graphical representation of the effect of a factor on a reponse variable (dependent variable)

Profile Y for A Y Levels of A a
This could be for an individual case or averaged over a group of cases This could be for specific level of another factor or averaged levels of another factor a … 1 2 3 Levels of A

Profiles of Weight Gain for Source and Level of Protein

Example – Four factor experiment
Four factors are studied for their effect on Y (luster of paint film). The four factors are: 1) Film Thickness - (1 or 2 mils) 2) Drying conditions (Regular or Special) 3) Length of wash (10,30,40 or 60 Minutes), and 4) Temperature of wash (92 ˚C or 100 ˚C) Two observations of film luster (Y) are taken for each treatment combination

The data is tabulated below:
Regular Dry Special Dry Minutes 92 C 100 C 92C 100 C 1-mil Thickness 2-mil Thickness

Definition: A factor is said to not affect the response if the profile of the factor is horizontal for all combinations of levels of the other factors: No change in the response when you change the levels of the factor (true for all combinations of levels of the other factors) Otherwise the factor is said to affect the response:

Profile Y for A – A affects the response
Levels of B a … 1 2 3 Levels of A

Profile Y for A – no affect on the response
Levels of B a … 1 2 3 Levels of A

Definition: Two (or more) factors are said to interact if changes in the response when you change the level of one factor depend on the level(s) of the other factor(s). Profiles of the factor for different levels of the other factor(s) are not parallel Otherwise the factors are said to be additive . Profiles of the factor for different levels of the other factor(s) are parallel.

Interacting factors A and B
Y Levels of B a … 1 2 3 Levels of A

Additive factors A and B
Y Levels of B a … 1 2 3 Levels of A

If two (or more) factors interact each factor effects the response.
If two (or more) factors are additive it still remains to be determined if the factors affect the response In factorial experiments we are interested in determining which factors effect the response and which groups of factors interact .

The testing in factorial experiments
Test first the higher order interactions. If an interaction is present there is no need to test lower order interactions or main effects involving those factors. All factors in the interaction affect the response and they interact The testing continues with for lower order interactions and main effects for factors which have not yet been determined to affect the response.

Models for factorial Experiments

The Single Factor Experiment
Situation We have t = a treatment combinations Let mi and s denote the mean and standard deviation of observations from treatment i. i = 1, 2, 3, … a. Note: we assume that the standard deviation for each population is the same. s1 = s2 = … = sa = s

The data Assume we have collected data for each of the a treatments
Let yi1, yi2 , yi3 , … , yin denote the n observations for treatment i. i = 1, 2, 3, … a.

The model Note: where has N(0,s2) distribution (overall mean effect)
(Effect of Factor A) Note: by their definition.

Model 1: yij (i = 1, … , a; j = 1, …, n) are independent Normal with mean mi and variance s2. Model 2: where eij (i = 1, … , a; j = 1, …, n) are independent Normal with mean 0 and variance s2. Model 3: where eij (i = 1, … , a; j = 1, …, n) are independent Normal with mean 0 and variance s2 and

The Two Factor Experiment
Situation We have t = ab treatment combinations Let mij and s denote the mean and standard deviation of observations from the treatment combination when A = i and B = j. i = 1, 2, 3, … a, j = 1, 2, 3, … b.

The data Assume we have collected data (n observations) for each of the t = ab treatment combinations. Let yij1, yij2 , yij3 , … , yijn denote the n observations for treatment combination - A = i, B = j. i = 1, 2, 3, … a, j = 1, 2, 3, … b.

The model Note: where has N(0,s2) distribution and

The model Note: where has N(0,s2) distribution Note:
by their definition.

Main effects Interaction Effect Error Mean Model :
where eijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance s2 and

Maximum Likelihood Estimates
where eijk (i = 1, … , a; j = 1, …, b ; k = 1, …, n) are independent Normal with mean 0 and variance s2 and

This is not an unbiased estimator of s2 (usually the case when estimating variance.)
The unbiased estimator results when we divide by ab(n -1) instead of abn

The unbiased estimator of s2 is
where

Testing for Interaction:
We want to test: H0: (ab)ij = 0 for all i and j, against HA: (ab)ij ≠ 0 for at least one i and j. The test statistic where

We reject H0: (ab)ij = 0 for all i and j, If

Testing for the Main Effect of A:
We want to test: H0: ai = 0 for all i, against HA: ai ≠ 0 for at least one i. The test statistic where

We reject H0: ai = 0 for all i, If

Testing for the Main Effect of B:
We want to test: H0: bj = 0 for all j, against HA: bj ≠ 0 for at least one j. The test statistic where

We reject H0: bj = 0 for all j, If

The ANOVA Table Source S.S. d.f. MS =SS/df F A SSA a - 1 MSA
MSA / MSError B SSB b - 1 MSB MSB / MSError AB SSAB (a - 1)(b - 1) MSAB MSAB/ MSError Error SSError ab(n - 1) MSError Total SSTotal abn - 1

Computing Formulae

One way Analysis of Variance (ANOVA)

Similar presentations

Presentation on theme: "One way Analysis of Variance (ANOVA)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

One way Analysis of Variance (ANOVA)

Similar presentations

Presentation on theme: "One way Analysis of Variance (ANOVA)"— Presentation transcript:

Similar presentations

About project

Feedback