Factorial Experiments Analysis of Variance Experimental Design.

Slides:

Advertisements

Similar presentations

Latin Square Designs. Selected Latin Squares 3 x 34 x 4 A B CA B C DA B C DA B C DA B C D B C AB A D CB C D AB D A CB A D C C A BC D B AC D A BC A D BC.

Advertisements

Incomplete Block Designs. Randomized Block Design We want to compare t treatments Group the N = bt experimental units into b homogeneous blocks of size.

ANOVA TABLE Factorial Experiment Completely Randomized Design.

1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.

Chapter 4 Randomized Blocks, Latin Squares, and Related Designs

Comparing k Populations Means – One way Analysis of Variance (ANOVA)

Two Factor ANOVA.

ANOVA notes NR 245 Austin Troy

Chapter 5 Introduction to Factorial Designs

1 Multifactor ANOVA. 2 What We Will Learn Two-factor ANOVA K ij =1 Two-factor ANOVA K ij =1 –Interaction –Tukey’s with multiple comparisons –Concept of.

Experimental Design Terminology  An Experimental Unit is the entity on which measurement or an observation is made. For example, subjects are experimental.

Common Design Problems 1.Masking factor effects 2.Uncontrolled factors 3.One-factor-at-a-time testing.

1 Chapter 5 Introduction to Factorial Designs Basic Definitions and Principles Study the effects of two or more factors. Factorial designs Crossed:

Incomplete Block Designs

Analysis of Variance & Multivariate Analysis of Variance

Outline Single-factor ANOVA Two-factor ANOVA Three-factor ANOVA

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 11 Multifactor Analysis of Variance.

1 Two Factor ANOVA Greg C Elvers. 2 Factorial Designs Often researchers want to study the effects of two or more independent variables at the same time.

Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.

Factorial Experiments

Latin Square Designs. Selected Latin Squares 3 x 34 x 4 A B CA B C DA B C DA B C DA B C D B C AB A D CB C D AB D A CB A D C C A BC D B AC D A BC A D BC.

The Randomized Block Design. Suppose a researcher is interested in how several treatments affect a continuous response variable (Y). The treatments may.

Basics of ANOVA Why ANOVA Assumptions used in ANOVA Various forms of ANOVA Simple ANOVA tables Interpretation of values in the table Exercises.

The Randomized Block Design. Suppose a researcher is interested in how several treatments affect a continuous response variable (Y). The treatments may.

MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations.

Factorial Experiments Analysis of Variance (ANOVA) Experimental Design.

1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 6 Solving Normal Equations and Estimating Estimable Model Parameters.

Two-way ANOVA Introduction to Factorial Designs and their Analysis.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Experimental Design and Analysis of Variance Chapter 10.

Chapter 11 Multifactor Analysis of Variance.

Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.

MANOVA Multivariate Analysis of Variance. One way Analysis of Variance (ANOVA) Comparing k Populations.

Randomized Block Design Blocks All treats appear once in each block.

Repeated Measures Designs. In a Repeated Measures Design We have experimental units that may be grouped according to one or several factors (the grouping.

Psych 5500/6500 Other ANOVA’s Fall, Factorial Designs Factorial Designs have one dependent variable and more than one independent variable (i.e.

Orthogonal Linear Contrasts This is a technique for partitioning ANOVA sum of squares into individual degrees of freedom.

ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.

1 Always be contented, be grateful, be understanding and be compassionate.

Repeated Measures Designs. In a Repeated Measures Design We have experimental units that may be grouped according to one or several factors (the grouping.

PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.

1 Overview of Experimental Design. 2 3 Examples of Experimental Designs.

Comparing k Populations Means – One way Analysis of Variance (ANOVA)

Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 18 Random Effects.

Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.

Orthogonal Linear Contrasts A technique for partitioning ANOVA sum of squares into individual degrees of freedom.

Copyright © Cengage Learning. All rights reserved. 11 Multifactor Analysis of Variance.

Other experimental designs Randomized Block design Repeated Measures designs.

Copyright © Cengage Learning. All rights reserved. 11 Multifactor Analysis of Variance.

The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.

ANOVA TABLE Factorial Experiment Completely Randomized Design.

The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.

ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.

The p-value approach to Hypothesis Testing

1 Chapter 5.8 What if We Have More Than Two Samples?

Experimental Designs The objective of Experimental design is to reduce the magnitude of random error resulting in more powerful tests to detect experimental.

Comparing k Populations Means – One way Analysis of Variance (ANOVA)

Design Lecture: week3 HSTS212.

Repeated Measures Designs

Incomplete Block Designs

Comparing Three or More Means

Chapter 5 Introduction to Factorial Designs

Chapter 5 Hypothesis Testing

Comparing k Populations

Comparing k Populations

Comparing Populations

Factorial Experiments

ANOVA Analysis of Variance.

Latin Square Designs.

One way Analysis of Variance (ANOVA)

STATISTICS INFORMED DECISIONS USING DATA

Presentation transcript:

Factorial Experiments Analysis of Variance Experimental Design

Dependent variable Y k Categorical independent variables A, B, C, … (the Factors) Let –a = the number of categories of A –b = the number of categories of B –c = the number of categories of C –etc.

The Completely Randomized Design We form the set of all treatment combinations – the set of all combinations of the k factors Total number of treatment combinations –t = abc…. In the completely randomized design n experimental units (test animals, test plots, etc. are randomly assigned to each treatment combination. –Total number of experimental units N = nt=nabc..

The treatment combinations can thought to be arranged in a k-dimensional rectangular block A 1 2 a B 12b

A B C

Another way of representing the treatment combinations in a factorial experiment A B... D C

Example In this example we are examining the effect of We have n = 10 test animals randomly assigned to k = 6 diets The level of protein A (High or Low) and The source of protein B (Beef, Cereal, or Pork) on weight gains Y (grams) in rats.

The k = 6 diets are the 6 = 3×2 Level-Source combinations 1.High - Beef 2.High - Cereal 3.High - Pork 4.Low - Beef 5.Low - Cereal 6.Low - Pork

Table Gains in weight (grams) for rats under six diets differing in level of protein (High or Low) and s ource of protein (Beef, Cereal, or Pork) Level of ProteinHigh ProteinLow protein Source of ProteinBeefCerealPorkBeefCerealPork Diet Mean Std. Dev

Example – Four factor experiment Four factors are studied for their effect on Y (luster of paint film). The four factors are: Two observations of film luster (Y) are taken for each treatment combination 1) Film Thickness - (1 or 2 mils) 2)Drying conditions (Regular or Special) 3)Length of wash (10,30,40 or 60 Minutes), and 4)Temperature of wash (92 ˚C or 100 ˚C)

The data is tabulated below: Regular DrySpecial Dry Minutes92  C100  C92  C100  C 1-mil Thickness mil Thickness

Notation Let the single observations be denoted by a single letter and a number of subscripts y ijk…..l The number of subscripts is equal to: (the number of factors) st subscript = level of first factor 2 nd subscript = level of 2 nd factor … Last subsrcript denotes different observations on the same treatment combination

Notation for Means When averaging over one or several subscripts we put a “bar” above the letter and replace the subscripts by Example: y 241

Profile of a Factor Plot of observations means vs. levels of the factor. The levels of the other factors may be held constant or we may average over the other levels

Definition: A factor is said to not affect the response if the profile of the factor is horizontal for all combinations of levels of the other factors: No change in the response when you change the levels of the factor (true for all combinations of levels of the other factors) Otherwise the factor is said to affect the response:

Definition: Two (or more) factors are said to interact if changes in the response when you change the level of one factor depend on the level(s) of the other factor(s). Profiles of the factor for different levels of the other factor(s) are not parallel Otherwise the factors are said to be additive. Profiles of the factor for different levels of the other factor(s) are parallel.

If two (or more) factors interact each factor effects the response. If two (or more) factors are additive it still remains to be determined if the factors affect the response In factorial experiments we are interested in determining –which factors effect the response and – which groups of factors interact.

Factor A has no effect A B

Additive Factors A B

Interacting Factors A B

The testing in factorial experiments 1.Test first the higher order interactions. 2.If an interaction is present there is no need to test lower order interactions or main effects involving those factors. All factors in the interaction affect the response and they interact 3.The testing continues with for lower order interactions and main effects for factors which have not yet been determined to affect the response.

Level of ProteinBeefCerealPorkOverall Low Source of Protein High Overall Example: Diet Example Summary Table of Cell means

Profiles of Weight Gain for Source and Level of Protein

Models for factorial Experiments Single Factor: A – a levels y ij =  +  i +  ij i = 1,2,...,a; j = 1,2,...,n Random error – Normal, mean 0, std-dev. Overall meanEffect on y of factor A when A = i

y 11 y 12 y 13 y 1n y 21 y 22 y 23 y 2n y 31 y 32 y 33 y 3n y a1 y a2 y a3 y an Levels of A 123 a observations Normal dist’n Mean of observations 11 22 33 aa  +  1  +  2  +  3  +  a Definitions

Two Factor: A (a levels), B (b levels y ijk =  +  i +  j + (  ) ij +  ijk i = 1,2,...,a ; j = 1,2,...,b ; k = 1,2,...,n Overall mean Main effect of AMain effect of B Interaction effect of A and B

Table of Means

Table of Effects – Overall mean, Main effects, Interaction Effects

Three Factor: A (a levels), B (b levels), C (c levels) y ijkl =  +  i +  j +  ij +  k + (  ) ik + (  ) jk +  ijk +  ijkl =  +  i +  j +  k +  ij + (  ik + (  jk +  ijk +  ijkl i = 1,2,...,a ; j = 1,2,...,b ; k = 1,2,...,c; l = 1,2,...,n Main effects Two factor Interacti ons Three factor Interaction Random error

 ijk = the mean of y when A = i, B = j, C = k =  +  i +  j +  k +  ij + (  ik + (  jk +  ijk i = 1,2,...,a ; j = 1,2,...,b ; k = 1,2,...,c; l = 1,2,...,n Main effects Two factor Interactions Three factor Interaction Overall mean

Levels of C Leve ls of B Levels of A Leve ls of B Levels of A No interaction

Levels of C Leve ls of B Levels of A A, B interact, No interaction with C Leve ls of B

Levels of C Leve ls of B Levels of A A, B, C interact Leve ls of B

Four Factor: y ijklm =  +   +  j + (  ) ij +  k + (  ) ik + (  ) jk + (  ) ijk +  l + (  ) il + (  ) jl + (  ) ijl + (  ) kl + (  ) ikl + (  ) jkl + (  ) ijkl +  ijklm =  +  i +  j +  k +  l + (  ) ij + (  ) ik + (  ) jk + (  ) il + (  ) jl + (  ) kl +(  ) ijk + (  ) ijl + (  ) ikl + (  ) jkl + (  ) ijkl +  ijklm i = 1,2,...,a ; j = 1,2,...,b ; k = 1,2,...,c; l = 1,2,...,d; m = 1,2,...,n where0 =  i =  j =  (  ) ij   k =  (  ) ik =  (  ) jk =  (  ) ijk =   l =  (  ) il =  (  ) jl =  (  ) ijl =  (  ) kl =  (  ) ikl =  (  ) jkl =  (  ) ijkl and  denotes the summation over any of the subscripts. Main effects Two factor Interactions Three factor Interactions Overall mean Four factor InteractionRandom error

Estimation of Main Effects and Interactions Estimator of Main effect of a Factor Estimator of k-factor interaction effect at a combination of levels of the k factors = Mean at the combination of levels of the k factors - sum of all means at k-1 combinations of levels of the k factors +sum of all means at k-2 combinations of levels of the k factors - etc. =Mean at level i of the factor - Overall Mean

Example: The main effect of factor B at level j in a four factor (A,B,C and D) experiment is estimated by: The two-factor interaction effect between factors B and C when B is at level j and C is at level k is estimated by:

The three-factor interaction effect between factors B, C and D when B is at level j, C is at level k and D is at level l is estimated by: Finally the four-factor interaction effect between factors A,B, C and when A is at level i, B is at level j, C is at level k and D is at level l is estimated by:

Anova Table entries Sum of squares interaction (or main) effects being tested = (product of sample size and levels of factors not included in the interaction) × (Sum of squares of effects being tested) Degrees of freedom = df = product of (number of levels - 1) of factors included in the interaction.

Analysis of Variance (ANOVA) Table Entries (Two factors – A and B)

The ANOVA Table

Analysis of Variance (ANOVA) Table Entries (Three factors – A, B and C)

The ANOVA Table

The Completely Randomized Design is called balanced If the number of observations per treatment combination is unequal the design is called unbalanced. (resulting mathematically more complex analysis and computations) If for some of the treatment combinations there are no observations the design is called incomplete. (some of the parameters - main effects and interactions - cannot be estimated.)

Example: Diet example Mean =

Main Effects for Factor A (Source of Protein) BeefCerealPork

Main Effects for Factor B (Level of Protein) HighLow

AB Interaction Effects Source of Protein BeefCerealPork LevelHigh of Protein Low

Example 2 Paint Luster Experiment

Table: Means and Cell Frequencies

Means and Frequencies for the AB Interaction (Temp - Drying)

Profiles showing Temp-Dry Interaction

Means and Frequencies for the AD Interaction (Temp- Thickness)

Profiles showing Temp-Thickness Interaction

The Main Effect of C (Length)

The Randomized Block Design

Suppose a researcher is interested in how several treatments affect a continuous response variable (Y). The treatments may be the levels of a single factor or they may be the combinations of levels of several factors. Suppose we have available to us a total of N = nt experimental units to which we are going to apply the different treatments.

The Completely Randomized (CR) design randomly divides the experimental units into t groups of size n and randomly assigns a treatment to each group.

The Randomized Block Design divides the group of experimental units into n homogeneous groups of size t. These homogeneous groups are called blocks. The treatments are then randomly assigned to the experimental units in each block - one treatment to a unit in each block.

Experimental Designs In many experiments were are interested in comparing a number of treatments. (the treatments maybe combinations of levels of several factors.) The objective of Experimental design is to reduce the magnitude of random error resulting in more powerful tests to detect experimental effects

The Completely Randomizes Design Treats 123…t Experimental units randomly assigned to treatments

Randomized Block Design Blocks All treats appear once in each block

The Model for a randomized Block Experiment i = 1,2,…, tj = 1,2,…, b y ij = the observation in the j th block receiving the i th treatment  = overall mean  i = the effect of the i th treatment  j = the effect of the j th Block  ij = random error

The Anova Table for a randomized Block Experiment SourceS.S.d.f.M.S.Fp-value TreatSS T t-1MS T MS T /MS E BlockSS B n-1MS B MS B /MS E ErrorSS E (t-1)(b-1)MS E

A randomized block experiment is assumed to be a two-factor experiment. The factors are blocks and treatments. The is one observation per cell. It is assumed that there is no interaction between blocks and treatments. The degrees of freedom for the interaction is used to estimate error.

Experimental Designs In many experiments were are interested in comparing a number of treatments. (the treatments maybe combinations of levels of several factors.) The objective of Experimental design is to reduce the magnitude of random error resulting in more powerful tests to detect experimental effects

The Completely Randomized Design Treats 123…t Experimental units randomly assigned to treatments

Randomized Block Design Blocks All treats appear once in each block

The matched pair design an experimental design for comparing two treatments Pairs The matched pair design is a randomized block design for t = 2 treatments

The Model for a randomized Block Experiment i = 1,2,…, tj = 1,2,…, b y ij = the observation in the j th block receiving the i th treatment  = overall mean  i = the effect of the i th treatment  j = the effect of the j th Block  ij = random error

The Anova Table for a randomized Block Experiment SourceS.S.d.f.M.S.Fp-value TreatSS T t-1MS T MS T /MS E BlockSS B n-1MS B MS B /MS E ErrorSS E (t-1)(b-1)MS E

Incomplete Block Designs

Randomized Block Design We want to compare t treatments Group the N = bt experimental units into b homogeneous blocks of size t. In each block we randomly assign the t treatments to the t experimental units in each block. The ability to detect treatment to treatment differences is dependent on the within block variability.

Comments The within block variability generally increases with block size. The larger the block size the larger the within block variability. For a larger number of treatments, t, it may not be appropriate or feasible to require the block size, k, to be equal to the number of treatments. If the block size, k, is less than the number of treatments (k < t)then all treatments can not appear in each block. The design is called an Incomplete Block Design.

Comments regarding Incomplete block designs When two treatments appear together in the same block it is possible to estimate the difference in treatments effects. The treatment difference is estimable. If two treatments do not appear together in the same block it not be possible to estimate the difference in treatments effects. The treatment difference may not be estimable.

Example Consider the block design with 6 treatments and 6 blocks of size two. The treatments differences (1 vs 2, 1 vs 3, 2 vs 3, 4 vs 5, 4 vs 6, 5 vs 6) are estimable. If one of the treatments is in the group {1,2,3} and the other treatment is in the group {4,5,6}, the treatment difference is not estimable

Definitions Two treatments i and i* are said to be connected if there is a sequence of treatments i 0 = i, i 1, i 2, … i M = i* such that each successive pair of treatments (i j and i j+1 ) appear in the same block In this case the treatment difference is estimable. An incomplete design is said to be connected if all treatment pairs i and i* are connected. In this case all treatment differences are estimable.

Example Consider the block design with 5 treatments and 5 blocks of size two. This incomplete block design is connected. All treatment differences are estimable. Some treatment differences are estimated with a higher precision than others

Definition An incomplete design is said to be a Balanced Incomplete Block Design. 1.if all treatments appear in exactly r blocks. This ensures that each treatment is estimated with the same precision 2.if all treatment pairs i and i* appear together in exactly blocks. This ensures that each treatment difference is estimated with the same precision. The value of is the same for each treatment pair.

Some Identities Let b = the number of blocks. t = the number of treatments k = the block size r = the number of times a treatment appears in the experiment. = the number of times a pair of treatment appears together in the same block 1.bk = rt Both sides of this equation are found by counting the total number of experimental units in the experiment. 2.r(k-1) = (t – 1) Both sides of this equation are found by counting the total number of experimental units that appear with a specific treatment in the experiment.

BIB Design A Balanced Incomplete Block Design (b = 15, k = 4, t = 6, r = 10, = 6)

An Example A food processing company is interested in comparing the taste of six new brands (A, B, C, D, E and F) of cereal. For this purpose: subjects will be asked to taste and compare these cereals scoring them on a scale of For practical reasons it is decided that each subject should be asked to taste and compare at most four of the six cereals. For this reason it is decided to use b = 15 subjects and a balanced incomplete block design to assess the differences in taste of the six brands of cereal.

The design and the data is tabulated below:

Analysis of Block Experiments

The purpose of such experiments is to estimate the effects of treatments applied to some material (experimental units, subjects etc.) grouped into relatively homogeneous groups (blocks). The variability within the groups (blocks) will be considerably less than if the subjects were left ungrouped. This will lead to a more powerful analysis for comparing the treatments.

The basic model for block experiments Suppose we have t treatments and b blocks of size k. The Assumption of Additivity

Let if treat n is applied to m th unit in j th block otherwise

Not of full rank

Now define the incidence matrix

Note if treat n is applied to m th unit in j th block otherwise Now

Thus Also and

Thus and

Finally

Summary: The Least Squares Estimates The Residual Sum of Squares

Hence

Summary: The Least Squares Estimates The Residual Sum of Squares