Completely Randomized Design

Completely Randomized Design
9 17

Completely Randomized Design
1. Experimental Units (Subjects) Are Assigned Randomly to Treatments Subjects are Assumed Homogeneous 2. One Factor or Independent Variable 2 or More Treatment Levels or Classifications 3. Analyzed by One-Way ANOVA

Randomized Design Example
Are the mean training times the same for 3 different methods? 9 subjects 3 methods (factor levels)    78

yij = the observation in ith treatment and the jth replication
The Linier Model i = 1,2,…, t j = 1,2,…, r yij = the observation in ith treatment and the jth replication m = overall mean t i = the effect of the ith treatment eij = random error

One-Way ANOVA F-Test 9 17

One-Way ANOVA F-Test 1. Tests the Equality of 2 or More (p) Population Means 2. Variables One Nominal Scaled Independent Variable 2 or More (p) Treatment Levels or Classifications One Interval or Ratio Scaled Dependent Variable 3. Used to Analyze Completely Randomized Experimental Designs Note: There is one dependent variable in the ANOVA model. MANOVA has more than one dependent variable. Ask, what are nominal & interval scales?

One-Way ANOVA F-Test Assumptions
1. Randomness & Independence of Errors Independent Random Samples are Drawn for each condition 2. Normality Populations (for each condition) are Normally Distributed 3. Homogeneity of Variance Populations (for each condition) have Equal Variances

One-Way ANOVA F-Test Hypotheses
H0: 1 = 2 = 3 = ... = t All Population Means are Equal No Treatment Effect Ha: Not All i Are Equal At Least 1 Pop. Mean is Different Treatment Effect NOT 1  2  ...  t

One-Way ANOVA F-Test Hypotheses
H0: 1 = 2 = 3 = ... = t All Population Means are Equal No Treatment Effect Ha: Not All i Are Equal At Least 1 Pop. Mean is Different Treatment Effect NOT 1  2  ...  t f(X) X  =  =  1 2 3 f(X) X  =   1 2 3

Why Variances? Observe one sample from each treatment group
Their means may be slightly different How different is enough to conclude population means are different? Depends on variability within each population Higher variance in population  higher variance in means Statistical tests are conducted by comparing variability between means to variability within each sample In all cases, the population means are DIFFERENT! Should reject Ho. Panel A: Same treatment effect - means are equal. Different random variation - standard deviations are different. As variances WITHIN get bigger, we are more likely to conclude equal means. In Populations 4, 5, & 6, it is possible to draw a sample and falsely conclude population means are equal. Panel B: Different treatment effect - means are different. Same random variation - standard deviations are equal. As variances AMONG get bigger, we are more likely to conclude population means are different. In Populations 4,5, & 6, it is possible to draw a sample and falsely conclude population means are equal. 82

Two Possible Experiment Outcomes
Same treatment variation Different random variation Reject equality of means! A In all cases, the population means are DIFFERENT! Should reject Ho. Panel A: Same treatment effect - means are equal. Different random variation - standard deviations are different. As variances WITHIN get bigger, we are more likely to conclude equal means. In Populations 4, 5, & 6, it is possible to draw a sample and falsely conclude population means are equal. Panel B: Different treatment effect - means are different. Same random variation - standard deviations are equal. As variances AMONG get bigger, we are more likely to conclude population means are different. In Populations 4,5, & 6, it is possible to draw a sample and falsely conclude population means are equal. Can’t reject equality of means! 82

Two More Possible Experiment Outcomes
Same treatment variation Different random variation Different treatment variation Same random variation A B Reject Reject In all cases, the population means are DIFFERENT! Should reject Ho. Panel A: Same treatment effect - means are equal. Different random variation - standard deviations are different. As variances WITHIN get bigger, we are more likely to conclude equal means. In Populations 4, 5, & 6, it is possible to draw a sample and falsely conclude population means are equal. Panel B: Different treatment effect - means are different. Same random variation - standard deviations are equal. As variances AMONG get bigger, we are more likely to conclude population means are different. In Populations 4,5, & 6, it is possible to draw a sample and falsely conclude population means are equal. Can’t reject equality of means! 82

One-Way ANOVA Basic Idea
1. Compares 2 Types of Variation to Test Equality of Means 2. Comparison Basis Is Ratio of Variances 3. If Treatment Variation Is Significantly Greater Than Random Variation then Means Are Not Equal 4. Variation Measures Are Obtained by ‘Partitioning’ Total Variation

One-Way ANOVA Partitions Total Variation
Variation due to Random Sampling are due to Individual Differences Within Groups. 84

Variation due to Random Sampling are due to Individual Differences Within Groups. 85

Variation due to Random Sampling are due to Individual Differences Within Groups. Variation due to treatment 86

Variation due to Random Sampling are due to Individual Differences Within Groups. Variation due to treatment Variation due to random sampling 87

Variation due to Random Sampling are due to Individual Differences Within Groups. Variation due to treatment Variation due to random sampling Sum of Squares Among Sum of Squares Between Sum of Squares Treatment Among Groups Variation 88

Variation due to Random Sampling are due to Individual Differences Within Groups. Variation due to treatment Variation due to random sampling Sum of Squares Among Sum of Squares Between Sum of Squares Treatment (SST) Among Groups Variation Sum of Squares Within Sum of Squares Error (SSE) Within Groups Variation 89

Total Variation Response, X X Group 1 Group 2 Group 3

Treatment Variation Response, X X3 X X2 X1 Group 1 Group 2 Group 3

Random (Error) Variation
Response, X X3 X2 X1 Group 1 Group 2 Group 3

SS=SSE+SST

Thus, SS=SSE+SST

One-Way ANOVA F-Test Test Statistic
F = MST / MSE MST Is Mean Square for Treatment MSE Is Mean Square for Error 2. Degrees of Freedom 1 = t -1 2 = tr - t t = # Populations, Groups, or Levels tr = Total Sample Size

One-Way ANOVA Summary Table
Source of Degrees Sum of Mean F Variation of Squares Square Freedom (Variance) n = sum of sample sizes of all populations. c = number of factor levels All values are positive. Why? (squared terms) Degrees of Freedom & Sum of Squares are additive; Mean Square is NOT. Treatment t - 1 SST MST = MST SST/(t - 1) MSE Error tr - t SSE MSE = SSE/(tr - t) Total tr - 1 SS(Total) = SST+SSE

ANOVA Table for a Completely Randomized Design
Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Treatments SST t SST/t-1 MST/MSE Error SSE tr - t SSE/tr-t Total SSTot tr - 1

The F distribution  F  Two parameters
increasing either one decreases F-alpha (except for v2<3) I.e., the distribution gets smashed to the left  F F  ( v1 , v2 )

One-Way ANOVA F-Test Critical Value
If means are equal, F = MST / MSE  1. Only reject large F! Reject H  Do Not Reject H F F a ( t  1 , tr -t) Always One-Tail! © T/Maker Co.

Example: Home Products, Inc.
Completely Randomized Design Home Products, Inc. is considering marketing a long-lasting car wax. Three different waxes (Type 1, Type 2, and Type 3) have been developed. In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 with Type 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signs of deterioration. The number of times each car went through the carwash is shown on the next slide. Home Products, Inc. must decide which wax to market. Are the three waxes equally effective?

Wax Wax Wax Observation Type Type Type 3 Sample Mean Sample Variance

Hypotheses H0: 1=2=3 Ha: Not all the means are equal where: 1 = mean number of washes for Type 1 wax 2 = mean number of washes for Type 2 wax 3 = mean number of washes for Type 3 wax

Mean Square Between Treatments Since the sample sizes are all equal: μ= (x1 + x2 + x3)/3 = ( )/3 = 29.8 SSTR= 5(29–29.8)2+ 5(30.4–29.8)2+ 5(30–29.8)2= 5.2 MSTR = 5.2/(3 - 1) = 2.6 Mean Square Error SSE = 4(2.5) + 4(3.3) + 4(2.5) = 33.2 MSE = 33.2/(15 - 3) = 2.77 _ _ _ =

Rejection Rule Using test statistic: Reject H0 if F > 3.89 Using p-value: Reject H0 if p-value < .05 where F.05 = 3.89 is based on an F distribution with 2 numerator degrees of freedom and 12 denominator degrees of freedom

Test Statistic F = MST/MSE = 2.6/2.77 = .939 Conclusion Since F = .939 < F.05 = 3.89, we cannot reject H0. There is insufficient evidence to conclude that the mean number of washes for the three wax types are not all the same.

ANOVA Table Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Treatments Error Total

Using Excel’s ANOVA: Single Factor Tool
Value Worksheet (top portion)

Value Worksheet (bottom portion)

Conclusion Using the p-Value The value worksheet shows a p-value of .418 The rejection rule is “Reject H0 if p-value < .05” Because .418 > .05, we cannot reject H0. There is insufficient evidence to conclude that the mean number of washes for the three wax types are not all the same.

(Randomized Complete Block Design)
RCBD (Randomized Complete Block Design)

Randomized Complete Block Design
An experimental design in which there is one independent variable, and a second variable known as a blocking variable, that is used to control for confounding or concomitant variables. It is used when the experimental unit or material are heterogeneous There is a way to block the experimental units or materials to keep the variability among within a block as small as possible and to maximize differences among block The block (group) should consists units or materials which are as uniform as possible

Randomized Complete Block Design
Confounding or concomitant variable are not being controlled by the analyst but can have an effect on the outcome of the treatment being studied Blocking variable is a variable that the analyst wants to control but is not the treatment variable of interest. Repeated measures design is a randomized block design in which each block level is an individual item or person, and that person or item is measured across all treatments.

The Blocking Principle
Blocking is a technique for dealing with nuisance factors A nuisance factor is a factor that probably has some effect on the response, but it is of no interest to the experimenter…however, the variability it transmits to the response needs to be minimized Typical nuisance factors include batches of raw material, operators, pieces of test equipment, time (shifts, days, etc.), different experimental units Many industrial experiments involve blocking (or should) Failure to block is a common flaw in designing an experiment (consequences?)

The Blocking Principle
If the nuisance variable is known and controllable, we use blocking If the nuisance factor is known and uncontrollable, sometimes we can use the analysis of covariance (see Chapter 14) to statistically remove the effect of the nuisance factor from the analysis If the nuisance factor is unknown and uncontrollable (a “lurking” variable), we hope that randomization balances out its impact across the experiment Sometimes several sources of variability are combined in a block, so the block becomes an aggregate variable

Partitioning the Total Sum of Squares in the Randomized Block Design
SStotal (total sum of squares) SSE (error sum of squares) SST (treatment sum of squares) SSB (sum of squares blocks) SSE’ (sum of squares error)

A Randomized Block Design
Individual observations . Single Independent Variable Blocking Variable 30

yij = the observation in ith treatment in the jth block
The Linier Model i = 1,2,…, t j = 1,2,…,r yij = the observation in ith treatment in the jth block m = overall mean ti = the effect of the ith treatment No interaction between blocks and treatments rj = the effect of the jth block eij = random error

Extension of the ANOVA to the RCBD
ANOVA partitioning of total variability:

Extension of the ANOVA to the RCBD
The degrees of freedom for the sums of squares in are as follows: Ratios of sums of squares to their degrees of freedom result in mean squares, and The ratio of the mean square for treatments to the error mean square is an F statistic  used to test the hypothesis of equal treatment means

ANOVA Procedure The ANOVA procedure for the randomized block design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error. The formula for this partitioning is SSTot = SST + SSB + SSE The total degrees of freedom, nT - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and (k - 1)(b - 1) go to the error term.

ANOVA Table for a Randomized Block Design
Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Treatments SST t – SST/t MST/MSE Blocks SSB r - 1 Error SSE (t - 1)(r - 1) SSE/(t-1)(r-1) Total SSTot tr - 1

Example: Eastern Oil Co.
Randomized Block Design Eastern Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends. Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide.

Automobile Type of Gasoline (Treatment) Blocks (Block) Blend X Blend Y Blend Z Means Treatment Means

Mean Square Due to Treatments The overall sample mean is 29. Thus, SST= 5[( )2+ ( )2+ ( )2]= 5.2 MST = 5.2/(3 - 1) = 2.6 Mean Square Due to Blocks SSB = 3[( ) ( )2] = 51.33 MSB = 51.33/(5 - 1) = 12.8 Mean Square Due to Error SSE = = 5.47 MSE = 5.47/[(3 - 1)(5 - 1)] = .68

Rejection Rule Using test statistic: Reject H0 if F > 4.46 Using p-value: Reject H0 if p-value < .05 Assuming  = .05, F.05 = 4.46 (2 d.f. numerator and 8 d.f. denominator)

Test Statistic F = MST/MSE = 2.6/.68 = 3.82 Conclusion Since 3.82 < 4.46, we cannot reject H0. There is not sufficient evidence to conclude that the miles per gallon ratings differ for the three gasoline blends.

Using Excel’s Anova: Two-Factor Without Replication Tool
Step 1 Select the Tools pull-down menu Step 2 Choose the Data Analysis option Step 3 Choose Anova: Two Factor Without Replication from the list of Analysis Tools … continued

Step 4 When the Anova: Two Factor Without Replication dialog box appears: Enter A1:D6 in the Input Range box Select Labels Enter .05 in the Alpha box Select Output Range Enter A8 (your choice) in the Output Range box Click OK

Value Worksheet (top portion)

Value Worksheet (middle portion)

Value Worksheet (bottom portion)

Conclusion Using the p-Value The value worksheet shows that the p-value is The rejection rule is “Reject H0 if p-value < .05” Thus, we cannot reject H0 because the p-value = > a = .05 There is not sufficient evidence to conclude that the miles per gallon ratings differ for the three gasoline blends

Similarities and differences between CRD and RCBD: Procedures
RCBD: Every level of “treatment” encountered by each experimental unit; CRD: Just one level each Descriptive statistics and graphical display: the same as CRD Model adequacy checking procedure: the same except: specifically, NO Block x Treatment Interaction ANOVA: Inclusion of the Block effect; dferror change from t(r – 1) to (t – 1)(r – 1)

Latin Square Design

Definition A Latin square is a square array of objects (letters A, B, C, …) such that each object appears once and only once in each row and each column. Example - 4 x 4 Latin Square. A B C D B C D A C D A B D A B C

The Latin Square Design
This design is used to simultaneously control (or eliminate) two sources of nuisance variability It is called “Latin” because we usually specify the treatment by the Latin letters “Square” because it always has the same number of levels (t) for the row and column nuisance factors A significant assumption is that the three factors (treatments and two nuisance factors) do not interact More restrictive than the RCBD Each treatment appears once and only once in each row and column If you can block on two (perpendicular) sources of variation (rows x columns) you can reduce experimental error when compared to the RCBD A B C D

Advantages and Disadvantages
Allows the experimenter to control two sources of variation Disadvantages: Error degree of freedom (df) is small if there are only a few treatments The experiment becomes very large if the number of treatments is large The statistical analysis is complicated by missing plots and mis-assigned treatments Another variation – use a smaller Latin Square (3x3 or 4x4), but repeat it. Analysis can then be combined across the two replicates.

Selected Latin Squares
Latin Square Designs Selected Latin Squares 3 x 3 4 x 4 A B C A B C D A B C D A B C D A B C D B C A B A D C B C D A B D A C B A D C C A B C D B A C D A B C A D B C D A B D C A B D A B C D C B A D C B A 5 x 5 6 x 6 A B C D E A B C D E F B A E C D B F D C A E C D A E B C D E F B A D E B A C D A F E C B E C D B A F E B A D C

In a Latin square You have three factors:
Treatments (t) (letters A, B, C, …) Rows (t) Columns (t) The number of treatments = the number of rows = the number of columns = t. The row-column treatments are represented by cells in a t x t array. The treatments are assigned to row-column combinations using a Latin-square arrangement

Example A courier company is interested in deciding between five brands (D,P,F,C and R) of car for its next purchase of fleet cars. The brands are all comparable in purchase price. The company wants to carry out a study that will enable them to compare the brands with respect to operating costs. For this purpose they select five drivers (Rows). In addition the study will be carried out over a five week period (Columns = weeks).

Each week a driver is assigned to a car using randomization and a Latin Square Design.
The average cost per mile is recorded at the end of each week and is tabulated below:

tk = the effect of the ith treatment ri = the effect of the ith row
The Linier Model i = 1,2,…, t j = 1,2,…, t k = 1,2,…, t yij(k) = the observation in ith row and the jth column receiving the kth treatment m = overall mean tk = the effect of the ith treatment No interaction between rows, columns and treatments ri = the effect of the ith row gj = the effect of the jth column eij(k) = random error

A Latin Square experiment is assumed to be a three-factor experiment.
The factors are rows, columns and treatments. It is assumed that there is no interaction between rows, columns and treatments. The degrees of freedom for the interactions is used to estimate error.

The Anova Table for a Latin Square Experiment
Source S.S. d.f. M.S. F p-value Treat SST t-1 MST MST /MSE Rows SSRow MSRow MSRow /MSE Cols SSCol MSCol MSCol /MSE Error SSE (t-1)(t-2) MSE Total t2 - 1

The Anova Table for Example
Source S.S. d.f. M.S. F p-value Week 4 16.06 0.0001 Driver 21.79 0.0000 Car 22.24 Error 12 Total 24

Example In this Experiment the we are again interested in how weight gain (Y) in rats is affected by Source of protein (Beef, Cereal, and Pork) and by Level of Protein (High or Low). There are a total of t = 3 X 2 = 6 treatment combinations of the two factors. Beef -High Protein Cereal-High Protein Pork-High Protein Beef -Low Protein Cereal-Low Protein and Pork-Low Protein

In this example we will consider using a Latin Square design
Six Initial Weight categories are identified for the test animals in addition to Six Appetite categories. A test animal is then selected from each of the 6 X 6 = 36 combinations of Initial Weight and Appetite categories. A Latin square is then used to assign the 6 diets to the 36 test animals in the study.

In the latin square the letter
A represents the high protein-cereal diet B represents the high protein-pork diet C represents the low protein-beef Diet D represents the low protein-cereal diet E represents the low protein-pork diet and F represents the high protein-beef diet.

The weight gain after a fixed period is measured for each of the test animals and is tabulated below:

The Anova Table for Example
Source S.S. d.f. M.S. F p-value Inwt 5 111.1 0.0000 App 138.03 Diet 263.06 Error 20 3.181 Total 35

Diet SS partioned into main effects for Source and Level of Protein
d.f. M.S. F p-value Inwt 5 111.1 0.0000 App 138.03 2 99.22 Level 1 820.88 SL 147.99 Error 20 3.181 Total 35

Graeco-Latin Square Designs
Mutually orthogonal Squares

Definition A Greaco-Latin square consists of two latin squares (one using the letters A, B, C, … the other using greek letters a, b, c, …) such that when the two latin square are supper imposed on each other the letters of one square appear once and only once with the letters of the other square. The two Latin squares are called mutually orthogonal. Example: a 7 x 7 Greaco-Latin Square Aa Be Cb Df Ec Fg Gd Bb Cf Dc Eg Fd Ga Ae Cc Dg Ed Fa Ge Ab Bf Dd Ea Fe Gb Af Bc Cg Ee Fb Gf Ac Bg Cd Da Ff Gc Ag Bd Ca De Eb Gg Ad Ba Ce Db Ef Fc

The Graeco-Latin Square Design
This design is used to simultaneously control (or eliminate) three sources of nuisance variability It is called “Graeco-Latin” because we usually specify the third nuisance factor, represented by the Greek letters, orthogonal to the Latin letters A significant assumption is that the four factors (treatments, nuisance factors) do not interact If this assumption is violated, as with the Latin square design, it will not produce valid results Graeco-Latin squares exist for all t ≥ 3 except t = 6

Note: At most (t –1) t x t Latin squares L1, L2, …, Lt-1 such that any pair are mutually orthogonal. It is possible that there exists a set of six 7 x 7 mutually orthogonal Latin squares L1, L2, L3, L4, L5, L6 .

The Greaco-Latin Square Design - An Example
A researcher is interested in determining the effect of two factors the percentage of Lysine in the diet and percentage of Protein in the diet have on Milk Production in cows. Previous similar experiments suggest that interaction between the two factors is negligible.

For this reason it is decided to use a Greaco-Latin square design to experimentally determine the two effects of the two factors (Lysine and Protein). Seven levels of each factor is selected 0.0(A), 0.1(B), 0.2(C), 0.3(D), 0.4(E), 0.5(F), and 0.6(G)% for Lysine and 2(a), 4(b), 6(c), 8(d), 10(e), 12(f) and 14(g)% for Protein. Seven animals (cows) are selected at random for the experiment which is to be carried out over seven three-month periods.

A Greaco-Latin Square is the used to assign the 7 X 7 combinations of levels of the two factors (Lysine and Protein) to a period and a cow. The data is tabulated on below:

The Linear Model j = 1,2,…, t i = 1,2,…, t k = 1,2,…, t l = 1,2,…, t
yij(kl) = the observation in ith row and the jth column receiving the kth Latin treatment and the lth Greek treatment

tk = the effect of the kth Latin treatment
m = overall mean tk = the effect of the kth Latin treatment ll = the effect of the lth Greek treatment ri = the effect of the ith row gj = the effect of the jth column eij(k) = random error No interaction between rows, columns, Latin treatments and Greek treatments

A Greaco-Latin Square experiment is assumed to be a four-factor experiment.
The factors are rows, columns, Latin treatments and Greek treatments. It is assumed that there is no interaction between rows, columns, Latin treatments and Greek treatments. The degrees of freedom for the interactions is used to estimate error.

Completely Randomized Design

Similar presentations

Presentation on theme: "Completely Randomized Design"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Completely Randomized Design

Similar presentations

Presentation on theme: "Completely Randomized Design"— Presentation transcript:

Similar presentations

About project

Feedback