Experimental Design An Experimental Design is a plan for the assignment of the treatments to the plots in the experiment Designs differ primarily in the way the plots are grouped before the treatments are applied How much restriction is imposed on the random assignment of treatments to the plots A B C D
Why do I need a design? To provide an estimate of experimental error To increase precision (blocking) To provide information needed to perform tests of significance and construct interval estimates To facilitate the application of treatments - particularly cultural operations
Factors to be Considered Physical and topographic features Soil variability Number and nature of treatments Experimental material (crop, animal, pathogen, etc.) Duration of the experiment Machinery to be used Size of the difference to be detected Significance level to be used Experimental resources Cost (money, time, personnel)
Cardinal Rule: Choose the simplest experimental design that will give the required precision within the limits of the available resources
Completely Randomized Design (CRD) Simplest and least restrictive Every plot is equally likely to be assigned to any treatment A B D A C D B C B A D C
Advantages of a CRD Flexibility Simple statistical analysis Any number of treatments and any number of replications Don’t have to have the same number of replications per treatment (but more efficient if you do) Simple statistical analysis Even if you have unequal replication Missing plots do not complicate the analysis Maximum error degrees of freedom
Disadvantage of CRD Low precision if the plots are not uniform A B C D
Uses for the CRD If the experimental site is relatively uniform If a large fraction of the plots may not respond or may be lost If the number of plots is limited
Design Construction No restriction on the assignment of treatments to the plots Each treatment is equally likely to be assigned to any plot Should use some sort of mechanical procedure to prevent personal bias Assignment of random numbers may be by: lot (draw a number ) computer assignment using a random number table
Random Assignment by Lot We have an experiment to test three varieties: the top line from Oregon, Washington, and Idaho to find which grows best in our area ----- t=3, r=4 1 6 12 5 A 1 2 3 4 5 6 7 8 9 10 11 12 A A A
Random Assignment by Computer (Excel) In Excel, type 1 in cell A1, 2 in A2. Block cells A1 and A2. Use the ‘fill handle’ to drag down through A12 - or through the number of total plots in your experiment. In cell B1, type = RAND(); copy cell B1 and paste to cells B2 through B12 - or Bn. Block cells B1 - B12 or Bn, Copy; From Edit menu choose Paste special and select values (otherwise the values of the random numbers will continue to change)
Random numbers in Excel (cont’d.) Sort columns A and B (A1..B12) by column B Assign the first treatment to the first r (4) cells in column C, the second treatment to the second r (4) cells, etc. Re-sort columns A B C by A if desired. (A1..C12)
The Statistical Analysis Partitions the total variation in the data into components associated with sources of variation For a Completely Randomized Design (CRD) Treatments --- Error For a Randomized Complete Block Design (RBD) Treatments --- Blocks --- Error Provides an estimate of experimental error (s2) Used to construct interval estimates and significance tests Provides a way to test the significance of variance sources t test – used to compare two treatments F test – used when there are two or more treatments
Analysis of Variance (ANOVA) Assumptions The error terms are… randomly, independently, and normally distributed, with a mean of zero and a common variance. The main effects are additive Linear additive model for a Completely Randomized Design (CRD) mean observation Yij = + i + ij treatment effect random error
The CRD Analysis We can: Estimate the treatment means Estimate the standard error of a treatment mean Test the significance of differences among the treatment means
SiSj Yij=Y.. What? i represents the treatment number (varies from 1 to t=3) j represents the replication number (varies from 1 to r=4) S is the symbol for summation Treatment (i) Replication (j) Observation (Yij) 1 1 47.9 1 2 50.6 1 3 43.5 1 4 42.6 2 1 62.8 2 2 50.9 2 3 61.8 2 4 49.1 3 1 66.4 3 2 60.6 3 3 64.0 3 4 64.0 C P K 47.9 62.5 66.4 50.6 50.9 60.6 43.5 61.8 64.0 42.6 49.1
The CRD Analysis - How To: Set up a table of observations and compute the treatment means and deviations grand mean mean of the i-th treatment deviation of the i-th treatment mean from the grand mean
The CRD Analysis, cont’d. Separate sources of variation Variation between treatments Variation within treatments (error) Compute degrees of freedom (df) 1 less than the number of observations total df = N-1 treatment df = t-1 error df = N-t or t(r-1) if each treatment has the same r
Skeleton ANOVA for CRD Source df SS MS F P >F Total N-1 Treatments Within treatments (Error) N-t
The CRD Analysis, cont’d. Compute Sums of Squares Total Treatment Error SSE = SSTot - SST Compute Mean Squares Treatment MST = SST / (t-1) Error MSE = SSE / (N-t) Calculate F statistic for treatments FT = MST/MSE
Using the ANOVA Use FT to judge whether treatment means differ significantly If FT is greater than F in the table, then differences are significant MSE = s2 or the sample estimate of the experimental error Used to compute standard errors and interval estimates Standard Error of a treatment mean Standard Error of the difference between two means
Numerical Example A set of on-farm demonstration plots were located throughout an agricultural district. A single plot was located within a lentil field on each of 20 farms in the district. Each plot was fertilized and treated to control weevils and weeds. A portion of each plot was harvested for yield and the farms were classified by soil type. A CRD analysis was used to see if there were yield differences due to soil type.
Table of Observations, Means, and Deviations 1 2 3 4 5 42.2 28.4 18.8 41.5 33.0 34.9 28.0 19.5 36.3 26.0 29.7 22.8 13.1 31.7 30.6 18.5 10.1 31.0 19.4 28.2 Mean Mean 35.600 23.420 15.375 33.740 29.867 27.185 ri 3 5 4 5 3 20 Dev 8.415 -3.765 -11.810 6.555 2.682 Dev2 70.812 14.175 139.476 42.968 7.191
ANOVA Table Source df SS MS F Total 19 1,439.2055 Soil Type 4 1,077.6313 269.4078 11.18** Error 15 361.5742 24.1049 Fcritical(α=0.05; 4,15 df) = 3.06 ** Significant at the 1% level
Formulae and Computations Coefficient of Variation Standard Error of a Mean Confidence Interval Estimate of a Mean (soil type 4)
Formulae for Mean Comparisons Standard Error of the Difference between Two Means (for soils 1 and 2) Test statistic with N-t df
Mean Yields and Standard Errors Soil Type 1 2 3 4 5 Mean Yield 35.60 23.42 15.38 33.74 29.87 Replications 3 5 4 5 3 Standard error 2.83 2.20 2.45 2.20 2.83 CV = 18.1% 95% confidence interval estimate for soil type 4 = 33.74 4.69 Standard error of difference between 1 and 2 = 3.58 CV =( s/mean)*100
Report of Analysis Analysis of yield data indicates highly significant differences in yield among the five soil types Soil type 1 produces the highest yield of lentil seed, though not significantly different from type 4 Soil type 3 is clearly inferior to the others 1 4 5 2 3