Planning rice breeding programs for impact Multi-environment trials: design and analysis
IRRI: Planning breeding Programs for Impact Introduction: P roblem of individual trials? Multi-environment trials (METs) used to predict performance in farmers fields Its predictive power = low SO
IRRI: Planning breeding Programs for Impact Introduction P roblem of METs? Must be planned carefully to ensure they are predictive and efficient very expensive and require much coordination and time SO
IRRI: Planning breeding Programs for Impact Learning objectives To clarify the purpose of variety trials To introduce linear models for multi-environment trials (MET’s) To describe the structure of the analysis of variance for MET’s To model the variance of a cultivar mean estimated from a MET To examine the effect of replication within and across sites and years on measures of precision
IRRI: Planning breeding Programs for Impact To predict performance: Off-station In the future WS 2002 WS Purpose of MET’s
IRRI: Planning breeding Programs for Impact 0 Yield (t/ha) 6 Single trial 0 Yield (t/ha) 6 Mean of 3 trials MET’s reduce SEM for cultivars
IRRI: Planning breeding Programs for Impact Simplest MET model considers trials “environments” Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl [7.1] The genotype x environment model Where: M = mean of all plots E i = effect of trial i R(E) j(i) = effect of rep j in trial I G k = effect of genotype k GE ik = interation of genotype k and trial i e ijkl = plot residual
IRRI: Planning breeding Programs for Impact The genotype x environment model Trials and reps are random factors They sample the TPE We do not select varieties for specific trials or reps Genotypes are fixed factors We are interested in the performance of the specific lines in the trial
IRRI: Planning breeding Programs for Impact The genotype x environment model The GE interaction is a random factor Interactions of fixed and random factors are always random Random interactions with genotypes are part of the error variance for genotype means
IRRI: Planning breeding Programs for Impact Single trial: Y ijk = μ + R j + G i + e k(j) GE model: Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl Relationship between GE model and single-trial model:
IRRI: Planning breeding Programs for Impact ANOVA for GLY model SourceMean SquareEMS Environments (E) Replicates within E GenotypesMS G σ 2 e + rσ 2 GE + reσ 2 G G x EMS GE σ 2 e + rσ 2 GE Error (Plot Residuals) MS e σ2eσ2e
IRRI: Planning breeding Programs for Impact Variance of a cultivar mean Where: e = number of trials r = number of reps per trial σ 2 Y = σ 2 GE /e + σ 2 e /re [7.2]
IRRI: Planning breeding Programs for Impact Estimating σ²G, σ²GE and σ²e σ 2 e = MS error σ 2 GE = (MS GE – Ms error )/r σ 2 G = (MS G – MS GE )/re
IRRI: Planning breeding Programs for Impact σ 2 e =.45 (t/ha) 2 σ 2 GE = 0.30 (t/ha) 2 Hypothetical values: σ 2 Y = σ 2 GE /e + σ 2 e /re [7.2] Example: modeling the LSD for a MET program using GE model
Number of sitesNr of reps/site SEM t/haLSD Table 1. The effect of trial and replicate number on the standard deviation of a cultivar mean: genotype x environment model
IRRI: Planning breeding Programs for Impact The “real” SEM (with GE component estimated separately) for a single trial is: SEM = (σ 2 GE /e + σ 2 e /re) 0.5 = ((0.3/1) + (0.45/4)) 0.5 = 0.64 t/ha The “apparent” SEM (with GE and G components confounded) for a single trial is: SEM = (σ 2 e /r) 0.5 = (0.45/4) 0.5 = 0.35
IRRI: Planning breeding Programs for Impact Y ijklm = M + Y i + S j + YS ij + R(YS) k(ij) + G l + GY il + GS jl + GYS ijl + e ijklm Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl σ 2 Y = σ 2 GY /y + σ 2 GS /s + σ 2 GYS /ys + σ 2 e /rys The genotype x site x year model A more realistic MET model subdivides the “environment” factor into “years” and “sites”:
Source Mean square EMS Years (Y) Sites (S) Y x S Replicates within Y x S Genotypes (G)MS G σ 2 e + rσ 2 GYS + rsσ 2 GY + ryσ 2 GS + rysσ 2 G G x SMS GS σ 2 e + rσ 2 GYS + ryσ 2 GS G x YMS GY σ 2 e + rσ 2 GYS + rsσ 2 GY G x Y x SMS GYS σ 2 e + rσ 2 GYS Plot residualsMS e σ2eσ2e ANOVA for GSY model
IRRI: Planning breeding Programs for Impact Estimating σ 2 GY, σ 2 GS, σ 2 GY S, and σ 2 e σ 2 e = MS error σ 2 GYS = (MS GYS – MS error )/r σ 2 GY = (MS GY – MS GYS )/rs σ 2 GS = (MS GS – MS GYS )/ry σ 2 G = (2MS G - MS GS – MS GY )/2rsy
IRRI: Planning breeding Programs for Impact Example: Modeling the LSD for a MET program using the GSY model For NE Thailand OYT: σ 2 e = (t/ha) 2 σ 2 GS = (t/ha) 2 σ 2 GY = (t/ha) 2 σ 2 GYS = (t/ha) 2 (Cooper et al., 1999)
IRRI: Planning breeding Programs for Impact Number of sites Number of years Number of replicates/site LSD (t ha -1 ) Example: Modeling the LSD for a MET program using the GSY model
IRRI: Planning breeding Programs for Impact Conclusions from error modeling exercise? σ 2 GS was very small in this case little evidence of specific adaptation to sites σ 2 GSY was very large in this case much random variation in cultivar performance from site to site and year to year σ 2 e very large, methods to reduce plot error are needed σ 2 GYS was very large compared to σ 2 GY and σ 2 GS sites and years are equivalent for testing
IRRI: Planning breeding Programs for Impact Deciding whether to divide a TPE If TPE = large and diverse, it may be worthwhile to divide it into sets of more homogeneous sites If no pre-existing hypothesis about how to group environments, use cluster, AMMI, or pattern analysis If there is a hypothesis that can be formed based on geography, soil type, management system, etc, group trials according to this fixed factor
IRRI: Planning breeding Programs for Impact Environments can be grouped into subregions: Y ijklm = M + S i + E j (S i ) + R(E(S)) k(ij) + G l + GS il + GE(S) lij + e ijklm Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl Subregions are fixed Trials within subregions are random If GS interaction term is not significant, subdivision is unnecessary, and could be harmful The genotype x subregion model
IRRI: Planning breeding Programs for Impact Source Mean square EMS Subregions (S) Locations within subregions (L(S)) Replicates within L(S) Genotypes (G)MS G σ 2 e + rσ 2 GL(S) + rlσ 2 GS + rlsσ 2 G G x SMS GS σ 2 e + rσ 2 GL(S) + rlσ 2 GS G x L(S)MS GL(S) σ 2 e + rσ 2 GL(S) Plot residualsMS e σ2eσ2e Expected mean squares for ANOVA of the genotype x subregion model for testing fixed groupings of sites
IRRI: Planning breeding Programs for Impact Example: Are central and southern Laos separate breeding targets? Should breeders and agronomists in Laos consider central and southern regions as separate TPE for RL rice? 22 traditional varieties tested in 4-rep trials at 3 sites in central region, 3 in south in WS 2004
SourcedfMSF Subregions (S) Locations within subregions (L(S)) Replicates within L(S) Genotypes (G) ** G x S G x L(S) ** Plot residuals ANOVA testing hypothesis: c entral & southern regions of Laos = separate RL breeding targets 22 TVs tested in WS 2004
IRRI: Planning breeding Programs for Impact Are central and southern Laos separate breeding targets? Genotype x subregion interaction is not significant when tested against variation among locations within subregions Subdivision is therefore not needed Subdivision might even be harmful, because it would reduce replication within each subregion
IRRI: Planning breeding Programs for Impact Can anyone briefly clarify the purpose of variety trials? When should you divide a TPE?
IRRI: Planning breeding Programs for Impact Summary 1 Purpose of a variety trial is to predict future performance in the TPE Random GEI interaction is large, and reduces precision with which cultivar means can be estimated Variance component estimates for the GLY model can be used to study resource allocation in testing programs Within homogeneous TPE, the GSY variance usually the largest. If so, strategies that emphasize testing over several sites or several years likely equally successful
IRRI: Planning breeding Programs for Impact Summary 2 Little benefit from including more than 3 replicates (and often more than 2) in a MET Standard errors and LSD’s estimated from single sites are unrealistically low because they do not take into account random GEI Fixed-subregion hypotheses allow a hypothesis about the existence of genotype x subregion interaction to be tested against genotype x trial within subregion interaction