Planning rice breeding programs for impact Multi-environment trials: design and analysis.

Slides:



Advertisements
Similar presentations
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
Advertisements

Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Combined Analysis of Experiments Basic Research –Researcher makes hypothesis and conducts a single experiment to test it –The hypothesis is modified and.
Combined Analysis of Experiments Basic Research –Researcher makes hypothesis and conducts a single experiment to test it –The hypothesis is modified and.
Planning rice breeding programs for impact
Sub - Sampling It may be necessary or convenient to measure a treatment response on subsamples of a plot –several soil cores within a plot –duplicate laboratory.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Genotype and genotype x environment interaction of some rice grain qualities in Tanzania Nkori J.M. Kibanda 1 and Ashura Luzi-Kihupi 2 1 Rice Breeder 2.
Nested Designs Study vs Control Site. Nested Experiments In some two-factor experiments the level of one factor, say B, is not “cross” or “cross classified”
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
© 2010 Pearson Prentice Hall. All rights reserved The Complete Randomized Block Design.
ANOVA: ANalysis Of VAriance. In the general linear model x = μ + σ 2 (Age) + σ 2 (Genotype) + σ 2 (Measurement) + σ 2 (Condition) + σ 2 (ε) Each of the.
Chapter 10 - Part 1 Factorial Experiments.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Quantitative Genetics
Correlation and Regression Analysis
13 Design and Analysis of Single-Factor Experiments:
Some Notes on the Design and Analysis of Experiments.
T WO W AY ANOVA W ITH R EPLICATION  Also called a Factorial Experiment.  Factorial Experiment is used to evaluate 2 or more factors simultaneously. 
T WO WAY ANOVA WITH REPLICATION  Also called a Factorial Experiment.  Replication means an independent repeat of each factor combination.  The purpose.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
PBG 650 Advanced Plant Breeding
Setting goals and identifying target environments Planning breeding programs for impact.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
Magister of Electrical Engineering Udayana University September 2011
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Fixed vs. Random Effects
Design of Engineering Experiments Part 4 – Introduction to Factorials
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Chapter 11 Multifactor Analysis of Variance.
METs for evaluating experimental varieties. Response variable: Grain yield lowmoderate A B extreme Basics of Genotype x Environment interaction Context:
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Genotype x Environment Interactions Analyses of Multiple Location Trials.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Planning rice breeding programs for impact Models, means, variances, LSD’s and Heritability.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Planning rice breeding programs for impact Correlated response to selection.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Randomized block designs  Environmental sampling and analysis (Quinn & Keough, 2002)
PBG 650 Advanced Plant Breeding
PCB 3043L - General Ecology Data Analysis.
Planning rice breeding programs for impact Heritability in multi-location trials and response to selection.
ANOVA, Regression and Multiple Regression March
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Genotype x Environment Interactions Analyses of Multiple Location Trials.
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Factorial Experiments Analysis of Variance Experimental Design.
Genotype x Environment Interactions Analyses of Multiple Location Trials.
Two-Factor Study with Random Effects In some experiments the levels of both factors A & B are chosen at random from a larger set of possible factor levels.
Analysis of Variance Yonghui Dong 03/12/2013. Why not multiple t tests comparison? Three Groups comparison: Group 1 vs. Group 2 Group 1 vs. Group 3 Group.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Factorial Experiments
Two way ANOVA with replication
Comparing Three or More Means
PCB 3043L - General Ecology Data Analysis.
Two way ANOVA with replication
Thought Questions.
12 Inferential Analysis.
Three way ANOVA If there was One way then and Two way, then you knew that there had to be………
CHAPTER 29: Multiple Regression*
Understanding Multi-Environment Trials
12 Inferential Analysis.
3.2. SIMPLE LINEAR REGRESSION
One-way Analysis of Variance
STATISTICS INFORMED DECISIONS USING DATA
F test for Lack of Fit The lack of fit test..
Presentation transcript:

Planning rice breeding programs for impact Multi-environment trials: design and analysis

IRRI: Planning breeding Programs for Impact Introduction: P roblem of individual trials? Multi-environment trials (METs) used to predict performance in farmers fields Its predictive power = low SO

IRRI: Planning breeding Programs for Impact Introduction P roblem of METs? Must be planned carefully to ensure they are predictive and efficient very expensive and require much coordination and time SO

IRRI: Planning breeding Programs for Impact Learning objectives To clarify the purpose of variety trials To introduce linear models for multi-environment trials (MET’s) To describe the structure of the analysis of variance for MET’s To model the variance of a cultivar mean estimated from a MET To examine the effect of replication within and across sites and years on measures of precision

IRRI: Planning breeding Programs for Impact  To predict performance: Off-station In the future WS 2002 WS Purpose of MET’s

IRRI: Planning breeding Programs for Impact 0 Yield (t/ha) 6 Single trial 0 Yield (t/ha) 6 Mean of 3 trials MET’s reduce SEM for cultivars

IRRI: Planning breeding Programs for Impact Simplest MET model considers trials “environments” Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl [7.1] The genotype x environment model Where: M = mean of all plots E i = effect of trial i R(E) j(i) = effect of rep j in trial I G k = effect of genotype k GE ik = interation of genotype k and trial i e ijkl = plot residual

IRRI: Planning breeding Programs for Impact The genotype x environment model Trials and reps are random factors  They sample the TPE  We do not select varieties for specific trials or reps Genotypes are fixed factors  We are interested in the performance of the specific lines in the trial

IRRI: Planning breeding Programs for Impact The genotype x environment model The GE interaction is a random factor  Interactions of fixed and random factors are always random  Random interactions with genotypes are part of the error variance for genotype means

IRRI: Planning breeding Programs for Impact Single trial: Y ijk = μ + R j + G i + e k(j) GE model: Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl Relationship between GE model and single-trial model:

IRRI: Planning breeding Programs for Impact ANOVA for GLY model SourceMean SquareEMS Environments (E) Replicates within E GenotypesMS G σ 2 e + rσ 2 GE + reσ 2 G G x EMS GE σ 2 e + rσ 2 GE Error (Plot Residuals) MS e σ2eσ2e

IRRI: Planning breeding Programs for Impact Variance of a cultivar mean Where: e = number of trials r = number of reps per trial σ 2 Y = σ 2 GE /e + σ 2 e /re [7.2]

IRRI: Planning breeding Programs for Impact Estimating σ²G, σ²GE and σ²e σ 2 e = MS error σ 2 GE = (MS GE – Ms error )/r σ 2 G = (MS G – MS GE )/re

IRRI: Planning breeding Programs for Impact σ 2 e =.45 (t/ha) 2 σ 2 GE = 0.30 (t/ha) 2 Hypothetical values: σ 2 Y = σ 2 GE /e + σ 2 e /re [7.2] Example: modeling the LSD for a MET program using GE model

Number of sitesNr of reps/site SEM t/haLSD Table 1. The effect of trial and replicate number on the standard deviation of a cultivar mean: genotype x environment model

IRRI: Planning breeding Programs for Impact The “real” SEM (with GE component estimated separately) for a single trial is: SEM = (σ 2 GE /e + σ 2 e /re) 0.5 = ((0.3/1) + (0.45/4)) 0.5 = 0.64 t/ha The “apparent” SEM (with GE and G components confounded) for a single trial is: SEM = (σ 2 e /r) 0.5 = (0.45/4) 0.5 = 0.35

IRRI: Planning breeding Programs for Impact Y ijklm = M + Y i + S j + YS ij + R(YS) k(ij) + G l + GY il + GS jl + GYS ijl + e ijklm Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl σ 2 Y = σ 2 GY /y + σ 2 GS /s + σ 2 GYS /ys + σ 2 e /rys The genotype x site x year model A more realistic MET model subdivides the “environment” factor into “years” and “sites”:

Source Mean square EMS Years (Y) Sites (S) Y x S Replicates within Y x S Genotypes (G)MS G σ 2 e + rσ 2 GYS + rsσ 2 GY + ryσ 2 GS + rysσ 2 G G x SMS GS σ 2 e + rσ 2 GYS + ryσ 2 GS G x YMS GY σ 2 e + rσ 2 GYS + rsσ 2 GY G x Y x SMS GYS σ 2 e + rσ 2 GYS Plot residualsMS e σ2eσ2e ANOVA for GSY model

IRRI: Planning breeding Programs for Impact Estimating σ 2 GY, σ 2 GS, σ 2 GY S, and σ 2 e σ 2 e = MS error σ 2 GYS = (MS GYS – MS error )/r σ 2 GY = (MS GY – MS GYS )/rs σ 2 GS = (MS GS – MS GYS )/ry σ 2 G = (2MS G - MS GS – MS GY )/2rsy

IRRI: Planning breeding Programs for Impact Example: Modeling the LSD for a MET program using the GSY model For NE Thailand OYT: σ 2 e = (t/ha) 2 σ 2 GS = (t/ha) 2 σ 2 GY = (t/ha) 2 σ 2 GYS = (t/ha) 2 (Cooper et al., 1999)

IRRI: Planning breeding Programs for Impact Number of sites Number of years Number of replicates/site LSD (t ha -1 ) Example: Modeling the LSD for a MET program using the GSY model

IRRI: Planning breeding Programs for Impact Conclusions from error modeling exercise? σ 2 GS was very small in this case  little evidence of specific adaptation to sites σ 2 GSY was very large in this case  much random variation in cultivar performance from site to site and year to year σ 2 e very large, methods to reduce plot error are needed σ 2 GYS was very large compared to σ 2 GY and σ 2 GS  sites and years are equivalent for testing

IRRI: Planning breeding Programs for Impact Deciding whether to divide a TPE If TPE = large and diverse, it may be worthwhile to divide it into sets of more homogeneous sites If no pre-existing hypothesis about how to group environments, use cluster, AMMI, or pattern analysis If there is a hypothesis that can be formed based on geography, soil type, management system, etc, group trials according to this fixed factor

IRRI: Planning breeding Programs for Impact Environments can be grouped into subregions: Y ijklm = M + S i + E j (S i ) + R(E(S)) k(ij) + G l + GS il + GE(S) lij + e ijklm Y ijkl = M + E i + R(E) j(i) + G k + GE ik + e ijkl Subregions are fixed Trials within subregions are random If GS interaction term is not significant, subdivision is unnecessary, and could be harmful The genotype x subregion model

IRRI: Planning breeding Programs for Impact Source Mean square EMS Subregions (S) Locations within subregions (L(S)) Replicates within L(S) Genotypes (G)MS G σ 2 e + rσ 2 GL(S) + rlσ 2 GS + rlsσ 2 G G x SMS GS σ 2 e + rσ 2 GL(S) + rlσ 2 GS G x L(S)MS GL(S) σ 2 e + rσ 2 GL(S) Plot residualsMS e σ2eσ2e Expected mean squares for ANOVA of the genotype x subregion model for testing fixed groupings of sites

IRRI: Planning breeding Programs for Impact Example: Are central and southern Laos separate breeding targets? Should breeders and agronomists in Laos consider central and southern regions as separate TPE for RL rice? 22 traditional varieties tested in 4-rep trials at 3 sites in central region, 3 in south in WS 2004

SourcedfMSF Subregions (S) Locations within subregions (L(S)) Replicates within L(S) Genotypes (G) ** G x S G x L(S) ** Plot residuals ANOVA testing hypothesis: c entral & southern regions of Laos = separate RL breeding targets 22 TVs tested in WS 2004

IRRI: Planning breeding Programs for Impact Are central and southern Laos separate breeding targets? Genotype x subregion interaction is not significant when tested against variation among locations within subregions  Subdivision is therefore not needed  Subdivision might even be harmful, because it would reduce replication within each subregion

IRRI: Planning breeding Programs for Impact Can anyone briefly clarify the purpose of variety trials? When should you divide a TPE?

IRRI: Planning breeding Programs for Impact Summary 1 Purpose of a variety trial is to predict future performance in the TPE Random GEI interaction is large, and reduces precision with which cultivar means can be estimated Variance component estimates for the GLY model can be used to study resource allocation in testing programs Within homogeneous TPE, the GSY variance usually the largest. If so, strategies that emphasize testing over several sites or several years likely equally successful

IRRI: Planning breeding Programs for Impact Summary 2 Little benefit from including more than 3 replicates (and often more than 2) in a MET Standard errors and LSD’s estimated from single sites are unrealistically low because they do not take into account random GEI Fixed-subregion hypotheses allow a hypothesis about the existence of genotype x subregion interaction to be tested against genotype x trial within subregion interaction