Presentation is loading. Please wait.

Presentation is loading. Please wait.

Contact: Biplot Analysis of Multi-Environment Trial Data Weikai Yan May 2006.

Similar presentations


Presentation on theme: "Contact: Biplot Analysis of Multi-Environment Trial Data Weikai Yan May 2006."— Presentation transcript:

1 Contact: wyan@ggebiplot.com Biplot Analysis of Multi-Environment Trial Data Weikai Yan May 2006

2 Weikai Yan 2006 Multi-Environment Trials (MET) MET are essential MET are expensive MET data are valuable MET data are not fully used

3 Weikai Yan 2006 Why biplot analysis? Biplot analysis can help understand MET data –Graphically, –Effectively, –Conveniently

4 Weikai Yan 2006 Outline Multi-environment trial (MET) data Basics of biplot analysis Biplot analysis of G-by-E data Biplot analysis of G-by-T data Better understanding of MET data Conclusions

5 Contact: wyan@ggebiplot.com Multi-environment trial data

6 Weikai Yan 2006 MET data is a genotype-environment-trait (G-E-T) 3-way table Multiple Genotypes Multiple Environments Multiple Traits

7 Weikai Yan 2006 A G-E-T 3-way table contains many 2-way tables G by E: for each trait G by T (trait): in each environment; across environments E by T: for each genotype; across genotypes G-E-T data >> G-E data

8 Weikai Yan 2006 A G-E-T 3-way table is an extended 2-way table G by V: –each E-T combination as a variable (V) P by T: –each G-E combination as a phenotype (P)

9 Weikai Yan 2006 A G-E-T 3-way table implies informative 2-way tables Association by environment 2-way tables –Associations: among traits between traits and genetic markers

10 Weikai Yan 2006 Goals of MET data analysis Short-term goals: –Variety evaluation Response to the environment (G x E) Trait profiles (G x T) Long-term goals: –To understand the target environment (G x E) the test environments (G x E) the crop (G x T) the genotype x environment interaction (A x T)

11 Contact: wyan@ggebiplot.com Basics of biplot analysis Most two-way tables can be visually studied using biplots

12 Weikai Yan 2006 Origin of biplot Gabriel (1971) One of the most important advances in data analysis in recent decades Currently… > 50,000 web pages Numerous academic publications Included in most statistical analysis packages Still a very new technique to most scientists Prof. Ruben Gabriel, The founder of biplot Courtesy of Prof. Purificación Galindo University of Salamanca, Spain

13 Weikai Yan 2006 What is a biplot? Biplot = bi + plot –plot scatter plot of two rows OR of two columns, or scatter plot summarizing the rows OR the columns –bi BOTH rows AND columns 1 biplot >> 2 plots

14 Weikai Yan 2006 Mathematical definition of a Biplot Graphical display of matrix multiplication Inner product property –P ij =OA i *OB j *cos ij –Implies the product matrix A(4, 2) B(2, 3) P(4, 3) Matrix multiplication A1A2 A3 A4 B1 B2 B3 5.0 cos = 0.8944 4.472 P11 = 5*4.472*0.8944 = 20

15 Weikai Yan 2006 Practical definition of a biplot Practical definition of a biplot Any two-way table can be analyzed using a 2D-biplot as soon as it can be sufficiently approximated by a rank-2 matrix. (Gabriel, 1971) G-by-E table Matrix decomposition G1G2 G3 G4 E1 E2 E3 P(4, 3) G(3, 2) E(2, 3) (Now 3D-biplots are also possible…)

16 Weikai Yan 2006 Singular Value Decomposition (SVD) & Singular Value Partitioning (SVP) (0 f 1) Singular values Matrix characterising the rows Matrix characterising the columns SVD = PCA? SVD: SVP: The rank of Y, i.e., the minimum number of PC required to fully represent Y Rows scoresColumn scores Biplot Plot

17 Weikai Yan 2006 Biplot interpretations Inner-product property Interpretations based on biplots with f = 1 approximates YY T, the distance matrix Similarity/dissimilarity among row (genotype) factors Interpretations based on biplots with f = 0 approximates Y T Y, the variance matrix Similarity/dissimilarity among column (environment) factors Combined use of f = 0 and f = 1 (Gabriel, 2002 Biometrika; Yan, 2002, Agron J; Built in the GGEbiplot software)

18 Weikai Yan 2006 Biplot analysis is… to use biplots to display –a two-way data per se (Y), –its distance matrix (YY T ), and –its variance matrix (Y T Y) so that –relationships among rows, –relationships among columns, and –interactions between rows and columns can be graphically visualized.

19 Weikai Yan 2006 Data centering prior to biplot analysis The general linear model for a G-by-E data set (P) –P = M + G + E + GE Possible two-way tables (Y): Y = P = M + G + E + GE original data: QQE biplot Y = P – M = G + E + GE global-centered (PCA) Y = P – M – E = G + GE column-centered: GGE biplot Y = P – M – G = E + GE row-centered Y = P – M – G – E = GE double-centered: GE biplot All models are useful, depending on the research objectives (built in GGEbiplot)

20 Weikai Yan 2006 Data scaling prior to biplot analysis Different GGE biplots Y ij = ( i + ij )/s j S j = 1 no scaling S j = (s.d.) j all environments are equally important S j = (s.e.) j heterogeneity among environments is removed (built in GGEbiplot)

21 Weikai Yan 2006 Four questions must be asked before trying to interpret a biplot 1.What is the model? How the data were centered and scaled? What are we looking at? 2.What is the goodness of fit? How confident are we about what we see? What if the data is fitted poorly? 3.How singular values are partitioned? What questions can be asked? 4.Are the axes drawn to scale? Are the patterns artifacts? (All are addressed explicitly in GGEbiplot)

22 Contact: wyan@ggebiplot.com Biplot Analysis of G-by-E data MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

23 Weikai Yan 2006 Sample G-by-E data (Yield data of 18 genotypes in 9 environments, 1993, Ontario, Canada)

24 Weikai Yan 2006 Before trying to interpret a biplot… 1.Model selection? Centering = 2 (G+GE) Scaling =0 2.Goodness of fit? 78%. 3.Singular value partitioning? SVP = 2 (environment- metric ) 4.Draw to scale? Yes.

25 Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION Mega-environment is a group of geographical locations that share the same (set of) best genotypes consistently across years.

26 Weikai Yan 2006 Relationships among environments Relationships among environments The Environment-vector view Angle vs. correlation The angles among test environments Environment grouping

27 Weikai Yan 2006 Which-won-where (Crossover GE is GE that caused genotype rank changes and different winners in different test environments) G12 G7 G18 G8 G13

28 Weikai Yan 2006 Are there meaningful crossover GE? Are there meaningful crossover GE? The which-won-where view (Crossover GE is GE that caused genotype rank changes and different winners in different test environments)

29 Weikai Yan 2006 Are the crossover patterns* repeatable? If YES… –The target environment can be divided into multiple mega-environments –GE can be exploited by selecting for each mega- environment –GE G If NO … –The target environment CANNOT be divided into multiple mega-environments –GE CANNOT be exploited –GE must be avoided by testing across locations and years *Not the environment-grouping patterns Mega-environment is a group of geographical locations that share the same (set of) best genotypes consistently across years. Multi-year data are needed

30 Weikai Yan 2006 Classify your target environment into one of three categories With Crossover GENo Crossover GE Repeatable (2) Multiple MEs Select for specifically adapted genotypes for each ME (1) Single simple ME A single test location, single year suffices to select a single best variety Not repeatable (3) Single complex ME Select for generally adapted genotypes across the whole regions across multiple years ME: mega-environment

31 Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

32 Weikai Yan 2006 Discriminating ability and representativeness Vector length: discriminating ability Angle to the AE: representativeness Average-environment axis Average environment

33 Weikai Yan 2006 Ideal test environments: discriminating and representative Ideal test environment

34 Weikai Yan 2006 Classify each test environment into one of three categories For each good or useful test environment: is it essential? DiscriminativeNot discriminative Representative (2) Good for selecting (more important) (1) Useless Not representative (3) Useful for culling (less important)

35 Weikai Yan 2006 Vector length = discrimination = GE = GE1 + GE2 Contribution to Proportionate GE Contribution to Non- proportionate GE

36 Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

37 Weikai Yan 2006 Vector length = GGE = G + GE Contribution To GE (instability) Contribution To G (mean performance)

38 Weikai Yan 2006 Mean vs. Stability

39 Weikai Yan 2006 Genotype ranking on both MEAN and STABILITY The ideal genotype

40 Weikai Yan 2006 Genotype classification Mean Stability High mean performance Low mean performance High stabilityGenerally adapted (VERY GOOD) Bad everywhere (VERY BAD) Low stabilitySpecifically Adapted (GOOD) Bad somewhere (BAD) Are there stability genes?!

41 Weikai Yan 2006 G x E data analysis summary 1) Mega-environment analysis 2) Test environment evaluation 3) Genotype evaluation Important comments: –(2) and (3) are meaningful only for a single mega-environment –Any stability analysis is meaningful only for a single mega- environment –Any stability index can be used only as a modifier to the ranking based on mean performance

42 Contact: wyan@ggebiplot.com Other ways to view a GGE biplot

43 Weikai Yan 2006 Inner-product property

44 Weikai Yan 2006 Ranking on a single environment

45 Weikai Yan 2006 Ranking on two environments

46 Weikai Yan 2006 Relative adaptation of a genotype

47 Weikai Yan 2006 Compare any two genotypes

48 Contact: wyan@ggebiplot.com Biplot analysis of Genotype by trait data

49 Weikai Yan 2006 Objectives of G By T data analysis Genotype evaluation based on trait profiles Relationship among breeding objectives

50 Weikai Yan 2006 Data of 4 traits for 19 covered oat varieties (Ontario 2004) (Background info: High yield, high groat, high protein, and low oil are desirable for milling oats)

51 Weikai Yan 2006 Relationships among traits

52 Weikai Yan 2006 Trait profile of each genotype

53 Weikai Yan 2006 Trait profile of a genotype

54 Weikai Yan 2006 Trait profile comparison between two genotypes

55 Weikai Yan 2006 Genotype ranking based on a trait

56 Weikai Yan 2006 Parent selection based on trait profiles

57 Weikai Yan 2006 Independent culling

58 Contact: wyan@ggebiplot.com Fuller understanding of MET data MET data are more informative than you thought

59 Weikai Yan 2006 A G-E-T 3-way dataset contains various 2-way tables G by E data G by T data E by T data: –for each genotype; all genotypes G by V data: –each E-T as a variable (V) P by T data: –each G-E as a phenotype (P) Genetic association by environment data Trait association by environment data

60 Weikai Yan 2006 Genetic-covariate by environment biplot (QTL by environment biplot) Barley Genomics Data

61 Weikai Yan 2006 Trait-association by environment biplot Oat MET Data

62 Weikai Yan 2006 Four-way data analysis Year…

63 Contact: wyan@ggebiplot.com Conclusions

64 Weikai Yan 2006 Conclusion (1) GGE biplot analysis is an effective tool for G by E data analysis to achieve understandings about…. 1.the target environment, 2.the test environments, and 3.the genotypes 4.stability analysis is useful only to a single mega-environment

65 Weikai Yan 2006 Conclusion (2) GGE biplot analysis is an effective tool for G by T data analysis to achieve understandings about…. 1. the interconnected plant system, 2. positively correlated traits 3. negatively correlated traits 4. the strength and weakness of the genotypes

66 Weikai Yan 2006 Conclusion (3) Biplot analysis is an effective tool for other two-way table analysis –Marker by environment –QTL by environment –Gene by treatment –Diallel cross –…

67 Weikai Yan 2006 Conclusion (4) Biplot analysis can be VERY EASY… –From reading data to displaying the biplot: 2 seconds –Displaying any of the perspectives of a biplot and changing from one to another: 1 second –Displaying the biplot for any subset: 1 second –Learning how to use the software and interpret biplots: 30 minutes –Everything can be just one mouse-click away

68 Contact: wyan@ggebiplot.com Thank you Contact: Weikai Yan: wyan@ggebiplot.comwyan@ggebiplot.com web: www.ggebiplot.comwww.ggebiplot.com


Download ppt "Contact: Biplot Analysis of Multi-Environment Trial Data Weikai Yan May 2006."

Similar presentations


Ads by Google