Genotype x Environment Interactions Analyses of Multiple Location Trials
Previous Class Why do researchers conduct experiments over multiple locations and multiple times? What causes genotype x environment interactions? What is the difference between a ‘true’ interaction and a scalar interaction? What environments can be considered to be controlled, partially controlled or nor controlled.
How many environments do I need? Where should they be?
Number of Environments Availability of planting material. Diversity of environmental conditions. Magnitude of error variances and genetic variances in any one year or location. Availability of suitable cooperators Cost of each trial ($’s and time).
Location of Environments Variability of environment throughout the target region. Proximity to research base. Availability of good cooperators. $$$’s.
Analyses of Multiple Experiments
Points to Consider before Analyses Normality. Homoscalestisity (homogeneity) of error variance. Additive. Randomness.
Points to Consider before Analyses Normality. Homoscalestisity (homogeneity) of error variance. Additive. Randomness.
Bartlett Test (same degrees of freedom) M = df{nLn(S) - Ln 2 } Where, S = 2 /n 2 n-1 = M/C C = 1 + (n+1)/3ndf n = number of variances, df is the df of each variance
Bartlett Test (same degrees of freedom) S = 101.0; Ln(S) = 4.614
Bartlett Test (same degrees of freedom) S = 100.0; Ln(S) = M = (5)[(4)(4.614) ] = 1.880, 3df C = 1 + (5)/[(3)(4)(5)] = 1.083
Bartlett Test (same degrees of freedom) S = 100.0; Ln(S) = M = (5)[(4)(4.614) ] = 1.880, 3df C = 1 + (5)/[(3)(4)(5)] = 2 3df = 1.880/1.083 = 1.74 ns
Bartlett Test (different degrees of freedom) M = ( df)nLn(S) - dfLn 2 Where, S = [ df. 2 ]/( df) 2 n-1 = M/C C = 1+{(1)/[3(n-1)]}.[ (1/df)-1/ ( df)] n = number of variances
Bartlett Test (different degrees of freedom) S = [ df. 2 ]/( df) = 13.79/37 = ( df)Ln(S) = (37)( ) =
Bartlett Test (different degrees of freedom) M = ( df)Ln(S) - dfLn 2 = (54.472) = C = 1+[1/(3)(4)]( ) = 1.057
Bartlett Test (different degrees of freedom) S = [ df. 2 ]/( df) = 13.79/37 = ( df)Ln(S) = (37)(=0.9870) = M = ( df)Ln(S) - dfLn 2 = (54.472) = C = 1+[1/(3)(4)]( ) = 2 3df = 17.96/1.057 = **, 3df
Heterogeneity of Error Variance
Significant Bartlett Test “ What can I do where there is significant heterogeneity of error variances?” Transform the raw data: Often ~ cw Binomial Distribution where = np and = npq Transform to square roots
Heterogeneity of Error Variance
Significant Bartlett Test “What else can I do where there is significant heterogeneity of error variances?” Transform the raw data: Homogeneity of error variance can always be achieved by transforming each site’s data to the Standardized Normal Distribution [x i - ]/
Significant Bartlett Test “What can I do where there is significant heterogeneity of error variances?” Transform the raw data Use non-parametric statistics
Analyses of Variance
Model ~ Multiple sites Y ijk = + g i + e j + ge ij + E ijk i g i = j e j = ij ge ij Environments and Replicate blocks are usually considered to be Random effects. Genotypes are usually considered to be Fixed effects.
Analysis of Variance over sites
Y ijkl = +g i +s j +y k +gs ij +gy ik +sy jk +gsy ijk +E ijkl i g i = j s j = k y k = 0 ij gs ij = ik gy ik = jk sy ij = 0 ijk gsy ijk = 0 Models ~ Years and sites
Analysis of Variance
Interpretation
Interpretation Look at data: diagrams and graphs Joint regression analysis Variance comparison analyze Probability analysis Multivariate transformation of residuals: Additive Main Effects and Multiplicative Interactions (AMMI)
Multiple Experiment Interpretation Visual Inspection Inter-plant competition study Four crop species: Pea, Lentil, Canola, Mustard Record plant height (cm) every week after planting Significant species x time interaction
Plant Biomas x Time after Planting
PeaLentil Mustard Canola
Legume Brassica
Joint Regression
Regression Revision Glasshouse study, relationship between time and plant biomass. Two species: B. napus and S. alba. Distructive sampled each week up to 14 weeks. Dry weight recorded.
Dry Weight Above Ground Biomass
Biomass Study S. alba B. napus
Biomass Study (Ln Transformation) S. alba B. napus
Mean x = 7.5; Mean y = SS(x)=227.5; SS(y)=61.66; SP(x,y)= Ln(Growth) = x Weeks se(b)=
B. napus Mean x = 7.5; Mean y = SS(x)=227.5; SS(y)=61.66; SP(x,y)= Ln(Growth) = x Weeks se(b)= Source df SS MS Regression *** Residual
S. alba Mean x = 7.5; Mean y = SS(x)=227.5; SS(y)=61.03; SP(x,y)= Ln(Growth) = x Weeks se(b)=
S. alba Mean x = 7.5; Mean y = SS(x)=227.5; SS(y)=61.03; SP(x,y)= Ln(Growth) = x Weeks se(b)= Source df SS MS Regression *** Residual
Comparison of Regression Slopes t - Test [b 1 - b 2 ] [se(b 1 ) + se(b 2 )/2] [( )/2] = 0.22 ns
Joint Regression Analyses
Y ijk = + g i + e j + ge ij + E ijk ge ij = i e j + ij Y ijk = + g i + (1+ i )e j + ij + E ijk
Yield Environments a b c d
Joint Regression Example Class notes, Table15, Page 229. 20 canola (Brassica napus) cultivars. Nine locations, Seed yield.
Joint Regression Example
Source df SSqMSq Regression *** Residual Westar = 0.94 x Mean
Joint Regression Example Source df SSqMSq Regression *** Residual Bounty = 1.12 x Mean
Joint Regression Example
Joint Regression ~ Example #2
Joint Regression
Problems with Joint Regression Non-independence - regression of genotype values onto site means, which are derived including the site values. The x-axis values (site means) are subject to errors, against the basic regression assumption. Sensitivity ( -values) correlated with genotype mean.
Problems with Joint Regression Non-independence - regression of genotype values onto site means, which are derived including the site values. Do not include genotype value in mean for that regression. Do regression onto other values other than site means (i.e. control values).
Joint Regression ~ Example #2
Problems with Joint Regression The x-axis values (site means) are subject to errors, against the basic regression assumption. Sensitivity ( -values) correlated with genotype mean.
Addressing the Problems Use genotype variance over sites to indicate sensitivity rather than regression coefficients.
Genotype Yield over Sites ‘Ark Royal’
Genotype Yield over Sites ‘Golden Promise’
Over Site Variance
Univariate Probability Prediction
Over Site Variance
Univariate Probability Prediction ƒ(µ¸A) T A.
TT TT ƒ ( A d dA T A. Univariate Probability Prediction
Environmental Variation 1 1 1 1 2 2 2 2 T
Use of Normal Distribution Function Tables |T – m| g to predict values greater than the target (T) |m – T| g to predict values less than the target (T)
The mean (m) and environmental variance ( g 2 ) of a genotype is 12.0 t/ha and , respectively (so = 4). What is the probability that the yield of that given genotype will exceed 14 t/ha when grown at any site in the region chosen at random from all possible sites. Use of Normal Distribution Function Tables
T – m g = = = = 14 – 12 4 = Use of Normal Distribution Function Tables = 0.5 Using normal dist. tables we have the probability from - to T is Actual answer is 1 – = (or 38.85% of all sites in the region).
Use of Normal Distribution Function Tables The mean (m) and environmental variance ( g 2 ) of a genotype is 12.0 t/ha and , respectively (so = 4). What is the probability that the yield of that given genotype will exceed 11 t/ha when grown at any site in the region chosen at random from all possible sites.
T – m g = = = = 11 – 12 4 = Use of Normal Distribution Function Tables = Using normal dist. tables we have (0.25) = , but because is negative our answer is 1 – (1 – ) = or 60% of all sites in the region.
Exceed the target; and (T-m)/ positive, then probability = 1 – table value. Exceed the target; and (T-m)/ negative, then probability = table value. Less than the target; and (m-T)/ positive, then probability = table value. Less than target; and (m-T)/ negative, then probability = 1 – table value. Use of Normal Distribution Function Tables
Univariate Probability
Multivariate Probability Prediction T1T1--T1T1-- T2T2--T2T2-- TnTn--TnTn-- …. f (x 1,x 2,..., x n ) dx 1, dx 2,..., dx n
Problems with Probability Technique Setting suitable/appropriate target values: Control performance Industry (or other) standard Past experience Experimental averages
Complexity of analytical estimations where number of variables are high: Use of rank sums Problems with Probability Technique
Additive Main Effects and Multiplicative Interactions AMMI AMMI analysis partitions the residual interaction effects using principal components. Inspection of scatter plot of first two eigen values (PC1 and PC2) or first eigen value onto the mean.
AMMI Analyses Y ijk = + g i + e j + ge ij + E ijk
AMMI Analyses Y ijk - - g i - e j - E ijk = ge ij
AMMI Analyses Y ijk - - g i - e j - E ijk = ge ij ge 11 ge 12 ge 13 ….. ge 1n ge 21 ge 22 ge 23 ….. ge 2n... … …... ge i1 ge i2 ge i3 ….. ge in... … …... ge k1 ge k2 ge k3 ….. ge kn
AMMI Analysis Seed Yield G1 S4 S7 S3 S5 G4 S1 S6 G2 G3 S2
G1 S4 S7 S3 S5 G4 S1 S6 G2 G3 S2 AMMI Analysis Seed Yield
G1 S4 S7 S3 S5 G4 S1 S6 G2 G3 S2 AMMI Analysis Seed Yield
Time Square Chi-Square