What are BLUP? and why they are useful?

What are BLUP? and why they are useful?
Best Linear Unbiased Prediction (BLUP) are useful for two main reasons 1) they allow analysis of UNBALANCED data accumulated from performance tests 2) exploits information from RELATIVES. Inbreeding recycling in different crop species naturally lead to pedigree relationship among inbreds.

Henderson began his pioneer work on BLUP since 1940
In general we use BLUP to refer to the joint use of both BLUP and Best Linear Unbiased Estimation (BLUE) Fixed effects are estimated in BLUE. These effects are constants variables rather than random variables. Examples of these effects are the estimation of the overall mean, effect of soil type, effects of sites or environments, effect of a transgenic, etc. Do not have a variance covariance. The data needs to be corrected by possible environmental effects prior comparing the effects of genotypes.

Random effects are predicted in BLUP
Random effects have a variance-covariance structure whereas fixed effects do not.  Resemblance among relatives, full-sibs, half-sibs  Soil trends due to spatial correlation. Plot-to-plot variability in the field are correlated due to distance. More distance plots are less correlated that close distance plots.  Variation in time has a covariance structure

LINEAR MIXED MODELS The presence of fixed and random effects leads to a MIXED MODEL. BLUP and BLUE refer to statistical properties of prediction and estimation rather than the procedure for obtaining such prediction and estimations BEST=sampling variance of what is estimated or predicted is MINIMIZED UNBIASED= in BLUE indicate that the expected value of the estimates are equal to their true value. UNBIASED= in BLUP indicate that the prediction have zero expectation

WHAT IS REQUIRED IN BLUP?
Knowledge of the true value of the variance and covariance of the random effects. These are unknown so implementation of BLUP using estimates of these variances is always an approximation. In practice BLUP involves the simultaneous prediction of genetic effects and the estimation of genetic and non-genetic variance components

LINEAR MIXED MODEL

DATA=Two type of environments and four related genotypes (from Bernardo, 2010)
Mega-Env 1 18 sites Morex 4.45 Mega-Env 1 18 sites Robust 4.61 Mega-Env 1 18 sites Stander 5.27 Mega-Env 2 9 sites Robust 5.00 Mega-Env 2 9 sites Excel 5.82 Mega-Env 2 9 sites Stander 5.79

MIXED MODEL EQUATIONS (MME)

Solution BLUEs and BLUPs

Properties of The estimates of in the mixed model equations are identical to the generalized least-squares solution for fixed effects. The following re-parameterization is required to make the coefficient matrix in the mixed-model equations to be non-singular. With the restriction that ti = 0 the estimates are unique (estimable functions)

Properties of The average among unrelated individuals in the base population = 0 The average among related individuals developed in the inbred recycling is expected to be non 0 due to selection and genetic drift. For example, for Morex in Mega-Env 1 but For Excel in Mega-Env but BLUP property of SHRINKAGE

Properties of Suppose the overall mean is the only fixed effects, all inbreds are unrelated and the data is balanced. For this case the breeding value of the jth individual is ; when heritability = 0 the breeding value = 0 and when heritability = 1 the breeding value is equal to the phenotypic value. This is the shrinkage of the BLUP towards the mean

GENOMIC SELECTION AND BLUP
Marker-based selection consists on (1) identifying the marker with the significant effects for the trait of interest and (2) using these markers in QTL-introgression, F2 enrichment, marker-assisted recurrent selection (MARS), etc. Use of significant tests in linkage mapping or association mapping of QTL implies that only a subset of markers are used in subsequent marker-based selection

Marker assisted selection
The use of significant tests to identify which markers to use in F2 enrichment or in MARS is somewhat arbitrary. Marker whose effects exceed the significant value (threshold) are included Markers whose effects are not exceeding the threshold value are assigned a value of 0 regardless how close the estimated effects were to the significant threshold value.

Genomic selection (GS)
Genomic selection uses ALL AVAILABLE markers and is useful for traits that are likely to be controlled by many QTLs with small effects rather than by a few major QTLs GS predicts a continuum of effects across all markers -- some marker have large effects and other markers may have a effects close to zero; but markers with effects close to zero are still used in selection. GS can be described as marker-based selection without QTL mapping. NEED CHEAP AND ABUNDANT MARKERS!!

MARKER EFFECTS IN GS CAN BE CALCULATED BY BLUP
Suppose n=150 F3 families from the cross of two inbreds are evaluated in similar environmental conditions for testcross performance and genotyped with p=384 SNPs. The linear mixed effects model for the performance of the testcrosses on an entry-mean basis is

where y is the vector of responses n x 1 (i=1,2,
where y is the vector of responses n x 1 (i=1,2,..,n) X is the marker incidence matrix n x p g vector of marker random effects for each SNP p x 1(j=1,2,…p) with g ~ N(0, Ip x p V(marker)) e vector of random residual effects n x 1 with e ~ N(0, In x n V(e)) Elements of X for the jth SNP marker depend on whether the ith F3 family is homozygous for the marker allele from the first parental inbred (xij=1), heterozygous (xij=0) or homozygous for the marker allele from the second parental inbred (xij=-1). The effect of each marker SNP is defined as the effect associated with the marker allele from the first parental inbred

g~N(0, Ip x pV(marker)) We need to estimate the BLUPs of the SNP effects and in order to do so need to estimate the variance of the random effects (since this is assumed to be known in BLUP) Need to estimate V(g)=V(marker). Assume that this variance is the genetic variance expressed among the progeny being evaluated i.e., V(g). This is divided by the number of SNP markers (p) such that each marker has the same variance. Then V(markers)=V(g)/p g~N(0, Ip x pV(g)/p)

V(markers)=V(g)/p g ~ N(0, Ip x p V(g)/p)
Two assumptions (1) Each marker account for equal amount of genetic variance V(markers)=V(g)/p  all markers jointly account for 100% of the genetic variance  each marker individually account for 1/n of the genetic variance (2) Epistasis is ignore for the prediction

MME for solving the marker effect in g
where is the variance of an entry mean Genomic selection treats markers as a surrogate for the phenotype so that the individuals with the best predicted performance can be selected

Accuracy of Genomic Selection Prediction
GS will be superior than MARS if it leads to more accurate predictions of genotypic values The accuracy is defined as the correlation between the true genotypic value and the genotypic value predicted from marker information. The true genotypic value is unobservable then the accuracy is estimated as the correlation between the observed and predicted performance divide by the squared root of the heritability (to correct for the influence of non-genetic effects on the observed performance)

What are BLUP? and why they are useful?

Similar presentations

Presentation on theme: "What are BLUP? and why they are useful?"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

What are BLUP? and why they are useful?

Similar presentations

Presentation on theme: "What are BLUP? and why they are useful?"— Presentation transcript:

Similar presentations

About project

Feedback