6. Kernel Regression
Framework Phenotype Genetic Value Model Residual Ridge Regression / LASSO Bayes A, Bayes B, Bayesian LASSO … - Linear model: - Reproducing Kernel Hilbert Spaces Regression Neural Networks … - Semi-parametric models:
RKHS Regressions (Background) Uses: Scatter-plot smoothing (Smoothing Splines) [1] Spatial smoothing (‘Kriging’) [2] Classification problems (Support vector machines) [3] Animal model … Regression setting (it can be of any nature) unknown function [1] Wahba (1990) Spline Models for Observational Data. [2] Cressie, N. (1993) Statistics for Spatial Data. [3] Vapnik, V. (1998) Statistical Learning Theory.
RKHS Regressions (Background) Non-parametric representation of functions Reproducing Kernel: Must be positive (semi) definite: Defines a correlation function: Defines a RKHS of real-valued functions [1] [1] Aronszajn, N. (1950) Theory of reproducing kernels
Functions as Gaussian processes K=A => Animal Model [1] [1] de los Campos Gianola and Rosa (2008) Journal of Animal Sci.
RKHS Regression in BGLR1 ETA<-list( list(K=K,model='RKHS') ) fm<-BGLR(y=y,ETA=ETA,nIter=...) [1]: the algorithm is described in de los Campos et al. Genetics Research (2010)
Choosing the RK based on predictive ability Strategies Grid of Values of ө + CV Fully Bayesian: assign a prior to ө (computationally demanding) Kernel Averaging [1] [1] de los Campos et al. (2010) WCGALP & Genetics Research (In press)
Histograms of the off-diagonal entries of each of the three t kernels used (K1, K2, K3) in the RKHS models for the wheat dataset
How to Choose the Reproducing Kernel? [1] Pedigree-models K=A Genomic Models: - Marker-based kinship - Model-derived Kernel Predictive Approach Explore a wide variety of kernels => Cross-validation => Bayesian methods [1] Shawne-Taylor and Cristianini (2004)
Example 2
Example 2
Example 2
Example 2
Example 3 Kernel Averaging
Kernel Averaging Strategies Grid of Values of + CV Fully Bayesian: assign a prior to (computationally demanding) Kernel Averaging [1] [1] de los Campos et al., Genetics Research (2010)
Kernel Averaging
Example 4 (100th basis function)
Example 4 (100th basis function, h=)
Example 4 (KA: trace plot residual variance)
Example 4 (KA: trace plot kernel-variances)
Example 4 (KA: trace plot kernel-variances)
Example 4 (KA: prediction accuracy)