Low-Cost, Low-Density Genotyping and its Potential Applications K.A. Weigel, O. González-Recio, G. de los Campos, H. Naya, N. Long, D. Gianola, and G.J.M. Rosa University of Wisconsin
Illumina BovineSNP50 Genotyping BeadChip < $250 per animal today
Low-Cost Genotyping Assays At the current price, the BovineSNP50 BeadChip is limited to applications involving males and elite females A low-cost assay with SNPs might deliver a substantial portion of the gain for a small fraction of the price Applications may include: preliminary screening of young bulls, selection of replacement heifers, genomic mating programs, and parentage discovery
VanRaden, 2008 Which SNPs to Select? Pick the SNPs with largest estimated effects? How many do we need?
VanRaden, 2008 Which SNPs to Select? Pick evenly spaced SNPs? How many do we need?
A measure of the impurity of an arbitrary collection of examples (S) Entropy (S) = - p + log 2 p + - p - log 2 p - where: p + = proportion of positive examples in S p - = proportion of negative examples in S Entropy
A measure of the effectiveness of an attribute in classifying the data Reduction in entropy caused by partitioning the examples into subsets (S 1,...,S n ) based on values of a given attribute (A) Information Gain (S,A) = Entropy(S) – i=1,n (|S i |/|S|) Entropy(S i ) Information Gain
Top 10% of SNPs for Net Merit (Info Gain for 20% highest bulls vs. 20% lowest bulls) Chromosome Number of SNPs 1181 SNPs (36.2%) in common (though many more in linkage disequilibrium) 3252 SNPs
Effects of Top Net Merit SNPs Chromosome Estimate Chromosome Estimate 3252 SNPs
Top 10% of SNPs for Specific Traits (Info Gain for 20% highest bulls vs. 20% lowest bulls; traditional coding) Chromosome Number of SNPs 3252 SNPs
Top 2.5% of SNPs for Specific Traits (Info Gain for 20% highest bulls vs. 20% lowest bulls; traditional coding) Number of SNPs Chromosome 813 SNPs
Top Info Gain SNPs in Common by Trait (20% highest vs. 20% lowest bulls; traditional coding) MilkFatProtPLSCSDPRNM$ Milk Fat Prot PL SCS DPR NM$ top 10% of SNPs (3252) above the diagonal top 2.5% of SNPs (813) below the diagonal
Bayesian least absolute selection and shrinkage operator One-step method for estimating effects of important SNPs while shrinking estimates for unimportant SNPs towards zero Assumes SNP effects follow a double exponential distribution (a few with large effects, many with negligible effects) Bayesian LASSO
Estimated SNP Effect (genetic SD) Number of SNPs Distribution of SNP Effects (analysis of Net Merit in training set with 32,518 SNPs)
Estimated Effect (genetic SD) Distribution of SNP Effects (analysis of Net Merit in training set with 32,518 SNPs)
Distribution of SNP Effects MeanSDMin.Max 300 SNPs SNPs SNPs SNPs SNPs SNPs SNPs ,518 SNPs
Compute parent averages and genomic PTAs using 2003 data from 3,305 Holstein bulls born in “Training Set” Compare ability to predict daughter deviations in 2008 data for 1,398 bulls born from “Testing Set” Validation of Genomic PTAs
Predicted Genomic PTA from All SNPs (gen. SD) PTA from Progeny Testing (SD) Predictive Ability for Net Merit (Genomic PTA vs. Progeny Test PTA in Testing Set) Corr. = ,518 SNPs
Predicted Genomic PTA from Top ___ SNPs (gen. SD) PTA from Progeny Testing Predictive Ability for Net Merit (Genomic PTA from SNPs vs. Progeny Test PTA in Testing Set) 750 SNPs 2000 SNPs 300 SNPs 1250 SNPs Corr. = 0.43Corr. = 0.52 Corr. = 0.55Corr. = 0.57
Number of SNPs used for Prediction Predictive Ability in Testing Set Predictive Ability for Net Merit (Genomic PTA vs. Progeny Test PTA in Testing Set)...
No. Bulls Chosen Correctly (of 1399) Top 50% (700 bulls) Top 25% (350 bulls) Top 10% (140 bulls) Top 5% (70 bulls) Top 2½% (35 bulls) Top 1% (14 bulls) 300 SNPs 460 (65.7%) 161 (46.0%) 31 (22.1%) 13 (18.6%) 3 (8.6%) 0 (0.0%) 500 SNPs 460 (65.7%) 180 (48.6%) 39 (27.9%) 12 (17.1%) 3 (8.6%) 1 (7.1%) 750 SNPs 479 (68.4%) 180 (48.6%) 39 (27.9%) 15 (21.4%) 4 (11.4%) 2 (14.3%) 1000 SNPs 484 (69.1%) 180 (48.6%) 40 (28.6%) 11 (15.7%) 3 (8.6%) 2 (14.3%) 1250 SNPs 482 (68.9%) 179 (51.1%) 43 (30.7%) 12 (17.1%) 4 (11.3%) 1 (7.1%) 1500 SNPs 479 (68.4%) 183 (52.2%) 46 (32.9%) 14 (20.0%) 5 (14.3%) 1 (7.1%) 2000 SNPs 489 (69.9%) 186 (53.1%) 42 (30.0%) 17 (24.3%) 6 (17.1%) 2 (14.3%) 32,518 SNPs 499 (71.3%) 191 (54.6%) 49 (35.0%) 16 (22.9%) 5 (14.3%) 2 (14.3%)
Verify reported parents Discover parents if unknown or incorrect Trace animals or animal products Animal ID Applications (96+ SNPs in the parentage panel)
Effects on Inbreeding Traditional animal model evaluations favor co-selection of families or relatives Genomic selection allows within-family selection, which leads to less inbreeding Low-cost, low-density genotyping assays will allow widespread screening of families that might provide unique genetic contributions to the population Identification and control of inherited defects will be greatly enhanced as well
Potential for Mate Selection Millions of cows are mated using computerized programs each year, based on faults in conformation or avoidance of inbreeding SNP genotypes of AI sires and potential mates could be used to minimize inbreeding or to identify parents with “complementary” DNA profiles
Possibilities for Novel Traits Opportunities to collect DNA and phenotypes for traits not routinely assessed in national recording schemes Examples include: feed intake, hormone level, immune function, hoof care, etc. Potential resource populations include: experimental herds, calf ranches, heifer growers, commercial herds with specific milking/feeding/management equipment, veterinary databases (without sire ID)
Novel Traits and Genomics Recorded Population (10,000-25,000 animals per trait or trait group) Whole Genome Selection QTL Detection and MAS full genotyping selective genotyping update estimates of SNP effects refine estimates of location or effect, add SNPs cost ~ $ mln per trait group cost ~ $ mln per trait additive or non-additive inheritance $200/genotype $100/trait 5 traits/group select high/low 10% no selection bias
Synergy with Herd Management “Personalized medicine” is the Holy Grail of biomedical research Examples include genotype-guided Warfarin dosing using two major genes Cost-effective applications in livestock will involve a series of small returns from enhanced vaccination programs, ration formulation, mate selection, veterinary care, and animal grouping decisions Integration with herd management software will be the key to success
UW-Madison Dairy Science…Committed to Excellence in Research, Extension and Instruction Any Questions?