Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD , USA Imputation
Wiggans, 2013SRUC Imputation (2) Imputation l Based on splitting the genotype into individual chromosomes (maternal and paternal contributions) l Missing SNPs assigned by tracking inheritance from ancestors and descendents l Imputed dams increase predictor population l Genotypes from all chips merged by imputing SNPs not present
Wiggans, 2013SRUC Imputation (3) Terms l Genotype – Alleles on both chromosomes for all markers l Allele representation – A,B; A,C,T,G l Genotype representation – number of A’s; 0,1,2,5 (missing) l Imputation – Determination of an allele from alleles of other markers and animals l Phasing – Separating a genotype into individual chromosomes and possibly assigning maternal or paternal origin
Wiggans, 2013SRUC Imputation (4) Genotype for Elevation l Chromosome 1
Wiggans, 2013SRUC Imputation (5) X chromosome l Bull l Cow
Wiggans, 2013SRUC Imputation (6) Pedigree – parents, grandparents, etc.
Wiggans, 2013SRUC Imputation (7) O-Style haplotypes – chromosome 15
Wiggans, 2013SRUC Imputation (8) findhap l Developed by Paul VanRaden l Divides chromosomes into segments l Allows for successively shorter segments, typically 3 runs w Long segments lock in identical by descent w Shorter segments fill in missing SNPs l Separates genotype into maternal and paternal contribution, haplotypes (phasing) l Builds haplotype library sequenced by frequency
Wiggans, 2013SRUC Imputation (9) findhap characteristics l Population haplotyping w Divides chromosomes into segments w Lists haplotypes by genotype match w Similar to FastPhase, Impute, or long range phasing l Pedigree haplotyping w Detects crossover; fixes noninheritance w Imputes nongenotyped ancestors
Wiggans, 2013SRUC Imputation (10) Recent program revisions l Improved imputation and reliability l Changes since January 2010 w Use known haplotype if 2nd is unknown w Use current instead of base frequency w Combine parent haplotypes if crossover is detected w Begin search with parent or grandparent haplotypes w Store 2 most popular progeny haplotypes l Decreased computing time by using previous haplotype library
Wiggans, 2013SRUC Imputation (11) Population haplotyping l Put 1st genotype into haplotype list l Check next genotype against list w Do any homozygous loci conflict? − If haplotype conflicts, continue search − If match, fill any unknown SNP with homozygote − 2nd haplotype = genotype minus 1st haplotype − Search for 2nd haplotype in rest of list w If no match in list, add to end of list l Sort list to put frequent haplotypes 1st
Wiggans, 2013SRUC Imputation (12) Coding of alleles and segments l Genotypes w 0 = BB, 1 = AB or BA, 2 = AA, w 3 = B_, 4 = A_, 5 = __ (missing) w Allele frequency used for missing l Haplotypes w 0 = B, 1 = not known, 2 = A l Segment inheritance (example) w Son has haplotype numbers 5 and 8 w Sire has haplotype numbers 8 and 21 w Son got haplotype number 5 from dam
Wiggans, 2013SRUC Imputation (13) l 1st segment of chromosome 15 l For efficiency, store haplotypes just once Most frequent Holstein haplotype had 4,316 copies ( 41,822 animals 2 chromosomes each) 15.16% % % % % % % % % % Most frequent haplotypes
Wiggans, 2013SRUC Imputation (14) Check new genotype against list l 1st segment of chromosome 15 w Search for 1st haplotype that matches genotype w Get 2nd haplotype by removing 1st from genotype % % % % % % % % % %
Wiggans, 2013SRUC Imputation (15) Recessive defect discovery l Check for homozygous haplotypes w Most haplotype blocks ~ 5 Mbp long w 7–90 expected, but 0 observed l 5 of top 11 haplotypes confirmed as lethal l Investigation of 936–52,449 carrier sire carrier MGS fertility records found 3.0–3.7% lower conception rates
Wiggans, 2013SRUC Imputation (16) Traditional evaluations 3X/year l Yield w Milk, fat, protein, component percentages l Type w Stature, udder characteristics, feet and legs l Calving w Calving ease, stillbirth rate l Functional w Somatic cell score, productive life, fertility
Wiggans, 2013SRUC Imputation (17) Reduce generation interval from 5 to 2 yr Genomic prediction of progeny test Select parents, transfer embryos to recipients Calves born and DNA tested Calves born from DNA-selected parents Bull receives progeny test
Wiggans, 2013SRUC Imputation (18) Benefit of genomics l Determine value of bull at birth l Increase selection accuracy l Reduce generation interval l Increase selection intensity l Increase rate of genetic gain
Wiggans, 2013SRUC Imputation (19) Genomic evaluation program l Identify animals to genotype l Send sample to genotyping laboratory l Genotype sample l Send genotype to evaluation center l Calculate genomic evaluation l Release monthly evaluation
Wiggans, 2013SRUC Imputation (20) DHI herd DNA laboratory AI organization, breed association DNA samples genotypes genomic evaluations nominations, pedigree data genotype quality reports genomic evaluations DNA samples genotypes DNA samples CDCB Genomic data flow
Wiggans, 2013SRUC Imputation (21) Genotyped animals – April 2013 Chip Traditional evaluation? Animal sex HolsteinJersey Brown Swiss Ayrshire 50K YesBulls 21,904 2,855 5, Cows 16,0621, NoBulls45,5373,8841, Cows 32, <50KYesBulls Cows 21,9809, NoBulls14,0261, Cows 158,62218, ImputedYesCows2, NoCows 1, All314,93837,9428,0801,213
Wiggans, 2013SRUC Imputation (22) Steps to prepare genotypes l Nominate animal for genotyping l Collect blood, hair, semen, nasal swab, or ear punch w Blood may not be suitable for twins l Extract DNA at laboratory l Prepare DNA and apply to beadchip l Do amplification and hybridization, 3-day process l Read red/green intensities from chip and call genotypes from clusters
Wiggans, 2013SRUC Imputation (23) What can go wrong l Inadequate DNA quality or quantity from sample l Genotype with many SNPs that cannot be determined (90% call rate required) l Parent-progeny conflicts w Pedigree error w Sample ID error (switched samples) w Laboratory error w Parent-progeny relationship detected not in pedigree
Wiggans, 2013SRUC Imputation (24) Parentage validation and discovery l Parent-progeny conflicts detected w Animal checked against all other genotypes w Conflict reported to breeds and requesters w Correct sire usually detected l MGS checked w 1 SNP at a time w Haplotype checking more accurate l Breeds moving to accept SNPs in place of microsatellites
Wiggans, 2013SRUC Imputation (25) SireAnimal A/B *B/B *A/A B/BA/B B/B A/B *A/A A/BA/A B/BA/B *B/B * A/B B/BA/B *A/A *B/B A/B A/A *B/B A/BA/A A/BA/A Parent-progeny conflicts Sire Conflicts = 0 *Tests = 10 Conflict % = 0% Conflict % Relationship MGS A/B A/A A/B* A/A* B/B* A/A* B/B* * * A/B B/B* A/B A/A B/B* A/B A/A* B/B MGS Conflicts = 3 *Tests = 10 Conflict % = 30.0%
Wiggans, 2013SRUC Imputation (26) l For animal w Pedigree wrong w Genotype unreliable (3K) l For SNP w SNP unreliable w Clustering needs adjustment Parent Progeny Parent-progeny conflicts
Wiggans, 2013SRUC Imputation (27) Detecting unreliable genotypes Conflicts (%) Accept Unreliable genotype (reject) 3.6 Reject
Wiggans, 2013SRUC Imputation (28) MGS detection l SNP conflict method (SNP) w Check if animal and MGS have opposite homozygotes (duo test) w If sire is genotyped, some heterozygous SNP can be checked (trio test) l Common haplotype method (HAP) w After imputation of all loci, determine maternal contribution by removing paternal haplotype w Count maternal haplotypes in common with MGS w Remove haplotypes from MGS and check remaining against maternal great-grandsire (MGGS)
Wiggans, 2013SRUC Imputation (29) Results by breed *50K genotyped animals only SNP methodHap method Breed MGS % confirmed MGS % confirmed MGGS % confirmed Holstein95 (98)*9792 Jersey91 (92)95 Brown Swiss94 (95)9785
Wiggans, 2013SRUC Imputation (30) Lab QC l Each SNP evaluated for w Call rate w Portion heterozygous w Parent-progeny conflicts l Clustering investigated if SNP exceeds limits l Number of failing SNPs indicates genotype quality l Target <10 SNPs in each category
Wiggans, 2013SRUC Imputation (31) Before clustering adjustment 86% call rate
Wiggans, 2013SRUC Imputation (32) After clustering adjustment 100% call rate
Wiggans, 2013SRUC Imputation (33) Automated QC reporting 6160 Genotypes Processed from LAB PASS/FAIL,Count,Description PASS,1,Parent Progeny Conflict SNP >2% PASS,5,Low Call Rate SNP >10% PASS,0,HWE SNP PASS,0,Chips w/ >20 Conflicts PASS,0.3,No Nomination % PASS,0,Genotype Submitted with No Sample Sheet Row
Wiggans, 2013SRUC Imputation (34) Reliability of Holstein predictions TraitBias*bREL (%)REL gain (%) Milk (kg)− Fat (kg)− Protein (kg) Fat (%) Protein (%) Productive life (mo)− Somatic cell score Daughter pregnancy rate (%) Sire calving ease (%DBH) Daughter calving ease (%DBH)− Sire stillbirth (%) Daughter stillbirth (%)− *2011 deregressed value – 2007 genomic evaluation
Wiggans, 2013SRUC Imputation (35) Marketed Holstein bulls
Wiggans, 2013SRUC Imputation (36) Ways to increase accuracy l Automatic addition of traditional evaluations of genotyped bulls when are 5 yr old l Possible genotyping of 10,000 bulls with semen in repository l Collaboration with other countries l Use of more SNPs from HD chips l Full sequencing – identify causative mutations
Wiggans, 2013SRUC Imputation (37) Application to more traits l Animal’s genotype is good for all traits l Traditional evaluations required for accurate estimates of SNP effects l Traditional evaluations not currently available for heat tolerance or feed efficiency l Research populations could provide data for traits that are expensive to measure l Will resulting evaluations work in target population?
Wiggans, 2013SRUC Imputation (38) Impact on producers l Young-bull evaluations with accuracy of early 1stcrop evaluations l AI organizations marketing genomically evaluated young bulls l Genotype usually required to be a bull dam l Rate of genetic improvement likely to increase by up to 50% l AI organizations reducing progeny-test programs
Wiggans, 2013SRUC Imputation (39) Why genomics works for dairy cattle l Extensive historical data available l Well developed genetic evaluation program l Widespread use of AI sires l Progeny-test programs l High-value animals worth the cost of genotyping l Long generation interval that can be reduced substantially by genomics
Wiggans, 2013SRUC Imputation (40) Council on Dairy Cattle Breeding – CDCB l CDCB assuming responsibility for receiving data and computing and delivering U.S. evaluations l USDA will continue research and development to improve evaluation system l CDCB and USDA employees located at USDA’s Beltsville Agricultural Research Center in Beltsville, Maryland
Wiggans, 2013SRUC Imputation (41) Questions?