Download presentation
Presentation is loading. Please wait.
Published byArlene Parks Modified over 9 years ago
1
Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA george.wiggans@ars.usda.gov Imputation 100 011110 1220020012 02121110111121 10111100112110002012200222011112021012002111221100211120220 0011110010110110102200110022011011200201101020222121122101220 2010011100011220221222112021120120201002022020002122 21122011101210011121110211211002010210002200020221 201000201100002202211022112101121110122220012011 12220020002002020201222110022222220022121111220 2100211112001101110112002022200011120110102121 1121211102022100211201211001111102111211020002 122000101101110202200221110102011121111011221 202102102121101102212200121101121101202201100 01 22200210021100011100211021101110002220021121 2 2121211000222010200222212001221121210111011 11 200201102020012222220021110 2200112 211122 10101121211 202111 2112 12112121 10120 1021 01 11220 012 10 0 21 00 2 2 11 12 1 0 21 1 2 12001 0 12
2
Wiggans, 2013SRUC Imputation (2) Imputation l Based on splitting the genotype into individual chromosomes (maternal and paternal contributions) l Missing SNPs assigned by tracking inheritance from ancestors and descendents l Imputed dams increase predictor population l Genotypes from all chips merged by imputing SNPs not present
3
Wiggans, 2013SRUC Imputation (3) Terms l Genotype – Alleles on both chromosomes for all markers l Allele representation – A,B; A,C,T,G l Genotype representation – number of A’s; 0,1,2,5 (missing) l Imputation – Determination of an allele from alleles of other markers and animals l Phasing – Separating a genotype into individual chromosomes and possibly assigning maternal or paternal origin
4
Wiggans, 2013SRUC Imputation (4) 100011122002001211101111211110111100112110002012 200222011112021012002111221100211120011110010110 110102200110022011011200201101020222121122102010 011100011220221222112021120120201002022020000211 000112020112211121110220111100002122020002210120 200022112201110121001112111021121100201021000220 002201000201100002202211022112101121110122220012 112122200200020020202012221100222222200221211112 100211112001101110112002022200011120110102111212 111020221002112012110011111021112110211122000101 101110202200221110102011121111011202102102121101 102212200121101121101202201100222002100211000111 002110211011100022200202212121100022201020022221 212211211120020110202001222222112212021211210110 012110110200220002001002000111101100121102121211 120101012120221010101111102110211221111112121112 101101200111110211110111112201210121211010222020 21211222120222002121210121210201100111222121101 Genotype for Elevation l Chromosome 1
5
Wiggans, 2013SRUC Imputation (5) X chromosome l Bull 202220200002022220002020222020202 l Cow 1201201212222010111022210210212022
6
Wiggans, 2013SRUC Imputation (6) Pedigree – parents, grandparents, etc.
7
Wiggans, 2013SRUC Imputation (7) O-Style haplotypes – chromosome 15
8
Wiggans, 2013SRUC Imputation (8) findhap l Developed by Paul VanRaden l Divides chromosomes into segments l Allows for successively shorter segments, typically 3 runs w Long segments lock in identical by descent w Shorter segments fill in missing SNPs l Separates genotype into maternal and paternal contribution, haplotypes (phasing) l Builds haplotype library sequenced by frequency
9
Wiggans, 2013SRUC Imputation (9) findhap characteristics l Population haplotyping w Divides chromosomes into segments w Lists haplotypes by genotype match w Similar to FastPhase, Impute, or long range phasing l Pedigree haplotyping w Detects crossover; fixes noninheritance w Imputes nongenotyped ancestors
10
Wiggans, 2013SRUC Imputation (10) Recent program revisions l Improved imputation and reliability l Changes since January 2010 w Use known haplotype if 2nd is unknown w Use current instead of base frequency w Combine parent haplotypes if crossover is detected w Begin search with parent or grandparent haplotypes w Store 2 most popular progeny haplotypes l Decreased computing time by using previous haplotype library
11
Wiggans, 2013SRUC Imputation (11) Population haplotyping l Put 1st genotype into haplotype list l Check next genotype against list w Do any homozygous loci conflict? − If haplotype conflicts, continue search − If match, fill any unknown SNP with homozygote − 2nd haplotype = genotype minus 1st haplotype − Search for 2nd haplotype in rest of list w If no match in list, add to end of list l Sort list to put frequent haplotypes 1st
12
Wiggans, 2013SRUC Imputation (12) Coding of alleles and segments l Genotypes w 0 = BB, 1 = AB or BA, 2 = AA, w 3 = B_, 4 = A_, 5 = __ (missing) w Allele frequency used for missing l Haplotypes w 0 = B, 1 = not known, 2 = A l Segment inheritance (example) w Son has haplotype numbers 5 and 8 w Sire has haplotype numbers 8 and 21 w Son got haplotype number 5 from dam
13
Wiggans, 2013SRUC Imputation (13) l 1st segment of chromosome 15 l For efficiency, store haplotypes just once Most frequent Holstein haplotype had 4,316 copies (0.0516 41,822 animals 2 chromosomes each) 15.16%022222222020020022002020200020000200202000022022222202220 24.37%022020220202200020022022200002200200200000200222200002202 34.36%022020022202200200022020220000220202200002200222200202220 43.67%022020222020222002022022202020000202220000200002020002002 53.66% 022222222020222022020200220000020222202000002020220002022 63.65%022020022202200200022020220000220202200002200222200202222 73.51%022002222020222022022020220200222002200000002022220002220 83.42%022002222002220022022020220020200202202000202020020002020 93.24%022222222020200000022020220020200202202000202020020002020 103.22%022002222002220022002020002220000202200000202022020202220 Most frequent haplotypes
14
Wiggans, 2013SRUC Imputation (14) Check new genotype against list l 1st segment of chromosome 15 w Search for 1st haplotype that matches genotype 022112222011221022021110220010110212202000102020120002021 w Get 2nd haplotype by removing 1st from genotype 022002222002220022022020220020200202202000202020020002020 5.16%022222222020020022002020200020000200202000022022222202220 4.37%022020220202200020022022200002200200200000200222200002202 4.36%022020022202200200022020220000220202200002200222200202220 3.67%022020222020222002022022202020000202220000200002020002002 3.66%022222222020222022020200220000020222202000002020220002022 3.65% 022020022202200200022020220000220202200002200222200202222 3.51% 022002222020222022022020220200222002200000002022220002220 3.42% 022002222002220022022020220020200202202000202020020002020 3.24% 022222222020200000022020220020200202202000202020020002020 3.22% 022002222002220022002020002220000202200000202022020202220
15
Wiggans, 2013SRUC Imputation (15) Recessive defect discovery l Check for homozygous haplotypes w Most haplotype blocks ~ 5 Mbp long w 7–90 expected, but 0 observed l 5 of top 11 haplotypes confirmed as lethal l Investigation of 936–52,449 carrier sire carrier MGS fertility records found 3.0–3.7% lower conception rates
16
Wiggans, 2013SRUC Imputation (16) Traditional evaluations 3X/year l Yield w Milk, fat, protein, component percentages l Type w Stature, udder characteristics, feet and legs l Calving w Calving ease, stillbirth rate l Functional w Somatic cell score, productive life, fertility
17
Wiggans, 2013SRUC Imputation (17) Reduce generation interval from 5 to 2 yr 012345 Genomic prediction of progeny test Select parents, transfer embryos to recipients Calves born and DNA tested Calves born from DNA-selected parents Bull receives progeny test
18
Wiggans, 2013SRUC Imputation (18) Benefit of genomics l Determine value of bull at birth l Increase selection accuracy l Reduce generation interval l Increase selection intensity l Increase rate of genetic gain
19
Wiggans, 2013SRUC Imputation (19) Genomic evaluation program l Identify animals to genotype l Send sample to genotyping laboratory l Genotype sample l Send genotype to evaluation center l Calculate genomic evaluation l Release monthly evaluation
20
Wiggans, 2013SRUC Imputation (20) DHI herd DNA laboratory AI organization, breed association DNA samples genotypes genomic evaluations nominations, pedigree data genotype quality reports genomic evaluations DNA samples genotypes DNA samples CDCB Genomic data flow
21
Wiggans, 2013SRUC Imputation (21) Genotyped animals – April 2013 Chip Traditional evaluation? Animal sex HolsteinJersey Brown Swiss Ayrshire 50K YesBulls 21,904 2,855 5,381 639 Cows 16,0621,054110 3 NoBulls45,5373,8841,031 325 Cows 32,892660102 110 <50KYesBulls1911289 Cows 21,9809,1324650 NoBulls14,0261,355902 Cows 158,62218,722658105 ImputedYesCows2,71323710312 NoCows 1,183321128 All314,93837,9428,0801,213
22
Wiggans, 2013SRUC Imputation (22) Steps to prepare genotypes l Nominate animal for genotyping l Collect blood, hair, semen, nasal swab, or ear punch w Blood may not be suitable for twins l Extract DNA at laboratory l Prepare DNA and apply to beadchip l Do amplification and hybridization, 3-day process l Read red/green intensities from chip and call genotypes from clusters
23
Wiggans, 2013SRUC Imputation (23) What can go wrong l Inadequate DNA quality or quantity from sample l Genotype with many SNPs that cannot be determined (90% call rate required) l Parent-progeny conflicts w Pedigree error w Sample ID error (switched samples) w Laboratory error w Parent-progeny relationship detected not in pedigree
24
Wiggans, 2013SRUC Imputation (24) Parentage validation and discovery l Parent-progeny conflicts detected w Animal checked against all other genotypes w Conflict reported to breeds and requesters w Correct sire usually detected l MGS checked w 1 SNP at a time w Haplotype checking more accurate l Breeds moving to accept SNPs in place of microsatellites
25
Wiggans, 2013SRUC Imputation (25) SireAnimal A/B *B/B *A/A B/BA/B B/B A/B *A/A A/BA/A B/BA/B *B/B * A/B B/BA/B *A/A *B/B A/B A/A *B/B A/BA/A A/BA/A Parent-progeny conflicts Sire Conflicts = 0 *Tests = 10 Conflict % = 0% Conflict % Relationship MGS A/B A/A A/B* A/A* B/B* A/A* B/B* * * A/B B/B* A/B A/A B/B* A/B A/A* B/B MGS Conflicts = 3 *Tests = 10 Conflict % = 30.0%
26
Wiggans, 2013SRUC Imputation (26) l For animal w Pedigree wrong w Genotype unreliable (3K) l For SNP w SNP unreliable w Clustering needs adjustment Parent10212002101201211001020100100 Progeny10202010100200221001120120220 Parent-progeny conflicts
27
Wiggans, 2013SRUC Imputation (27) Detecting unreliable genotypes 00.20.40.60.811.21.41.61.82.0 2.42.83.2 Conflicts (%) Accept Unreliable genotype (reject) 3.6 Reject
28
Wiggans, 2013SRUC Imputation (28) MGS detection l SNP conflict method (SNP) w Check if animal and MGS have opposite homozygotes (duo test) w If sire is genotyped, some heterozygous SNP can be checked (trio test) l Common haplotype method (HAP) w After imputation of all loci, determine maternal contribution by removing paternal haplotype w Count maternal haplotypes in common with MGS w Remove haplotypes from MGS and check remaining against maternal great-grandsire (MGGS)
29
Wiggans, 2013SRUC Imputation (29) Results by breed *50K genotyped animals only SNP methodHap method Breed MGS % confirmed MGS % confirmed MGGS % confirmed Holstein95 (98)*9792 Jersey91 (92)95 Brown Swiss94 (95)9785
30
Wiggans, 2013SRUC Imputation (30) Lab QC l Each SNP evaluated for w Call rate w Portion heterozygous w Parent-progeny conflicts l Clustering investigated if SNP exceeds limits l Number of failing SNPs indicates genotype quality l Target <10 SNPs in each category
31
Wiggans, 2013SRUC Imputation (31) Before clustering adjustment 86% call rate
32
Wiggans, 2013SRUC Imputation (32) After clustering adjustment 100% call rate
33
Wiggans, 2013SRUC Imputation (33) Automated QC reporting 6160 Genotypes Processed from LAB2013021811 PASS/FAIL,Count,Description PASS,1,Parent Progeny Conflict SNP >2% PASS,5,Low Call Rate SNP >10% PASS,0,HWE SNP PASS,0,Chips w/ >20 Conflicts PASS,0.3,No Nomination % PASS,0,Genotype Submitted with No Sample Sheet Row
34
Wiggans, 2013SRUC Imputation (34) Reliability of Holstein predictions TraitBias*bREL (%)REL gain (%) Milk (kg)−64.30.9267.128.6 Fat (kg)−2.70.9169.831.3 Protein (kg) 0.70.8561.523.0 Fat (%) 0.01.0086.548.0 Protein (%) 0.00.9079.040.4 Productive life (mo)−1.80.9853.021.8 Somatic cell score 0.00.8861.227.0 Daughter pregnancy rate (%) 0.00.9251.221.7 Sire calving ease (%DBH) 0.80.7331.010.4 Daughter calving ease (%DBH)−1.10.8138.419.9 Sire stillbirth (%) 1.50.9221.8 3.7 Daughter stillbirth (%)− 0.20.8330.313.2 *2011 deregressed value – 2007 genomic evaluation
35
Wiggans, 2013SRUC Imputation (35) Marketed Holstein bulls
36
Wiggans, 2013SRUC Imputation (36) Ways to increase accuracy l Automatic addition of traditional evaluations of genotyped bulls when are 5 yr old l Possible genotyping of 10,000 bulls with semen in repository l Collaboration with other countries l Use of more SNPs from HD chips l Full sequencing – identify causative mutations
37
Wiggans, 2013SRUC Imputation (37) Application to more traits l Animal’s genotype is good for all traits l Traditional evaluations required for accurate estimates of SNP effects l Traditional evaluations not currently available for heat tolerance or feed efficiency l Research populations could provide data for traits that are expensive to measure l Will resulting evaluations work in target population?
38
Wiggans, 2013SRUC Imputation (38) Impact on producers l Young-bull evaluations with accuracy of early 1stcrop evaluations l AI organizations marketing genomically evaluated young bulls l Genotype usually required to be a bull dam l Rate of genetic improvement likely to increase by up to 50% l AI organizations reducing progeny-test programs
39
Wiggans, 2013SRUC Imputation (39) Why genomics works for dairy cattle l Extensive historical data available l Well developed genetic evaluation program l Widespread use of AI sires l Progeny-test programs l High-value animals worth the cost of genotyping l Long generation interval that can be reduced substantially by genomics
40
Wiggans, 2013SRUC Imputation (40) Council on Dairy Cattle Breeding – CDCB l CDCB assuming responsibility for receiving data and computing and delivering U.S. evaluations l USDA will continue research and development to improve evaluation system l CDCB and USDA employees located at USDA’s Beltsville Agricultural Research Center in Beltsville, Maryland
41
Wiggans, 2013SRUC Imputation (41) Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.