Presentation is loading. Please wait.

Presentation is loading. Please wait.

2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.

Similar presentations


Presentation on theme: "2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland."— Presentation transcript:

1 2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland School of Medicine, Baltimore, MD, USA 3 University of Wisconsin Dept. Dairy Science, Madison, WI, USA Paul.VanRaden@ars.usda.gov 2010 Fill ing Missing Genotypes Using Haplotypes Fill ing Missing Genotypes Using Haplotypes

2 ADSA / ASAS annual meeting, Denver, July 2010 (2)Paul VanRaden 2010 Genotypes / Haplotypes  Genotypes indicate how many copies of each allele were inherited  Haplotypes indicate which alleles are on which chromosome  Observed genotypes partitioned into the two unknown haplotypes Pedigree haplotyping uses relatives Population haplotyping finds matching allele patterns

3 ADSA / ASAS annual meeting, Denver, July 2010 (3)Paul VanRaden 2010 Filling missing genotypes  Predict unknown SNP from known Measure 3,000, predict 43,000 SNP Measure 50,000, predict 500,000 Measure each haplotype at highest density only a few times  Predict dam from progeny SNP  Increase reliabilities for less cost

4 ADSA / ASAS annual meeting, Denver, July 2010 (4)Paul VanRaden 2010 Haplotyping Program findhap.f90  Begin with population haplotyping Divide chromosomes into segments, ~250 SNP / segment List haplotypes by genotype match Similar to fastPhase, IMPUTE  End with pedigree haplotyping Detect crossover, fix noninheritance Impute nongenotyped ancestors

5 ADSA / ASAS annual meeting, Denver, July 2010 (5)Paul VanRaden 2010 Computer Requirements 500,000 markers, 33,414 animals StepGbytesCPU hours Simulate genotypes391.8 Pop’n haplotypes21.2 Pedigree haplotypes31.8 Store genotypes13- Store haplotypes3- Iterate allele effects (for 5 traits) 830

6 ADSA / ASAS annual meeting, Denver, July 2010 (6)Paul VanRaden 2010 Recent Program Revisions  Improved imputation and GEBV reliability since 9WCGALP paper  Changes since January 2010 Use known haplotype if second is unknown Use current instead of base frequency Combine parent haplotypes if crossover is detected Begin search with parent or grandparent haplotypes Store 2 most popular progeny haplotypes

7 ADSA / ASAS annual meeting, Denver, July 2010 (7)Paul VanRaden 2010 Example Bull: O-Style USA137611441, Sire = O-Man  Read genotypes and pedigrees  Write haplotype segments found List paternal / maternal inheritance List crossover locations

8 ADSA / ASAS annual meeting, Denver, July 2010 (8)Paul VanRaden 2010 O-Style Haplotypes Chromosome 15

9 ADSA / ASAS annual meeting, Denver, July 2010 (9)Paul VanRaden 2010 Pedigree Haplotyping AB allele coding Genotypes: OMan BB,AA,AA,AB,AA,AB,AB,AA,AA,AB Ostyle BB,AA,AA,AB,AB,AA,AA,AA,AA,AB Haplotypes: OStyle (pat) B A A _ A A A A A _ OStyle (mat) B A A _ B A A A A _

10 ADSA / ASAS annual meeting, Denver, July 2010 (10)Paul VanRaden 2010 Allele and Segment Coding  Genotypes 0 = BB, 1 = AB or BA, 2 = AA 5 = missing  Haplotypes 0 = B, 1 = not known, 2 = A  Segment storage (example) O-Style has haplotype numbers 5 and 8 O-Man has haplotype numbers 8 and 21 O-Style got haplotype number 5 from dam

11 ADSA / ASAS annual meeting, Denver, July 2010 (11)Paul VanRaden 2010 Most Frequent Haplotypes Most Frequent Haplotypes 1st segment of chromosome 15 1 5.16% 022222222020020022002020200020000200202000022022222202220 2 4.37% 022020220202200020022022200002200200200000200222200002202 3 4.36% 022020022202200200022020220000220202200002200222200202220 4 3.67% 022020222020222002022022202020000202220000200002020002002 5 3.66% 022222222020222022020200220000020222202000002020220002022 6 3.65% 022020022202200200022020220000220202200002200222200202222 7 3.51% 022002222020222022022020220200222002200000002022220002220 8 3.42% 022002222002220022022020220020200202202000202020020002020 9 3.24% 022222222020200000022020220020200202202000202020020002020 10 3.22% 022002222002220022002020002220000202200000202022020202220 Most frequent haplotype in Holsteins had 4,316 copies =.0516 * 41,822 animals * 2 chromosomes each

12 ADSA / ASAS annual meeting, Denver, July 2010 (12)Paul VanRaden 2010 Population Haplotyping Steps  Put first genotype into haplotype list  Check next genotype against list Do any homozygous loci conflict? – If haplotype conflicts, continue search – If match, fill any unknown SNP with homozygote – 2 nd haplotype = genotype minus 1 st haplotype – Search for 2 nd haplotype in rest of list If no match in list, add to end of list  Sort list to put frequent haplotypes 1st

13 ADSA / ASAS annual meeting, Denver, July 2010 (13)Paul VanRaden 2010 Check New Genotype Against List Check New Genotype Against List 1st segment of chromosome 15 5.16% 022222222020020022002020200020000200202000022022222202220 4.37% 022020220202200020022022200002200200200000200222200002202 4.36% 022020022202200200022020220000220202200002200222200202220 3.67% 022020222020222002022022202020000202220000200002020002002 3.66% 022222222020222022020200220000020222202000002020220002022 3.65% 022020022202200200022020220000220202200002200222200202222 3.51% 022002222020222022022020220200222002200000002022220002220 3.42% 022002222002220022022020220020200202202000202020020002020 3.24% 022222222020200000022020220020200202202000202020020002020 3.22% 022002222002220022002020002220000202200000202022020202220 Subtract 1 st haplotype from genotype to get 2 nd : 022002222002220022022020220020200202202000202020020002020 Check genotype: 022112222011221022021110220010110212202000102020120002021

14 ADSA / ASAS annual meeting, Denver, July 2010 (14)Paul VanRaden 2010 Conclusions  Missing genotypes can be filled easily Population and pedigree haplotyping can both process long segments efficiently Imputing 500,000 SNP for 33,414 Holsteins required 3 Gbyte memory, 3 CPU hours  Program findhap.f90 implemented for April 2010 routine evaluation Several recent improvements to accuracy Ready to include lower or higher density genotypes in evaluations


Download ppt "2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland."

Similar presentations


Ads by Google