Download presentation
Presentation is loading. Please wait.
Published byMitchell Boyd Modified over 6 years ago
1
A multi-strain, high-resolution mouse haplotype map reveals three distinctive genetic signatures
Laboratory of Population Genetics
2
Motivations An accurate high-resolution haplotype map of the mouse genome enables prioritization of QTL candidate genes Different haplotype block structures have been reported in different studies >10MB block size in GNF study (Wiltshire et al, PNAS 2003) Mb block size in WI study (Wade et al, Nature 2002) kb block size in a 8MB region chr 19 (Park et al, Genome Research 2003) Analysis of a 10Mb region on chromosome 7 using the Celera mouse SNPs reveals a different genetic variation pattern Celera mouse chromosome 16 SNP data are publicly available Laboratory of Population Genetics
3
Objectives Develop an integrated, high resolution, multi-strain mouse haplotype map Compare the haplotype structure derived from high-density SNPs with those derived from low density markers Perform experimental validation in regions of conflict and in regions of interest across 20 inbred strains Analyze biological factors that have contributed to the formation mouse genetic variation patterns Laboratory of Population Genetics
4
Data Sources Chromosome 16 reference sequence
MGSCv3 (NCBI build 30, Feb. 2003) SNP Data Laboratory of Population Genetics
5
Construction of Multi-Strain Haplotype Blocks with High Density SNP Markers
Method Greedy algorithm that starts with two-haplotype per block Seed: a minimum of two adjacent SNPs with no-ambiguity in haplotype assignment Singleton SNP that breaks the two-haplotype configuration does not affect block extension Results 2,083 blocks 65,068 (95% ) Celera SNPs in 5 laboratory inbred strains. Laboratory of Population Genetics
6
Distribution of Haplotype Block Size
Laboratory of Population Genetics
7
Blocks with Different Size Have Similar SNP Density Distribution
Laboratory of Population Genetics
8
A 2.4-Mb Haplotype Block with Varying SNP Density
DBA/2J A/J 129X1/SvJ 129S1/SvImJ C57BL/6J #SNP/10kb 400000 800000 >20 11-20 6-10 2-5 1 #SNP/10kb SNP Experimental Validation B6 Allele Non-B6 Allele 374 SNPs over 2.4Mb. Avg Density=0.156/kb. 153 of which were in hotspots (red and orange)
9
A 2.4-Mb Region with High SNP Density but Heterogeneous Variation Pattern (Erosion)
Antaxin 2 binding protein 1 (nucleic acid binding, RNA binding) 129S1 DBA/2J A/J 129X1 B6 >20 11-20 6-10 2-5 1 #SNP/10kb B6 Allele Non-B6 Allele Missing Data Laboratory of Population Genetics
10
Details of Haplotype Erosion Across 160KB
Location 5,721,639-5,878,633bp on chr16 Blocks 179SNPs in 14 blocks with the major pattern 116 SNPs in 19 blocks with the other patterns 49 Singleton SNPs 129S1/SvImJ DBA/2J A/J 129X1/SvJ C57BL/6J SNP Density Laboratory of Population Genetics
11
Other Heterogeneous Haplotype Patterns
2) Segmentation 129S1 DBA/2J A/J 129X1 C57BL/6J SNP Density Laboratory of Population Genetics
12
Other Heterogeneous Haplotype Patterns
3) Segmentation with Erosion 4) Random Laboratory of Population Genetics
13
Three Major Variation Patterns
SNP Deserts: >1Mb with <0.5SNP per 10kb Large Blocks: >300kb “melded” haplotype blocks with consistent variation patterns Block Breakers: regions with heterogeneous variation patterns
14
Predictive Power of Haplotype Structures
Test the ability to use the haplotype structure in one study to predict allelic variations in another study Our Haploytpe Blocks 98% accuracy on WI B6/129S1 SNPs that do not overlap with Celera SNPs 92% accuracy on GNF B6/129S1/AJ/DBA haplotypes WI B6/129S1 Haplotype Blocks 74% accuracy on Celera B6/129S1 genotypes 85% accuracy on GNF B6/129S1 genotypes 80% GNF markers are non-polymorphic across inbred strains used in Celera and WI shotgun sequencing Laboratory of Population Genetics
15
SNP Deserts in Chromosome 16
6 >1Mb SNP deserts in the five inbred strains used for Celera shotgun sequencing All 6 SNP deserts overlap with WI SNP deserts conserved across all WI strains 0.21% WI B6/SvJ SNPs in our SNP deserts 0.97% WI all SNPs in our SNP deserts 5 out of the 6 deserts have at least one end as part of large haplotype blocks SNP deserts are not genetically homogeneous There are STRP polymorphisms There are indel polymorphisms Laboratory of Population Genetics
16
Validation of a SNP Desert
N NNNNNN NN011111N Other Lab inbred B6,AKRJ Skive Czech 5 STRPs and 1 SNP in a 15kb SNP desert in all laboratory inbred strains The STRPs and the 1 SNP have the same variation pattern as the neighboring regions with high SNP density among the laboratory inbred strains Additional 120 SNPs discovered between the laboratory inbred strains and feral inbred strains Laboratory of Population Genetics
17
A Gene-Coding Region with Varying SNP Density
WI SNPs Celera SNPs Mis-sense?? silent e3 e12 down UTR3 e4 mRNA sequence is MGC clone: from mammary tissues metastasized to lung The 10kb region is included in a 77kb haplotype block with 44 SNPs Variations in the mRNA sequence do not overlap with WI and Celera SNPs >=2 haplotypes in the regions?? Laboratory of Population Genetics
18
Results of Experimental Validation
>down hap s1;129x1;AJ;BALB;C3HHe;DBA hap AKRJ;C57BL hap Czech;Skive >UTR3 /num_SNP=25 /num_strain=10 /num_hap=4 hap s1;129x1;AJ;BALB;C3HHe;DBA2J; hap N AKRJ; hap Czech;Skive; hap C57BL; >NM_145481_e12 /num_SNP=7 /num_strain=10 /num_hap=4 hap s1;129x1;AJ;BALB; hap AKRJ; hap C3HHe;C57BL;DBA2J; hap Czech;Skive; >e4 hap1 0 Others hap2 1 Czech;Skive Mis-sense SNP does not validate Silent SNP validates
19
Laboratory of Population Genetics
Mouse cSNPs Synonymous: 185 Non-synonymous: 100 Laboratory of Population Genetics
20
Haplotype Diversity in 43 Target Regions Assayed by 94 Amplicons
Laboratory of Population Genetics
21
Conclusions We have compiled an accurate, multi-strain, high-resolution haplotype map for mouse chromosome 16 We have discovered three distinctive genetic variation patterns for laboratory inbred mouse: SNP deserts, large blocks and block breakers Large haplotype blocks may consist regions with varying SNP density Selection in inbreeding may have an effect on SNP distribution in protein coding regions as well as SNP rate in gene coding regions Our method is scalable for whole-genome analysis Laboratory of Population Genetics
22
Acknowledgement Laboratory of Population Genetics
Ken Buetow Kent Hunter Michael Gandolph Bill Rowe Michael Edmonson Jenny Kelly University of Wisconsin Rob Williams Laboratory of Population Genetics
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.