A multi-strain, high-resolution mouse haplotype map reveals three distinctive genetic signatures Laboratory of Population Genetics.

Slides:



Advertisements
Similar presentations
LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Advertisements

Why this paper Causal genetic variants at loci contributing to complex phenotypes unknown Rat/mice model organisms in physiology and diseases Relevant.
Potato Mapping / QTLs Amir Moarefi VCR
Using mouse genetics to understand human disease Mark Daly Whitehead/Pfizer Computational Biology Fellow.
Combined sequence based and genetic mapping analysis of complex traits in outbred rats Baud, A. et al. Rat Genome Sequencing and Mapping Consortium Presented.
Genome Structure/Mapping Lisa Malm 05/April/2006 VCR 221 Lisa Malm 05/April/2006 VCR 221.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Outline to SNP bioinformatics lecture
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
A dynamic program algorithm for haplotype block partitioning Zhang, et. al. (2002) PNAS. 99, 7335.
Discussion Our current results suggest that it is possible to identify susceptibility regions using this methodology. The presented method takes advantage.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
RExPrimer Pongsakorn Wangkumhang, M.Sc. Biostatistics and Informatics Laboratory, Genome Institute, National Center for Genetic Engineering and Biotechnology.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
From QTL to QTG: Are we getting closer? Sagiv Shifman and Ariel Darvasi The Hebrew University of Jerusalem.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
Fine mapping QTLs using Recombinant-Inbred HS and In-Vitro HS William Valdar Jonathan Flint, Richard Mott Wellcome Trust Centre for Human Genetics.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
The International Consortium. The International HapMap Project.
A genetic polymorphism in the Drosophila insulin receptor suggests adaptation to climate variation across continents Annalise Paaby a, Mark Blacket b,
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Towards a Complete Mouse Haplotype Map Mathew Pletcher Genomics Institute of the Novartis Research Foundation.
Dobrynin et al., Genome Biology,  The African cheetah  Fastest land animal  Ancestors were distributed in the Americas, Europe and Asia until.
Notes: Human Genome (Right side page)
The Haplotype Blocks Problems Wu Ling-Yun
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Accelerating positional cloning in mice using ancestral haplotype patterns Mark Daly Whitehead Institute for Biomedical Research.
Identifying candidate genes for the regulation of the response to Trypanosoma congolense infection Introduction African cattle breeds differ significantly.
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
The Transcriptional Landscape of the Mammalian Genome
Of Sea Urchins, Birds and Men
Invest. Ophthalmol. Vis. Sci ;52(6): doi: /iovs Figure Legend:
ENCODE Pseudogenes and Transcription
RNA-seq Replicate 1 RNA-seq Replicate 2 DNA
Gene Hunting: Design and statistics
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Volume 22, Issue 9, Pages (May 2012)
Today… Review a few items from last class
Genome-wide Associations
By Michael Fraczek and Caden Boyer
Identification and Characterization of pre-miRNA Candidates in the C
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Volume 20, Issue 12, Pages (June 2010)
Jong-Min Lee, Kyung-Hee Kim, Aram Shin, Michael J
Balanced Translocation detected by FISH
Emily C. Walsh, Kristie A. Mather, Stephen F
Sequence the 3 billion base pairs of human
BF528 - Whole Genome Sequencing and Genomic Variation
Approximation Algorithms for the Selection of Robust Tag SNPs
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
by Meru J. Sadhu, Joshua S. Bloom, Laura Day, and Leonid Kruglyak
Mean C-to-U editing ratios for most editing sites map to a region on chromosome 6 at 122 Mb. (A) Genome scan of mean C-to-U editing for 70 editing sites.
SNPs and CNPs By: David Wendel.
Haplotype Block Partition with Limited Resources and Applications to Human Chromosome 21 Haplotype Data  Kui Zhang, Fengzhu Sun, Michael S. Waterman,
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Presentation transcript:

A multi-strain, high-resolution mouse haplotype map reveals three distinctive genetic signatures Laboratory of Population Genetics

Motivations An accurate high-resolution haplotype map of the mouse genome enables prioritization of QTL candidate genes Different haplotype block structures have been reported in different studies >10MB block size in GNF study (Wiltshire et al, PNAS 2003) 1.0-2.0Mb block size in WI study (Wade et al, Nature 2002) 100-150kb block size in a 8MB region chr 19 (Park et al, Genome Research 2003) Analysis of a 10Mb region on chromosome 7 using the Celera mouse SNPs reveals a different genetic variation pattern Celera mouse chromosome 16 SNP data are publicly available Laboratory of Population Genetics

Objectives Develop an integrated, high resolution, multi-strain mouse haplotype map Compare the haplotype structure derived from high-density SNPs with those derived from low density markers Perform experimental validation in regions of conflict and in regions of interest across 20 inbred strains Analyze biological factors that have contributed to the formation mouse genetic variation patterns Laboratory of Population Genetics

Data Sources Chromosome 16 reference sequence MGSCv3 (NCBI build 30, Feb. 2003) SNP Data Laboratory of Population Genetics

Construction of Multi-Strain Haplotype Blocks with High Density SNP Markers Method Greedy algorithm that starts with two-haplotype per block Seed: a minimum of two adjacent SNPs with no-ambiguity in haplotype assignment Singleton SNP that breaks the two-haplotype configuration does not affect block extension Results 2,083 blocks 65,068 (95% ) Celera SNPs in 5 laboratory inbred strains. Laboratory of Population Genetics

Distribution of Haplotype Block Size Laboratory of Population Genetics

Blocks with Different Size Have Similar SNP Density Distribution Laboratory of Population Genetics

A 2.4-Mb Haplotype Block with Varying SNP Density DBA/2J A/J 129X1/SvJ 129S1/SvImJ C57BL/6J #SNP/10kb 400000 800000 1200000 1600000 2000000 2400000 >20 11-20 6-10 2-5 1 #SNP/10kb SNP Experimental Validation B6 Allele Non-B6 Allele 374 SNPs over 2.4Mb. Avg Density=0.156/kb. 153 of which were in hotspots (red and orange)

A 2.4-Mb Region with High SNP Density but Heterogeneous Variation Pattern (Erosion) Antaxin 2 binding protein 1 (nucleic acid binding, RNA binding) 129S1 DBA/2J A/J 129X1 B6 >20 11-20 6-10 2-5 1 #SNP/10kb B6 Allele Non-B6 Allele Missing Data Laboratory of Population Genetics

Details of Haplotype Erosion Across 160KB Location 5,721,639-5,878,633bp on chr16 Blocks 179SNPs in 14 blocks with the major pattern 116 SNPs in 19 blocks with the other patterns 49 Singleton SNPs 129S1/SvImJ DBA/2J A/J 129X1/SvJ C57BL/6J SNP Density Laboratory of Population Genetics

Other Heterogeneous Haplotype Patterns 2) Segmentation 129S1 DBA/2J A/J 129X1 C57BL/6J SNP Density Laboratory of Population Genetics

Other Heterogeneous Haplotype Patterns 3) Segmentation with Erosion 4) Random Laboratory of Population Genetics

Three Major Variation Patterns SNP Deserts: >1Mb with <0.5SNP per 10kb Large Blocks: >300kb “melded” haplotype blocks with consistent variation patterns Block Breakers: regions with heterogeneous variation patterns

Predictive Power of Haplotype Structures Test the ability to use the haplotype structure in one study to predict allelic variations in another study Our Haploytpe Blocks 98% accuracy on WI B6/129S1 SNPs that do not overlap with Celera SNPs 92% accuracy on GNF B6/129S1/AJ/DBA haplotypes WI B6/129S1 Haplotype Blocks 74% accuracy on Celera B6/129S1 genotypes 85% accuracy on GNF B6/129S1 genotypes 80% GNF markers are non-polymorphic across inbred strains used in Celera and WI shotgun sequencing Laboratory of Population Genetics

SNP Deserts in Chromosome 16 6 >1Mb SNP deserts in the five inbred strains used for Celera shotgun sequencing All 6 SNP deserts overlap with WI SNP deserts conserved across all WI strains 0.21% WI B6/SvJ SNPs in our SNP deserts 0.97% WI all SNPs in our SNP deserts 5 out of the 6 deserts have at least one end as part of large haplotype blocks SNP deserts are not genetically homogeneous There are STRP polymorphisms There are indel polymorphisms Laboratory of Population Genetics

Validation of a SNP Desert 0000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 001000 11111 01 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000 00000 00 1110100011111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111111111110111111111101 000111 N0101 11 111111111111111101111111NNNNNN111111111111111111111111101111111011111NN011111N1000111111111111111111111111111111111111110 110001 11 Other Lab inbred B6,AKRJ Skive Czech 5 STRPs and 1 SNP in a 15kb SNP desert in all laboratory inbred strains The STRPs and the 1 SNP have the same variation pattern as the neighboring regions with high SNP density among the laboratory inbred strains Additional 120 SNPs discovered between the laboratory inbred strains and feral inbred strains Laboratory of Population Genetics

A Gene-Coding Region with Varying SNP Density WI SNPs Celera SNPs Mis-sense?? silent e3 e12 down UTR3 e4 mRNA sequence is MGC clone: from mammary tissues metastasized to lung The 10kb region is included in a 77kb haplotype block with 44 SNPs Variations in the mRNA sequence do not overlap with WI and Celera SNPs >=2 haplotypes in the regions?? Laboratory of Population Genetics

Results of Experimental Validation >down hap1 011110 129s1;129x1;AJ;BALB;C3HHe;DBA hap2 000000 AKRJ;C57BL hap3 110001 Czech;Skive >UTR3 /num_SNP=25 /num_strain=10 /num_hap=4 hap1 1100011000000100000000010 129s1;129x1;AJ;BALB;C3HHe;DBA2J; hap4 110001010011100N110001000 AKRJ; hap3 1111110011100111011110101 Czech;Skive; hap2 0000000000000000000000000 C57BL; >NM_145481_e12 /num_SNP=7 /num_strain=10 /num_hap=4 hap1 0000100 129s1;129x1;AJ;BALB; hap2 0010001 AKRJ; hap3 0000000 C3HHe;C57BL;DBA2J; hap4 1101011 Czech;Skive; >e4 hap1 0 Others hap2 1 Czech;Skive Mis-sense SNP does not validate Silent SNP validates

Laboratory of Population Genetics Mouse cSNPs Synonymous: 185 Non-synonymous: 100 Laboratory of Population Genetics

Haplotype Diversity in 43 Target Regions Assayed by 94 Amplicons Laboratory of Population Genetics

Conclusions We have compiled an accurate, multi-strain, high-resolution haplotype map for mouse chromosome 16 We have discovered three distinctive genetic variation patterns for laboratory inbred mouse: SNP deserts, large blocks and block breakers Large haplotype blocks may consist regions with varying SNP density Selection in inbreeding may have an effect on SNP distribution in protein coding regions as well as SNP rate in gene coding regions Our method is scalable for whole-genome analysis Laboratory of Population Genetics

Acknowledgement Laboratory of Population Genetics Ken Buetow Kent Hunter Michael Gandolph Bill Rowe Michael Edmonson Jenny Kelly University of Wisconsin Rob Williams Laboratory of Population Genetics