Download presentation
Presentation is loading. Please wait.
Published byBlake Armstrong Modified over 9 years ago
1
Zoology 2005 Part 2 Richard Mott
2
Inbred Mouse Strain Haplotype Structure When the genomes of a pair of inbred strains are compared, –we find a mosaic of segments of identity and difference (Wade et al, Nature 2002). –A QTL segregating between the strains must lie in a region of sequence difference. What happens when we compare more than two strains simultaneously?
3
No Simple Haplotype Block Mosaic Yalcin et al 2004 PNAS
4
…But a Tree Mosaic
5
In-silico Mapping Simple idea- –Collect phenotypes across a set of inbred strains –Genotype the strains (ONCE) –Look for phenotype-genotype correlation –Works well for simple Mendelian traits (eg coat colour) –Suggested as a panacea for QTL mapping
6
In-silico Mapping Problems Less well-suited for complex traits Number of strains required grows quickly with the complexity of the trait. Suggested at least 100 strains required, possibly more if epistasis is present Require high-density genotype/sequence data to ensure identity-by-state = identity by-descent May be very useful for the dissection of a QTL previously identified in a F2 cross (look for patterns of sequence difference)
7
Recombinant Inbred Lines Panels of inbred lines descended form pairs of inbred strains Genomes are inbred mosaics of the founders Lines only need be genotyped once Similar to in-silico mapping except –identity-by-descent=identity-by-state –Coarser recombination structure –?lower resolution mapping?
8
BXD chromosome 4
9
Testing if a variant is functional without genotyping it (Yalcin et al, Genetics 2005) Requirements: –A Heterogeneous Stock, genotyped at a skeleton of markers –The genome sequences of the progenitor strains –A statistical test
10
Merge Analysis Each polymorphism groups together the founders according to their alleles If the polymorphism is functional, then a model in which the phenotypic strain effects are estimated after merging the strains together should be as good as a model where each strain can have an independent effect. Compare the fit of “merged” and “unmerged” genetic models to test if the variant is functional. If the fit of the merged model is poor then that variant can be eliminated.
11
Merge Analysis
13
How can we show a gene under a QTL peak affects the trait? Genetic Mapping identifies Functional Variants, not Genes Could be a control element affecting some other gene
14
Quantitative Complementation KO 0
15
Quantitative Complementation KO HighLow 050100 30 wt
16
Quantitative Complementation KO HighLow 050100 30 wt
17
Quantitative Complementation KO HighLow 050100 30 wt
18
Quantitative Complementation KO HighLow 050100 30 wt
19
Using Functional Information to Confirm Genes Further experiments –further bioinformatics, eg networks, functional annotation (GO, KEGG) –candidate gene sequencing –gene expression analyses (eQTL) of founder strains HS
20
Mouse/human sequence comparison
21
Enhancer reporter assays luciferase reporter promoterenhancer luciferase reporter promoterenhancer
22
Enhancer elements affect promoter expression
23
Large-Scale Genetic Mapping Using a Heterogeneous Stock Multiple Phenotypes collected in parallel
24
Predictions (from simulation of an HS population) In a population of 1,000 HS animals: –Genome-wide power to detect 5% QTL ~ 0.92 –Resolution < 2 Mb
25
Study design 2,000 mice 15,000 diallelic markers More than 100 phenotypes –each mouse subject to a battery of tests spread over weeks 5-9 of the animal’s life –more (post-mortem) phenotypes being added
26
Phenotypes
27
Covariates For each phenotype, we recorded covariates, eg, –experimenter –time of day –apparatus (eg, Shock Chamber 3)
28
Data collection All animals microchipped Automated data checking, processing and uploading All data uploaded into the Integrated Genotyping System (IGS) database
29
Genotypes from Illumina Genotyped and phenotyped 2,000 offspring Genotyped 300 parents Pedigree analysis shows genotyping was 99.99% accurate 11, 558 markers polymorphic in HS
30
QTL mapping Models –HAPPY and single marker association Fitting framework –Linear regression of (transformed) phenotypes –Survival analysis for latency data –Logit-based models for categorical data Significant covariates incorporated into the null model, eg Startle ~ TestChamber + BodyWeight + Year + Age + Hour + Gender additive genetic info for locus full genetic info for locus Additive Full Null + + =
31
QTL mapping Significance tests –partial F-test (linear models), Chi-square / LRT (others) Significance thresholds –different for each phenotype –have to take into account LD fit distribution to scores of permuted data
32
We set score thresholds using ideas from sequence databank search programs such as BLAST E-values
33
We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan E-values
34
We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. E-values
35
We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests. E-values
36
We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests. Hence a nominal P-value of p corresponds to an E-value of pM E-values
37
Problems Our population includes both siblings and unrelateds We have ignored this distinction And therefore: 1.Confounding environmental family effects with genetic family effects 2.Allowing ghost peaks due to linkage disequilibrium between markers within a sibship Our solution so far: (1) Investigating the effect of environmental factors and building covariates into the model (2) Identify peaks by a multiple conditional fit
38
Multiple Peak Fitting Forward Selection For each phenotype’s genome scan: –Make list of all peaks > genome-wide threshold T –Fit most significant peak, P 1 –Go through list of peaks, refitting each on conditional upon the most significant peak. –Add the most significant remaining peak, P 2 –Continue refitting remaining peaks P 3, P 4 … and adding them into model until the most significant remaining peak < T
39
Peaks found by multiple conditional fit Multiple conditional fit (using additive model only) number of phenotypes
40
Database for scans
41
Additive model Full model E-value thresholds additive only E<0.01 is about the same as genome-wide corrected p<0.01.
42
Database for scans zoom in
43
Covariates
44
QTL Mapping: Validation Coat colour Detection of known QTLs
45
Coat colour genes
46
A known QTL: HDL Wang et al, 2003 HS mapping
47
High Resolution QTLs PhenotypeChromMbMethodRefHS position Cue freezing370-83Genome tagged miceLiu 200371-73 Obesity2142-168CongenicDemant 2004150-153 10 week body mass1156-160Progeny testingChristians 2004154.5-156 Emotionality1143-148HSMott 2000143-144.5 Emotionality10123-127HSMott 2000121.5-122.7 Emotionality1254-57HSMott 200055.5-56.5 Emotionality1564-77HSTurri 199963.5-66
48
New QTLs: two examples Freeze.During.Tone (from Cue Conditioning behavioural experiment) …………1 peak % of CD4 in CD3 cells (immunology assay) …………10 peaks
49
Cue Conditioning Freezing in response to a conditioned stimulus TONE Freezing
50
Cue Conditioning Freeze.During.Tone: huge effect, small number of genes cntn1: Contactin precursor (Neural cell surface protein) chr15
51
% CD4 cells in CD3 cells huge effect but lots of genes
52
% CD4 in CD3 (under peak)
53
All QTLs 608 peaks Median interval is 938,936 bp … … or about 9 genes per peak
54
Summary The HS project so far has –phenotyped 2,500 HS mice –genotyped 2,300 mice –mapped over 140 phenotypes –identified more than 600 potential QTLs
55
Confirming gene candidates Increased mapping resolution through –include epistasis –multivariate –G x E –pleiotropy –sex effects Further experiments –further bioinformatics, eg networks, functional annotation (GO, KEGG) –candidate gene sequencing –gene expression analyses (eQTL) of founder strains HS
56
Confirming gene candidates: epistasis Single marker association of pairwise epistasis
57
Work of many hands Carmen Arboleda-Hitas Amarjit Bhomra Peter Burns Richard Copley Stuart Davidson Simon Fiddy Jonathan Flint Polinka Hernandez Sue Miller Richard Mott Chela Nunez Gemma Peachey Sagiv Shifman Leah Solberg Amy Taylor Martin Taylor Jordana Tzenova-Bell William Valdar Binnaz Yalcin Dave Bannerman Shoumo Bhattacharya Bill Cookson Rob Deacon Dominique Gauguier Doug Higgs Tertius Hough Paul Klenerman Nick Rawlins Project funded by The Wellcome Trust, UK
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.