Zoology 2005 Part 2 Richard Mott. Inbred Mouse Strain Haplotype Structure When the genomes of a pair of inbred strains are compared, –we find a mosaic.

Slides:

Advertisements

Similar presentations

15 The Genetic Basis of Complex Inheritance

Advertisements

Linkage and Genetic Mapping

The genetic dissection of complex traits

Planning breeding programs for impact

Why this paper Causal genetic variants at loci contributing to complex phenotypes unknown Rat/mice model organisms in physiology and diseases Relevant.

Gene by environment effects. Elevated Plus Maze (anxiety)

Experimental crosses. Inbred Strain Cross Backcross.

Qualitative and Quantitative traits

Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh

ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.

METHODS FOR HAPLOTYPE RECONSTRUCTION

Combined sequence based and genetic mapping analysis of complex traits in outbred rats Baud, A. et al. Rat Genome Sequencing and Mapping Consortium Presented.

Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.

Basics of Linkage Analysis

QTL Mapping R. M. Sundaram.

1 15 The Genetic Basis of Complex Inheritance. 2 Multifactorial Traits Multifactorial traits are determined by multiple genetic and environmental factors.

A multi-phenotype protocol for fine scale mapping of QTL in outbred heterogeneous stock mice LC Solberg, C Arboledas, P Burns, S Davidson, G Nunez, A Taylor,

1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.

Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.

1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.

Office hours Wednesday 3-4pm 304A Stanley Hall. Fig Association mapping (qualitative)

Genetic Traits Quantitative (height, weight) Dichotomous (affected/unaffected) Factorial (blood group) Mendelian - controlled by single gene (cystic fibrosis)

2050 VLSB. Dad phase unknown A1 A2 0.5 (total # meioses) Odds = 1/2[(1-r) n r k ]+ 1/2[(1-r) n r k ]odds ratio What single r value best explains the data?

Molecular ecology, quantitative genetic and genomics Dave Coltman + Melissa Gunn, Andrew Leviston, Katie Hartnup & Jon Slate.

Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.

Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.

Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.

Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.

Geuvadis RNAseq analysis at UNIGE Analysis plans

Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen

QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.

From QTL to QTG: Are we getting closer? Sagiv Shifman and Ariel Darvasi The Hebrew University of Jerusalem.

Fine mapping QTLs using Recombinant-Inbred HS and In-Vitro HS William Valdar Jonathan Flint, Richard Mott Wellcome Trust Centre for Human Genetics.

Regulation of gene expression in the mammalian eye and its relevance to eye disease Todd Scheetz et al. Presented by John MC Ma.

Bayesian MCMC QTL mapping in outbred mice Andrew Morris, Binnaz Yalcin, Jan Fullerton, Angela Meesaq, Rob Deacon, Nick Rawlins and Jonathan Flint Wellcome.

Experimental Design and Data Structure Supplement to Lecture 8 Fall

Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.

Complex Traits Most neurobehavioral traits are complex Multifactorial

Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.

Quantitative Genetics

QTL Mapping in Heterogeneous Stocks Talbot et al, Nature Genetics (1999) 21: Mott et at, PNAS (2000) 97:

Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.

Association between genotype and phenotype

Mapping and cloning Human Genes. Finding a gene based on phenotype ’s of DNA markers mapped onto each chromosome – high density linkage map. 2.

Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.

Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

The International Consortium. The International HapMap Project.

Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.

Pedagogical Objectives Bioinformatics/Neuroinformatics Unit Review of genetics Review/introduction of statistical analyses and concepts Introduce QTL.

Lecture 22: Quantitative Traits II

Genomics of Adaptation

Genetic correlations and associative networks for CNS transcript abundance and neurobehavioral phenotypes in a recombinant inbred mapping panel Elissa.

1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)

Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,

Supplemental Figure 1. False trans association due to probe cross-hybridization and genetic polymorphism at single base extension site. (A) The Infinium.

Why you should know about experimental crosses. To save you from embarrassment.

What is a QTL? Quantitative trait locus (loci) Region of chromosome that contributes to variation in a quantitative trait Generally used to study “complex.

Genome-wide genetic association of complex traits in outbred mice William Valdar, Leah C. Solberg, Dominique Gauguier, Stephanie Burnett, Paul Klenerman,

Systems Genetics Approach to the Study of Brain Iron Regulation Byron C. Jones Professor of Biobehavioral Health & Pharmacology The Pennsylvania State.

Association Mapping in Families Gonçalo Abecasis University of Oxford.

Complex Trait Genetics in Animal Models

University of Tennessee-Memphis

upstream vs. ORF binding and gene expression?

Genome Wide Association Studies using SNP

Map-based cloning of interesting genes

Gene mapping in mice Karl W Broman Department of Biostatistics

Genetic architecture of behaviour

Genome-wide Association Studies

Lecture 9: QTL Mapping II: Outbred Populations

GWAS-eQTL signal colocalisation methods

Presentation transcript:

Zoology 2005 Part 2 Richard Mott

Inbred Mouse Strain Haplotype Structure When the genomes of a pair of inbred strains are compared, –we find a mosaic of segments of identity and difference (Wade et al, Nature 2002). –A QTL segregating between the strains must lie in a region of sequence difference. What happens when we compare more than two strains simultaneously?

No Simple Haplotype Block Mosaic Yalcin et al 2004 PNAS

…But a Tree Mosaic

In-silico Mapping Simple idea- –Collect phenotypes across a set of inbred strains –Genotype the strains (ONCE) –Look for phenotype-genotype correlation –Works well for simple Mendelian traits (eg coat colour) –Suggested as a panacea for QTL mapping

In-silico Mapping Problems Less well-suited for complex traits Number of strains required grows quickly with the complexity of the trait. Suggested at least 100 strains required, possibly more if epistasis is present Require high-density genotype/sequence data to ensure identity-by-state = identity by-descent May be very useful for the dissection of a QTL previously identified in a F2 cross (look for patterns of sequence difference)

Recombinant Inbred Lines Panels of inbred lines descended form pairs of inbred strains Genomes are inbred mosaics of the founders Lines only need be genotyped once Similar to in-silico mapping except –identity-by-descent=identity-by-state –Coarser recombination structure –?lower resolution mapping?

BXD chromosome 4

Testing if a variant is functional without genotyping it (Yalcin et al, Genetics 2005) Requirements: –A Heterogeneous Stock, genotyped at a skeleton of markers –The genome sequences of the progenitor strains –A statistical test

Merge Analysis Each polymorphism groups together the founders according to their alleles If the polymorphism is functional, then a model in which the phenotypic strain effects are estimated after merging the strains together should be as good as a model where each strain can have an independent effect. Compare the fit of “merged” and “unmerged” genetic models to test if the variant is functional. If the fit of the merged model is poor then that variant can be eliminated.

Merge Analysis

How can we show a gene under a QTL peak affects the trait? Genetic Mapping identifies Functional Variants, not Genes Could be a control element affecting some other gene

Quantitative Complementation KO 0

Quantitative Complementation KO HighLow wt

Quantitative Complementation KO HighLow  wt

Quantitative Complementation KO HighLow    wt

Quantitative Complementation KO HighLow    wt

Using Functional Information to Confirm Genes Further experiments –further bioinformatics, eg networks, functional annotation (GO, KEGG) –candidate gene sequencing –gene expression analyses (eQTL) of founder strains HS

Mouse/human sequence comparison

Enhancer reporter assays luciferase reporter promoterenhancer luciferase reporter promoterenhancer

Enhancer elements affect promoter expression

Large-Scale Genetic Mapping Using a Heterogeneous Stock Multiple Phenotypes collected in parallel

Predictions (from simulation of an HS population) In a population of 1,000 HS animals: –Genome-wide power to detect 5% QTL ~ 0.92 –Resolution < 2 Mb

Study design 2,000 mice 15,000 diallelic markers More than 100 phenotypes –each mouse subject to a battery of tests spread over weeks 5-9 of the animal’s life –more (post-mortem) phenotypes being added

Phenotypes

Covariates For each phenotype, we recorded covariates, eg, –experimenter –time of day –apparatus (eg, Shock Chamber 3)

Data collection All animals microchipped Automated data checking, processing and uploading All data uploaded into the Integrated Genotyping System (IGS) database

Genotypes from Illumina Genotyped and phenotyped 2,000 offspring Genotyped 300 parents Pedigree analysis shows genotyping was 99.99% accurate 11, 558 markers polymorphic in HS

QTL mapping Models –HAPPY and single marker association Fitting framework –Linear regression of (transformed) phenotypes –Survival analysis for latency data –Logit-based models for categorical data Significant covariates incorporated into the null model, eg Startle ~ TestChamber + BodyWeight + Year + Age + Hour + Gender additive genetic info for locus full genetic info for locus Additive Full Null + + =

QTL mapping Significance tests –partial F-test (linear models), Chi-square / LRT (others) Significance thresholds –different for each phenotype –have to take into account LD fit distribution to scores of permuted data

We set score thresholds using ideas from sequence databank search programs such as BLAST E-values

We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan E-values

We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. E-values

We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests. E-values

We set score thresholds using ideas from sequence databank search programs such as BLAST The E-value of a threshold is the number of times you would expect to see a false positive exceed the threshold in a genome scan Applying the Bonferroni correction to the number of marker intervals is too severe because LD makes neighbouring scores correlated. Permutation analyses indicate the score of the most significant expected random score amongst all ~12000 marker intervals behaves as if it was drawn from M~4000 independent tests. Hence a nominal P-value of p corresponds to an E-value of pM E-values

Problems Our population includes both siblings and unrelateds We have ignored this distinction And therefore: 1.Confounding environmental family effects with genetic family effects 2.Allowing ghost peaks due to linkage disequilibrium between markers within a sibship Our solution so far: (1) Investigating the effect of environmental factors and building covariates into the model (2) Identify peaks by a multiple conditional fit

Multiple Peak Fitting Forward Selection For each phenotype’s genome scan: –Make list of all peaks > genome-wide threshold T –Fit most significant peak, P 1 –Go through list of peaks, refitting each on conditional upon the most significant peak. –Add the most significant remaining peak, P 2 –Continue refitting remaining peaks P 3, P 4 … and adding them into model until the most significant remaining peak < T

Peaks found by multiple conditional fit Multiple conditional fit (using additive model only) number of phenotypes

Database for scans

Additive model Full model E-value thresholds additive only E<0.01 is about the same as genome-wide corrected p<0.01.

Database for scans zoom in

Covariates

QTL Mapping: Validation Coat colour Detection of known QTLs

Coat colour genes

A known QTL: HDL Wang et al, 2003 HS mapping

High Resolution QTLs PhenotypeChromMbMethodRefHS position Cue freezing370-83Genome tagged miceLiu Obesity CongenicDemant week body mass Progeny testingChristians Emotionality HSMott Emotionality HSMott Emotionality HSMott Emotionality HSTurri

New QTLs: two examples Freeze.During.Tone (from Cue Conditioning behavioural experiment) …………1 peak % of CD4 in CD3 cells (immunology assay) …………10 peaks

Cue Conditioning Freezing in response to a conditioned stimulus TONE Freezing

Cue Conditioning Freeze.During.Tone: huge effect, small number of genes cntn1: Contactin precursor (Neural cell surface protein) chr15

% CD4 cells in CD3 cells huge effect but lots of genes

% CD4 in CD3 (under peak)

All QTLs 608 peaks Median interval is 938,936 bp … … or about 9 genes per peak

Summary The HS project so far has –phenotyped 2,500 HS mice –genotyped 2,300 mice –mapped over 140 phenotypes –identified more than 600 potential QTLs

Confirming gene candidates Increased mapping resolution through –include epistasis –multivariate –G x E –pleiotropy –sex effects Further experiments –further bioinformatics, eg networks, functional annotation (GO, KEGG) –candidate gene sequencing –gene expression analyses (eQTL) of founder strains HS

Confirming gene candidates: epistasis Single marker association of pairwise epistasis

Work of many hands Carmen Arboleda-Hitas Amarjit Bhomra Peter Burns Richard Copley Stuart Davidson Simon Fiddy Jonathan Flint Polinka Hernandez Sue Miller Richard Mott Chela Nunez Gemma Peachey Sagiv Shifman Leah Solberg Amy Taylor Martin Taylor Jordana Tzenova-Bell William Valdar Binnaz Yalcin Dave Bannerman Shoumo Bhattacharya Bill Cookson Rob Deacon Dominique Gauguier Doug Higgs Tertius Hough Paul Klenerman Nick Rawlins Project funded by The Wellcome Trust, UK