Inferring Genetic Architecture of Complex Biological Processes Brian S

Slides:



Advertisements
Similar presentations
The genetic dissection of complex traits
Advertisements

Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal.
Experimental crosses. Inbred Strain Cross Backcross.
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
QTL Mapping R. M. Sundaram.
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Quantitative Genetics
Seattle Summer Institute : Systems Genetics for Experimental Crosses Brian S. Yandell, UW-Madison Elias Chaibub Neto, Sage Bionetworks
Quantitative Genetics
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Professor Brian S. Yandell joint faculty appointment across colleges: –50% Horticulture (CALS) –50% Statistics (Letters & Sciences) Biometry Program –MS.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
QTL Mapping in Heterogeneous Stocks Talbot et al, Nature Genetics (1999) 21: Mott et at, PNAS (2000) 97:
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
BayesNCSU QTL II: Yandell © Bayesian Interval Mapping 1.what is Bayes? Bayes theorem? Bayesian QTL mapping Markov chain sampling18-25.
Yandell © June Bayesian analysis of microarray traits Arabidopsis Microarray Workshop Brian S. Yandell University of Wisconsin-Madison
1 Before considering selection, it’s important to characterize how gene expression varies within and between species. What evolutionary forces act on gene.
Yandell © 2003JSM Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Yandell © 2003Wisconsin Symposium III1 QTLs & Microarrays Overview Brian S. Yandell University of Wisconsin-Madison
Why you should know about experimental crosses. To save you from embarrassment.
What is a QTL? Quantitative trait locus (loci) Region of chromosome that contributes to variation in a quantitative trait Generally used to study “complex.
13 October 2004Statistics: Yandell © Inferring Genetic Architecture of Complex Biological Processes Brian S. Yandell 12, Christina Kendziorski 13,
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
ModelNCSU QTL II: Yandell © Model Selection for Multiple QTL 1.reality of multiple QTL3-8 2.selecting a class of QTL models comparing QTL models16-24.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.
Plant Microarray Course
EQTLs.
Quantile-based Permutation Thresholds for QTL Hotspots
Quantile-based Permutation Thresholds for QTL Hotspots
Dirk-Jan de Koning*, Örjan Carlborg*, Robert Williams†, Lu Lu†,
Professor Brian S. Yandell
Identifying QTLs in experimental crosses
University of Tennessee-Memphis
upstream vs. ORF binding and gene expression?
Statistical Applications in Biology and Genetics
Mapping variation in growth in response to glucose concentration
Model Selection for Multiple QTL
Quantitative traits Lecture 13 By Ms. Shumaila Azam
Genome Wide Association Studies using SNP
Multiple Traits & Microarrays
Graphical Diagnostics for Multiple QTL Investigation Brian S
Gene Hunting: Design and statistics
Gene mapping in mice Karl W Broman Department of Biostatistics
Inferring Genetic Architecture of Complex Biological Processes BioPharmaceutical Technology Center Institute (BTCI) Brian S. Yandell University of Wisconsin-Madison.
Bayesian Model Selection for Quantitative Trait Loci with Markov chain Monte Carlo in Experimental Crosses Brian S. Yandell University of Wisconsin-Madison.
4. modern high throughput biology
Statistical issues in QTL mapping in mice
High Throughput Gene Mapping Brian S. Yandell
Bayesian Interval Mapping
Brian S. Yandell University of Wisconsin-Madison 26 February 2003
Genome-wide Association Studies
Bayesian Interval Mapping
Seattle SISG: Yandell © 2007
Seattle SISG: Yandell © 2008
examples in detail days to flower for Brassica napus (plant) (n = 108)
Inferring Genetic Architecture of Complex Biological Processes Brian S
Seattle SISG: Yandell © 2006
Note on Outbred Studies
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Chapter 7 Multifactorial Traits
Brian S. Yandell University of Wisconsin-Madison
Seattle SISG: Yandell © 2009
Chapter 7 Beyond alleles: Quantitative Genetics
Evan G. Williams, Johan Auwerx  Cell 
Presentation transcript:

Inferring Genetic Architecture of Complex Biological Processes Brian S Inferring Genetic Architecture of Complex Biological Processes Brian S. Yandell12, Christina Kendziorski13, Hong Lan4, Elias Chaibub1, Alan D. Attie4 U Alabama Birmingham 1 Department of Statistics 2 Department of Horticulture 3 Department of Biostatistics & Medical Informatics 4 Department of Biochemistry University of Wisconsin-Madison http://www.stat.wisc.edu/~yandell/statgen 3 November 2004 UAB: Yandell © 2004

glucose insulin (courtesy AD Attie) 3 November 2004 UAB: Yandell © 2004 (courtesy AD Attie)

studying diabetes in an F2 mouse model: segregating panel from inbred lines B6.ob x BTBR.ob  F1  F2 selected mice with ob/ob alleles at leptin gene (Chr 6) sacrificed at 14 weeks, tissues preserved physiological study (Stoehr et al. 2000 Diabetes) mapped body weight, insulin, glucose at various ages gene expression studies RT-PCR for a few mRNA on 108 F2 mice liver tissues (Lan et al. 2003 Diabetes; Lan et al. 2003 Genetics) Affymetrix microarrays on 60 F2 mice liver tissues U47 A & B chips, RMA normalization design: selective phenotyping (Jin et al. 2004 Genetics) 3 November 2004 UAB: Yandell © 2004

The intercross (from K Broman)  3 November 2004 UAB: Yandell © 2004

mRNA expression as phenotype: interval mapping for SCD1 is complicated 3 November 2004 UAB: Yandell © 2004

taking a multiple QTL approach improve statistical power, precision increase number of QTL detected better estimates of loci: less bias, smaller intervals improve inference of complex genetic architecture patterns and individual elements of epistasis appropriate estimates of means, variances, covariances asymptotically unbiased, efficient assess relative contributions of different QTL improve estimates of genotypic values less bias (more accurate) and smaller variance (more precise) mean squared error = MSE = (bias)2 + variance 3 November 2004 UAB: Yandell © 2004

Pareto diagram of QTL effects major QTL on linkage map (modifiers) minor QTL major QTL 3 polygenes 2 4 5 1 3 November 2004 UAB: Yandell © 2004

Bayesian model assessment: number of QTL for SCD1 with R/bim 3 November 2004 UAB: Yandell © 2004

Bayesian model assessment genetic architecture: chromosome pattern 3 November 2004 UAB: Yandell © 2004

trans-acting QTL for SCD1 Bayesian model averaging with R/bim additive QTL? dominant QTL? 3 November 2004 UAB: Yandell © 2004

Bayesian LOD and h2 for SCD1 (summaries from R/bim) 3 November 2004 UAB: Yandell © 2004

SCD mRNA expression phenotype 2-D scan for QTL (R/qtl) epistasis LOD peaks joint LOD peaks 3 November 2004 UAB: Yandell © 2004

sub-peaks can be easily overlooked 3 November 2004 UAB: Yandell © 2004

hetereogeneity: many genes affect each trait major QTL on linkage map (modifiers) minor QTL major QTL 3 polygenes 2 4 5 1 3 November 2004 UAB: Yandell © 2004

interval mapping basics observed measurements Y = phenotypic trait X = markers & linkage map i = individual index 1,…,n missing data missing marker data Q = QT genotypes alleles QQ, Qq, or qq at locus unknown quantities M = genetic architecture  = QT locus (or loci)  = phenotype model parameters pr(Q=q|X,) genotype model grounded by linkage map, experimental cross recombination yields multinomial for Q given X f(Y|q) phenotype model distribution shape (assumed normal here) unknown parameters  (could be non-parametric)  after Sen & Churchill (2001 Genetics) M 3 November 2004 UAB: Yandell © 2004

genetic architecture: heterogeneity heterogeneity: many genes can affect phenotype different allelic combinations can yield similar phenotypes multiple genes can affecting phenotype in subtle ways multiple genes can interact (epistasis) genetic architecture: model for explained genetic variation loci (genomic regions) that affect trait genotypic effects of loci, including possible epistasis M = {1, 2, 3, (1, 2)} = 3 loci with epistasis between two q = 0 + q1 + q2 + q3 + q (1,2) = linear model for genotypic mean  = (1, 2, 3) = loci in model M q = (q1, q2, q3) = possible genotype at loci  Q = (Q1, Q2, Q3) = genotype for each individual at loci  3 November 2004 UAB: Yandell © 2004

multiple QTL interval mapping genotypic mean depends on model M q = 0 + {q in M} qk interval mapping between flanking markers f(Y | X, M) = q f(Y | q) f(Q = q | X, ) model selection choice of distribution: f is normal sample many possible architectures compare based on Bayes factors (BIC) 3 November 2004 UAB: Yandell © 2004

2M observations 30,000 traits 60 mice 3 November 2004 UAB: Yandell © 2004

modern high throughput biology measuring the molecular dogma of biology DNA  RNA  protein  metabolites measured one at a time only a few years ago massive array of measurements on whole systems (“omics”) thousands measured per individual (experimental unit) all (or most) components of system measured simultaneously whole genome of DNA: genes, promoters, etc. all expressed RNA in a tissue or cell all proteins all metabolites systems biology: focus on network interconnections chains of behavior in ecological community underlying biochemical pathways genetics as one experimental tool perturb system by creating new experimental cross each individual is a unique mosaic 3 November 2004 UAB: Yandell © 2004

finding heritable traits (from Christina Kendziorski) reduce 30,000 traits to 300-3,000 heritable traits probability a trait is heritable pr(H|Y,Q) = pr(Y|Q,H) pr(H|Q) / pr(Y|Q) Bayes rule pr(Y|Q) = pr(Y|Q,H) pr(H|Q) + pr(Y|Q, not H) pr(not H|Q) phenotype averaged over genotypic mean  pr(Y|Q, not H) = f0(Y) =  f(Y| ) pr() d if not H pr(Y|Q, H) = f1(Y|Q) = q f0(Yq ) if heritable Yq = {Yi | Qi =q} = trait values with genotype Q=q 3 November 2004 UAB: Yandell © 2004

hierarchical model for expression phenotypes (EB arrays: Christina Kendziorski) mRNA phenotype models given genotypic mean q common prior on q across all mRNA (use empirical Bayes to estimate prior) 3 November 2004 UAB: Yandell © 2004

expression meta-traits: pleiotropy reduce 3,000 heritable traits to 3 meta-traits(!) what are expression meta-traits? pleiotropy: a few genes can affect many traits transcription factors, regulators weighted averages: Z = YW principle components, discriminant analysis infer genetic architecture of meta-traits model selection issues are subtle missing data, non-linear search what is the best criterion for model selection? time consuming process heavy computation load for many traits subjective judgement on what is best 3 November 2004 UAB: Yandell © 2004

PC for two correlated mRNA 3 November 2004 UAB: Yandell © 2004

PC across microarray functional groups Affy chips on 60 mice ~40,000 mRNA 2500+ mRNA show DE (via EB arrays with marker regression) 1500+ organized in 85 functional groups 2-35 mRNA / group which are interesting? examine PC1, PC2 circle size = # unique mRNA 3 November 2004 UAB: Yandell © 2004

factor loadings for PC1&2 3 November 2004 UAB: Yandell © 2004

blue bars at 1%, 5%; width proportional to group size how well does PC1 do? lod peaks for 2 QTL at best pair of chr data (red) vs. 500 permutations (boxplots) blue bars at 1%, 5%; width proportional to group size 3 November 2004 UAB: Yandell © 2004

84 PC meta-traits by functional group focus on 2 interesting groups 3 November 2004 UAB: Yandell © 2004

red lines: peak for PC meta-trait black/blue: peaks for mRNA traits arrows: cis-action? 3 November 2004 UAB: Yandell © 2004

(portion of) chr 4 region ? 3 November 2004 UAB: Yandell © 2004

DA meta-traits: separate pleiotropy from environmental correlation pleiotropy only environmental correlation only both Korol et al. (2001) 3 November 2004 UAB: Yandell © 2004

interaction plots for DA meta-traits DA for all pairs of markers: separate 9 genotypes based on markers (a) same locus pair found with PC meta-traits (b) Chr 2 region interesting from biochemistry (Jessica Byers) (c) Chr 5 & Chr 9 identified as important for insulin, SCD 3 November 2004 UAB: Yandell © 2004

comparison of PC and DA meta-traits on 1500+ mRNA traits genotypes from Chr 4/Chr 15 locus pair (circle=centroid) PC captures spread without genotype DA creates best separation by genotype PC ignores genotype DA uses genotype note better spread of circles correlation of PC and DA meta-traits 3 November 2004 UAB: Yandell © 2004

relating meta-traits to mRNA traits DA meta-trait standard units log2 expression SCD trait 3 November 2004 UAB: Yandell © 2004

DA: a cautionary tale (184 mRNA with |cor| > 0 DA: a cautionary tale (184 mRNA with |cor| > 0.5; mouse 13 drives heritability) 3 November 2004 UAB: Yandell © 2004

building graphical models infer genetic architecture of meta-trait E(Z | Q, M) = q = 0 + {q in M} qk find mRNA traits correlated with meta-trait Z  YW for modest number of traits Y extend meta-trait genetic architecture M = genetic architecture for Y expect subset of QTL to affect each mRNA may be additional QTL for some mRNA 3 November 2004 UAB: Yandell © 2004

posterior for graphical models posterior for graph given multivariate trait & architecture pr(G | Y, Q, M) = pr(Y | Q, G) pr(G | M) / pr(Y | Q) pr(G | M) = prior on valid graphs given architecture multivariate phenotype averaged over genotypic mean  pr(Y | Q, G) = f1(Y | Q, G) = q f0(Yq | G) f0(Yq | G) =  f(Yq | , G) pr() d graphical model G implies correlation structure on Y genotype mean prior assumed independent across traits pr() = t pr(t) 3 November 2004 UAB: Yandell © 2004

from graphical models to pathways build graphical models QTL  RNA1  RNA2 class of possible models best model = putative biochemical pathway parallel biochemical investigation candidate genes in QTL regions laboratory experiments on pathway components 3 November 2004 UAB: Yandell © 2004

graphical models (with Elias Chaibub) f1(Y | Q, G=g) = f1(Y1 | Q) f1(Y2 | Q, Y1) DNA RNA unobservable meta-trait QTL protein D1 R1 observable cis-action? QTL P1 D2 R2 observable trans-action P2 3 November 2004 UAB: Yandell © 2004