4. modern high throughput biology

Slides:



Advertisements
Similar presentations
Selective Breeding & cDNA Microarrays
Advertisements

Linkage and Genetic Mapping
The genetic dissection of complex traits
Methods to read out regulatory functions
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Inferring Causal Phenotype Networks Elias Chaibub Neto & Brian S. Yandell UW-Madison June 2010 QTL 2: NetworksSeattle SISG: Yandell ©
Computer Simulation in Plant Breeding Introduction Outline Application I: Breeding Method Application II: Gene Mapping Application III: Genetic Modeling.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
BIO341 Meiotic mapping of whole genomes (methods for simultaneously evaluating linkage relationships among large numbers of loci)
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Regulation of gene expression in the mammalian eye and its relevance to eye disease Todd Scheetz et al. Presented by John MC Ma.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
BayesNCSU QTL II: Yandell © Bayesian Interval Mapping 1.what is Bayes? Bayes theorem? Bayesian QTL mapping Markov chain sampling18-25.
Yandell © June Bayesian analysis of microarray traits Arabidopsis Microarray Workshop Brian S. Yandell University of Wisconsin-Madison
1 Before considering selection, it’s important to characterize how gene expression varies within and between species. What evolutionary forces act on gene.
Yandell © 2003JSM Dimension Reduction for Mapping mRNA Abundance as Quantitative Traits Brian S. Yandell University of Wisconsin-Madison
1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Final Remarks Genetical.
Yandell © 2003Wisconsin Symposium III1 QTLs & Microarrays Overview Brian S. Yandell University of Wisconsin-Madison
What is a QTL? Quantitative trait locus (loci) Region of chromosome that contributes to variation in a quantitative trait Generally used to study “complex.
13 October 2004Statistics: Yandell © Inferring Genetic Architecture of Complex Biological Processes Brian S. Yandell 12, Christina Kendziorski 13,
Gene Mapping for Correlated Traits Brian S. Yandell University of Wisconsin-Madison Correlated TraitsUCLA Networks (c)
Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.
A Multi-stage Approach to Detect Gene-gene Interactions Associated with Multiple Correlated Phenotypes Zhou Xiangdong,Keith Chan, Danhong Zhu Department.
EQTLs.
Quantile-based Permutation Thresholds for QTL Hotspots
Quantile-based Permutation Thresholds for QTL Hotspots
Professor Brian S. Yandell
upstream vs. ORF binding and gene expression?
Model Selection for Multiple QTL
Gene-set analysis Danielle Posthuma & Christiaan de Leeuw
Multiple Traits & Microarrays
Linkage Studies (1a).
Graphical Diagnostics for Multiple QTL Investigation Brian S
Gene Hunting: Design and statistics
Peter John M.Phil, PhD Atta-ur-Rahman School of Applied Biosciences (ASAB) National University of Sciences & Technology (NUST)
Inferring Genetic Architecture of Complex Biological Processes BioPharmaceutical Technology Center Institute (BTCI) Brian S. Yandell University of Wisconsin-Madison.
Genetics and Genomics 5a. Integrative Genomics
Post-GWAS and Mechanistic Analyses
Inferring Genetic Architecture of Complex Biological Processes Brian S
1 Department of Engineering, 2 Department of Mathematics,
Inferring Causal Phenotype Networks
High Throughput Gene Mapping Brian S. Yandell
Genome-wide Associations
1 Department of Engineering, 2 Department of Mathematics,
Linking Genetic Variation to Important Phenotypes
CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks
1 Department of Engineering, 2 Department of Mathematics,
Inferring Causal Phenotype Networks Driven by Expression Gene Mapping
Seattle SISG: Yandell © 2008
examples in detail days to flower for Brassica napus (plant) (n = 108)
Schedule for the Afternoon
Inferring Genetic Architecture of Complex Biological Processes Brian S
Seattle SISG: Yandell © 2006
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Linkage analysis and genetic mapping
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Genetics.
Understanding Tissue-Specific Gene Regulation
Volume 11, Issue 5, Pages (May 2015)
Principle of Epistasis Analysis
A systems view of genetics in chronic kidney disease
GWAS-eQTL signal colocalisation methods
INTRODUCTION TO MOLECULAR GENETICS
The first two principal components for the islet gene expression data for the 181 microarray probes that map to the chromosome 6 trans-eQTL hotspot with.
A: QTL analysis of genotyping data on chromosome 12 revealed a locus linked to glucose levels in F2 mice at 6 months of age at marker D12Mit231 with a.
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Presentation transcript:

4. modern high throughput biology measuring the molecular dogma of biology DNA  RNA  protein  metabolites measured one at a time only a few years ago massive array of measurements on whole systems (“omics”) thousands measured per individual (experimental unit) all (or most) components of system measured simultaneously whole genome of DNA: genes, promoters, etc. all expressed RNA in a tissue or cell all proteins all metabolites systems biology: focus on network interconnections chains of behavior in ecological community underlying biochemical pathways genetics as one experimental tool perturb system by creating new experimental cross each individual is a unique mosaic fri 14 jul 2006 Yandell © 2006

coordinated expression in mouse genome (Schadt et al. 2003) expression pleiotropy in yeast genome (Brem et al. 2002) fri 14 jul 2006 Yandell © 2006

finding heritable traits (from Christina Kendziorski) reduce 30,000 traits to 300-3,000 heritable traits probability a trait is heritable pr(H|Y,Q) = pr(Y|Q,H) pr(H|Q) / pr(Y|Q) Bayes rule pr(Y|Q) = pr(Y|Q,H) pr(H|Q) + pr(Y|Q, not H) pr(not H|Q) phenotype averaged over genotypic mean  pr(Y|Q, not H) = f0(Y) =  f(Y|G ) pr(G) dG if not H pr(Y|Q, H) = f1(Y|Q) = q f0(Yq ) if heritable Yq = {Yi | Qi =q} = trait values with genotype Q=q fri 14 jul 2006 Yandell © 2006

hierarchical model for expression phenotypes (EB arrays: Christina Kendziorski) mRNA phenotype models given genotypic mean Gq common prior on Gq across all mRNA (use empirical Bayes to estimate prior) fri 14 jul 2006 Yandell © 2006

Improving the power for eQTL mapping (Chen et al. 2005) Consider 6000 transcripts that show heritability (as seeds) For each seed, construct list of correlated transcripts ( > 0.7) 1300 lists are enriched for at least one GO term G1 G2 G3 … G860 L1 x L2 L3 L1300 fri 14 jul 2006 Yandell © 2006

184 transcripts from 31 lists in Lipid Metabolism GO category Here is the GPCR story that we showed in the paper. GPCR stands for “G protein coupled receptor”. GPCRs is a protein family of transmembrane receptors. GPCRs are involved in many pathological pathways, and they are the targets of 40 to 50% of the modern medicinal drugs. Using our approach, we identified a set of 174 coordinately regulated GPCR transcripts. This can be seen from the monochromatic columns across the 60 F2 mice. Correlation study showed that they are not highly correlated with each other. But the fact that they are all enriched for the same functional category implies they are co-regulated. Here is a negative control plot, as you can see the coordinate patterns disappeared. Using the linkage information from those GPCR transcripts, we were able to discover a regulatory locus on chr2 at 30cM, between D2mit297 and D2mit9. Of the 174 traits, 50 had a major LOD peak at that locus, and 81 had a secondary peak in the region. If you look carefully, many of these LOD scores are pretty low, but the strong peak agreement suggests that most are influenced by the same regulator. fri 14 jul 2006 Yandell © 2006 Animals

Lipid Metabolism vs. Control Transcripts Here is the GPCR story that we showed in the paper. GPCR stands for “G protein coupled receptor”. GPCRs is a protein family of transmembrane receptors. GPCRs are involved in many pathological pathways, and they are the targets of 40 to 50% of the modern medicinal drugs. Using our approach, we identified a set of 174 coordinately regulated GPCR transcripts. This can be seen from the monochromatic columns across the 60 F2 mice. Correlation study showed that they are not highly correlated with each other. But the fact that they are all enriched for the same functional category implies they are co-regulated. Here is a negative control plot, as you can see the coordinate patterns disappeared. Using the linkage information from those GPCR transcripts, we were able to discover a regulatory locus on chr2 at 30cM, between D2mit297 and D2mit9. Of the 174 traits, 50 had a major LOD peak at that locus, and 81 had a secondary peak in the region. If you look carefully, many of these LOD scores are pretty low, but the strong peak agreement suggests that most are influenced by the same regulator. Transcripts fri 14 jul 2006 Yandell © 2006 Animals

Pairwise correlation of traits from different lipid metabolism lists         -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 100 200 300 400 500 600        Here is the GPCR story that we showed in the paper. GPCR stands for “G protein coupled receptor”. GPCRs is a protein family of transmembrane receptors. GPCRs are involved in many pathological pathways, and they are the targets of 40 to 50% of the modern medicinal drugs. Using our approach, we identified a set of 174 coordinately regulated GPCR transcripts. This can be seen from the monochromatic columns across the 60 F2 mice. Correlation study showed that they are not highly correlated with each other. But the fact that they are all enriched for the same functional category implies they are co-regulated. Here is a negative control plot, as you can see the coordinate patterns disappeared. Using the linkage information from those GPCR transcripts, we were able to discover a regulatory locus on chr2 at 30cM, between D2mit297 and D2mit9. Of the 174 traits, 50 had a major LOD peak at that locus, and 81 had a secondary peak in the region. If you look carefully, many of these LOD scores are pretty low, but the strong peak agreement suggests that most are influenced by the same regulator. fri 14 jul 2006 Yandell © 2006

LOD scores for lipid metabolism transcripts Here is the GPCR story that we showed in the paper. GPCR stands for “G protein coupled receptor”. GPCRs is a protein family of transmembrane receptors. GPCRs are involved in many pathological pathways, and they are the targets of 40 to 50% of the modern medicinal drugs. Using our approach, we identified a set of 174 coordinately regulated GPCR transcripts. This can be seen from the monochromatic columns across the 60 F2 mice. Correlation study showed that they are not highly correlated with each other. But the fact that they are all enriched for the same functional category implies they are co-regulated. Here is a negative control plot, as you can see the coordinate patterns disappeared. Using the linkage information from those GPCR transcripts, we were able to discover a regulatory locus on chr2 at 30cM, between D2mit297 and D2mit9. Of the 174 traits, 50 had a major LOD peak at that locus, and 81 had a secondary peak in the region. If you look carefully, many of these LOD scores are pretty low, but the strong peak agreement suggests that most are influenced by the same regulator. Chromosome 2 (cM) fri 14 jul 2006 Yandell © 2006

Validation of Riken32G18 fri 14 jul 2006 Yandell © 2006 Experiments showed that the Riken gene appears to be regulated reciprocally with the lipogenic genes. fri 14 jul 2006 Yandell © 2006

expression meta-traits: pleiotropy reduce 3,000 heritable traits to 3 meta-traits(!) what are expression meta-traits? pleiotropy: a few genes can affect many traits transcription factors, regulators weighted averages: Z = YW principle components, discriminant analysis infer genetic architecture of meta-traits model selection issues are subtle missing data, non-linear search what is the best criterion for model selection? time consuming process heavy computation load for many traits subjective judgement on what is best fri 14 jul 2006 Yandell © 2006

PC for two correlated mRNA fri 14 jul 2006 Yandell © 2006

PC across microarray functional groups Affy chips on 60 mice ~40,000 mRNA 2500+ mRNA show DE (via EB arrays with marker regression) 1500+ organized in 85 functional groups 2-35 mRNA / group which are interesting? examine PC1, PC2 circle size = # unique mRNA fri 14 jul 2006 Yandell © 2006

84 PC meta-traits by functional group focus on 2 interesting groups fri 14 jul 2006 Yandell © 2006

red lines: peak for PC meta-trait black/blue: peaks for mRNA traits arrows: cis-action? fri 14 jul 2006 Yandell © 2006

(portion of) chr 4 region ? fri 14 jul 2006 Yandell © 2006

interaction plots for DA meta-traits DA for all pairs of markers: separate 9 genotypes based on markers (a) same locus pair found with PC meta-traits (b) Chr 2 region interesting from biochemistry (Jessica Byers) (c) Chr 5 & Chr 9 identified as important for insulin, SCD fri 14 jul 2006 Yandell © 2006

comparison of PC and DA meta-traits on 1500+ mRNA traits genotypes from Chr 4/Chr 15 locus pair (circle=centroid) PC captures spread without genotype DA creates best separation by genotype PC ignores genotype DA uses genotype note better spread of circles correlation of PC and DA meta-traits fri 14 jul 2006 Yandell © 2006

relating meta-traits to mRNA traits DA meta-trait standard units log2 expression SCD trait fri 14 jul 2006 Yandell © 2006

DA: a cautionary tale (184 mRNA with |cor| > 0 DA: a cautionary tale (184 mRNA with |cor| > 0.5; mouse 13 drives heritability) fri 14 jul 2006 Yandell © 2006

building graphical models infer genetic architecture of meta-trait E(Z | Q, M) = q = 0 + {q in M} qk find mRNA traits correlated with meta-trait Z  YW for modest number of traits Y extend meta-trait genetic architecture M = genetic architecture for Y expect subset of QTL to affect each mRNA may be additional QTL for some mRNA fri 14 jul 2006 Yandell © 2006

posterior for graphical models posterior for graph given multivariate trait & architecture pr(G | Y, Q, M) = pr(Y | Q, G) pr(G | M) / pr(Y | Q) pr(G | M) = prior on valid graphs given architecture multivariate phenotype averaged over genotypic mean  pr(Y | Q, G) = f1(Y | Q, G) = q f0(Yq | G) f0(Yq | G) =  f(Yq | , G) pr() d graphical model G implies correlation structure on Y genotype mean prior assumed independent across traits pr() = t pr(t) fri 14 jul 2006 Yandell © 2006

from graphical models to pathways build graphical models QTL  RNA1  RNA2 class of possible models best model = putative biochemical pathway parallel biochemical investigation candidate genes in QTL regions laboratory experiments on pathway components fri 14 jul 2006 Yandell © 2006

graphical models (with Elias Chaibub) f1(Y | Q, G=g) = f1(Y1 | Q) f1(Y2 | Q, Y1) DNA RNA unobservable meta-trait QTL protein D1 R1 observable cis-action? QTL P1 D2 R2 observable trans-action P2 fri 14 jul 2006 Yandell © 2006

summary expression QTL are complicated need to consider multiple interacting QTL coherent approach for high-throughput traits identify heritable traits dimension reduction to meta-traits mapping genetic architecture extension via graphical models to networks many open questions model selection computation efficiency inference on graphical models fri 14 jul 2006 Yandell © 2006

references: eQTL Brem et al. Kruglyak (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296: 752-755. Lan et al. Yandell, Attie (2003) Dimension reduction for mapping mRNA abundance as quantitative traits. Genetics 164: 1607-1614. Lan, Chen, Flowers, Yandell et al. Kendziorski, Attie (2006) Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genet 2: e6. Schadt, Monks et al. Friend (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422: 297-302. Schadt et al. Lusis (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet  37: 710-717. fri 14 jul 2006 Yandell © 2006

references: multiple traits Doerge (2002) Mapping and analysis of quantitative trait loci in experimental populations. Nat Rev Genet 3: 43-52. Jiang, Zeng (1995) Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140: 1111-1127. Korol et al. Nevo (2001) Enhanced efficiency of quantitative trait loci mapping analysis based on multivariate complexes of quantitative traits. Genetics 157: 1789-1803. Vieira, Pasyukova, Zeng et al. Mackay (2000) Genotype-environment interaction for quantitative trait loci affecting life span in Drosophila melanogaster. Genetics 154: 213-227. fri 14 jul 2006 Yandell © 2006

thanks USDA CSREES NIH/NIDDK, 5803701 and 66369-01 ADA Innovation Award, 7-03-IG-01 HHMI 133-ES29 (CMK) National Council for Scientific and Technological Development (CNPq), Brazil (ENC) fri 14 jul 2006 Yandell © 2006