Download presentation
Presentation is loading. Please wait.
1
Biomedical Master Introduction to genome-wide association studies Metabolic diseases (B. Thorens) Biomedical Master: Metabolic diseases Lausanne, October 31, 2011 Sven Bergmann University of Lausanne & Swiss Institute of Bioinformatics http://serverdgm.unil.ch/bergmann
2
A Systems Biology approach Large (genomic) systems many uncharacterized elements relationships unknown computational analysis should: improve annotation reveal relations reduce complexity Small systems elements well-known many relationships established quantitative modeling of systems properties like: Dynamics Robustness Logics
6
Overview Population stratification Our whole genome associations New Methods and Approaches
7
ATTGCAATCCGTGG...ATCGAGCCA…TACGATTGCACGCCG… ATTGCAAGCCGTGG...ATCTAGCCA…TACGATTGCAAGCCG… ATTGCAATCCGTGG...ATCGAGCCA…TACGATTGCACGCCG…ATTGCAAGCCGTGG...ATCTAGCCA…TACGATTGCAAGCCG… Genetic variation in SNPs (Single Nucleotide Polymorphisms)
8
6’189 individuals Phenotypes 159 measurement 144 questions Genotypes 500.000 SNPs CoLaus = Cohort Lausanne Collaboration with: Vincent Mooser (GSK), Peter Vollenweider & Gerard Waeber (CHUV)
9
Analysis of Genotypes only Principle Component Analysis reveals SNP-vectors explaining largest variation in the data
10
Ethnic groups cluster according to geographic distances PC1 PC2
11
PCA of POPRES cohort
12
Predicting location according to SNP-profile...
13
… is pretty accurate!
14
The Swiss segregate according to language
15
PC-Analysis of genotypic profile Is surprisingly accurate! Is useful for forensic purposes or for individuals interested in their ancestry Is useful for population stratification in Genome-wide Association studies
16
Phenotypic variation:
17
What is association? chromosomeSNPstrait variant Genetic variation yields phenotypic variation Population with ‘ ’ allele Distributions of “trait”
18
Association using regression genotypeCoded genotype phenotype
19
Regression formalism (monotonic) transformation phenotype (response variable) of individual i effect size (regression coefficient) coded genotype (feature) of individual i p(β=0) error (residual) Goal: Find effect size that explains best all (potentially transformed) phenotypes as a linear function of the genotypes and estimate the probability (p-value) for the data being consistent with the null hypothesis (i.e. no effect)
20
Whole Genome Association
21
Current microarrays probe ~1M SNPs! Standard approach: Evaluate significance for association of each SNP independently: significance
22
Whole Genome Association significance Manhattan plot observed significance Expected significance Quantile-quantile plot Chromosome & position GWA screens include large number of statistical tests! Huge burden of correcting for multiple testing! Can detect only highly significant associations ( p < α / #(tests) ~ 10 -7 )
27
Genome-wide meta-analysis for serum calcium identifies significantly associated SNPs near the calcium-sensing receptor (CASR) gene Karen Kapur, Toby Johnson, Noam D. Beckmann, Joban Sehmi, Toshiko Tanaka, Zolt á n Kutalik, Unnur Styrkarsdottir, Weihua Zhang, Diana Marek, Daniel F. Gudbjartsson, Yuri Milaneschi, Hilma Holm, Angelo DiIorio, Dawn Waterworth, Andrew Singleton, Unnur Steina Bjornsdottir, Gunnar Sigurdsson, Dena Hernandez, Ranil DeSilva, Paul Elliott, Gudmundur Eyjolfsson, Jack M Guralnik, James Scott, Unnur Thorsteinsdotti, Stefania Bandinelli, John Chambers, Kari Stefansson, G é rard Waeber, Luigi Ferrucci, Jaspal S Kooner, Vincent Mooser, Peter Vollenweider, Jacques S. Beckmann, Murielle Bochud, Sven Bergmann
28
Current insights from GWAS: Well-powered (meta-)studies with (ten-)thousands of samples have identified a few (dozen) candidate loci with highly significant associations Many of these associations have been replicated in independent studies
29
Current insights from GWAS: Each locus explains but a tiny (<1%) fraction of the phenotypic variance All significant loci together explain only a small (<10%) of the variance
30
The “Missing variance” (Non-)Problem Why should a simplistic (additive) model using incomplete or approximate features possibly explain anything close to the genetic variance of a complex trait? … and it doesn ’ t have to as long as Genome-wide Association Studies are meant to as an undirected approach to elucidate new candidate loci that impact the trait!
31
1.Improve measurements: - measure more variants (e.g. by UHS) - measure other variants (e.g. CNVs) - measure “molecular phenotypes” 2.Improve models: - proper integration of uncertainties - include interactions - multi-layer models How could our models become more predictive?
32
Towards a layered Systems Model We need intermediate (molecular) phenotypes to better understand organismal phenotypes
33
Network Approaches for Integrative Association Analysis Using knowledge on physical gene-interactions or pathways to prioritize the search for functional interactions
34
Transcription Modules reduce Complexity SB, J Ihmels & N Barkai Physical Review E (2003) http://maya.unil.chhttp://maya.unil.ch: 7575/ExpressionView
35
Association of (average) module expression is often stronger than for any of its constituent genes
36
Analysis of genome-wide SNP data reveals that population structure mirrors geography Genome-wide association studies elucidate candidate loci for a multitude of traits, but have little predictive power so far Future improvement will require –better genotyping (CGH, UHS, …) –New analysis approaches (interactions, networks, data integration) Take-home Messages:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.