Download presentation
Presentation is loading. Please wait.
1
High level GWAS analysis
Tzu L Phang Associate Professor of Bioinformatics Colorado Center for Personalized Medicine
2
Agenda What is GWAS? What does GWAS tell us?
What should we consider when designing or interpreting GWAS?
3
What is a GWAS? A genome-wide association study is an approach that involves rapidly scanning markers across genome (~0.5M to 1M) of many people (~2K) to find genetic variations associated with a particular disease A large number of subjects are needed because Associations between SNPs and causal variants are expected to show low odds rations, typically below 1.5 In order to obtain a reliable signal, given the very large number of tests that are required, associations must show a high level of significance to survice the multiple testing correstion. Such studies are particularly useful in finding genetic variations that contribute to common, complex disease
4
What is the goal of GWAS? Goal: uncover the genetic basis of a given disease Basic Idea: a rather vague idea of a study design that involves genotyping cases and controls at a large number (104 to 106) of SNP markers spread (in some unspecified way) throughout the genome. Look for associations between the genotypes at each locus and disease statues
5
Why are such studies possible now?
Computing resources to perform GWAS are now commonly available The completion of the Human Genome Project in 2003, and the International HapMap Project in 2005, researchers now have a set of research tools that make it possible to find the genetic contributions to common disease
6
General design and workflow
7
Case-control GWAS Obtain DNA from people with disease of interest (cases) and unaffected controls Run each DNA sample on a SNP chip to measure genotypes at 300,000-1,000,000 SNPs in cases and controls Identify SNPs where one allele is significantly more common in cases than controls The SNP is associated with disease
8
Association does not imply causation
Suppose that genotypes at a particular SNP are significantly associated with disease
9
Association does not imply causation
This may be because the SNP is associated with some other factor (a confounder), which is associated with disease but is not in the same causal pathway
10
Association ≠ Causation
Possible confounders of genetic associations: Ethnic ancestry Genotyping batch, genotyping centre DNA quality Environmental exposures in the same causal pathway Nicotine receptors --> smoking --> lung cancer Hung et al, Nature 452: 633 (2008) + other articles in same issue Alcohol dehydrogenase genes --> alcohol consumption --> throat cancer Hashibe et al, Nature Genetics 40: 707 (2008)
11
Helpful confounding: linkage disequilibrium
Linkage disequilibrium (LD) is the non- independence of alleles at nearby markers in a population because of a lack of recombination between the markers LD is helpful, because not all SNPs have to be genotyped
12
Direct and indirect association
Functional SNP is genotyped and an association is found Functional SNP (blue) is not genotyped, but a number of other SNPs (red) in LD with the functional SNP are genotyped, and an association is found for those SNPS
13
Odds ratio – measure of effect size
14
Interpretation of odds ratio
15
Multiple testing considerations
Suppose you test 500,000 SNPs for association with disease Expect around 500,000 x 0.05 = 25,000 to have p-value less than 0.05 More appropriate significance threshold p = / 500,000 = 10-7 Genome-wide significance In our MS GWAS we considered SNPs for follow- up if they had p- values less than 0.001 To detect a smaller p-value need a larger study
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.