Download presentation
Presentation is loading. Please wait.
Published byShanna Marsh Modified over 9 years ago
1
Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215
2
Multiple hypotheses testing? Family based association studies (trios with affected child) Population based case control studies GWAS Pvalues
3
Unusual Pvalue distributions Pvalue QQ plot 3
4
Unusual Pvalue distributions Pvalue QQ plotPopulation stratification 4 Balding, Nature Reviews Genetics 2010
5
Population Stratification Population stratification –e.g. some SNP unique to ethnic group –Need to make sure sample groups match –Hidden environmental structure ● Two populations have different disease frequency, and different allele frequency. ● Association picks up the fact they are different populations! 5
6
Genotyping Principal Components (PCs) Can Model Population Stratification Li et al., Science 2008
7
European population structure 1,387 samples ~200K SNPs
8
UK WTCCC1 Study 8 Africa European Chinese + Japanese Afro-Caribbean samples South Asian samples
9
Genomic control Devlin and Roeder (1999) used theoretical arguments to propose that with population structure, the distribution of Cochran-Armitage trend tests, genome-wide, is inflated by a constant multiplicative factor λ. We can estimate the multiplicative inflation factor using the statistic λ = median(X i 2 )/0.456. Inflation factor λ > 1 indicates population structure and/or genotyping error. We can carry out an adjusted test of association that takes account of any mismatching of cases/controls at any SNP using the statistic X i 2 / λ. Inflation factor λ = 1.11 Population outliers and/or structure? True hits?
10
IBD: Identity By Descent Test If two individuals share common ancestor, they will share many SNPs / haplotype blocks on their genome (identical by state: IBS) 10
11
IBD: Identity By Descent Test Pairwise IBD probability between samples Probability two individuals share 0 (Z0), 1 (Z1), and 2 (Z2) haplotypes across the genome. Remove IDBs 11
12
Manolio et al., Clin Invest 2008
13
13 Pitfalls of Association Studies Not very predictive Explain little heritability Poor reproducibility Poor penetrance (fraction of people with the marker who show the trait) and expressivity (severity of the effect) Focus on common variation Difficult when several genes affecting a quantitative trait Many associated variants are not causal No available intervention for many disease risks
14
Pitfalls of Association Studies Not very predictive 14
15
Missing Heritability? Visccher, AJHG 2011
16
16 Reproducibility of Association Studies Most reported associations have not been consistently reproduced Hirschhorn et al, Genetics in Medicine, 2002, review of association studies –603 associations of polymorphisms and disease –166 studied in at least three populations –Only 6 seen in > 75% studies
17
17 Cause for Inconsistency What explains the lack of reproducibility? False positives –Multiple hypothesis testing –Ethnic admixture / stratification False negatives –Lack of power for weak effects Population differences –Variable LD with causal SNP –Population-specific modifiers
18
18 Causes for Inconsistency A sizable fraction (but less than half) of reported associations are likely correct Genetic effects are generally modest –Beware the winner’s curse (auction theory) –In association studies, first positive report is equivalent to the winning bid Large study sizes are needed to detect these reliably
19
19 Should we Believe Association Study Results? Initial skepticism is warranted Replication, especially with low p values, is encouraging Large sample sizes are crucial E.g. PPAR Pro12Ala & Diabetes
20
Replication, Replication, Replication Meta-analysis of multiple studies to increase GWAS power Combine data from different platforms / studies Impute unmeasured or missing genotypes based on LD (e.g. HapMap haplotypes or 1000 Genomes) Analyze all studies together to increase GWAS power 20
21
Detection Power of GWAS 21
22
Mapping (expression) Quantitative Trait Loci 22
23
SHR BN F1 F2 Genotype BGenotype H HBBHBHH Strain Distribution Pattern for Gene X Gene X Rat Recombinant Inbred (RI) Strains F1 offspring are identical F2 offspring are different (due to recombination) Brother sister mating over >20 generations to achieve homozygosity at all genetic loci
24
Gene X BHBBBHH SDP for Gene X Mapping of QTLs Compare strain distribution pattern of every marker with certain traits RI strains obesity mRNA Linkage
25
(e)QTL Mapping Many disease associated genes have been mapped with QTL eQTL mapping: –Transcript abundance may act as intermediate phenotype between genetic loci and the clinical phenotype –Incorporate information of genotype, expression, and clinical traits together to construct regulatory networks and to improve understanding of disease etiologies 25
26
eQTL Analysis 26
27
cis- and trans-acting eQTLs 27
28
trans-eQTLs Hot-spots 28
29
eQTL on Human HapMap –Gene expression –Histone mark –DNase-seq Need to check AA, AB, BB genotypes against gene expression differences 29
30
eQTL on TF Binding and Epigenetics 30 McDaniell et al, Science 2010
31
Summary Population stratification, IBD Removing outliers or find the scaling factor Predictability, heritability Reproducibility QTL and eQTL mapping Cis- vs trans- eQTL 31
32
32 Acknowledgement Tim Niu Kenneth Kidd, Judith Kidd and Glenys Thomson Joel Hirschhorn Greg Gibson & Spencer Muse Jim Stankovich Teri Manolio David Evans Guodong Wu Enrico Petretto Wei Wang Bo Li
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.