Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215.

Similar presentations


Presentation on theme: "Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215."— Presentation transcript:

1 Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215

2 Multiple hypotheses testing? Family based association studies (trios with affected child) Population based case control studies GWAS Pvalues

3 Unusual Pvalue distributions Pvalue QQ plot 3

4 Unusual Pvalue distributions Pvalue QQ plotPopulation stratification 4 Balding, Nature Reviews Genetics 2010

5 Population Stratification Population stratification –e.g. some SNP unique to ethnic group –Need to make sure sample groups match –Hidden environmental structure ● Two populations have different disease frequency, and different allele frequency. ● Association picks up the fact they are different populations! 5

6 Genotyping Principal Components (PCs) Can Model Population Stratification Li et al., Science 2008

7 European population structure 1,387 samples ~200K SNPs

8 UK WTCCC1 Study 8 Africa European Chinese + Japanese Afro-Caribbean samples South Asian samples

9 Genomic control Devlin and Roeder (1999) used theoretical arguments to propose that with population structure, the distribution of Cochran-Armitage trend tests, genome-wide, is inflated by a constant multiplicative factor λ. We can estimate the multiplicative inflation factor using the statistic λ = median(X i 2 )/0.456. Inflation factor λ > 1 indicates population structure and/or genotyping error. We can carry out an adjusted test of association that takes account of any mismatching of cases/controls at any SNP using the statistic X i 2 / λ. Inflation factor λ = 1.11 Population outliers and/or structure? True hits?

10 IBD: Identity By Descent Test If two individuals share common ancestor, they will share many SNPs / haplotype blocks on their genome (identical by state: IBS) 10

11 IBD: Identity By Descent Test Pairwise IBD probability between samples Probability two individuals share 0 (Z0), 1 (Z1), and 2 (Z2) haplotypes across the genome. Remove IDBs 11

12 Manolio et al., Clin Invest 2008

13 13 Pitfalls of Association Studies Not very predictive Explain little heritability Poor reproducibility Poor penetrance (fraction of people with the marker who show the trait) and expressivity (severity of the effect) Focus on common variation Difficult when several genes affecting a quantitative trait Many associated variants are not causal No available intervention for many disease risks

14 Pitfalls of Association Studies Not very predictive 14

15 Missing Heritability? Visccher, AJHG 2011

16 16 Reproducibility of Association Studies Most reported associations have not been consistently reproduced Hirschhorn et al, Genetics in Medicine, 2002, review of association studies –603 associations of polymorphisms and disease –166 studied in at least three populations –Only 6 seen in > 75% studies

17 17 Cause for Inconsistency What explains the lack of reproducibility? False positives –Multiple hypothesis testing –Ethnic admixture / stratification False negatives –Lack of power for weak effects Population differences –Variable LD with causal SNP –Population-specific modifiers

18 18 Causes for Inconsistency A sizable fraction (but less than half) of reported associations are likely correct Genetic effects are generally modest –Beware the winner’s curse (auction theory) –In association studies, first positive report is equivalent to the winning bid Large study sizes are needed to detect these reliably

19 19 Should we Believe Association Study Results? Initial skepticism is warranted Replication, especially with low p values, is encouraging Large sample sizes are crucial E.g. PPAR  Pro12Ala & Diabetes

20 Replication, Replication, Replication Meta-analysis of multiple studies to increase GWAS power Combine data from different platforms / studies Impute unmeasured or missing genotypes based on LD (e.g. HapMap haplotypes or 1000 Genomes) Analyze all studies together to increase GWAS power 20

21 Detection Power of GWAS 21

22 Mapping (expression) Quantitative Trait Loci 22

23 SHR BN F1 F2 Genotype BGenotype H HBBHBHH Strain Distribution Pattern for Gene X Gene X Rat Recombinant Inbred (RI) Strains F1 offspring are identical F2 offspring are different (due to recombination) Brother sister mating over >20 generations to achieve homozygosity at all genetic loci

24 Gene X BHBBBHH SDP for Gene X Mapping of QTLs Compare strain distribution pattern of every marker with certain traits RI strains obesity mRNA Linkage

25 (e)QTL Mapping Many disease associated genes have been mapped with QTL eQTL mapping: –Transcript abundance may act as intermediate phenotype between genetic loci and the clinical phenotype –Incorporate information of genotype, expression, and clinical traits together to construct regulatory networks and to improve understanding of disease etiologies 25

26 eQTL Analysis 26

27 cis- and trans-acting eQTLs 27

28 trans-eQTLs Hot-spots 28

29 eQTL on Human HapMap –Gene expression –Histone mark –DNase-seq Need to check AA, AB, BB genotypes against gene expression differences 29

30 eQTL on TF Binding and Epigenetics 30 McDaniell et al, Science 2010

31 Summary Population stratification, IBD Removing outliers or find the scaling factor Predictability, heritability Reproducibility QTL and eQTL mapping Cis- vs trans- eQTL 31

32 32 Acknowledgement Tim Niu Kenneth Kidd, Judith Kidd and Glenys Thomson Joel Hirschhorn Greg Gibson & Spencer Muse Jim Stankovich Teri Manolio David Evans Guodong Wu Enrico Petretto Wei Wang Bo Li


Download ppt "Genome-Wide Association Studies Xiaole Shirley Liu Stat 115/215."

Similar presentations


Ads by Google