GWAS-eQTL signal colocalisation methods

Slides:



Advertisements
Similar presentations
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Advertisements

SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
METHODS FOR HAPLOTYPE RECONSTRUCTION
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
GWAS Hits and Functional Implications Peter Castaldi February 1, 2013.
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)
Supplemental Figure 1. False trans association due to probe cross-hybridization and genetic polymorphism at single base extension site. (A) The Infinium.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
SNPs and complex traits: where is the hidden heritability?
EQTLs.
Common variation, GWAS & PLINK
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
Gene expression from RNA-Seq
upstream vs. ORF binding and gene expression?
Functional Mapping and Annotation of GWAS: FUMA
Figure 2. Regional plots and box plots for gene ABO top cis-SNPs whose signal was not attenuated after adjusting for the lead GWAS SNPs. (A) Observed −log10(P)
Genome Wide Association Studies using SNP
Gene Hunting: Design and statistics
Genome Biology & Applied Bioinformatics Mehmet Tevfik DORAK, MD PhD
Comprehensively Evaluating cis-Regulatory Variation in the Human Prostate Transcriptome by Using Gene-Level Allele-Specific Expression  Nicholas B. Larson,
Post-GWAS and Mechanistic Analyses
The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans Science Volume 348(6235): May 8, 2015 Published by AAAS.
Ingenuity Knowledge Base
High level GWAS analysis
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Power to detect QTL Association
Beyond GWAS Erik Fransen.
Linking Genetic Variation to Important Phenotypes
Colocalization of GWAS and eQTL Signals Detects Target Genes
Genetic-Variation-Driven Gene-Expression Changes Highlight Genes with Important Functions for Kidney Disease  Yi-An Ko, Huiguang Yi, Chengxiang Qiu, Shizheng.
Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci  Gosia Trynka,
Genome-wide Identification of Craniofacial Transcriptional Enhancers
Genetic effects on gene expression across human tissues
Genome Biology & Applied Bioinformatics Mehmet Tevfik DORAK, MD PhD
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
High-Resolution Genetic Maps Identify Multiple Type 2 Diabetes Loci at Regulatory Hotspots in African Americans and Europeans  Winston Lau, Toby Andrew,
Understanding Tissue-Specific Gene Regulation
QTL Fine Mapping by Measuring and Testing for Hardy-Weinberg and Linkage Disequilibrium at a Series of Linked Marker Loci in Extreme Samples of Populations 
Parisa Shooshtari, Hailiang Huang, Chris Cotsapas 
Revisiting the Thrifty Gene Hypothesis via 65 Loci Associated with Susceptibility to Type 2 Diabetes  Qasim Ayub, Loukas Moutsianas, Yuan Chen, Kalliope.
Rajiv C. McCoy, Jon Wakefield, Joshua M. Akey  Cell 
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Genetic Regulatory Mechanisms of Smooth Muscle Cells Map to Coronary Artery Disease Risk Loci  Boxiang Liu, Milos Pjanic, Ting Wang, Trieu Nguyen, Michael.
Malika Kumar Freund, Kathryn S
One SNP at a Time: Moving beyond GWAS in Psoriasis
Figure 2 LocusZoom plots
Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS  Xin He, Chris K. Fuller, Yi Song, Qingying Meng, Bin Zhang,
Structural Architecture of SNP Effects on Complex Traits
A systems view of genetics in chronic kidney disease
Diego Calderon, Anand Bhaskar, David A
Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts?  Alexandra E. Fish, John A. Capra, William.
Presentation by: Hannah Mays UCF - BSC 4434 Professor Xiaoman Li
An Expanded View of Complex Traits: From Polygenic to Omnigenic
Fig. 2 Genotype-induced differential gene expression is different in MDMi cells compared to monocytes. Genotype-induced differential gene expression is.
Chen Yao, Roby Joehanes, Andrew D
Xiaoquan Wen, Yeji Lee, Francesca Luca, Roger Pique-Regi 
Evaluation of power for linkage disequilibrium mapping
Joseph K. Pickrell  The American Journal of Human Genetics 
Widespread Allelic Heterogeneity in Complex Traits
The American Journal of Human Genetics
Colocalization of GWAS and eQTL Signals Detects Target Genes
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Presentation transcript:

GWAS-eQTL signal colocalisation methods Integrating GWASs and eQTL studies can elucidate mechanism of non-coding variants on diseases Challenging due to the uncertainty induced by (i) LD and (ii) allelic heterogeneity Allen et al, 2017 Same causal variant(s) or not?

What we want to see What we’ll often see Causality Pleiotropy Linkage Transcription Disease Lung Function GWAS Causal variant Transcription (eQTL) Pleiotropy Transcription Disease Causal variant Genotype AA Aa aa Linkage Transcription Disease (Non-coding) Causal variant Causal variant 1 Causal variant 2 What we’ll often see

Current UK Biobank LF GWAS If top eSNP for a gene is in our 99% credible set, then we inferred that both signals were colocalised Generally a strict approach Some credible sets have 1-2 SNPs (e.g. rs35506 below) Puts too much trust on the eQTL results Relatively small sample sizes & potential cell-type heterogeneity Strict thresholds applied as methods still work in progress Credible set: rs35506 & rs35505 (near TBX3) Shrine, Guyatt et al, 2018. BioRxiv

eCAVIAR Hormozdiari et al, 2017. AJHG “State-of-the-art” Widely used since publication (>50 citations) Probabilistic model for integrating GWAS and eQTL data to estimate the posterior probability of the same variant being causal in both GWAS and eQTL studies, while accounting for allelic heterogeneity and LD It can (i) quantify the strength between a ‘causal’ variant and its associated signals in both studies, and (ii) colocalize variants that pass the significance threshold in GWAS For any given peak variant identified in GWAS, eCAVIAR considers a collection of variants around that peak variant as one single locus eCAVIAR: eQTL and GWAS CAusal Variants Identification in Associated Regions

(Most likely) Causal SNP(s) Target Gene(s) Relevant Tissue(s) CLPP: colocalisation posterior probability – probability that the same variant(s) is causal in both the GWAS and eQTL study (Most likely) Causal SNP(s) Target Gene(s) Relevant Tissue(s)

CLPP: colocalisation posterior probability – GWAS -log10(P) CLPP is high CLPP is low eQTL -log10(P) CLPP is low CLPP is low (~0.25) if 1 causal variant specified. CLPP≈1 if >1 causal variant CLPP: colocalisation posterior probability – probability that the same variant(s) is causal in both the GWAS and eQTL study

Current analysis plan & results MFAP2 region FEV1/FVC meta-analysis GWAS results +/-500kb around sentinel SNP and P<10-4 output: 375 SNPs GTEx Lung (full results) and Lung eQTL (FDR<5%) Input: 366 and 5 SNPs, respectively Supp. Table 13 Z-score= 3.719 Sakornsakolpat et al (BioRxiv) supplement p10: To determine whether these signals co-localized (rather than being related due to linkage disequilibrium), we performed colocalization analysis between our genomewide significant loci and mQTL using eCAVIAR [64]. We tested variants that were significant in both datasets, P<0.0027 in GWAS (equivalent to Z score>3, as recommended by the author [64]) and P<3.2x10-6 in mQTL [61] . We estimated the posterior probability of a variant being shared in both GWAS and mQTL, using a cut-off of 0.1 as previous demonstrated [64].

99% credible set has 5 SNPs (incl. rs9435733) Shrine, Guyatt et al, 2018. BioRxiv

eCAVIAR outputs *_col contains the colocalization posterior probability (CLPP). Last column is the CLPP score *_post: contains the probability of each variant is causal in eQTL or GWAS. The last column is this quantity *_set: is the credible set used for fine-mapping purpose *_hist: the output of eCAVIAR when you set -f and if you set the maximum number of causal "-c " to X. Then you will have a *_hist file where you will have X+1 column in the output file as follows: First column is the probability that this locus has 0 causal variants; second column is the probability that this locus has 1 causal variant; X-th Column: is the probability that this locus has (X-1) causal variants The files _1 and _2 refer to the GWAS and eQTL results, respectively

eCAVIAR paper discussion Strong evidence in support of the idea that most GWAS loci are not strong eQTL loci and that the mechanism by which GWAS loci affect gene regulation is more complicated than expected Possible explanations: GWAS loci in fact do affect expression but are secondary signals in comparison to the stronger associations found in current eQTL studies Heterogeneity of tissues could render it hard to detect eQTLs specific to a disease-relevant cell type that composes only a fraction of the tissue GWAS variants affect other aspects of gene regulation, such as splicing or regulation at a level other than transcription regulation Several studies have shown that alternative splicing could explain the causal mechanism of complex disease associations GWAS loci are eQTL loci only in certain conditions, such as development, where expression levels are not typically measured

Other colocalisation methods RTC (regulatory trait concordance) method Requires individual level data for the eQTL datasets Conditions on the top GWAS signals and checks whether any eQTL signals are attenuated COLOC/MOLOC Utilises an approximate Bayes factor to estimate the posterior probabilities that a variant is causal in both GWASs and eQTL studies Initially developed for checking colocalisation between a pair of GWAS using summary stats, then extended to >2 studies. Sherlock Bayesian statistical framework that matches GWAS association signals with eQTL signals for a specific gene in order to detect whether the same variant is causal in both studies. Similar to RTC, Sherlock accounts for the uncertainty of LD Easy to use online server (http://sherlock.ucsf.edu) Enloc Similar method to eCAVIAR but not cited much Piccolo https://github.com/Ksieber/piccolo RTC: Nica et al, 2010. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. Enloc: Wen et al, 2017. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLoS Genet Sherlock: He et al, 2013. Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. AJHG COLOC: Giambartolomei et al, 2014. Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. PLoS Genet MOLOC: Giambartolomei et al, 2018. A Bayesian Framework for Multiple Trait Colo-calization from Summary Association Statistics. Bioinformatics

To do/discuss All SNPs and genes on Table 1? Automate pipeline Request Lung eQTL results for all regions P-value (0.0027 & 3.2x10-6) & cut-off (0.1) thresholds? Other tissues? Blood eQTL? All GTEx tissues?