Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regulatory variation and its functional consequences Chris Cotsapas

Similar presentations


Presentation on theme: "Regulatory variation and its functional consequences Chris Cotsapas"— Presentation transcript:

1 Regulatory variation and its functional consequences Chris Cotsapas cotsapas@broadinstitute.org

2 Motivating questions How do phenotypes vary across individuals? – Regulatory changes drive cellular and organismal traits – Likely also drive evolutionary differences How are genes (co)regulated? – Pathways, processes, contexts

3

4

5 Regulatory variation What do “interesting” variants do? Genetic changes to: – Coding sequence ** – Gene expression levels – Splice isomer levels – Methylation patterns – Chromatin accessibility – Transcription factor binding kinetics – Cell signaling – Protein-protein interactions ~88% of GWAS hits are regulatory

6 Genetic variation alters regulation Protein levels – Maize (Damerval 94) Expression levels – Yeast, maize, mouse, humans (Brem 02, Schadt 03, Stranger 05, Stranger 07) RNA splicing – Humans (Pickrell 12, Lappalainen 13) Methylation and Dnase I peak strength – Humans (Degner 12; Gibbs 12)

7 cis-eQTL –The position of the eQTL maps near the physical position of the gene. –Promoter polymorphism? –Insertion/Deletion? –Methylation, chromatin conformation? trans-eQTL –The position of the eQTL does not map near the physical position of the gene. –Regulator? –Direct or indirect? Modified from Cheung and Spielman 2009 Nat Gen Genetics of gene expression (eQTL)

8 Cis- eQTL analysis: Test SNPs within a pre-defined distance of gene 1Mb SNPs gene probe 1Mb window

9

10 QT association Analysis of the relationship between a dependent or outcome variable (phenotype) with one or more independent or predictor variables (SNP genotype) Y i =   +   X i +  i Number of A1 Alleles 012 Continuous Trait Value  Slope:   Linear Regression Equation Logistic Regression Equation =  + Xi + i=  + Xi + i ln ( ) pipi (1-p i )

11 gene 3 eQTL analysis: a GWAS for every gene gene 2gene N gene 5 gene 4 gene 1

12 cis-eQTLs are rather common Nica et al PLoS Genet 2011

13 Cis-eQTLs cluster around TSS Stranger et al PLoS Genet 2012

14 trans hotspots (yeast) Brem et al Science 2002

15 Yvert et al Nat Genet 2003

16 DOES REGULATORY VARIATION ALTER PHENOTYPE? APPLICATION TO GWAS Candidate genes, perturbations underlying organismal phenotypes

17 Rationale How do disease/trait variants actually alter biology? If they change regulation, then: – Change in gene expression/isoform use – Phenotypic consequence*

18 Compare patterns of association GWAS peak eQTL for gene 1 eQTL for gene 2

19 Pearson’s covariance for windows of 51 SNPs between –log(p) in 2 traits CD GWAS p eQTL p Detect a peak when effect is the same No peak when there are independent hits near each other

20 Crohn’s/eQTL analysis CD meta analysis (GWAS only) CEU Hapmap LCL eQTL data Overlapping SNPs only (eQTL data has 610K SNPs, most in CD meta-analysis) Test 133 associations (total 1054 tests) GWAS peak eQTL for gene 1 eQTL for gene 2

21 Crohn’s/eQTL analysis SNPCHRGene rs117425705PTGER4 rs129949972ATG16L1 rs1140116SPNS1 rs107814999INPP5E rs22669592C22orf29 A peak implies that the same effect drives GWAS and eQTL

22

23 MS/eQTL analysis SNPCHRGene rs68807785PTGER4 rs713227712CDK2AP rs76650904CISD2 rs22552143GOLGB1 & EAF2 rs20120211812METTL1 & TSFM rs1294651017ORMDL3, STARD3 & ZPBP2 rs228379222PPM1F rs75525441SLC30A7 rs3453644319SLC44A2 A peak implies that the same effect drives GWAS and eQTL

24

25

26 DOES REGVAR REVEAL CO-REGULATION? A.K.A. WHERE ARE THE TRANS eQTLS? Open question

27 gene 3 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2gene N gene 5 gene 4 gene 1

28 Issues with trans mapping Power – Genome-wide significance is 5e -8 – Multiple testing on ~20K genes – Sample sizes clearly inadequate Data structure – Bias corrections deflate variance – Non-normal distributions Sample sizes – Far too small

29 But… Assume that trans eQTLs affect many genes… …and you can use cross-trait methods!

30 Association data Z 1,1 Z 1,2 ……Z 1,p Z 2,1 : : Z s,1 Z s,p

31 Cross-phenotype meta-analysis S CPMA ~ L(data | λ≠1) L(data | λ=1) Cotsapas et al, PLoS Genetics

32 CPMA for correlated traits Empirical assessment to account for correlation Simulate Z scores under covariance, recalculate CPMA Construct distribution of CPMA for dataset, call significance with Ben Voight, U Penn

33 Experimental design 610,180 SNPs MAF >0.15 CEU and YRI LD pruned (r 2 < 0.2) 8368 transcripts Detectable on Illumina arrays 108 CEU individuals* 109 YRI individuals* * Stranger et al Nat Genet 2007 (LCL data; publicly available) CEU p-values Transcript ~ SNP, sex YRI p-values Transcript ~ SNP, sex plink CPMA CEU CPMA scores YRI CPMA scores >95%ile sim CPMA

34 Target sets of genes trans-acting variant: SNP with CPMA evidence Target genes: genes affected by trans-acting variant (i.e. regulon)

35 Prediction 1 Allelic effects should be conserved between two populations – Binomial test on paired observations for all genes P < 0.05 in at least one population True for 1124/1311 SNPs (binomial p < 0.05) Genes p CEU < 0.05 Genes p YRI < 0.05 CEU++--+ YRI++--+ --++-

36 Prediction 2 Target genes should overlap – Identify by mixture of gaussians classification – Empirical p from distribution of overlaps between N CEU and N YRI genes across SNPs. True for 600/1311 SNPs (empirical p < 0.05) Genes p CEU < 0.05 Genes p YRI < 0.05

37 What about the target genes? Regulons: – Encode proteins more connected than expected by chance www.broadinstitute.org/mpg/dapple.php Rossin et al 2011 PLoS Genetics

38 What about the target genes? Regulons: – Encode proteins enriched for TF targets (ENCODE LCL data) – 24/67 filtered TFs significant – Binomial overlap test TFp-value CEBPB3.7 x 10 -142 HDAC87.8 x 10 -122 FOS2.5 x 10 -96 JUND3.7 x 10 -88 NFYB3.3 x 10 -71 ETS13.8 x 10 -63 FAM48A2.1 x 10 -61 FOXA11.4 x 10 -33 GATA14.6 x 10 -33 HEY17.8 x 10 -32 trans target genes CHiPseq LCL target genes

39 Summary Regulatory variation is common It affects gene expression levels Likely many other types: – DNA accessibility, chromatin states – Transcript splicing, processing, turnover Has phenotypic consequences – GWAS – Some cellular assays (not discussed here)

40 Open questions Discover regulatory elements (cis) – Promoters, enhancers etc Gene regulatory circuits (trans) Dynamics of regulation – Splicing variation, processing, degradation Phenotypic consequences – Cellular assays required Tie in to organismal phenotype

41 NEXT-GEN SEQUENCING DATA RNAseq, GTEx

42 GTEx – Genotype-Tissue EXpression An NIH common fund project Current: 35 tissues from 50 donors Scale up: 20K tissues from 900 donors. Novel methods groups: 5 current + RFA

43 How can we make RNAseq useful? Standard eQTLs – Montgomery et al, Pickrell et al Nature 2010 Isoform eQTLs – Depth of sequence! Long genes are preferentially sequenced Abundant genes/isoforms ditto Power!? Mapping biases due to SNPs

44 RNAseq combined with other techs Regulons: TF gene sets via CHiP/seq – Look for trans effects Open chromatin states (Dnase I; methylation) – Find active genes – Changes in epigenetic marks correlated to RNA – Genetic effects RNA/DNA comparisons – Simultaneous SNP detection/genotyping – RNA editing ???


Download ppt "Regulatory variation and its functional consequences Chris Cotsapas"

Similar presentations


Ads by Google