Download presentation
Presentation is loading. Please wait.
Published byBriana Patterson Modified over 9 years ago
1
Regulatory variation and its functional consequences Chris Cotsapas cotsapas@broadinstitute.org
2
Motivating questions How do phenotypes vary across individuals? – Regulatory changes drive cellular and organismal traits – Likely also drive evolutionary differences How are genes (co)regulated? – Pathways, processes, contexts
5
Regulatory variation What do “interesting” variants do? Genetic changes to: – Coding sequence ** – Gene expression levels – Splice isomer levels – Methylation patterns – Chromatin accessibility – Transcription factor binding kinetics – Cell signaling – Protein-protein interactions ~88% of GWAS hits are regulatory
6
Genetic variation alters regulation Protein levels – Maize (Damerval 94) Expression levels – Yeast, maize, mouse, humans (Brem 02, Schadt 03, Stranger 05, Stranger 07) RNA splicing – Humans (Pickrell 12, Lappalainen 13) Methylation and Dnase I peak strength – Humans (Degner 12; Gibbs 12)
7
cis-eQTL –The position of the eQTL maps near the physical position of the gene. –Promoter polymorphism? –Insertion/Deletion? –Methylation, chromatin conformation? trans-eQTL –The position of the eQTL does not map near the physical position of the gene. –Regulator? –Direct or indirect? Modified from Cheung and Spielman 2009 Nat Gen Genetics of gene expression (eQTL)
8
Cis- eQTL analysis: Test SNPs within a pre-defined distance of gene 1Mb SNPs gene probe 1Mb window
10
QT association Analysis of the relationship between a dependent or outcome variable (phenotype) with one or more independent or predictor variables (SNP genotype) Y i = + X i + i Number of A1 Alleles 012 Continuous Trait Value Slope: Linear Regression Equation Logistic Regression Equation = + Xi + i= + Xi + i ln ( ) pipi (1-p i )
11
gene 3 eQTL analysis: a GWAS for every gene gene 2gene N gene 5 gene 4 gene 1
12
cis-eQTLs are rather common Nica et al PLoS Genet 2011
13
Cis-eQTLs cluster around TSS Stranger et al PLoS Genet 2012
14
trans hotspots (yeast) Brem et al Science 2002
15
Yvert et al Nat Genet 2003
16
DOES REGULATORY VARIATION ALTER PHENOTYPE? APPLICATION TO GWAS Candidate genes, perturbations underlying organismal phenotypes
17
Rationale How do disease/trait variants actually alter biology? If they change regulation, then: – Change in gene expression/isoform use – Phenotypic consequence*
18
Compare patterns of association GWAS peak eQTL for gene 1 eQTL for gene 2
19
Pearson’s covariance for windows of 51 SNPs between –log(p) in 2 traits CD GWAS p eQTL p Detect a peak when effect is the same No peak when there are independent hits near each other
20
Crohn’s/eQTL analysis CD meta analysis (GWAS only) CEU Hapmap LCL eQTL data Overlapping SNPs only (eQTL data has 610K SNPs, most in CD meta-analysis) Test 133 associations (total 1054 tests) GWAS peak eQTL for gene 1 eQTL for gene 2
21
Crohn’s/eQTL analysis SNPCHRGene rs117425705PTGER4 rs129949972ATG16L1 rs1140116SPNS1 rs107814999INPP5E rs22669592C22orf29 A peak implies that the same effect drives GWAS and eQTL
23
MS/eQTL analysis SNPCHRGene rs68807785PTGER4 rs713227712CDK2AP rs76650904CISD2 rs22552143GOLGB1 & EAF2 rs20120211812METTL1 & TSFM rs1294651017ORMDL3, STARD3 & ZPBP2 rs228379222PPM1F rs75525441SLC30A7 rs3453644319SLC44A2 A peak implies that the same effect drives GWAS and eQTL
26
DOES REGVAR REVEAL CO-REGULATION? A.K.A. WHERE ARE THE TRANS eQTLS? Open question
27
gene 3 Whole-genome eQTL analysis is an independent GWAS for expression of each gene gene 2gene N gene 5 gene 4 gene 1
28
Issues with trans mapping Power – Genome-wide significance is 5e -8 – Multiple testing on ~20K genes – Sample sizes clearly inadequate Data structure – Bias corrections deflate variance – Non-normal distributions Sample sizes – Far too small
29
But… Assume that trans eQTLs affect many genes… …and you can use cross-trait methods!
30
Association data Z 1,1 Z 1,2 ……Z 1,p Z 2,1 : : Z s,1 Z s,p
31
Cross-phenotype meta-analysis S CPMA ~ L(data | λ≠1) L(data | λ=1) Cotsapas et al, PLoS Genetics
32
CPMA for correlated traits Empirical assessment to account for correlation Simulate Z scores under covariance, recalculate CPMA Construct distribution of CPMA for dataset, call significance with Ben Voight, U Penn
33
Experimental design 610,180 SNPs MAF >0.15 CEU and YRI LD pruned (r 2 < 0.2) 8368 transcripts Detectable on Illumina arrays 108 CEU individuals* 109 YRI individuals* * Stranger et al Nat Genet 2007 (LCL data; publicly available) CEU p-values Transcript ~ SNP, sex YRI p-values Transcript ~ SNP, sex plink CPMA CEU CPMA scores YRI CPMA scores >95%ile sim CPMA
34
Target sets of genes trans-acting variant: SNP with CPMA evidence Target genes: genes affected by trans-acting variant (i.e. regulon)
35
Prediction 1 Allelic effects should be conserved between two populations – Binomial test on paired observations for all genes P < 0.05 in at least one population True for 1124/1311 SNPs (binomial p < 0.05) Genes p CEU < 0.05 Genes p YRI < 0.05 CEU++--+ YRI++--+ --++-
36
Prediction 2 Target genes should overlap – Identify by mixture of gaussians classification – Empirical p from distribution of overlaps between N CEU and N YRI genes across SNPs. True for 600/1311 SNPs (empirical p < 0.05) Genes p CEU < 0.05 Genes p YRI < 0.05
37
What about the target genes? Regulons: – Encode proteins more connected than expected by chance www.broadinstitute.org/mpg/dapple.php Rossin et al 2011 PLoS Genetics
38
What about the target genes? Regulons: – Encode proteins enriched for TF targets (ENCODE LCL data) – 24/67 filtered TFs significant – Binomial overlap test TFp-value CEBPB3.7 x 10 -142 HDAC87.8 x 10 -122 FOS2.5 x 10 -96 JUND3.7 x 10 -88 NFYB3.3 x 10 -71 ETS13.8 x 10 -63 FAM48A2.1 x 10 -61 FOXA11.4 x 10 -33 GATA14.6 x 10 -33 HEY17.8 x 10 -32 trans target genes CHiPseq LCL target genes
39
Summary Regulatory variation is common It affects gene expression levels Likely many other types: – DNA accessibility, chromatin states – Transcript splicing, processing, turnover Has phenotypic consequences – GWAS – Some cellular assays (not discussed here)
40
Open questions Discover regulatory elements (cis) – Promoters, enhancers etc Gene regulatory circuits (trans) Dynamics of regulation – Splicing variation, processing, degradation Phenotypic consequences – Cellular assays required Tie in to organismal phenotype
41
NEXT-GEN SEQUENCING DATA RNAseq, GTEx
42
GTEx – Genotype-Tissue EXpression An NIH common fund project Current: 35 tissues from 50 donors Scale up: 20K tissues from 900 donors. Novel methods groups: 5 current + RFA
43
How can we make RNAseq useful? Standard eQTLs – Montgomery et al, Pickrell et al Nature 2010 Isoform eQTLs – Depth of sequence! Long genes are preferentially sequenced Abundant genes/isoforms ditto Power!? Mapping biases due to SNPs
44
RNAseq combined with other techs Regulons: TF gene sets via CHiP/seq – Look for trans effects Open chromatin states (Dnase I; methylation) – Find active genes – Changes in epigenetic marks correlated to RNA – Genetic effects RNA/DNA comparisons – Simultaneous SNP detection/genotyping – RNA editing ???
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.