Chromatin state and DNA sequence in TF binding dynamics and disease

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Functional Non-Coding DNA Part II DNA Regulatory Elements BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
A Genomic Code for Nucleosome Positioning Authors: Segal E., Fondufe-Mittendorfe Y., Chen L., Thastrom A., Field Y., Moore I. K., Wang J.-P. Z., Widom.
Manolis Kellis Broad Institute of MIT and Harvard
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
Epigenomic and regulatory genomics of complex human disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad Institute of.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad.
Epigenetic Analysis BIOS Statistics for Systems Biology Spring 2008.
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Manolis Kellis Broad Institute of MIT and Harvard
Jason Ernst Broad Institute of MIT and Harvard
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Supplemental Figure 1. False trans association due to probe cross-hybridization and genetic polymorphism at single base extension site. (A) The Infinium.
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
EQTLs.
Functional Elements in the Human Genome
Epigenetics 04/04/16.
Functional Mapping and Annotation of GWAS: FUMA
Manolis Kellis Broad Institute of MIT and Harvard
Comprehensively Evaluating cis-Regulatory Variation in the Human Prostate Transcriptome by Using Gene-Level Allele-Specific Expression  Nicholas B. Larson,
Jason Ernst Joint work with Pouya Kheradpour, Luke Ward
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
Jason Ernst Joint work with Pouya Kheradpour, Luke Ward
Chromatin-guided interpretation of variation in a disease cohort.
Beyond GWAS Erik Fransen.
Volume 152, Issue 3, Pages (January 2013)
Genetic-Variation-Driven Gene-Expression Changes Highlight Genes with Important Functions for Kidney Disease  Yi-An Ko, Huiguang Yi, Chengxiang Qiu, Shizheng.
1. Interpreting rich epigenomic datasets
Epigenomic views of human disease reveal 1000s of regulatory variants
Higher Nevus Count Exhibits a Distinct DNA Methylation Signature in Healthy Human Skin: Implications for Melanoma  Leonie Roos, Johanna K. Sandling, Christopher.
Volume 9, Issue 3, Pages (September 2017)
Volume 62, Issue 2, Pages (April 2016)
A twin approach to unraveling epigenetics
Genome-wide DNA methylation profile implicates potential cartilage regeneration at the late stage of knee osteoarthritis  Y. Zhang, N. Fukui, M. Yahata,
Latent Regulatory Potential of Human-Specific Repetitive Elements
Volume 67, Issue 6, Pages e6 (September 2017)
Volume 20, Issue 4, Pages e6 (April 2017)
Parisa Shooshtari, Hailiang Huang, Chris Cotsapas 
Long-Range Modulation of PAG1 Expression by 8q21 Allergy Risk Variants
In collaboration with Mikkelsen Lab
Mapping Global Histone Acetylation Patterns to Gene Expression
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Volume 20, Issue 4, Pages e6 (April 2017)
Structural Architecture of SNP Effects on Complex Traits
Epigenomic Profiling Reveals DNA-Methylation Changes Associated with Major Psychosis  Jonathan Mill, Thomas Tang, Zachary Kaminsky, Tarang Khare, Simin.
Volume 14, Issue 6, Pages (June 2014)
Volume 10, Issue 10, Pages (October 2017)
Volume 21, Issue 6, Pages e6 (December 2017)
Volume 29, Issue 5, Pages (May 2016)
Volume 14, Issue 6, Pages (June 2014)
Elevated DNA methylation across a 48-kb region spanning the HOXA gene cluster is associated with Alzheimer's disease neuropathology  Rebecca G. Smith,
Volume 62, Issue 2, Pages (April 2016)
Volume 1, Issue 1, Pages (July 2015)
Volume 165, Issue 3, Pages (April 2016)
Figure 1 Results of genome-wide association study for age at diagnosis of PD Results of genome-wide association study for age at diagnosis of PD Genome-wide.
Integrative analysis of 111 reference human epigenomes
Volume 52, Issue 1, Pages (October 2013)
Discovery and analysis of methylation quantitative trait loci (mQTLs) mapping to novel osteoarthritis genetic risk signals  S.J. Rice, K. Cheung, L.N.
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Symmetrical Dose-Dependent DNA-Methylation Profiles in Children with Deletion or Duplication of 7q11.23  Emma Strong, Darci T. Butcher, Rajat Singhania,
The 3D Genome in Transcriptional Regulation and Pluripotency
IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors  Tiffany Amariuta, Yang.
Presentation transcript:

Chromatin state and DNA sequence in TF binding dynamics and disease Manolis Kellis Broad Institute of MIT and Harvard MIT Computer Science & Artificial Intelligence Laboratory

DNA vs. epigenome in dynamics & disease Sequence specificity Motifs TF binding ? Interplay ENCODE Ernst, Bernstein Chromatin state CATGACTG CATGCCTG GWAS Genotype Disease QTLs QTLs Epigenotype Roadmap Eaton, De Jager

States combine histone marks, FAIRE, Pol2, DNase Transition matrix ENCODE datasets: Bernstein, Stam, Lieb, Crawford Several classes of Dnase hypersensitive regions Do they have different TF-binding properties?

TFs show characteristic chromatin state preferences Confirm TFDNAse relationship However: Different TFs bind different chromatin states Dynamic binding across cell types?

Patterns hold across 300+ TF binding expts What about dynamics?

Dynamic enhancers vs. constitutive CTCF/promoters

Dynamic TF binding  dynamic enhancer activity Dynamic enh./static promoters TF binding corr. w/ TF expression

TF co-occurrence patterns driven by chromatin state Raw enrichments

TF co-occurrence patterns driven by chromatin state Raw enrichments Conditional enrichments (if state preference is known)

Chromatin state preferences are motif encoded States bound by TFs enriched in corresponding motifs Enrichment also found in states of specific repression

Bound regions in preferred states depleted in motifs Permissive binding in promoters/enhancers/insulators DNase/FAIRE regions lacking marks: not permissive

Summary Chromatin states, TF dynamics, and motifs TFs bind DNase; distinct chromatin state preferences Chromatin state preferences are partly motif-encoded States predict most previously-observed co-binding Motifs guide states, states enable permissive binding Methylation vs. genotype in Alzheimer’s Disease Variability between individuals mostly genotype-driven Most variable: promoter-flanking, brain enhancers Predictive for AD: Global inhibition of 7000 probes Enhancers, not promoters. NRSF, ELK1, CTCF targets Conclusions: Power of regulatory annotation for interpreting disease Interplay of DNA sequence & epigenome in TFs/disease

DNA vs. epigenome in dynamics & disease Sequence specificity Motifs TF binding ? Interplay ENCODE Ernst, Bernstein Chromatin state CATGACTG CATGCCTG GWAS Genotype Disease QTLs QTLs Epigenotype Roadmap Eaton, De Jager

Interpreting disease-association signals (1) Interpret variants using ENCODE - Chromatin states: Enhancers, promoters, motifs - Enrichment in individual loci, across 1000s of SNPs in T1D CATGACTG CATGCCTG GWAS Genotype Disease (2) Epigenome changes in disease - Molecular phenotypic changes in patients vs. controls - Small variation in brain methylomes, mostly genotype-driven - 1000s of brain-specific enhancers increase methylation in Alzheimer’s mQTLs MWAS Epigenome

Methylation in 750 Alzheimer patients/controls 486,000 methylation probes 750 individuals (~50% w/AD) Memory and Aging Project Religious Order Study Brad Bernstein REMC mapping Philip deJager, Epigenomics Roadmap Genome Epigenome meQTL Phenotype Classification MWAS 1 2 Patients followed for 10+ years with cognitive evaluations Brain samples donated post-mortem methylation/genotype Seek predictive features: SNPs, QTLs, mQTLs, regulation

Global variability in DLPFC and CD4+ methylation T-cells CD4+ Dorso-Lateral Pre-Frontal Cortex Gender (M/F) Batch Colors along the top represent gender, colors along the left indicate “batch” (CD4+ batch vs DLPFC batch, the four red bars in the black section are DLPFC samples run in the CD4+ batch to make sure that batch effect wasn’t stronger than cell type effect). Most similar Least similar

Little variability, focused on regulatory regions Probe intensity distribution Inter-individual variability Hemi-methylated probes are also the most variable Tiny fraction (0.6%) of all probes Promoters: Stable low (active) Gene bodies: Stable high (active) Enhancers/poised: Most variable

Most epigenomic variability is genotype-driven P-value (-log10P) -1 Distance from CpG (MB) 1 Chromosome and genomic position Overlay Manhattan plots of 450,000 methylation probes Cutoff of 10-14 (10-2 after Benjamini-Hochberg correction) 150,000 mQTLs at P<0.01 after FDR correction

MultimodalSNP-associatedPromoter-depleted All probes 1 Active promoter SNP-associated 2 Promoter flanking Multimodal probes (~3Κ) SNP-associated probes (29% of all) 138,731 184 2,647 3 Active enhancer 4 Weak enhancer 5 Gene bodies 6 Active gene bodies 93.5% of multimodal probes are SNP-associated Importance of distinguishing contribution of genotype to disease associations 7 Repetitive Remember the multi-modal probes that didn’t seem to fall into a functional group? Almost all of them are strongly SNP-associated, implying that their multi-modality is driven by genotype. 8 Heterochromatin 9 Low signal % of CpG probes SNP-associated probes depleted in promoters (driven epigenetically>genetically, open chrom)

>80% variance explained for 50,000+ probes Significance q-value 25 210 215 220 Distance to CpG (MB) 8k 32k 1M Variance explained Adjusted R2 25 210 215 220 Distance to CpG (MB)

Phil de Jager: Methylation in 750 Alzheimer patients 486,000 methylation probes 750 individuals (~50% w/AD) Memory and Aging Project Religious Order Study Brad Bernstein REMC mapping Phil de Jager, Roadmap disease epigenomics Genome Epigenome meQTL Phenotype Classification MWAS 1 2 Patients followed for 10+ years with cognitive evaluations Brain samples donated post-mortem methylation/genotype Seek predictive features: SNPs, QTLs, mQTLs, regulation

Global hyper-methylation in 1000s of AD-associated loci QQ plot: Many loci with weak effects? Expected (-logP) Observed (-logP) 10 8 6 4 2 Top 7000 probes P-value 480,000 probes, ranked by Alzheimer’s association Methylation Alzheimer’s-associated probes are hypermethylated Global effect across 1000s of probes Rank all probes by Alzheimer’s association Observe functional changes down ranklist 7000 probes show shift in methylation Complex disease: genome-wide effects Alzheimer’s Normal Hypermethylated probes (repressed)

Chromatin state breakdown reveals  activity Red: More methylated in Alhzeimer’s Blue: Less methylated in Alzheimer’s Significant probes are in enhancers Not promoters % probes 1 Active promoter 2 Promoter flanking 3 Active enhancer 4 Weak enhancer 5 Gene bodies 6 Active gene bodies 7 Repetitive 8 Heterochromatin 9 Low signal * => fisher exact test, p-value <= 0.001

Estimating number of functionally-associated probes Active TSS flanking Active enhancer Poised promoter Polycomb repressed Weak enhancer Expected Promoter Strong transcription Weak transcription 10,000 Functional enrichments found for 10,000 probes

Predictive power of hyper-methylation signal Sum of methylation signal in 1,026 regulatory regions The idea here is the same as the previous plot, but I’ve required that it only contain those probes that were both in the top 6000 and are either strong enhancers or TSS flanking regions. Sum total methylation levels across 1026 probes Individuals in top quintile show 2.5-fold higher risk By comparison, the APOE4 allele confers 1.5-fold

AD-associated probes enriched in ELK1/NRSF targets CTCF All probes, ranked by AD assoc. P-value Regulatory motifs enriched in top-scoring probes Genomic basis for association, potential cis or trans effect Reveals biological pathways involved and potential targets

DNA vs. epigenome in dynamics & disease Sequence specificity Motifs TF binding ? Interplay ENCODE Ernst, Bernstein Chromatin state CATGACTG CATGCCTG GWAS Genotype Disease QTLs QTLs Epigenotype Roadmap Eaton, De Jager

Summary Chromatin states, TF dynamics, and motifs TFs bind DNase; distinct chromatin state preferences Chromatin state preferences are partly motif-encoded States predict most previously-observed co-binding Motifs guide states, states enable permissive binding Methylation vs. genotype in Alzheimer’s Disease Variability between individuals mostly genotype-driven Most variable: promoter-flanking, brain enhancers Predictive for AD: Global inhibition of 7000 probes Enhancers, not promoters. NRSF, ELK1, CTCF targets Conclusions: Power of regulatory annotation for interpreting disease Interplay of DNA sequence & epigenome in TFs/disease

Collaborators and Acknowledgements Chromatin state dynamics, ENCODE Brad Bernstein, John Stam, Jason Lieb, Crawford Methylation in Alzheimer’s disease Philip deJager & Gyan Srivastava, Brad Bernstein Religious Order Study, Memory and Aging Project Large-scale epigenomic datasets Epigenomics Roadmap, ENCODE project, NHGRI Funding NHGRI, NIH, NSF, Sloan Foundation

MIT Computational Biology group Compbio.mit.edu Mike Lin Ben Holmes Soheil Feizi Angela Yen Luke Ward Bob Altshuler Mukul Bansal Chris Bristow Stefan Washietl Pouya Kheradpour Matt Eaton Manolis Kellis Jason Ernst Irwin Jungreis Rachel Sealfon Jessica Wu Daniel Marbach Louisa DiStefano Dave Hendrix Loyal Goff Sushmita Roy Stata3 Stata4