Ross Hardison Department of Biochemistry and Molecular Biology

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Manolis Kellis: Research synopsis Brief overview 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting.
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Genes - Where Do We Go From Here? Camp Sunshine Monday, July 13, 2015 Dr. Dave Bodine, Ph.D. Chief, Genetics and Molecular Biology Branch National Institute.
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Dynamics of epigenetic states during erythroid differentiation Ross Hardison July 20, 2010 Barcelona.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Genomics of Gene Regulation
Overview of ENCODE Elements
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Accessing and visualizing genomics data
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
The Chromatin State The scientific quest to decipher the histone code Lior Zimmerman.
Additional high-throughput sequencing techniques (finding all functional elements of genome) June 15, 2017.
CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells Presented by Nur Ata Bruss and Xinyi Ma.
Genetics Journal Club Sumeet A. Khetarpal 10 December 2015.
Ross Hardison Department of Biochemistry and Molecular Biology
Epigenetics Continued
Epigenetics 04/04/16.
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Functional Mapping and Annotation of GWAS: FUMA
Zebrafish msxe Expression Analysis
Day 5 Session 29: Questions and follow-up…. James C. Fleet, PhD
Gene Hunting: Design and statistics
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
Regulation of Gene Expression by Eukaryotes
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
Loyola Marymount University
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
by Kentson Lam, Alexander Muselman, Randal Du, Yuka Harada, Amanda G
Volume 23, Issue 5, Pages (May 2018)
In collaboration with Mikkelsen Lab
Chromatin Insulators: Linking Genome Organization to Cellular Function
ChIP-seq Robert J. Trumbly
Volume 63, Issue 4, Pages (August 2016)
Genome-wide analysis of p53 occupancy.
Single Cell Regulatory Variation
Systematic mapping of functional enhancer-promoter connections with CRISPR interference by Charles P. Fulco, Mathias Munschauer, Rockwell Anyoha, Glen.
Volume 23, Issue 5, Pages (May 2018)
Unlinking an lncRNA from Its Associated cis Element
Presentation by: Hannah Mays UCF - BSC 4434 Professor Xiaoman Li
Systematic mapping of functional enhancer–promoter connections with CRISPR interference by Charles P. Fulco, Mathias Munschauer, Rockwell Anyoha, Glen.
Volume 132, Issue 6, Pages (March 2008)
Volume 122, Issue 6, Pages (September 2005)
Adam C. Wilkinson, Hiromitsu Nakauchi, Berthold Göttgens  Cell Systems 
Loyola Marymount University
Anh Pham Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease.
By Wenfei Jin Presenter: Peter Kyesmu
Bernard Mulvey, Joseph D. Dougherty  Cell 
Integrative analysis of 111 reference human epigenomes
Loyola Marymount University
Loyola Marymount University
IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors  Tiffany Amariuta, Yang.
The Genetics of Transcription Factor DNA Binding Variation
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Integrative analysis of epigenomes illuminates differentiation and diseases of blood cells Ross Hardison Department of Biochemistry and Molecular Biology Huck Institute for Genomics Penn State University 9/29/16 Bioinformatics and Genomics, UNC Charlotte

Simplified scheme of hematopoiesis HSC CMP MEP GMP CLP MEG ERY EOS Mast GRA MONO T B NK 2 M sec-1 9/29/16

Differentiation and diseases of blood cells Lineage specific binding of key transcription factors drives expression patterns that determine cell type Maps of transcription factor occupancy inform models of regulation Cell specific phenotypes arise from lineage-specific binding of transcription factors at distinct sites ValIdated Systematic IntegratiON: A VISION for epigenomics in hematopoietic gene regulation Measure distances between cell types by quantitative comparisons of chromatin accessibility landscapes and transcriptomes Integrative analysis of epigenomics can improve prediction of enhancers Formal modeling to understand regulation of a locus and regulatory output of each cis-regulatory module Use this information to increase accuracy of search for genetic variants in regulatory regions to explain phenotypes 9/29/16

E.H. Davidson, 1976, Gene Activity in Early Development, 2nd ed. The guiding principle of developmental biology: Differential gene expression determines the distinctive properties of each cell type. E.H. Davidson, 1976, Gene Activity in Early Development, 2nd ed. 9/29/16

Lineage specific binding of key transcription factors drives expression patterns that determine cell type 9/29/16

GATA1 is required for production of erythrocytes, megakaryocytes, mast cells, and eosinophils ES cells, Gata1- HSC CMP CLP GMP MEP MEG ERY GRA MONO T B NK EOS Mast blastocyst Chimeric mouse X X X X Did the Gata1- ES cells contribute to specific lineages? Pevny et al. 1991. Nature 349:257; Pevny et al. 1995. Development 121:163 and subsequent papers, multiple alleles of Gata1 S.H. Orkin (1995) J. Biol. Chem. 270: 4955-4958. 9/29/16

Lineage-restricted TFs determine hematopoietic cell fate HSC CMP CLP GMP MEP MEG ERY GRA MONO T B NK EOS Mast TAL1 GATA2 IKAROS PU.1 PU.1 GATA1 PAX5 TAL1 GATA1 CEBPA FLI1 LMO2 GATA3 GATA3 GATA1 PU.1 KLF1 9/29/16

Cell-restricted transcription factors regulate target genes positively and negatively 9/29/16

Erythroid differentiation in cultured and primary cells G1E-ER4 BFU-E 9/29/16 Weiss, Yu, Orkin (1997) Mol. Cell. Biol. 17: 1642 Wu et al. 2011 Genome Res 21: 1659-1671 Welch et al.. (2004) Blood 104: 3146 Pilon, Subramanian, Kumar et al.2011 Blood Epub Sep 2011

Transcriptional response to GATA1-ER activation in G1E cells B Induced Repressed Differentially expressed genes 3 7 14 24 30 hr Platform Genes induced Genes repressed References Affymetrix microarrays 1048 1568 Cheng et al. 2009. Genome Res 19:2172 RNA-seq, polyA+ RNA 1416 1039 Jain, Mishra et al. 2015. Genomics Data 4:1-7 A B 9/29/16

TFs regulate lineage-specific genes GATA1 + Induced gene WGATAR GATA1 - Repressed gene WGATAR Contexts must differ between induced and repressed: Sequence, motifs? Other TFs? Co-activators? Co-repressors? Chromatin? Nuclear location? 9/29/16

GATA1 occupancy genome-wide: clues about regulation Yong Cheng Ying Zhang G. Celine Han GATA1 occupancy genome-wide: clues about regulation 9/29/16

Locations of occupancy by GATA1 ChIP-chip ~3,558 sites Cheng et al. 2009. ChIP-seq ~14,000 sites Erythroblasts, Wu et al. 2011; Pimkin et al. 2014 ChIP-exo ~10,000 sites Han et al. 2016. Mol. Cell Biol. 9/29/16 Jain, Mishra et al. 2015. Genomics Data 4:1-7.

Distinguishing features of GATA1-mediated gene induction GATA1 tends to bind close to the TSS Most often in the first intron but frequently in the proximal flanking region Multiple GATA1 OSs 58% of induced genes, 24% repressed Evolutionary constraint on the GATA motif instances Region around the TSS depleted of H3K27me3 9/29/16 Cheng et al. (2009) Genome Res. 19:2172-2184

TAL1 + GATA1 = induction Gerd Blobel Weisheng Wu Tripic et al (2009) Blood 113: 2191 Cheng et al (2009) Genome Res. 19: 2172 Wu et al (2011) Genome Res. 21: 1659 9/29/16 Weisheng Wu

~15,000 GATA1-bound sites More than number of GATA1-responsive genes (~2,500) Average of 6 bound sites per responsive gene Far fewer than number of GATA1 binding site motif instances (~8 million) About 1 bound site per 500 motif instances Considering DNA segments (500bp) containing at least one motif instance, about 1 in 150 DNA segments are bound 9/29/16

Determinants of GATA1 occupancy: Chromatin >> motifs Study DNA segments comparable in size to ChIP-chip peaks (500bp) that also have a match to a GATA1 binding site motif What distinguishes GATA1-bound from unbound segments? Additional motifs increase discriminatory power only 2 fold. Mark of active chromatin (H3K4me1) increases discrimination 25 fold. Zhang et al. 2009. Nucleic Acids Res 37:7024. Ying Zhang 9/29/16

Epigenetic features associated with transcriptional regulation, assayed genome-wide Repressed chromatin Enhancer Promoter Repressed chromatin H3K27ac 9/29/16

Changes in TF occupancy drive differential regulation Maxim Pimkin, Chris Morrissey, Tejas Mishra, Deepti Jain, Weisheng Wu.. Changes in TF occupancy drive differential regulation 9/29/16

Most GATA1 and TAL1 binding sites are distinctive to ERYs vs MEGs The TFs GATA1 and TAL1 are required for production of both erythroblasts and megakaryocytes. Pimkin et al. (2014) Genome Research 24: 1932 9/29/16

Major shifts in TAL1 occupancy during hematopoiesis Wu et al. (2014) Genome Research 24: 1945 9/29/16

ValIdated Systematic IntegratiON: A VISION for epigenomics in hematopoietic gene regulation Ross Hardison Department of Biochemistry and Molecular Biology Huck Institute for Genome Sciences Penn State University 9/29/16

Rationale for the VISION project Acquisition of genome-wide epigenetic data across hematopoiesis is no longer the major barrier to understanding mechanisms of gene regulation during normal and pathological tissue development The chief challenges are how to integrate epigenetic data in terms that are accessible and understandable to a broad community of researchers build validated quantitative models explaining how the dynamics of gene expression relates to epigenetic features translate information effectively from mouse models to potential applications in human health. 9/29/16

VISION: ValIdated Systematic IntegratiON of epigenomics in hematopoietic gene regulation Acquire Integrate Validate Translate 9/29/16

Initial VISION Resources http://www.bx.psu.edu/~giardine/vision/ BX Browser: Visualize functional genomics data 3D Genome Browser CODEX compendium of functional genomics Repository of hematopoietic transcriptomes Jens Lichtenberg poster IDEAS data integration Single cell transcriptomes, HSC Gottgens lab ENCODE Element Browser Translate between mouse and human 9/29/16

Generate, compile, and curate epigenomic data Work from individual labs 736 datasets 11,774 datasets High quality, high information tracks Hematopoietic cells : 9/29/16

Focus on myeloid-erythroid branches of hematopoiesis HSC HPC7 CMP MEP GMP CLP G1E ER4 CFU-Mk CFU-E MEG ERY EOS Mast GRA MONO T B NK 2 M sec-1 9/29/16

ScriptSeq RNA-seq at Zfpm1 and neighbors 9/29/16

Hierarchical clustering: Erythroid separates from others Transcript levels of all genes (RNA-seq) BG, July 22, 2016 9/29/16

ATAC-seq in Zfpm1 and neighbors 9/29/16

Hierarchical clustering: Erythroid separates from others Nuclease accessibility (ATAC-seq) 9/29/16 BG, Aug 03, 2016

General model for lineage choice HSC CMP MEP GMP ERY MEG CFU-Mk CFU-E Lineage choice occurs with – or even via – establishment of permissive and repressive chromatin states These chromatin states are relatively stable within a lineage – even when expression changes dramatically Induction and repression within a lineage are largely a result of changes in patterns of TF binding on the stage of the permissive chromatin Similar regulatory landscapes Dynamic TF binding = change in regulatory landscape 9/29/16

Nergiz Dogan Integrative analysis of epigenomics can improve prediction of enhancers 9/29/16

Epigenetic signatures can predict enhancers with high accuracy: TAL1 occupancy 9/29/16 Dogan et al (2015) Epigenetics & Chromatin 8: 16

TF occupancy: frequently active as enhancers HMs without TFs: rarely active as enhancers 9/29/16 Dogan et al. (2015) Epigenetics & Chromatin 8: 16.

Integration of epigenetic signals in two dimensions simultaneously Integration of epigenetic signals along chromosomes and across cell types Yu Zhang (Statistics, PSU): Integrative and Discriminative Epigenome Annotation System (IDEAS) Zhang, An, Yue, Hardison (2016) Nucleic Acids Research 44:6721-6731 Joint characterization of epigenetic landscapes in many cell types and detection of differential regulatory regions Preserves the position-dependent and cell type-specific information at fine scales 9/29/16

Integrative analysis of histone modifications reveals little change during erythroid maturation Ernst & Kellis (2012) Nature Methods 9/29/16 Wu et al. (2011) Genome Research 21: 1659.

Integrative and Discriminative Epigenome Annotation System (IDEAS) Zhang, An, Yue, Hardison (2016) Nucleic Acids Research 44:6721-6731 9/29/16

IDEAS to integrate histone modifications and ATAC-seq across cell types ATAC: Hardison & Bodine, Amit lab Histone Mod iChIP: Amit lab IDEAS: Integrative and Discriminative Epigenome Annotation System: 2D segmentation Yu Zhang et al. 2016 NAR14:6721-6731 9/17/16 Promoter Active chromatin Quiescent

Nascent VISION gives new insights Previous studies: Autoregulation by GFI1B binding to promoter proximal CRM Moroy et al. 2005. NAR 33:987. Multipotent progenitor cells Maturing erythroid cells Structural TFs 9/29/16

Interpreting the maps as testable hypotheses 9/29/16

Try to integrate all the epigenomic and expression information to derive rules for regulation that apply globally rules = equations 9/29/16

Modeling different aspects of regulation in VISION 9/29/16

Functional output from distal CRMs measured for Hbb locus Blood, 2012 9/29/16

Locus models for Hbb and Hba Locus model: States the functional output Xi,j from each of the cis-regulatory modules (CRMs) contributing to the expression level of the target gene (T). E.g. here is a formal statement of results from Bender et al. 2012: THbb = XHS1 + XHS2 + XHS3 + XHS4 + XHS5,6 = 0.22 + 0.41 + 0.29 + 0.19 + 0.03 For the Hba complex of enhancers (Hay et al. 2016. Nature Genetics 48: 898): THba = XR1 + XR2 + XR3 + XRm + XR4 = 0.3 + 0.5 + 0.1 + 0.05 + 0.2 9/29/16

Models for cis-regulatory modules (CRMs) CRM model: Quantitative estimates of the contribution of epigenomic features, sequence, conservation, etc. to the functional output Xi,j from each of the CRMs XHS2, Hbb-b1 = 0.41= combination of f(chromatin state), f(TF occupancy), … XHS1, Hbb-b1 = 0.22 HS1 HS2 9/29/16

Global application of models Once you have a CRM model, you can apply it globally It is an equation using variables for which you have measurements genome-wide H3K27ac, GATA1 occupancy, TAL1 occupancy, motifs, etc. So you can predict Xi,j for all candidate CRMs We learned it from a few CRMs in a few loci, and of course it should work there. But what about other loci? Test these predictions! Genome editing in additional, reference loci 9/29/16

Epigenome maps provide a guide to noncoding variants associated with phenotype 9/29/16

Variants affecting gene regulation play a prominent role in complex traits The majority of genomic variants associated with complex traits are not in protein-coding exons Hindorff et al (2009) PNAS 106:9362. Phenotype-associated, noncoding variants are highly enriched in DNA with epigenetic signatures of regulatory regions. Maurano et al. (2012) Science 337: 1190 Schaub et al. (2012) Genome Research ENCODE Consortium (2012) Integrated Encyclopedia … Nature 9/29/16

From GWAS results to allele-specific regulation CRM = cis regulatory module, e.g. enhancer 9/29/16 Hardison (2012) JBC 287:30932. Minireview on Epigenetic data as guide to interpret GWAS

Cluster of SNPs associated with inflammatory diseases are close to sites occupied by GATA factors 9/29/16 ENCODE Consortium (2014) Integrated Encyclopedia … Nature

Strategy for linking regulatory variation to phenotype Locus with phenotype-associated variants Identify candidate CRMs from epigenomic data Find common and rare variants in CRMs for in cohorts of patients Predict those likely to affect regulation Test for allele-specific effects Candidate enhancers Candidate loop bases DNase FL ERY DNase Multipot prog 9/29/16

Differentiation and diseases of blood cells Lineage specific binding of key transcription factors drives expression patterns that determine cell type Maps of transcription factor occupancy inform models of regulation Cell specific phenotypes arise from lineage-specific binding of transcription factors at distinct sites ValIdated Systematic IntegratiON: A VISION for epigenomics in hematopoietic gene regulation Measure distances between cell types by quantitative comparisons of chromatin accessibility landscapes and transcriptomes Integrative analysis of epigenomics can improve prediction of enhancers Formal modeling to understand regulation of a locus and regulatory output of each cis-regulatory module Use this information to increase accuracy of search for genetic variants in regulatory regions to explain phenotypes 9/29/16

Thanks to the VISION team Cheryl Keller Yu Zhang Gerd Blobel James Taylor Berthold Gottgens Amber Miller Feng Yue Mitch Weiss David Bodine Doug Higgs Belinda Giardine Jim Hughes Hardison Lab http://www.bx.psu.edu/~giardine/vision/ Supported by 9/29/16

Deliverables from VISION Comprehensive catalogs of cis-regulatory modules utilized during hematopoiesis Built by integration of multiple data types Validated by extensive experimental tests Quantitative models for gene regulation Built by machine learning Extensively tested by genome editing approaches in ten reference loci Predictions applied genome-wide. A guide for investigators to translate insights from mouse models to human clinical studies. 9/29/16