ValIdated Systematic IntegratiON: A VISION for epigenomics in hematopoietic gene regulation Ross Hardison Department of Biochemistry and Molecular Biology Huck Institute for Genome Sciences Penn State University 9/17/16
Mapping function-associated features genome-wide: Sparse For 2000 cell types, could require ~800 million *-seq assays 9/17/16
Focused efforts of multiple labs on one system gets closer to completeness HSC CMP GMP MEP CLP MEG ERY EOS Mast GRA MONO T B NK Hematopoiesis and datasets 9/17/16
Rationale for the VISION project Acquisition of genome-wide epigenetic data across hematopoiesis is no longer the major barrier to understanding mechanisms of gene regulation during normal and pathological tissue development The chief challenges are how to integrate epigenetic data in terms that are accessible and understandable to a broad community of researchers build validated quantitative models explaining how the dynamics of gene expression relates to epigenetic features translate information effectively from mouse models to potential applications in human health. 9/17/16
VISION: ValIdated Systematic IntegratiON of epigenomics in hematopoietic gene regulation Acquire Integrate Validate Translate 9/17/16
Initial VISION Resources http://www.bx.psu.edu/~giardine/vision/ BX Browser: Visualize functional genomics data 3D Genome Browser CODEX compendium of functional genomics Repository of hematopoietic transcriptomes Jens Lichtenberg poster IDEAS data integration Single cell transcriptomes, HSC Gottgens lab ENCODE Element Browser Translate between mouse and human 9/17/16
Generate, compile, and curate epigenomic data Work from individual labs 736 datasets 11,774 datasets High quality, high information tracks Hematopoietic cells : 9/17/16
IDEAS to integrate histone modifications and ATAC-seq across cell types ATAC: Hardison & Bodine, Amit lab Histone Mod iChIP: Amit lab IDEAS: Integrative and Discriminative Epigenome Annotation System: 2D segmentation Yu Zhang et al. 2016 NAR14:6721-6731 9/17/16 Promoter Active chromatin Quiescent
Chromatin interactions for target prediction Chr2 243.2Mb, Res=40kb Promoter Capture HiC. Mifsud et al. 2015 Nature Genetics Capture C. Hughes et al. 2014 Nature Genetics HiC. Lieberman-Aiden et al. 2009 Science. ChIA-PET. Fullwood et al. 2009 Nature. : Working on: Target gene assignments for CRMs 9/17/16
Try to integrate all the epigenomic and expression information to derive rules for regulation that apply globally rules = equations 9/17/16
Modeling different aspects of regulation in VISION 9/17/16
Functional output from distal CRMs measured for Hbb locus Blood, 2012 9/17/16
Locus model for Hbb and Hba Locus model: States the functional output Xi,j from each of the cis-regulatory modules (CRMs) contributing to the expression level of the target gene (T). E.g. here is a formal statement of results from Bender et al. 2012: THbb = XHS1 + XHS2 + XHS3 + XHS4 + XHS5,6 = 0.22 + 0.41 + 0.29 + 0.19 + 0.03 For the Hba complex of enhancers (Hay et al. 2016. Nature Genetics 48: 898): THba = XR1 + XR2 + XR3 + XRm + XR4 = 0.3 + 0.5 + 0.1 + 0.05 + 0.2 9/17/16
Models for cis-regulatory modules (CRMs) CRM model: Quantitative estimates of the contribution of epigenomic features, sequence, conservation, etc. to the functional output Xi,j from each of the CRMs XHS2, Hbb-b1 = 0.41= combination of f(chromatin state), f(TF occupancy), … XHS1, Hbb-b1 = 0.22 HS1 HS2 9/17/16
Global application of models Once you have a CRM model, you can apply it globally It is an equation using variables for which you have measurements genome-wide H3K27ac, GATA1 occupancy, TAL1 occupancy, motifs, etc. So you can predict Xi,j for all candidate CRMs We learned it from a few CRMs in a few loci, and of course it should work there. But what about other loci? Test these predictions! Genome editing in additional, reference loci 9/17/16
Deliverables from VISION Comprehensive catalogs of cis-regulatory modules utilized during hematopoiesis Built by integration of multiple data types Validated by extensive experimental tests Quantitative models for gene regulation Built by machine learning Extensively tested by genome editing approaches in ten reference loci Predictions applied genome-wide. A guide for investigators to translate insights from mouse models to human clinical studies. 9/17/16
Nascent VISION gives new insights Previous studies: Autoregulation by GFI1B binding to promoter proximal CRM Moroy et al. 2005. NAR 33:987. Multipotent progenitor cells Maturing erythroid cells Structural TFs 9/17/16
Interpreting the maps as testable hypotheses 9/17/16
Thanks to the VISION team Cheryl Keller Yu Zhang Gerd Blobel James Taylor Berthold Gottgens Amber Miller Feng Yue Mitch Weiss David Bodine Doug Higgs Belinda Giardine Jim Hughes Hardison Lab http://www.bx.psu.edu/~giardine/vision/ Supported by 9/17/16