Genomics of Erythroid Regulation: G1E and G1E-ER4 January 20, 2010.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Transcriptional regulation in Eukaryotes The regulatory elements of bacterial, yeast, and human genes.
A Genomic Code for Nucleosome Positioning Authors: Segal E., Fondufe-Mittendorfe Y., Chen L., Thastrom A., Field Y., Moore I. K., Wang J.-P. Z., Widom.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding Transcription Factor Binding Sites BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
Figure S1: Genome-wide distribution of positions of TAL1 OSs relative to the transcription start sites (TSSs) of RefSeq genes [110].
Analysis of ChIP-Seq Data
Epigenetics 12/05/07 Statisticians like data.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Comparative genomics, ChIP-chip and transfections to find cis-regulatory modules Penn State University, Center for Comparative Genomics and Bioinformatics:
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Chromatin Remodeling. Levels of chromatin organization nucleosome arrays 300 nm fiber.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Dynamics of epigenetic states during erythroid differentiation Ross Hardison July 20, 2010 Barcelona.
4 male, 4 female LCLs HumanChimpanzeeRhesus Macaque Expression: RNAseq Active Gene Marks: Pol II (ChIPseq) H3K4me3 (ChIPseq) Repressed Region Mark: H3K27me3.
Log 2 (expression) H3K4me2 score A SLAMF6 log 2 (expression) Supplementary Fig. 1. H3K4me2 profiles vary significantly between loci of genes expressed.
The β-Globin LCR is Not Necessary for an Open Chromatin Structure or Developmentally Regulated Transcription of the Native Mouse β-Globin Locus Elliot.
Genomics of Gene Regulation
Analysis of ChIP-Seq Data Biological Sequence Analysis BNFO 691/602 Spring 2014 Mark Reimers.
CS173 Lecture 9: Transcriptional regulation III
Molecules and mechanisms of epigenetics. Adult stem cells know their fate! For example: myoblasts can form muscle cells only. Hematopoetic cells only.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
The Chromatin State The scientific quest to decipher the histone code Lior Zimmerman.
Ross Hardison Department of Biochemistry and Molecular Biology
Ross Hardison Department of Biochemistry and Molecular Biology
Epigenetics Continued
Functional Elements in the Human Genome
Epigenetics 04/04/16.
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Volume 43, Issue 1, Pages (July 2011)
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
by Sang-Hyun Song, AeRi Kim, Tobias Ragoczy, M. A
by Holger Weishaupt, Mikael Sigvardsson, and Joanne L. Attema
Taichi Umeyama, Takashi Ito  Cell Reports 
Volume 44, Issue 3, Pages (November 2011)
Volume 7, Issue 5, Pages (June 2014)
Nucleosome-Driven Transcription Factor Binding and Gene Regulation
The Stat3/GR Interaction Code: Predictive Value of Direct/Indirect DNA Recruitment for Transcription Outcome  David Langlais, Catherine Couture, Aurélio.
Volume 49, Issue 1, Pages (January 2013)
Volume 63, Issue 4, Pages (August 2016)
Control of the Embryonic Stem Cell State
Volume 17, Issue 6, Pages (November 2016)
Genome-wide analysis of p53 occupancy.
Volume 46, Issue 1, Pages (April 2012)
Molecular Convergence of Neurodevelopmental Disorders
Human Promoters Are Intrinsically Directional
Unlinking an lncRNA from Its Associated cis Element
Evolution of Alu Elements toward Enhancers
Volume 10, Issue 10, Pages (October 2017)
Volume 132, Issue 6, Pages (March 2008)
Volume 29, Issue 5, Pages (March 2008)
Volume 35, Issue 2, Pages (August 2011)
Volume 55, Issue 5, Pages (September 2014)
Volume 55, Issue 5, Pages (September 2014)
Gene Density, Transcription, and Insulators Contribute to the Partition of the Drosophila Genome into Physical Domains  Chunhui Hou, Li Li, Zhaohui S.
Volume 9, Issue 3, Pages (November 2014)
Anne Pfeiffer, Hui Shi, James M. Tepperman, Yu Zhang, Peter H. Quail 
By Wenfei Jin Presenter: Peter Kyesmu
Volume 52, Issue 1, Pages (October 2013)
Volume 24, Issue 8, Pages e7 (August 2018)
Taichi Umeyama, Takashi Ito  Cell Reports 
Multiplex Enhancer Interference Reveals Collaborative Control of Gene Regulation by Estrogen Receptor α-Bound Enhancers  Julia B. Carleton, Kristofer.
The Genetics of Transcription Factor DNA Binding Variation
Volume 14, Issue 6, Pages (February 2016)
Presentation transcript:

Genomics of Erythroid Regulation: G1E and G1E-ER4 January 20, 2010

Investigators on global predictions and tests Penn State –Hardison –Francesca Chiaromonte –Yu Zhang –Webb Miller –Stephan Schuster –Frank Pugh, collaborator –Kateryna Makova, collaborator –Anton Nekrutenko, collaborator Childrens’ Hospital of Philadelphia –Mitch Weiss –Gerd Blobel Emory Univ. –James Taylor Duke Univ. –Greg Crawford, collaborator Univ. Queensland –Andrew Perkins, collaborator NHGRI –Laura Elnitski, collaborator

Aims of Global tests and predictions of erythroid regulation

Hematopoiesis

FactorClassMode of discovery GATA1Zn fingerbinds  globin locus NF-E2bZIPbinds  globin locus KLF1/EKLFZn fingersubtractive hybridization SCL/TAL1bHLHrearranged in leukemias GFI1bZnfingeroncoviral integration site ZBTB7a/LRFPOZ-Kruppelproto-oncogene lymphomas Others Major erythroid transcription factors

Globin genes -GATA- GATA-1 Transcription factor GATA-1 Founding member of a small family of proteins - GATA-2  GATA-6 Binds functionally important cis WGATAR motifs in regulatory regions of many hematopoietic genes Essential for erythroid and megakaryocyte development - gene knockout studies in mice - analysis of human patients

Lineage-restricted factors cofactors histone modifying enzymes Transcription

Gene activation by alterations in chromatin “Regulatory signals entering the nucleus encounter chromatin, not DNA, and the rate-limiting biochemical response that leads to activation of gene expression in most cases involves alterations in chromatin structure. How are such alterations achieved?” –Gary Felsenfeld & Mark Groudine (2003) Controlling the double helix. Nature 421: "It is now generally argued that reorganization of these chromatin structures is a process that is mechanistically linked to many gene activation or repression events, and is initiated by the action of site-specific transcription factors, acting either through ATP-driven nucleosome remodeling machines, or via the action of enzymes that covalently modify various components of the chromatin structure.” –Sam John, … John A. Stamatoyannopoulos, and Gordon L. Hager (2008) Interaction of the Glucocorticoid Receptor with the Chromatin Landscape. Molecular Cell 29, 611–624. “These results imply that GATA-1 is sufficient to direct chromatin structure reorganization within the beta-globin LCR and an erythroid pattern of gene expression in the absence of other hematopoietic transcription factors.” –Layon ME, Ackley CJ, West RJ, Lowrey CH. (2007) Expression of GATA-1 in a non- hematopoietic cell line induces beta-globin locus control region chromatin structure remodeling and an erythroid pattern of gene expression. J Mol Biol. 366:

Chromatin transitions before and after TF binding "We propose four specific kinds of interaction. The classical mode for GR binding to chromatin involves receptor-dependent recruitment of the Swi/Snf complex (1), resulting in a hormone-dependent hypersensitive transition. It is now clear, however, that some hormone-dependent events must involve other remodeling species (2). Furthermore, many GR binding events are associated with pre-existing transitions, and these constitutive events fall again into two classes, Brg1 dependent (3) and Brg1 independent (4). ” –Sam John, … John A. Stamatoyannopoulos, and Gordon L. Hager (2008) Interaction of the Glucocorticoid Receptor with the Chromatin Landscape. Molecular Cell 29, 611–624.

Order of events in activation can vary Figure 5. Models Depicting Different Orders of Action by Regulators and Chromatin- Remodeling Complexes Regulators, HAT complexes, and ATP- dependent remodeling complexes can act in different orders (pathway A, B, or C) and still give the same end result: a template competent for transcription. Although not shown, it is also possible that binding by the general transcription factors precedes the action and recruitment of HAT complexes and ATP-dependent remodelers. –Geeta J. Narlikar, Hua-Ying Fan and Robert E. Kingston (2002) Cooperation between Complexes that Regulate Chromatin Structure and Transcription. Cell 108:

What biochemical events precede and follow GATA1 binding? Where should we look in the genome? –Segments around the transcription start site (TSS) all genes expressed vs nonexpressed genes all GATA1-responsive genes –induced vs repressed genes –All TF-occupied segments (OSs) –Distal TF OSs (i.e. outside the TSS zone) –All mappable regions Much larger computation Treat levels of biochemical marks as –continuous variables –discrete segments (bound or not, histones modified or not) Categorize genes and OSs by the order of events –Category 1. Co-occupancy by other TFs, histone modifications, and DHS formation occurs after the TF1 of interest binds or is activated –Category 2. Co-occupancy by other TFs, histone modifications, and DHS formation occurs before the TF1 of interest binds or is activated

Gata1 – ES cells in vitro hematopoietic differentiation: erythropoietin stem cell factor immature hematopoietic cell lines thrombopoietin add back GATA-1 G1E G1ME erythroid + megakaryocyte erythroid G1E-ER4+estradiol Cell-based models to study GATA1 function

+ estradiol Stably expressing estrogen- activated GATA-1 (GATA-1-ER) Global analysis of GATA1-regulated erythroid gene expression in G1E cells ( )

hemoglobin morphology U74 array 12,500 probesets 9,266 genes array 45,000 probesets 19,000 genes hrs in estradiol Affymetrix gene chip Blood 2004 Genome Res 2009 Transcriptome analysis

GATA1-induced (>2-fold) 1048 genes known targets new gene discovery GATA1-repressed (>2-fold) 1568 genes stem cell/progenitor markers proto-oncogenes (Kit/Myc/Myb) function unknown Affy Kinetics of GATA1-regulated Gene Expression

60 megabase region of chromosome 7 identify new GATA1-regulated genes define combinatorial TF interactions correlate histone marks w/ TF occupancy and gene expression Factor occupancy and GATA1 responses

Direct and indirect effects in repression and activation

Datasets available and in progress FeatureG1EG1E-ER4 + E2 DNase hypersensitive sitesIn progressDone GATA1 occupancyNoDone TAL1 occupancyDoneDone, will repeat GATA2 occupancyDone CTCF occupancyDone H3K4me1Done H3K4me3Done H3K27me3Done RNA polymerase IIRepeating it RNA seqIn progress

Western blots show specificity of antibodies and presence of proteins in cells α-GATA1 G1E G1E-ER4 MEL CH GATA1-ER GATA GATA2 α-GATA2 G1E G1E-ER4 MEL CH α-CTCF CTCF TAL1 αTAL1 CH12 are B-lymphoid cells; others are erythroid. Cheryl Keller Capone

Major observations under investigation GATA1 binds to a majority of the DNA segments occupied by TAL1 in G1E-ER4 cells (+E2). However, over half of these segments are occupied by TAL1 prior to restoration of GATA1. –Only a minority are at GATA2 occupied segments (OSs) TAL1 seems to be redistributed around some target loci –Change gradient in TAL1 from HS6>HS1 to HS1>HS6 in Hbb LCR Large changes in histone modifications are not observed after restoring and activating GATA1 –But some “small” changes are observed Level of GATA1 occupancy is similar in mouse (G1E-ER4+E2 cells) and human (K562 cells), but only a small minority of occupied segments are shared –15,000 GATA1 OSs in each species – 1,000 GATA1 OSs are shared

Hbb locus and surrounding OR genes  TAL1 redistributes when GATA1 is restored  ChIPseq fits with previous data  PolII and TAL1 are recruited to Hbb genes when GATA1 is restored

Zfpm1  Induced immediately after GATA1- ER is activated  TAL1 occupancy corresponds to GATA2 OS in G1E

c-Kit  Repressed after GATA1- ER is activated  TAL1 occupancy at GATA1 OSs, may correspond to GATA2 OS in G1E  Loss of TAL1 occupancy correlates with repression

Changes in peaks of occupancy, co-occupancy Start with peak calls from MACS for all the TF OS and from Fseq for DNase hypersensitive sites Define overlapping segments as those sharing at least one nucleotide Use set operations tools in Galaxy to find overlapping segments Compare OS for each TF +/- GATA1 and find overlaps between TFs Chris Morrissey

TAL1_G1E 6,930 TAL1_ER4 7,449 Overlap 2,777 GATA1 15,361 GATA1 TAL1_G1E 4,269 GATA1 TAL1_ER4 4,443 GATA1 Overlap 2,544 U U U TAL1 has the most overlap with GATA1 Chris Morrissey

CTCF_G1E 15,757 CTCF_ER4 27,909 Overlap 14,982 GATA1 15,361 GATA1 CTCF_G1E 555 GATA1 CTCF_ER4 932 GATA1 Overlap 528 U U U Chris Morrissey CTCF expands, but doesn't move

GATA2_G1E 2,077 GATA2_ER4 10,759 Overlap 356 GATA1 15,361 GATA1 GATA2_G1E 465 U GATA1 GATA2_ER4 178 U GATA1 Overlap 32 U GATA2 moves a lot (?) Chris Morrissey But is this just an artifact of noisy GATA2 data? Seems like this would be an ideal application for Yu Zhang and Kuan-Bei’s improvement in peak calling by using ChIP data on other proteins. - RH

Compute ratios of signals in G1E and ER4, adjust by M vs A plot 1. for each 10bp bin, we have tag counts for both G1E and ER4: tagcnts_g1e and tagcnts_er4 ; the number of total mapped reads in G1E and ER4: reads_g1e (in millions), and reads_er4 (in millions) 2. calculate rpm_g1e=(tagcnts_g1e+1)/reads_g1e; rpm_er4=(tagcnts_er4+1)/reads_er4. (the reason to do +1 is to remove zeros) 3. calculate M=log2(rpm_er4/rpm_g1e); A=0.5*log2(rpm_er4*rpm_g1e) 4. do MA-plot by plotting M versus A, and build a lowess line through the dots 5. based on the lowess regression, for each "A", predict a value "P" 6. calculate M'=M-P ; this M' stands for the difference between ER4 and G1E. Weisheng Wu and F. Chiaromonte

Effect of adjusting ratios by lowess of an M vs A plot Weisheng Wu

Correlations among chromatin features and expression in TSS segments Examine 4kb DNA segments centered on transcription start sites (TSSs) for all genes Determine mean signal for TF occupancy, histone modifications in each Compute Log2 of ratios, MA lowess adjustment Determine expression levels of genes and change between G1E and ER4 Draw scatterplots and determine correlations for all pairwise comparisons Weisheng Wu

Correlations in G1E-ER4 cells +E2 Weisheng WuSame sets of graphs for G1E and ratio of signals have been done

Notable pairwise correlations in TSSs Weisheng Wu Co-occupancy by GATA1 and TAL1 Positive correlation with Trx marks (H3K4me) Negative correlation with Pc marks (H3K27me3)

Limited explanatory power for CHANGE in expression

Changes (if any) in biochemical features at TSS show little or no difference between induced and repressed genes black: TSSs of all genes, red: up, blue: down, green: non-responsive Weisheng Wu

Major results from pairwise correlations in GATA1os

Distribution of the changes of the TFs/HMs at GATA1os

The changes of HMs don’t differ quite much between induced and repressed genes at GATA1os

Principal components in genomic features at TSSs PCA Dataset: Raw counts of all factors, in G1E and ER4 models, in 4kb window around TSS Swathi A. Kumar C1C2C3C4C5C6C7C8C9C10 Std-dev Proportion of Var Cumulative Var

Principal components in genomic features at TSSs PCA Dataset: Raw counts of all factors, in G1E and ER4 models, in 4kb window around TSS Swathi A. Kumar C1C2C3C4C5C6C7C8C9C10 Std-dev Proportion of Var Cumulative Var

PCA results Swathi A. Kumar

H3K4me3 is major contributor to variance Swathi A. Kumar

Linear Discriminant Analysis Same dataset as used in pairwise correlations of features in TSS segments Initial run –Binary response of induced or repressed expression Responsive vs Non-responsive Induced vs Repressed –Predictor variables are raw counts of GATA1 and other associated transcription/histone factors, before and after induction. – Leave-one-out cross validation Swathi A. Kumar

Major results from LDA Swathi A. Kumar Responsive vs Non-responsive Misclassification rate = 0.007% Induced vs repressed Misclassification rate = 36.7% Real Response AllocatedResponsiveNon-responsive Responsive Non-responsive Real Response AllocatedInducedRepressed Induced Repressed

Shared vs lineage-specific GATA1 OSs Yong Cheng, Kuan-Bei Chen ChIP-seq reads in G1E-ER4 + E2 for mouse GATA1 OSs ChIP-seq reads in K562 for human GATA1 OSs Map each to their respective genomes (ELAND) MACs peak calls –15,000 in each LiftOver each set of peaks to other species –About 10,000 liftOver in each Run intersection of the liftedOver peaks 1000 are shared in both species

Shared GATA1 OSs show higher occupancy level ChIPseq signal for GATA1 in each OS Yong Cheng

Genes close to shared GATA1 OSs are enriched for well- known erythroid functions

Shared GATA1 OSs are enriched in induced genes UpDown Genes with shared GATA1 OSs within 10kb Genes without shared GATA1 OS within 10kb Yong Cheng

Preserved GATA1 motifs are enriched in the shared GATA1 OSs SharedNot shared Preserved WGATAR w/o preserved WGATAR Yong Cheng

Genome-wide turnover analysis in mouse GATA1 occupied intervals mm8.gata1 A: occupied intervals15,360 B: occupied intervals with binding motifs in reference sequence12,202 B/A79.44% C: occupied intervals with rodent-specific binding motifs6,565 C/B53.80% D: occupied intervals with primate-rodent compensatory pattern1,383 D/B11.33% E: occupied intervals with shared motif between human and mouse3,018 E/B24.73% B is under-estimated since we are using alignments instead of the mouse sequence itself D : the compensatory motifs have a minimum distance of 10 bp Kuan-Bei Chen

Genome-wide turnover analysis in human hg18.gata1hg18.mychg18.ctcfhg18.gabp A: occupied intervals15,25211,58326,9766,442 B: occupied intervals with binding motifs in reference sequence12,3319,31524,9132,958 B/A80.85%80.42%92.35%45.92% C: occupied intervals with primate-specific binding motifs6,3484,45211,2971,290 C/B51.48%47.79%45.35%43.61% D: occupied intervals with primate-rodent compensatory pattern6561,2002, D/B5.32%12.88%9.05%11.76% E: occupied intervals with shared motif between human and mouse2,4463,39712, E/B19.84%36.47%50.77%28.03% Kuan-Bei Chen

Patterns of GATA1 binding sites in shared human/mouse shared OSs Out of 956 shared OSs: OSs have rodent-specific GATA1 motifs which are not present in human, chimp, rhesus, dog and cow OSs have GATA1 motifs shared by human and mouse OSs have primate-rodent compensatory patterns. Kuan-Bei Chen

Investigators on “PSU” mouse ENCODE Penn State –Hardison –Stephan Schuster –Frank Pugh –Robert Paulson –Francesca Chiaromonte, OSC –Yu Zhang, OSC –Webb Miller, OSC –Anton Nekrutenko, OSC Childrens’ Hospital of Philadelphia –Mitch Weiss –Gerd Blobel Emory Univ. –James Taylor Univ. Massachusetts –Job Dekker Duke Univ. –Greg Crawford, consultant –Terry Furey, consultant Cal Tech - Barbara Wold

Aims of “PSU” mouse ENCODE

Conservation of function, conservation of sequence (or not)