Download presentation
Presentation is loading. Please wait.
Published byLucy Wilcox Modified over 9 years ago
1
Genomics of Erythroid Regulation: G1E and G1E-ER4 January 20, 2010
2
Investigators on global predictions and tests Penn State –Hardison –Francesca Chiaromonte –Yu Zhang –Webb Miller –Stephan Schuster –Frank Pugh, collaborator –Kateryna Makova, collaborator –Anton Nekrutenko, collaborator Childrens’ Hospital of Philadelphia –Mitch Weiss –Gerd Blobel Emory Univ. –James Taylor Duke Univ. –Greg Crawford, collaborator Univ. Queensland –Andrew Perkins, collaborator NHGRI –Laura Elnitski, collaborator
3
Aims of Global tests and predictions of erythroid regulation
4
Hematopoiesis
5
FactorClassMode of discovery GATA1Zn fingerbinds globin locus NF-E2bZIPbinds globin locus KLF1/EKLFZn fingersubtractive hybridization SCL/TAL1bHLHrearranged in leukemias GFI1bZnfingeroncoviral integration site ZBTB7a/LRFPOZ-Kruppelproto-oncogene lymphomas Others Major erythroid transcription factors
6
Globin genes -GATA- GATA-1 Transcription factor GATA-1 Founding member of a small family of proteins - GATA-2 GATA-6 Binds functionally important cis WGATAR motifs in regulatory regions of many hematopoietic genes Essential for erythroid and megakaryocyte development - gene knockout studies in mice - analysis of human patients
7
Lineage-restricted factors cofactors histone modifying enzymes Transcription
8
Gene activation by alterations in chromatin “Regulatory signals entering the nucleus encounter chromatin, not DNA, and the rate-limiting biochemical response that leads to activation of gene expression in most cases involves alterations in chromatin structure. How are such alterations achieved?” –Gary Felsenfeld & Mark Groudine (2003) Controlling the double helix. Nature 421: 448-453 "It is now generally argued that reorganization of these chromatin structures is a process that is mechanistically linked to many gene activation or repression events, and is initiated by the action of site-specific transcription factors, acting either through ATP-driven nucleosome remodeling machines, or via the action of enzymes that covalently modify various components of the chromatin structure.” –Sam John, … John A. Stamatoyannopoulos, and Gordon L. Hager (2008) Interaction of the Glucocorticoid Receptor with the Chromatin Landscape. Molecular Cell 29, 611–624. “These results imply that GATA-1 is sufficient to direct chromatin structure reorganization within the beta-globin LCR and an erythroid pattern of gene expression in the absence of other hematopoietic transcription factors.” –Layon ME, Ackley CJ, West RJ, Lowrey CH. (2007) Expression of GATA-1 in a non- hematopoietic cell line induces beta-globin locus control region chromatin structure remodeling and an erythroid pattern of gene expression. J Mol Biol. 366:737-744.
9
Chromatin transitions before and after TF binding "We propose four specific kinds of interaction. The classical mode for GR binding to chromatin involves receptor-dependent recruitment of the Swi/Snf complex (1), resulting in a hormone-dependent hypersensitive transition. It is now clear, however, that some hormone-dependent events must involve other remodeling species (2). Furthermore, many GR binding events are associated with pre-existing transitions, and these constitutive events fall again into two classes, Brg1 dependent (3) and Brg1 independent (4). ” –Sam John, … John A. Stamatoyannopoulos, and Gordon L. Hager (2008) Interaction of the Glucocorticoid Receptor with the Chromatin Landscape. Molecular Cell 29, 611–624.
10
Order of events in activation can vary Figure 5. Models Depicting Different Orders of Action by Regulators and Chromatin- Remodeling Complexes Regulators, HAT complexes, and ATP- dependent remodeling complexes can act in different orders (pathway A, B, or C) and still give the same end result: a template competent for transcription. Although not shown, it is also possible that binding by the general transcription factors precedes the action and recruitment of HAT complexes and ATP-dependent remodelers. –Geeta J. Narlikar, Hua-Ying Fan and Robert E. Kingston (2002) Cooperation between Complexes that Regulate Chromatin Structure and Transcription. Cell 108: 475-487
11
What biochemical events precede and follow GATA1 binding? Where should we look in the genome? –Segments around the transcription start site (TSS) all genes expressed vs nonexpressed genes all GATA1-responsive genes –induced vs repressed genes –All TF-occupied segments (OSs) –Distal TF OSs (i.e. outside the TSS zone) –All mappable regions Much larger computation Treat levels of biochemical marks as –continuous variables –discrete segments (bound or not, histones modified or not) Categorize genes and OSs by the order of events –Category 1. Co-occupancy by other TFs, histone modifications, and DHS formation occurs after the TF1 of interest binds or is activated –Category 2. Co-occupancy by other TFs, histone modifications, and DHS formation occurs before the TF1 of interest binds or is activated
12
Gata1 – ES cells in vitro hematopoietic differentiation: erythropoietin stem cell factor immature hematopoietic cell lines thrombopoietin add back GATA-1 G1E G1ME erythroid + megakaryocyte erythroid G1E-ER4+estradiol Cell-based models to study GATA1 function
13
+ estradiol Stably expressing estrogen- activated GATA-1 (GATA-1-ER) Global analysis of GATA1-regulated erythroid gene expression in G1E cells (1999-2009)
14
hemoglobin morphology U74 array 12,500 probesets 9,266 genes 430 2.0 array 45,000 probesets 19,000 genes hrs in estradiol 0 3 7 14 21 30 Affymetrix gene chip Blood 2004 Genome Res 2009 Transcriptome analysis
15
GATA1-induced (>2-fold) 1048 genes known targets new gene discovery GATA1-repressed (>2-fold) 1568 genes stem cell/progenitor markers proto-oncogenes (Kit/Myc/Myb) function unknown Affy 430 2.0 Kinetics of GATA1-regulated Gene Expression
16
60 megabase region of chromosome 7 identify new GATA1-regulated genes define combinatorial TF interactions correlate histone marks w/ TF occupancy and gene expression Factor occupancy and GATA1 responses
17
Direct and indirect effects in repression and activation
18
Datasets available and in progress FeatureG1EG1E-ER4 + E2 DNase hypersensitive sitesIn progressDone GATA1 occupancyNoDone TAL1 occupancyDoneDone, will repeat GATA2 occupancyDone CTCF occupancyDone H3K4me1Done H3K4me3Done H3K27me3Done RNA polymerase IIRepeating it RNA seqIn progress
19
Western blots show specificity of antibodies and presence of proteins in cells α-GATA1 G1E G1E-ER4 MEL CH12 125 101 56.2 GATA1-ER GATA1 125 101 56.2 GATA2 α-GATA2 G1E G1E-ER4 MEL CH12 125 101 56.2 35.8 α-CTCF CTCF 125 101 56.2 TAL1 αTAL1 CH12 are B-lymphoid cells; others are erythroid. Cheryl Keller Capone
20
Major observations under investigation GATA1 binds to a majority of the DNA segments occupied by TAL1 in G1E-ER4 cells (+E2). However, over half of these segments are occupied by TAL1 prior to restoration of GATA1. –Only a minority are at GATA2 occupied segments (OSs) TAL1 seems to be redistributed around some target loci –Change gradient in TAL1 from HS6>HS1 to HS1>HS6 in Hbb LCR Large changes in histone modifications are not observed after restoring and activating GATA1 –But some “small” changes are observed Level of GATA1 occupancy is similar in mouse (G1E-ER4+E2 cells) and human (K562 cells), but only a small minority of occupied segments are shared –15,000 GATA1 OSs in each species – 1,000 GATA1 OSs are shared
21
Hbb locus and surrounding OR genes TAL1 redistributes when GATA1 is restored ChIPseq fits with previous data PolII and TAL1 are recruited to Hbb genes when GATA1 is restored
22
Zfpm1 Induced immediately after GATA1- ER is activated TAL1 occupancy corresponds to GATA2 OS in G1E
23
c-Kit Repressed after GATA1- ER is activated TAL1 occupancy at GATA1 OSs, may correspond to GATA2 OS in G1E Loss of TAL1 occupancy correlates with repression
24
Changes in peaks of occupancy, co-occupancy Start with peak calls from MACS for all the TF OS and from Fseq for DNase hypersensitive sites Define overlapping segments as those sharing at least one nucleotide Use set operations tools in Galaxy to find overlapping segments Compare OS for each TF +/- GATA1 and find overlaps between TFs Chris Morrissey
25
TAL1_G1E 6,930 TAL1_ER4 7,449 Overlap 2,777 GATA1 15,361 GATA1 TAL1_G1E 4,269 GATA1 TAL1_ER4 4,443 GATA1 Overlap 2,544 U U U TAL1 has the most overlap with GATA1 Chris Morrissey
26
CTCF_G1E 15,757 CTCF_ER4 27,909 Overlap 14,982 GATA1 15,361 GATA1 CTCF_G1E 555 GATA1 CTCF_ER4 932 GATA1 Overlap 528 U U U Chris Morrissey CTCF expands, but doesn't move
27
GATA2_G1E 2,077 GATA2_ER4 10,759 Overlap 356 GATA1 15,361 GATA1 GATA2_G1E 465 U GATA1 GATA2_ER4 178 U GATA1 Overlap 32 U GATA2 moves a lot (?) Chris Morrissey But is this just an artifact of noisy GATA2 data? Seems like this would be an ideal application for Yu Zhang and Kuan-Bei’s improvement in peak calling by using ChIP data on other proteins. - RH
28
Compute ratios of signals in G1E and ER4, adjust by M vs A plot 1. for each 10bp bin, we have tag counts for both G1E and ER4: tagcnts_g1e and tagcnts_er4 ; the number of total mapped reads in G1E and ER4: reads_g1e (in millions), and reads_er4 (in millions) 2. calculate rpm_g1e=(tagcnts_g1e+1)/reads_g1e; rpm_er4=(tagcnts_er4+1)/reads_er4. (the reason to do +1 is to remove zeros) 3. calculate M=log2(rpm_er4/rpm_g1e); A=0.5*log2(rpm_er4*rpm_g1e) 4. do MA-plot by plotting M versus A, and build a lowess line through the dots 5. based on the lowess regression, for each "A", predict a value "P" 6. calculate M'=M-P ; this M' stands for the difference between ER4 and G1E. Weisheng Wu and F. Chiaromonte
29
Effect of adjusting ratios by lowess of an M vs A plot Weisheng Wu
30
Correlations among chromatin features and expression in TSS segments Examine 4kb DNA segments centered on transcription start sites (TSSs) for all genes Determine mean signal for TF occupancy, histone modifications in each Compute Log2 of ratios, MA lowess adjustment Determine expression levels of genes and change between G1E and ER4 Draw scatterplots and determine correlations for all pairwise comparisons Weisheng Wu
31
Correlations in G1E-ER4 cells +E2 Weisheng WuSame sets of graphs for G1E and ratio of signals have been done
32
Notable pairwise correlations in TSSs Weisheng Wu Co-occupancy by GATA1 and TAL1 Positive correlation with Trx marks (H3K4me) Negative correlation with Pc marks (H3K27me3)
33
Limited explanatory power for CHANGE in expression
34
Changes (if any) in biochemical features at TSS show little or no difference between induced and repressed genes black: TSSs of all genes, red: up, blue: down, green: non-responsive Weisheng Wu
35
Major results from pairwise correlations in GATA1os
36
Distribution of the changes of the TFs/HMs at GATA1os
37
The changes of HMs don’t differ quite much between induced and repressed genes at GATA1os
38
Principal components in genomic features at TSSs PCA Dataset: Raw counts of all factors, in G1E and ER4 models, in 4kb window around TSS Swathi A. Kumar C1C2C3C4C5C6C7C8C9C10 Std-dev3.17430.86820.49280.25370.22530.10690.09820.07390.0620.0507 Proportion of Var0.89780.067160.02160.005730.004520.00100.00080.00040.00030.0002 Cumulative Var0.89780.96490.98660.99230.99680.99790.99870.99920.99960.9998
39
Principal components in genomic features at TSSs PCA Dataset: Raw counts of all factors, in G1E and ER4 models, in 4kb window around TSS Swathi A. Kumar C1C2C3C4C5C6C7C8C9C10 Std-dev1.9581.6321.3171.1060.9640.8460.7340.6750.5950.500 Proportion of Var0.2950.2050.1340.0940.0720.0550.0410.0350.0270.019 Cumulative Var0.2950.5000.6340.7280.7990.8540.8950.9310.9580.977
40
PCA results Swathi A. Kumar
41
H3K4me3 is major contributor to variance Swathi A. Kumar
42
Linear Discriminant Analysis Same dataset as used in pairwise correlations of features in TSS segments Initial run –Binary response of induced or repressed expression Responsive vs Non-responsive Induced vs Repressed –Predictor variables are raw counts of GATA1 and other associated transcription/histone factors, before and after induction. – Leave-one-out cross validation Swathi A. Kumar
43
Major results from LDA Swathi A. Kumar Responsive vs Non-responsive Misclassification rate = 0.007% Induced vs repressed Misclassification rate = 36.7% Real Response AllocatedResponsiveNon-responsive Responsive3168807 Non-responsive8775953 Real Response AllocatedInducedRepressed Induced32621958 Repressed5261014
44
Shared vs lineage-specific GATA1 OSs Yong Cheng, Kuan-Bei Chen ChIP-seq reads in G1E-ER4 + E2 for mouse GATA1 OSs ChIP-seq reads in K562 for human GATA1 OSs Map each to their respective genomes (ELAND) MACs peak calls –15,000 in each LiftOver each set of peaks to other species –About 10,000 liftOver in each Run intersection of the liftedOver peaks 1000 are shared in both species
45
Shared GATA1 OSs show higher occupancy level ChIPseq signal for GATA1 in each OS Yong Cheng
46
Genes close to shared GATA1 OSs are enriched for well- known erythroid functions
47
Shared GATA1 OSs are enriched in induced genes UpDown Genes with shared GATA1 OSs within 10kb 20587 Genes without shared GATA1 OS within 10kb 637704 Yong Cheng
48
Preserved GATA1 motifs are enriched in the shared GATA1 OSs SharedNot shared Preserved WGATAR7042090 w/o preserved WGATAR 2526391 Yong Cheng
49
Genome-wide turnover analysis in mouse GATA1 occupied intervals mm8.gata1 A: occupied intervals15,360 B: occupied intervals with binding motifs in reference sequence12,202 B/A79.44% C: occupied intervals with rodent-specific binding motifs6,565 C/B53.80% D: occupied intervals with primate-rodent compensatory pattern1,383 D/B11.33% E: occupied intervals with shared motif between human and mouse3,018 E/B24.73% B is under-estimated since we are using alignments instead of the mouse sequence itself D : the compensatory motifs have a minimum distance of 10 bp Kuan-Bei Chen
50
Genome-wide turnover analysis in human hg18.gata1hg18.mychg18.ctcfhg18.gabp A: occupied intervals15,25211,58326,9766,442 B: occupied intervals with binding motifs in reference sequence12,3319,31524,9132,958 B/A80.85%80.42%92.35%45.92% C: occupied intervals with primate-specific binding motifs6,3484,45211,2971,290 C/B51.48%47.79%45.35%43.61% D: occupied intervals with primate-rodent compensatory pattern6561,2002,255348 D/B5.32%12.88%9.05%11.76% E: occupied intervals with shared motif between human and mouse2,4463,39712,649829 E/B19.84%36.47%50.77%28.03% Kuan-Bei Chen
51
Patterns of GATA1 binding sites in shared human/mouse shared OSs Out of 956 shared OSs: - 200 OSs have rodent-specific GATA1 motifs which are not present in human, chimp, rhesus, dog and cow. - 626 OSs have GATA1 motifs shared by human and mouse. - 76 OSs have primate-rodent compensatory patterns. Kuan-Bei Chen
52
Investigators on “PSU” mouse ENCODE Penn State –Hardison –Stephan Schuster –Frank Pugh –Robert Paulson –Francesca Chiaromonte, OSC –Yu Zhang, OSC –Webb Miller, OSC –Anton Nekrutenko, OSC Childrens’ Hospital of Philadelphia –Mitch Weiss –Gerd Blobel Emory Univ. –James Taylor Univ. Massachusetts –Job Dekker Duke Univ. –Greg Crawford, consultant –Terry Furey, consultant Cal Tech - Barbara Wold
53
Aims of “PSU” mouse ENCODE
54
Conservation of function, conservation of sequence (or not)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.