What is the Best Way to Find the Binding Site for a Transcription Factor? Dennis Shasha, Courant Institute, New York University With Philip Benfey and Ken Birnbaum Biology Department, New York University.
Transcriptional Networks Induction Specificity Modularity AAAA AAAA AAAA e.g. cis-element AAAA AAAA Repression AAAA AAAA AAAA AAAA Time 1 Time 2 Time 3
Genomic and Expression Data Cis-regulatory regions of expressed genes Expres-sion Level Time
Clusters of Co-Expressed Genes Expres-sion level Time a a a a a a a a a Over-represented motifs
Low Correlation Expression Level Time Time Transcription Factor Dowstream Genes
Modularity From: Arnone & Davidson, Development, 1997 later expression module early specification module From: Arnone & Davidson, Development, 1997
Transcription Factor X Expres-sion Time NAAAAAA.. ..TTTTTTN
Time Expres-sion Gene Cis - regulatory region A Transcription Factor X Expression levels of genes with ACA?GTC in their promoters Time Gene Cis - regulatory region ACA?GTC A
Time Expres-sion Gene Cis - regulatory region A B Transcription Factor X Expres-sion Expression levels of genes with ACA?GTC in their promoters Time Gene Cis - regulatory region ACA?GTC A ACA?GTC B
Time Expres-sion Gene Cis - regulatory region A B C Transcription Factor X Expres-sion Expression levels of genes with ACA?GTC in their promoters Time Gene Cis - regulatory region ACA?GTC A ACA?GTC B ACA?GTC C
Composite expression of ACAGTC at Time 5 Transcription Factor X Expres-sion Expression levels of genes with ACA?GTC in their promoters Time Gene Cis - Expression regulatory Level Example: Time 5 region ACA?GTC A 2 ACA?GTC B 4 ACA?GTC C 2 = 8 Composite expression of ACAGTC at Time 5
Composite expression of ACAGTC at Time 5 Transcription Factor X Composite expression of genes with ACA?GTC in promoter Expres-sion Time Expression levels of genes with ACA?GTC in their promoters Gene Cis - Expression regulatory Level Example: Time 5 region ACA?GTC A 2 ACA?GTC B 4 ACA?GTC C 2 = 8 Composite expression of ACAGTC at Time 5
Cooperative Binding Model TFs AND Binding Sites TF Expression Cell Type A Cell Type B Cell Type C Z Z Expression Level A B C Z
X Cooperative Binding Model TFs AND Binding Sites TF Expression Cell Type A Cell Type B Cell Type C X Z Expression Level A B C Target Gene Expression Expression Level A B C
X X X X Cooperative Binding Model TFs AND Binding Sites TF Expression Cell Type A Cell Type B Cell Type C Z Z Expression Level A B C Z Z X X Target Gene Expression Z Z Z Z Z X Expression Level X A B C
X X X X Cooperative Binding Model TFs AND Binding Sites TF Expression Cell Type A Cell Type B Cell Type C Z Z Expression Level A B C Z Z X X Target Gene Expression Z Z Z Z Z X Expression Level X A B C
Cooperative Binding Model TFs OR Binding Sites
Independent Binding Model: well handled by Bussemaker et al.
Assumptions: TF RNA Expression = TF protein (protein movement) TFs are active where they are expressed (co-factors) Binding sites are within 2 kb of initiation site
Testing the Method Yeast Fully sequenced genome Genome-wide mRNA expression profiles 300 knockout lines from Rosetta (Hughes et al. 2000) 2 datasets on yeast progressing through the cell cycle (Spellman et al. 1998; Cho et al. 1998)
Results for STE12 *From SCPD (Zhang, Cold Spring Harbor) Dennis, the point of the lower *From SCPD (Zhang, Cold Spring Harbor)
Results for STE12 *From SCPD (Zhang, Cold Spring Harbor) Dennis, the point of the lower *From SCPD (Zhang, Cold Spring Harbor)
Results for STE12 *From SCPD (Zhang, Cold Spring Harbor) Dennis, the point of the lower *From SCPD (Zhang, Cold Spring Harbor)
Overall Results
Conclusions Technique: correlate transcription factor expression with cis-element expression. Can capture information that would be missed by gene expression correlation. Can handle cooperative (AND) and independent (OR) cases. Does less well for complex circuits. Future effort: manipulate promoters to eliminate false positives. (Information theory + experiments)