Download presentation
Presentation is loading. Please wait.
Published byDaisy Williamson Modified over 9 years ago
1
Promoter and Module Analysis Statistics for Systems Biology
2
Transcription Factors DNA binding proteins that facilitate or inhibit Pol II initiation or elongation General transcription factors: –Used widely for many genes under many circumstances Specific transcription factors –Used to initiate specific genes under specific circumstances Distinction may not be so sharp!
3
Transcription Factor Families Several structures line up amino acids –Helix-turn-Helix (Homeodomain) –Helix-loop-helix –Zinc Finger Mostly dimers These families have proliferated because of their role in attracting transcription apparatus
4
DNA-Binding Proteins All proteins interact weakly with DNA Proteins with projecting amino acids interact with the DNA major groove Hydrogen bonds stabilize position of proteins on DNA Proteins that line up several amino acid contacts bind strongly to specific DNA sequences
5
Transcription Factor Recognition Sites Typically 6-10 positions very selective and several others show bias Often selectivity profile summarized by ‘motif’
6
Selectivity of Specific T.F.’s Most TF’s recognize 6-10 bases of DNA E. coli: longer (8-12 bp) TF’s –All sequences are effective Yeast: areas around promoters selectively cleared of nucleosomes –~ 30 x accessibility for those Animal: cooperative binding of several T.F.’s
7
Cofactors Frequently the effect of DNA-binding proteins depends on co-factors E.g. ER sits on the DNA but requires estrogen as a co-factor to function Myc requires Max as a co- factor to stimulate transcription If Max is coupled with Mad instead, the genes are repressed
8
Assembly of Transcription App. Change in physical conformation of DNA leads to increased likelihood of spontaneous assembly of Pol II Getting Pol II further into the gene seems to require further steps
9
The TF Family Circus
10
Inferring Regulatory Architecture Aim: to find which regulators influence gene expression Concerns: –Contributions of many factors to any one gene Approaches: –Decision tree (Computer Science) –Regression (more statistical) DNA sequence motifs can be a surrogate
11
The Israeli ‘Module’ Approach Idea: model TF binding as a ‘decision- tree’ Steps 1.Cluster gene expression profiles 2.Fit best regulator tree to each cluster 3.Re-assign genes to clusters Iterate until converge
12
Strengths and Weaknesses of Module Approach Explicitly models interaction among regulators Expression arrays give poor estimates of activity of TF’s or other regulators Some regulators could repress genes Discrete predictor model is inefficient
13
Update: Estimating TF Activity Since TF expression data is unreliable for activity, could we do better inferring TF activity? Use DNA sequence motifs as surrogate for TF binding Fit double E-M – complicated!
14
The Regression Approach Direct data on TF occupancy from ChIP Two stages: –Find candidate TF’s by correlation between occupancy and sets of genes –Estimate TF activity in each condition by regression model
15
Regression Steps Preliminary Screen r > r threshold
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.