Presentation is loading. Please wait.

Presentation is loading. Please wait.

Promoter and Module Analysis Statistics for Systems Biology.

Similar presentations


Presentation on theme: "Promoter and Module Analysis Statistics for Systems Biology."— Presentation transcript:

1 Promoter and Module Analysis Statistics for Systems Biology

2 Transcription Factors DNA binding proteins that facilitate or inhibit Pol II initiation or elongation General transcription factors: –Used widely for many genes under many circumstances Specific transcription factors –Used to initiate specific genes under specific circumstances Distinction may not be so sharp!

3 Transcription Factor Families Several structures line up amino acids –Helix-turn-Helix (Homeodomain) –Helix-loop-helix –Zinc Finger Mostly dimers These families have proliferated because of their role in attracting transcription apparatus

4 DNA-Binding Proteins All proteins interact weakly with DNA Proteins with projecting amino acids interact with the DNA major groove Hydrogen bonds stabilize position of proteins on DNA Proteins that line up several amino acid contacts bind strongly to specific DNA sequences

5 Transcription Factor Recognition Sites Typically 6-10 positions very selective and several others show bias Often selectivity profile summarized by ‘motif’

6 Selectivity of Specific T.F.’s Most TF’s recognize 6-10 bases of DNA E. coli: longer (8-12 bp) TF’s –All sequences are effective Yeast: areas around promoters selectively cleared of nucleosomes –~ 30 x accessibility for those Animal: cooperative binding of several T.F.’s

7 Cofactors Frequently the effect of DNA-binding proteins depends on co-factors E.g. ER sits on the DNA but requires estrogen as a co-factor to function Myc requires Max as a co- factor to stimulate transcription If Max is coupled with Mad instead, the genes are repressed

8 Assembly of Transcription App. Change in physical conformation of DNA leads to increased likelihood of spontaneous assembly of Pol II Getting Pol II further into the gene seems to require further steps

9 The TF Family Circus

10 Inferring Regulatory Architecture Aim: to find which regulators influence gene expression Concerns: –Contributions of many factors to any one gene Approaches: –Decision tree (Computer Science) –Regression (more statistical) DNA sequence motifs can be a surrogate

11 The Israeli ‘Module’ Approach Idea: model TF binding as a ‘decision- tree’ Steps 1.Cluster gene expression profiles 2.Fit best regulator tree to each cluster 3.Re-assign genes to clusters Iterate until converge

12 Strengths and Weaknesses of Module Approach Explicitly models interaction among regulators Expression arrays give poor estimates of activity of TF’s or other regulators Some regulators could repress genes Discrete predictor model is inefficient

13 Update: Estimating TF Activity Since TF expression data is unreliable for activity, could we do better inferring TF activity? Use DNA sequence motifs as surrogate for TF binding Fit double E-M – complicated!

14 The Regression Approach Direct data on TF occupancy from ChIP Two stages: –Find candidate TF’s by correlation between occupancy and sets of genes –Estimate TF activity in each condition by regression model

15 Regression Steps Preliminary Screen r > r threshold


Download ppt "Promoter and Module Analysis Statistics for Systems Biology."

Similar presentations


Ads by Google