Download presentation
Presentation is loading. Please wait.
Published byNigel Evans Modified over 9 years ago
1
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC
2
ActivatorRepressor Regulated gene ActivatorRepressor Regulated gene Activator Regulated gene Repressor State 1 Activator State 2 Activator Repressor State 3 Gene Regulation: Simple Example Regulated gene DNA Microarray Regulators DNA Microarray Regulators
3
truefalse true false Regulation Tree Activator? Repressor? State 1State 2State 3 true Regulation program Module genes Activator expression Repressor expression Genes in the same module share the same regulation program
4
Module Networks Goal: Discover regulatory modules and their regulators Module genes: set of genes that are similarly controlled Regulation program: expression as function of regulators Modules HAP4 CMK1 true false true false
5
Expression level in each module is a function of expression of regulators Module Network Probabilistic Model Experiment Gene Expression Module Regulator 1 Regulator 2 Regulator 3 Level What module does gene “g” belong to? Expression level of Regulator 1 in experiment BMH1 GIC2 0 0 0 2 1 Module P(Level | Module, Regulators) HAP4 CMK1 0 0 0
6
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC
7
Learning Problem Experiment Gene Expression Module Regulator 1 Regulator 2 Regulator 3 Level HAP4 CMK1 0 0 0 Find gene module assignments and tree structures that maximize P(M|D) Goal: Gene module assignments Tree structures Hard Genes: 5000-10000 Regulators: ~500
8
Learning Algorithm Overview Relearn gene assignments to modules clustering Gene module assignment Regulatory modules Learn regulation programs HAP4 CMK1
9
Learning Regulation Programs Experiments Module genes Experiments sorted in original order Experiments sorted by Hap4 expression log P(M|D) log P(D| , ) + log P( , ) HAP4 log P(M|D) log P(D HAP4 | HAP4 , HAP4 ) + log P(D HAP4 | HAP4 , HAP4 ) + log P( HAP4 , HAP4 , HAP4 , HAP4 ) SIP4 log P(M|D) log P(D SIP4 | SIP4 , SIP4 ) + log P(D SIP4 | SIP4 , SIP4 ) + log P( SIP4 , SIP4 , SIP4 , SIP4 ) log P(M|D) log P(D HAP4 | HAP4 , HAP4 ) + log P(D CMK1 | CMK1 , CMK1 ) + log P(D CMK1 | CMK1 , CMK1 ) + … HAP4 CMK1 Module genes Hap4 expression Regulator
10
Learning Algorithm Performance -131 -130 -129 -128 05101520 Bayesian score (avg. per gene) Algorithm iterations 0 10 20 30 40 50 05101520 Algorithm iterations Gene module assignment changes (% from total) Significant improvements across learning iterations Many genes (50%) change module assignment in learning
11
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC
12
Yeast Stress Data Genes Selected 2355 that showed activity Experiments (173) Diverse environmental stress conditions: heat shock, nitrogen depletion,…
13
Comparison to Bayesian Networks Problems Robustness Interpretability Cmk1 Hap4 Mig1 Ste12 Bayesian Network Friedman et al ’00 Hartemink et al. ’01 Yap1 Gic1 Expression level of each gene is a function of expression of regulators Fragment of learned Bayesian network 2355 variables (genes) 173 instances (experiments)
14
Comparison to Bayesian Networks Problems Robustness Interpretability Cmk1 Hap4 Mig1 Ste12 Bayesian Network Friedman et al ’00 Hartemink et al. ’01 Yap1 Gic1 Module Network SPRKF ’03 (UAI) Solutions Robustness sharing parameters Interpretability module-level model Regulator 1 Regulator 2 Regulator 3 Level Module
15
Comparison to Bayesian Networks Problems Robustness Interpretability Solutions Robustness sharing parameters Interpretability module-level model Test Data Log-Likelihood (gain per instance) Number of modules Bayesian Network performance -150 -100 -50 0 50 100 150 0100200300400500 Learn which parameters are shared (by learning which genes are in the same module)
16
Module From Model to Regulatory Modules Regulator 1 Regulator 2 Regulator 3 Level HAP4 CMK1 Biologically relevant? HAP4 CMK1 0 0 0
17
Respiration Module Regulation program Module genes Energy production (oxid. phos. 26/55 P<10 -30 ) Hap4+Msn4 known to regulate module genes Module genes functionally coherent? Module genes known targets of predicted regulators? Predicted regulator
18
Energy, Osomlarity, & cAMP Signaling Tpk1: Regulation by non-TFs (Tpk1 is a catalytic unit of cAMP dependent protein kinase) Module contains known Tpk1 targets (e.g. Tps1) Tpk1-mediated STRE motif (50/64 genes; p<3x10 -11 )
19
EM: Biological Improvement
20
Hap4Xbp1Yer184cYap6Gat1Ime4Lsg1Msn4Gac1Gis1 Ypl230w Not3Sip2 Amino acid metabolism Energy and cAMP signaling DNA and RNA processing nuclear 123 253341 STREN41HAP234 4 26 REPCARCAT8N26ADR1 3947 HSFHAC1XBP1 3042 MCM1N30 3136 ABF_CN36 516 Kin82Cmk1Tpk1Ppt1 N11GATA 8109 GCN4CBF1_B Tpk2Pph3 13141517 N14N13 Regulation supported in literatureRegulator (Signaling molecule)Regulator (transcription factor) Inferred regulation 48 Module (number) Experimentally tested regulator Enriched cis-Regulatory Motif Bmh1Gcn20 GCR1 18 MIG1N18 11
21
Biological Evaluation Summary Are the module genes functionally coherent? Are some module genes known targets of the predicted regulators? 46/50 30/50 Functionally coherent = module genes enriched for GO annotations with hypergeometric p-value < 0.01 (corrected for multiple hypotheses) Known targets = direct biological experiments reported in the literature
22
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC
23
From Model to Detailed Predictions Prediction: Experiment: Regulator ‘X’ regulates process ‘Y’ Knock out ‘X’ and repeat experiment HAP4 Ypl230w X ?
24
Does ‘X’ Regulate Predicted Genes? Experiment: knock out Ypl230w (stationary phase) 1334 regulated genes (312 expected by chance) wild-typemutant >4x Regulated genes Rank modules by regulated genes Predicted modules ModuleSig. Protein foldingP<0.0001 Cell diferentiationP<0.02 Glycolysis and foldingP<0.04 Mitochondrial and protein fateP<0.04 ModuleSig. Protein foldingP<0.0001 Cell diferentiationP<0.02 Glycolysis and foldingP<0.04 Mitochondrial and protein fateP<0.04 Modules predicted to be regulated by Ypl230w Ypl230w regulates computationally predicted genes
25
Regulated genes (1014) Ppt1 knockout (hypo-osmotic stress) wild-typemutant Regulated genes (1034) wild-typemutant Kin82 knockout (heat shock) ModuleSig. Energy and osmotic stressP<0.0001 Energy, osmolarity & cAMP signalingP<0.006 mRNA, rRNA and tRNA processingP<0.02 ModuleSig. Ribosomal and phosphate metabolismP<0.009 Amino acid and purine metabolismP<0.01 mRNA, rRNA and tRNA processingP<0.02 Protein foldingP<0.02 Cell cycleP<0.02 Does ‘X’ Regulate Predicted Genes?
26
Wet Lab Experiments Summary 3/3 regulators regulate computationally predicted genes New yeast biology suggested Ypl230w activates protein-folding, cell wall and ATP-binding genes Ppt1 represses phosphate metabolism and rRNA processing Kin82 activates energy and osmotic stress genes
27
Outline Who regulates whom and when? Model Learning algorithm Evaluation Wet lab experiments Perspective: why does it work? Reg. ACGTGC
28
Why does it work? Underlying assumption: Regulators are transcriptionally regulated Regulators are part of regulatory structures in which they are themselves regulated* Statistical methods can detect associations between regulators and their targets * [Shen-Orr et al., ’02] find many such structures
29
Regulator Chain Respiration module Time Active protein level mRNA expression level Phd1 Hap4 Targets Phd1 Hap4 Targets Phd1 (TF) Hap4 (TF) Cox4Cox6Atp17 Black: regulators that cannot be detected Red: correctly predicted regulator Blue: targets
30
Auto Regulation Snf kinase regulated processes module Yap6 (TF) Vid24Tor1Gut2 Black: regulators that cannot be detected Red: correctly predicted regulator Blue: targets
31
Positive Signaling Loop Sporulation and cAMP pathway module Sip2 (SM) Msn4 (TF) Vid24Tor1Gut2 Black: regulators that cannot be detected Red: correctly predicted regulator Blue: targets
32
Negative Signaling Loop Energy and osmotic stress module Tpk1 (SM) Msn4 (TF) Nth1Tps1Glo1 Black: regulators that cannot be detected Red: correctly predicted regulator Blue: targets
33
Why Does it Work? Feed-forward and feedback loops Some transcription factors and signal transduction molecules have a detectable expression signature Module Networks infers their regulatory relationships
34
Assignment Download the yeast stress expression dataset Download the list of transcription factor regulators Randomly partition the dataset in a 5-fold cross validation scheme For k=50: Create a hard-clustering model (use code from earlier exercise). At each array, this model has a separate Gaussian distribution for each of the 50 values of the cluster variable Use the assignment of genes to clusters that you learned in the hard-clustering, and for each cluster, learn a decision tree with at most: (1) one split (2) two splits (3) three splits Note 1: allow only splits with >=5 arrays in each side of the split Note 2: split question is whether the expression level of the transcription factor is greater than some value
35
Assignment Continued Note 3: at each leaf of the resulting model, there is a single Gaussian distribution that is used for all arrays that map to that leaf Compute the log-likelihood of the test data for each model (hard-clustering, and each of the three regulation models) Plot the avg. and std. test log-likelihood for each model For the model with two splits on each cluster, use the Gaussian distribution at each array to sample a new expression dataset with exactly the same number of genes and number of arrays. For each original gene and array, you sample from the Gaussian distribution associated with that gene and that array Learn a model with two splits for each cluster Plot the number of regulation tree splits that are identical between the model that sampled the data and the new model that you learned
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.