Download presentation
Presentation is loading. Please wait.
Published bySimon Thomas Modified over 9 years ago
1
Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman (Hebrew U.)
2
Learning Bayesian Networks Density estimation Model data distribution in population Probabilistic inference: Prediction Classification Dependency structure Interactions between variables Causality Scientific discovery Data INTL MSFT MOT NVLS
3
Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 MSFT DELL INTL NVLS MOT
4
Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 DELL INTL NVLS MOT MSFT
5
Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 INTL MSFT DELL NVLS MOT Bayesian Network
6
Fragment of learned BN Stock Market 4411 stocks (variables) 273 trading days (instances) from Jan.’02 – Mar.’03 Problems Statistical robustness Interpretability
7
Key Observation Many stocks depend on the same influencing factors in much the same way Example: Intel, Novelus, Motorola, Dell depend on the price of Microsoft Many other domains with similar characteristics Gene expression Collaborative filtering Computer network performance … Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70
8
INTL MSFT MOT DELL AMAT HPQ CPD 2 CPD 1 CPD 3 Bayesian Network The Module Network Idea CPD 6 CPD 3 CPD 5 CPD 1 CPD 2 CPD 4 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I Module Network
9
Problems and Solutions Statistical robustness Interpretability Share parameters and dependencies between variables with similar behavior Explicit modeling of modular structure
10
Outline Module Network Probabilistic model Learning the model Experimental results
11
Module Network Components Module Assignment Function A(MSFT)=M I A(MOT)=A(DELL)=A(INTL) =M II A(AMAT)= A(HPQ)=M III INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ
12
Module Network Components Module Assignment Function Set of parents for each module Pa(M I )= Pa(M II )={MSFT} Pa(M III )={DELL, INTL} INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I
13
Module Network Components Module Assignment Function Set of parents for each module CPD template for each module INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I
14
Ground Bayesian Network A module network induces a ground BN over X A module network defines a coherent probabilty distribution over X if the ground BN is acyclic INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ Ground Bayesian Network
15
Module Graph Nodes correspond to modules M i M j if at least one variable in M i is a parent of M j INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I MIMI M II M III Module graph Theorem: The ground BN is acyclic if the module graph is acyclic Acyclicity checked efficiently using the module graph
16
Outline Module Network Probabilistic model Learning the model Experimental results
17
Learning Overview Given data D, find assignment function A and structure S that maximize the Bayesian score Marginal data likelihood Data likelihood Parameter prior Marginal likelihood Assignment / structure prior
18
Instance 3 Likelihood Function Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ Instance 1 Instance 2 MIMI M II |MSFT M III |DELL,INTL Sufficient statistics of (X,Y) Likelihood function decomposes by modules
19
Bayesian Score Decomposition Bayesian score decomposes by modules INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I Delete INTL Module III Module j variablesModule j parents
20
Bayesian Score Decomposition Bayesian score decomposes by modules INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I A(MOT)=2 A(MOT)=1
21
Algorithm Overview Find assignment function A and structure S that maximize the Bayesian score Dependency structure S Improve structure Improve assignments Find initial assignment A Assignment function A
22
Initial Assignment Function x[1] AMAT MSFTDELL MOT HPQ INTL x[2] x[3] x[4] Variables (stocks) Instances (trading days) Find variables that are similar across instances A(MOT)= M II A(INTL)= M II A(DELL)= M II MSFT MOTHPQ AMAT DELL INTL 123
23
Algorithm Overview Find assignment function A and structure S that maximize the Bayesian score Dependency structure S Improve structure Improve assignments Find initial assignment A Assignment function A
24
Learning Dependency Structure Heuristic search with operators Add/delete parent for module Cannot reverse edges Handle acyclicity Can be checked efficiently on the module graph Efficient computation After applying operator for module M j, only update score of operators for module M j INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I MIMI M II M III X INTL Module I INTL Module III X MSFT Module II
25
Learning Dependency Structure Structure search done at module level Parent selection Reduced search space relative to BN Acyclicity checking Individual variables only used for computation of sufficient statistics
26
Algorithm Overview Find assignment function A and structure S that maximize the Bayesian score Dependency structure S Improve structure Improve assignments Find initial assignment A Assignment function A
27
Learning Assignment Function A(DELL)=M I Score: 0.7 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL
28
Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL
29
Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 A(DELL)=M III Score: cyclic! INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL
30
Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 A(DELL)=M III Score: cyclic! INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I
31
Ideal Algorithm Learn the module assignment of all variables simultaneously
32
Problem Due to acyclicity cannot optimize assignment for variables separately DELL Module I AMAT Module III MSFT Module II HPQ Module IV MIMI M II M III Module Network Module graph M IV A(DELL)=Module IV A(MSFT)=Module III DELL MSFT DELL
33
Problem Due to acyclicity cannot optimize assignment for variables separately DELL Module I AMAT Module III MSFT Module II HPQ Module IV MIMI M II M III Module Network Module graph M IV A(DELL)=Module IV A(MSFT)=Module III DELL MSFT DELL
34
Learning Assignment Function Sequential update algorithm Iterate over all variables For each variable, find its optimal assignment given the current assignment to all other variables Efficient computation When changing assignment from M i to M j, only need to recompute score for modules i and j
35
Learning the Model Initialize module assignment A Optimize structure S Optimize module assignment A For each variable, find its optimal assignment given the current assignment to all other variables INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ MOT
36
Related Work Bayesian networks Parameter sharing PRMs OOBNs Module Networks Shared structure X X X Shared parameters X Learn parameter sharing X X Langseth+al N/A Learn structure X
37
Outline Module Network Probabilistic model Learning the model Experimental results Statistical validation Case study: Gene regulation
38
Learning Algorithm Performance -131 -130 -129 -128 05101520 Bayesian score (avg. per gene) Algorithm iterations Structure change iterations
39
-800 -750 -700 -650 -600 -550 -500 -450 020406080100120140160180200 Test data likelihood (per instance) Number of modules 25 instances 50 instances 100 instances 200 instances 500 instances Generalization to Test Data Synthetic data: 10 modules, 500 variables Best performance achieved for models with 10 modules
40
Generalization to Test Data Test data likelihood (per instance) Number of modules Synthetic data: 10 modules, 500 variables 25 instances 50 instances 100 instances 200 instances 500 instances Gain beyond 100 instances is small
41
Structure Recovery Graph Synthetic data: 10 modules, 500 variables Number of modules Recovered structure (% correct) 25 instances 50 instances 200 instances 500 instances 100 instances 74% of 2250 parent- child relationships recovered
42
Stock Market 4411 variables (stocks), 273 instances (trading days) Comparison to Bayesian networks (cross validation) Test Data Log-Likelihood (gain per instance) Number of modules 400 450 500 550 600 050100150200250300 0 Bayesian network performance
43
Regulatory Networks Learn structure of regulatory networks: Which genes are regulated by each regulator
44
Gene Expression Data Measures mRNA level for all genes in one condition Learn dependency of the expression of genes as a function of expression of regulators Experiments Genes Induced Repressed
45
Gene Expression 2355 variables (genes), 173 instances (arrays) Comparison to Bayesian networks Test Data Log-Likelihood (gain per instance) Number of modules -150 -100 -50 0 50 100 150 0 100200300400500 Bayesian network performance
46
Biological Evaluation Find sets of co-regulated genes (regulatory module) Find the regulators of each module Segal et al., Nature Genetics, 2003 46/50 30/50
47
Experimental Design Hypothesis: Regulator ‘X’ activates process ‘Y’ Experiment: Knock out ‘X’ and repeat experiment HAP4 Ypl230W true false true false X ? Segal et al., Nature Genetics, 2003
48
wt Ypl230w 0 3 5 7 9 24 0 2 5 7 9 24 (hrs.) >16x 341 differentially expressed genes 0 7 15 30 60 wt (min.) Ppt1 >4x 602 0 5 15 30 60 wt (min.) Kin82 >4x 281 Differentially Expressed Genes Segal et al., Nature Genetics, 2003
49
Were the differentially expressed genes predicted as targets? Rank modules by enrichment for diff. expressed genes #ModuleSignificance 14 Ribosomal and phosphate metabolism 8/32, 9e3 11 Amino acid and purine metabolism11/53, 1e2 15 mRNA, rRNA and tRNA processing 9/43, 2e2 39 Protein folding6/23, 2e2 30 Cell cycle7/30, 2e2 Ppt1 # ModuleSignificance 39Protein folding7/23, 1e-4 29Cell differentiation6/41, 2e-2 5 Glycolysis and folding5/37, 4e-2 34Mitochondrial and protein fate 5/37, 4e-2 Ypl230w #ModuleSignificance 3Energy and osmotic stress I8/31, 1e4 2Energy, osmolarity & cAMP signaling9/64, 6e3 15 mRNA, rRNA and tRNA processing6/43, 2e2 Kin82 Biological Experiments Validation All regulators regulate predicted modules Segal et al., Nature Genetics, 2003
50
Summary Probabilistic model for learning modules of variables and their structural dependencies Improved performance over Bayesian networks Statistical robustness Interpretability Application to gene regulation Reconstruction of many known regulatory modules Prediction of targets for unknown regulators
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.