Download presentation
Presentation is loading. Please wait.
Published byJordan Harmon Modified over 9 years ago
1
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli
3
+
4
+ +
5
+ + + + …
6
Learning Bayesian networks from data and prior knowledge
7
Bayesian networks A CB D EF NODES EDGES Marriage between graph theory and probability theory. Directed acyclic graph (DAG) representing conditional independence relations. It is possible to score a network in light of the data. We can infer how well a particular network explains the observed data.
8
Bayesian networks versus causal networks A CB A CB True causal graph Node A unknown
9
Bayesian networks versus causal networks A CB Equivalence classes: networks with the same scores. Equivalent networks cannot be distinguished in light of the data. We can only learn the undirected graph. Unless… we use interventions or prior knowledge. A CB A CB A CB
10
Learning Bayesian networks from data P(M|D) = P(D|M) P(M) / Z
15
Use TF binding motifs in promoter sequences
16
Biological prior knowledge matrix Biological Prior Knowledge Indicates some knowledge about the relationship between genes i and j
17
Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j
18
Prior distribution over networks Energy of a network
19
Sample networks and hyperparameters from the posterior distribution Capture intrinsic inference uncertainty Learn the trade-off parameters automatically P(M|D) = P(D|M) P(M) / Z
20
Prior distribution over networks Energy of a network
21
Rewriting the energy Energy of a network
22
Approximation of the partition function
23
Multiple sources of prior knowledge
24
Rewriting the energy Energy of a network
25
Approximation of the partition function
26
MCMC sampling scheme
27
Sample networks and hyperparameters from the posterior distribution Metropolis-Hastings scheme Proposal probabilities
28
MCMC with one prior Sample graph and the parameter . Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
29
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
30
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
31
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
32
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
33
Approximation of the partition function
34
MCMC with two priors Sample graph and the parameters and 2 Separate in three samples to improve the acceptance: 1.Sample graph with 1 and 2 fixed. 2.Sample 1 with graph and 2 fixed. 3.Sample 2 with graph and 1 fixed.
35
Bayesian networks with biological prior knowledge Biological prior knowledge: Information about the interactions between the nodes. We use two distinct sources of biological prior knowledge. Each source of biological prior knowledge is associated with its own trade-off parameter: 1 and 2. The trade off parameter indicates how much biological prior information is used. The trade-off parameters are inferred. They are not set by the user!
36
Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22
37
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
38
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
39
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
40
Application to the Raf regulatory network
41
Raf regulatory network From Sachs et al Science 2005
42
Evaluation: Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation carcinogenesis Extensively studied in the literature gold standard network
43
Data Prior knowledge
44
Intracellular multicolour flow cytometry. Measured protein concentrations. 11 proteins: 1200 concentration profiles. We sample 5 separate subsets with 100 concentration profiles each. Flow cytometry data and KEGG
45
Microarray example Spellman et al (1998) Cell cycle 73 samples Tu et al (2005) Metabolic cycle 36 samples Genes time
46
Data Prior knowledge
47
KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks. http://www.genome.jp/kegg/ Flow cytometry data and KEGG
48
Prior knowledge from KEGG
49
Prior distribution
50
The data and the priors + KEGG + Random
51
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
52
Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22
53
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
54
Sampled values of the hyperparameters
55
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
56
How to compare the recovered networks?
57
True networkPredicted network compare Thresholding Performance evaluation
58
True networkPredicted network compare True Positives False Positives Counting Performance evaluation
59
True networkPredicted network compare True Positives False Positives Counting Performance evaluation DGE – Consider edge directions UGE – Discard the edge directions
60
Performance evaluation: ROC curves
61
We use the Area Under the Receiver Operating Characteristic Curve (AUC). AUC=0.75 AUC=1 AUC=0.5 Performance evaluation: ROC curves
62
Evaluation 2: TP scores We set the threshold such that we obtain 5 spurious edges (5 FPs) and count the corresponding number of true edges (TP count).
64
Flow cytometry data and KEGG
65
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
66
Learning the trade-off hyperparameter Repeat MCMC simulations for large set of fixed hyperparameters β Obtain AUC scores for each value of β Compare with the proposed scheme in which β is automatically inferred. Mean and standard deviation of the sampled trade off parameter
67
Learning the trade-off hyperparameters on simulated data Mean and standard deviation of the sampled trade-off parameter Repeat MCMC simulations for large set of fixed hyperparameters β Obtain AUC scores for each value of β Compare with the proposed scheme in which β is automatically inferred.
68
Regulation of Raf-1 by Direct Feedback Phosphorylation. Molecular Cell, Vol. 17, 2005 Dougherty et al New evidence for the accepted network
69
Conclusion Bayesian scheme for the systematic integration of different sources of biological prior knowledge. The method can automatically evaluate how useful the different sources of prior knowledge are. We get an improvement in the regulatory network reconstruction. This improvement is close to optimal.
70
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.