Download presentation
Presentation is loading. Please wait.
Published byLynne Phillips Modified over 9 years ago
3
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
4
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Can we improve the network reconstruction by systematically integrating different sources of biological prior knowledge?
6
+
7
+ +
8
+ + + + …
9
Which sources of prior knowledge are reliable? How do we trade off the different sources of prior knowledge against each other and against the data?
10
Overview of the talk Revision: Bayesian networks Integration of prior knowledge Empirical evaluation
11
Overview of the talk Revision: Bayesian networks Integration of prior knowledge Empirical evaluation
12
Bayesian networks A CB D EF NODES EDGES Marriage between graph theory and probability theory. Directed acyclic graph (DAG) representing conditional independence relations. It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. We can infer how well a particular network explains the observed data.
14
Bayesian networks versus causal networks Bayesian networks represent conditional (in)dependence relations - not necessarily causal interactions.
15
Bayesian networks versus causal networks A CB A CB True causal graph Node A unknown
16
Bayesian networks versus causal networks A CB Equivalence classes: networks with the same scores: P(D|M). Equivalent networks cannot be distinguished in light of the data. A CB A CB A CB
17
Symmetry breaking A CB Prior knowledge A CB A CB A CB P(M|D) = P(D|M) P(M) / Z D: data. M: network structure
18
P(D|M)
19
Prior knowledge: B is a transcription factor with binding sites in the upstream regions of A and C P(M)
20
P(M|D) ~ P(D|M) P(M)
21
Learning Bayesian networks P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data
24
Overview of the talk Revision: Bayesian networks Integration of prior knowledge Empirical evaluation
26
Use TF binding motifs in promoter sequences
27
Biological prior knowledge matrix Biological Prior Knowledge Indicates some knowledge about the relationship between genes i and j
28
Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j
29
Notation Prior knowledge matrix: P B (for “belief”) Network structure: G (for “graph”) or M (for “model”) P: Probabilities
30
Prior distribution over networks Energy of a network
31
Sample networks and hyperparameters from the posterior distribution Capture intrinsic inference uncertainty Learn the trade-off parameters automatically P(M|D) = P(D|M) P(M) / Z
32
Prior distribution over networks Energy of a network
33
Rewriting the energy Energy of a network
34
Approximation of the partition function Partition function of a perfect gas
35
Multiple sources of prior knowledge
36
MCMC sampling scheme
37
Sample networks and hyperparameters from the posterior distribution Metropolis-Hastings scheme Proposal probabilities
38
Bayesian networks with biological prior knowledge Biological prior knowledge: Information about the interactions between the nodes. We use two distinct sources of biological prior knowledge. Each source of biological prior knowledge is associated with its own trade-off parameter: 1 and 2. The trade off parameter indicates how much biological prior information is used. The trade-off parameters are inferred. They are not set by the user!
39
Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22
40
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
41
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
42
Overview of the talk Revision: Bayesian networks Integration of prior knowledge Empirical evaluation
43
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
44
Raf regulatory network From Sachs et al Science 2005
45
Raf regulatory network
46
Evaluation: Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation carcinogenesis Extensively studied in the literature gold standard network
47
Data Prior knowledge
48
Flow cytometry data Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins 5400 cells have been measured under 9 different cellular conditions (cues) Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments
49
Microarray example Spellman et al (1998) Cell cycle 73 samples Tu et al (2005) Metabolic cycle 36 samples Genes time
50
Data Prior knowledge
51
KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks. http://www.genome.jp/kegg/ Flow cytometry data and KEGG
52
Prior knowledge from KEGG
53
Prior distribution
54
The data and the priors + KEGG + Random
55
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
56
Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22
57
Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters
58
Sampled values of the hyperparameters
59
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
60
How can we evaluate the reconstruction accuracy ?
62
Flow cytometry data and KEGG
63
Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?
64
Learning the trade-off hyperparameter Repeat MCMC simulations for large set of fixed hyperparameters β Obtain AUC scores for each value of β Compare with the proposed scheme in which β is automatically inferred. Mean and standard deviation of the sampled trade off parameter
66
Conclusion Bayesian scheme for the systematic integration of different sources of biological prior knowledge. The method can automatically evaluate how useful the different sources of prior knowledge are. We get an improvement in the regulatory network reconstruction. This improvement is close to optimal.
67
Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.