Download presentation
Presentation is loading. Please wait.
Published byEustace Gibson Modified over 9 years ago
1
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli
2
Systems biology Learning signalling pathways and regulatory networks from postgenomic data
3
Reverse Engineering of Regulatory Networks Can we learn the network structure from postgenomic data themselves? Statistical methods to distinguish between –Direct correlations –Indirect correlations Challenge: Distinguish between –Correlations –Causal interactions Breaking symmetries with active interventions: –Gene knockouts (VIGs, RNAi)
8
Shrinkage estimation and the lemma of Ledoit-Wolf
13
Bayesian networks versus Graphical Gaussian models Directed versus undirected graphs Score based versus constrained based inference
14
Evaluation On real experimental data, using the gold standard network from the literature On synthetic data simulated from the gold- standard network
16
Evaluation: Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation carcinogenesis Extensively studied in the literature gold standard network
18
Data Laboratory data from cytometry experiments Down-sampled to 100 measurements Sample size indicative of microarray experiments
20
Two types of experiments
22
Evaluation On real experimental data, using the gold standard network from the literature On synthetic data simulated from the gold- standard network
23
Comparison with simulated data 1
24
Raf pathway
25
Comparison with simulated data 2
26
Steady-state approximation
30
Evaluation 1: AUC scores
31
Evaluation 2: TP scores We set the threshold such that we obtained 5 spurious edges (5 FPs) and counted the corresponding number of true edges (TP count).
35
AUC scores
36
TP scores
37
Raf pathway
38
Conclusions 1 BNs and GGMs outperform RNs, most notably on Gaussian data. No significant difference between BNs and GGMs on observational data. For interventional data, BNs clearly outperform GGMs and RNs, especially when taking the edge direction (DGE score) rather than just the skeleton (UGE score) into account.
39
Conclusions 2 Performance on synthetic data better than on real data: Real data: more complex Real interventions are not ideal Errors in the gold-standard network
41
Reconstructing gene regulatory networks with Bayesian networks by combining microarray data with biological prior knowledge
42
MOTIVATION
43
Use TF binding motifs in promoter sequences
44
Use prior knowledge from KEGG
45
Prior knowledge
47
Biological prior knowledge matrix Biological Prior Knowledge Indicates some knowledge about the relationship between genes i and j
48
Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j
49
Prior distribution over networks Energy of a network
50
Rewriting the energy Energy of a network
51
Approximation of the partition function
52
Multiple sources of prior knowledge
53
Rewriting the energy Energy of a network
54
Approximation of the partition function
55
MCMC sampling scheme
56
Sample networks and hyperparameters from the posterior distribution Metropolis-Hastings scheme Proposal probabilities
57
Metropolis-Hastings scheme
59
MCMC with one prior Sample graph and the parameter . Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
60
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
61
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
62
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
63
Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with fixed. 2.Sample with graph fixed.
64
Approximation of the partition function
65
MCMC with two priors Sample graph and the parameters and 2 Separate in three samples to improve the acceptance: 1.Sample graph with 1 and 2 fixed. 2.Sample 1 with graph and 2 fixed. 3.Sample 2 with graph and 1 fixed.
66
Application to real data
67
Flow cytometry data and KEGG
69
Data available: –Intracellular multicolour flow cytometry. –Measured protein concentrations. –1200 data points. We sample 5 data sets with 100 data points each.
70
Flow cytometry data and KEGG KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks.
71
Flow cytometry data and KEGG
72
Prior distribution
73
Sampled values of the hyperparameters
74
Idealized network population
76
Sampled values of the hyperparameters
77
Performance evaluation: AUC scores
78
Flow cytometry data and KEGG
79
Comparison: AUC for fixed and sampled hyperparameters on real data
80
Comparison: AUC for fixed and sampled hyperparameters on synthetic data
81
Future work
83
Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.