Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics.

Similar presentations


Presentation on theme: "Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics."— Presentation transcript:

1 Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics Scotland 2) Centre for Systems Biology at Edinburgh

2 Systems Biology

3 Cell membran nucleus Protein activation cascade TF phosphorylation -> cell response

4 Raf signalling network From Sachs et al Science 2005

5

6 unknown high- throughput experiments postgenomic data machine learning statistical methods

7 Differential equation models Multiple parameter sets can offer equally plausible solutions. Multimodality in parameters space: point estimates become meaningless. Overfitting problem  not suitable for model selection. Bayesian approach: computing of marginal likelihood computationally challenging.

8 Bayesian networks A CB D EF NODES EDGES Marriage between graph theory and probability theory. Directed acyclic graph (DAG) representing conditional independence relations. It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. We can infer how well a particular network explains the observed data.

9

10 Learning Bayesian networks P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data

11

12

13 MCMC in structure space Madigan & York (1995), Guidici & Castello (2003)

14 Alternative paradigm: order MCMC

15

16

17 MCMC in structure space Instead of

18 MCMC in order space

19

20

21 Problem: Distortion of the prior distribution

22 A A A B B B AB BA

23 A A A B B B AB BA 0.5

24 A A A B B B AB BA

25 A A A B B B AB BA

26 A A A B B B AB BA 0.25 0.5 0.25

27 Current work with Marco Grzegorczyk MCMC in structure space rather than order space. Design new proposal moves that achieve faster mixing and convergence. Proposed new paradigm

28 First idea Propose new parents from the distribution: Identify those new parents that are involved in the formation of directed cycles. Orphan them, and sample new parents for them subject to the acyclicity constraint.

29 1) Select a node2) Sample new parents3) Find directed cycles 4) Orphan “loopy” parents 5) Sample new parents for these parents

30 Problem: This move is not reversible Path via illegal structure

31 Devise a simpler move that is reversible Identify a pair of nodes X Y Orphan both nodes. Sample new parents from the “Boltzmann distribution” subject to the acyclicity constraint such the inverse edge Y X is included. C1 C2 C1,2

32 1) Select an edge 2) Orphan the nodes involved3) Constrained resampling of the parents

33 This move is reversible!

34 1) Select an edge 2) Orphan the nodes involved3) Constrained resampling of the parents

35 Simple idea Mathematical Challenge: Show that condition of detailed balance is satisfied. Derive the Hastings factor … … which is a function of various partition functions

36 Acceptance probability

37

38 Ergodicity The new move is reversible but … … not irreducible AB BA BA Theorem: A mixture with an ergodic transition kernel gives an ergodic Markov chain. REV-MCMC: at each step randomly switch between a conventional structure MCMC step and the proposed new move.

39

40 Does the new method avoid the bias intrinsic to order MCMC? How do convergence and mixing compare to structure and order MCMC? What is the effect on the network reconstruction accuracy? Evaluation

41 Results Analytical comparison of the convergence properties Empirical comparison of the convergence properties Evaluation of the systematic bias Molecular regulatory network reconstruction with prior knowledge

42 Analytical comparison of the convergence properties Generate data from a noisy XOR Enumerate all 3-node networks t

43 Analytical comparison of the convergence properties Generate data from a noisy XOR Enumerate all 3-node networks Compute the posterior distribution p° Compute the Markov transition matrix A for the different MCMC methods Compute the Markov chain p(t+1)= A p(t) Compute the (symmetrized) KL divergence KL(t)= t

44 Solid line: REV-MCMC. Other lines: structure MCMC and different versions of inclusion-driven MCMC

45 Results Analytical comparison of the convergence properties Empirical comparison of the convergence properties Evaluation of the systematic bias Molecular regulatory network reconstruction with prior knowledge

46 Empirical comparison of the convergence and mixing properties Standard benchmark data: Alarm network (Beinlich et al. 1989) for monitoring patients in intensive care 37 nodes, 46 directed edges Generate data sets of different size Compare the three MCMC algorithms under the same computational costs  structure MCMC (1.0E6)  order MCMC (1.0E5)  REV-MCMC (1.0E5)

47

48

49

50

51 AUC=0.75 AUC=1 AUC=0.5 What are the implications for network reconstruction ? ROC curves Area under the ROC curve (AUROC)

52

53 Conclusion Structure MCMC has convergence and mixing difficulties. Order MCMC and REV-MCMC show a similar (and much better) performance.

54 Conclusion Structure MCMC has convergence and mixing difficulties. Order MCMC and REV-MCMC show a similar (and much better) performance. How about the bias?

55 Results Analytical comparison of the convergence properties Empirical comparison of the convergence properties Evaluation of the systematic bias Molecular regulatory network reconstruction with prior knowledge

56 Evaluation of the systematic bias using standard benchmark data Standard machine learning benchmark data: FLARE and VOTE Restriction to 5 nodes  complete enumeration possible (~ 1.0E4 structures) The true posterior probabilities of edge features can be computed Compute the difference between the true scores and those obtained with MCMC

57 Deviations between true and estimated directed edge feature posterior probabilities

58

59 Results Analytical comparison of the convergence properties Empirical comparison of the convergence properties Evaluation of the systematic bias Molecular regulatory network reconstruction with prior knowledge

60 Raf regulatory network From Sachs et al Science 2005

61 Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation  carcinogenesis Extensively studied in the literature  gold standard network

62 Data Prior knowledge

63 Flow cytometry data Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins 5400 cells have been measured under 9 different cellular conditions (cues) Downsampling to 10 & 100 instances (5 separate subsets): indicative of microarray experiments

64 Data Prior knowledge

65

66 Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j P  B (for “belief”)

67 Prior distribution over networks Energy of a network

68 Prior knowledge Sachs et al. Edge Non-edge 0.9 0.6 0.55 0.1 0.4 0.45

69 AUROC scores

70 Conclusion True prior knowledge that is strong  no significant difference True prior knowledge that is weak  Order MCMC leads to a slight yet significant deterioration. (Significant at the p=0.01 value obtained from a paired t-test).

71

72 Prior knowledge from KEGG

73 Flow cytometry data and KEGG

74 The new method avoids the bias intrinsic to order MCMC. Its convergence and mixing are similar to order MCMC; both methods outperform structure MCMC. We can get an improvement over order MCMC when using explicit prior knowledge. Conclusions

75 Thank you! Any questions?


Download ppt "Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics."

Similar presentations


Ads by Google