Dynamic Bayesian Networks

Slides:



Advertisements
Similar presentations
Bayesian network for gene regulatory network construction
Advertisements

A Tutorial on Learning with Bayesian Networks
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Dynamic Bayesian Networks (DBNs)
Simulation Metamodeling using Dynamic Bayesian Networks in Continuous Time Jirka Poropudas (M.Sc.) Aalto University School of Science and Technology Systems.
IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS
Introduction of Probabilistic Reasoning and Bayesian Networks
Overview Full Bayesian Learning MAP learning
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.
6. Gene Regulatory Networks
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics.
Slides for “Data Mining” by I. H. Witten and E. Frank.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
BAYESIAN INFERENCE OF SIGNALING NETWORK TOPOLOGY IN A CANCER CELL LINE Steven M. Hill, Yiling Lu, Jennifer Molina, Laura M. Heiser, Paul T. Spellman, Terence.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Introduction on Graphic Models
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Oliver Schulte Machine Learning 726
Hierarchical Agglomerative Clustering on graphs
Incorporating graph priors in Bayesian networks
Representation, Learning and Inference in Models of Cellular Networks
Multi-task learning approaches to modeling context-specific networks
Learning gene regulatory networks in Arabidopsis thaliana
Bayesian data analysis
Input Output HMMs for modeling network dynamics
Dynamics and context-specificity in biological networks
Bayes Net Learning: Bayesian Approaches
Bayesian Networks Applied to Modeling Cellular Networks
Intelligent Information System Lab
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Data Mining Lecture 11.
Recovering Temporally Rewiring Networks: A Model-based Approach
CSCI 5822 Probabilistic Models of Human and Machine Learning
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Hidden Markov Models Part 2: Algorithms
CS 188: Artificial Intelligence Spring 2007
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CSCI 5822 Probabilistic Models of Human and Machine Learning
CS 188: Artificial Intelligence
Dynamics and context-specificity in biological networks
Evaluation of inferred networks
CS 188: Artificial Intelligence Fall 2007
Probabilistic Reasoning
Incorporating graph priors in Bayesian networks
Input Output HMMs for modeling network dynamics
Class #19 – Tuesday, November 3
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Volume 4, Issue 1, Pages e10 (January 2017)
Bayesian Model Selection and Averaging
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Presentation transcript:

Dynamic Bayesian Networks Sushmita Roy sroy@biostat.wisc.edu Computational Network Biology Biostatistics & Medical Informatics 826 Computer Sciences 838 https://compnetbiocourse.discovery.wisc.edu Oct 18th 2016

Goals for today Dynamic Bayesian Network How does a DBN differ from a static BN? Learning a Dynamic Bayesian Network DBN with structure priors Application of DBN to a phospho-proteomics time course Evaluation and insights

Dynamic Bayesian Networks A Dynamic Bayesian network (DBN) is a Bayesian network that can model temporal/sequential data DBN is a Bayes net for dynamic processes A DBN also has a graph structure and conditional probability distributions The DBN specifies how observations at a future time point may arise from previous time points.

Notation Assume we have a time course with T time points specifying activity of p different variables Let denote the set of random variables at time t A DBN over these variables defines the joint distribution of P(X), where A DBN, like a BN, has a directed acyclic graph G and parameters Θ G typically specifies the dependencies between time points In addition we need to specify dependence (if any) at t=0

A DBN for p variables and T time points X11 X21 Xp1 … X1 X2 Xp … 1 X1 X2 Xp … 2 … X1 X2 Xp … T … p … X2: Variables at time t=2 Dependency at the first time point

Stationary assumption in a Bayesian network The stationarity assumption states that the dependency structure and parameters do not change with t Due to this assumption, we only need to specify dependencies between two sets of variables p t X1t X2t Xpt … X1t+1 X2t+1 Xpt+1 t+1 X1 X2 Xp 1 2 T t=1 t=2 t=T

Computing the joint probability distribution in a DBN Joint Probability Distribution can be factored into a product of conditional distributions across time and variables: Parameters specifying the form of the conditional distributions Graph encoding dependency structure between variables at consecutive time points Parents of Xit defined by the graph G

Learning problems in DBNs Parameter learning: Given known temporal dependencies between random variables estimate the parameters from observed measurements Structure learning: Given data, learn both the graph structure and parameters Complexity of learning depends upon the order of the model

An example DBN Let us consider a simple example of two regulators, B and C and one target gene, A Assume their expression takes on values H, L and NC (for high, low and no-change in expression) A’s expression level depends upon regulator B and C’s expression level B and C mutually regulate each other Let XAt denote the random variable representing the expression level of gene A at time t

DBN for a three node network The collapsed network

Specifying the parameters of the DBN for a three node network Each of these conditional distributions will specify the distribution over {H,L, NC} given the state of the parent variable DBN

Specifying the parameters of the DBN L NC 0.5 0.1 0.4 0.2 0.25

Specifying the parameters of the DBN L NC 0.8 0.1 0.2 0.6 0.7 0.3 0.05 0.75 Parameter estimation: Estimating these numbers

Assume the following CPDs for three variables NC 0.5 0.1 0.4 0.2 0.25 B C H L NC 0.8 0.1 0.2 0.6 0.7 0.3 0.05 0.75 H L NC 0.5 0.1 0.4 0.2 0.25

Computing the probability distribution of an observation Suppose we are given a new observation time course T=0 T=1 T=2 NC L H Assume, P(NC)=0.5 and P(H)=P(L)=0.25 for all variables at T=0. Using the DBN from the previous slides, what is the probability of this time course? First we plug in the formula at the time point level Next, we look at the graph structure of the DBN to further decompose these terms

Computing the probability distribution of an observation Graph structure of the DBN to further decompose these terms NC L H Assume P(NC)=0.5 and P(H)=P(L)=0.25 at T=0.

Parameter estimation in DBNs Parameter estimation approach would differ depending upon the form of the CPD Assume that the variables are discrete, then we need to estimate the entries of the CPD distribution

Parameter estimation example for three node DBN Need to estimate this table H L NC Suppose we had a training time course: T=0 T=1 T=2 T=3 T=4 NC L H To compute these probabilities, we need to look at the joint assignments of {XBt+1,XCt} for all 0≤t≤4 What is P(XBt+1=H|XCt=L)? What is P(XBt+1=NC|XCt=L)?

Structure learning in DBNs We need to learn the dependency structure between two consecutive time points We may also want to learn within time point connectivity Structure search learning algorithms used for BNs, can be used with a simple extension: parents of a node can come from the previous or current time step.

DBN with score-based search Score of a DBN is a function of the data likelihood Data: Collection of time courses Graph prior: This can be uniform, or can encode some form of model complexity

Goals for today Dynamic Bayesian Network How does a DBN differ from a static BN? Learning a Dynamic Bayesian Network DBN with structure priors Application of DBN to a phospho-proteomics time course Evaluation and insights

Bayesian Inference of Signaling Network Topology in a Cancer Cell Line (Hill et al 2012) Protein signaling networks are important for many cellular diseases The networks can differ between normal and disease cell types But the structure of the network remains incomplete Temporal activity of interesting proteins can be measured over time, that can be used infer the network structure Build on prior knowledge of signaling networks to learn a better, predictive network BNs are limiting because they do not model time

Applying DBNs to infer signaling network topology Fig.1: Data-driven characterization of signaling networks. Reverse-phase protein arrays interrogate signaling dynamics in samples of interest. Network structure is inferred using DBNs, with primary phospho-proteomic data integrated with existing biology, using informative priors objectively weighted by an empirical Bayes approach. Edge probabilities then allow the generation and prioritization of hypotheses for experimental validation Hill et al., Bioinformatics 2012

Application of DBNs to signaling networks Dataset description Phospho-protein levels of 20 proteins Eight time points Four growth conditions Use known signaling network as a graph prior Estimate CPDs as conditional regularized Gaussians Assume a first-order Markov model Xt depends on on Xt-1

Integrating prior signaling network into the DBN A Bayesian approach to graph learning Graph prior is encoded as (Following Mukherjee & Speed 2008) Where f(G)=-|E(G)\E*| is defined as the number of edges in the graph G, E(G), that are not in the prior set E* This prior does not promote new edges, but penalizes edges that are not in the prior Data likelihood Graph prior Prior strength Graph features

Calculating posterior probabilities of edges For each edge e, we need to calculate Although this is intractable in general, this work makes some assumptions Allow edges only forward in time The learning problem decomposes to smaller per-variable problems that can be solved by variable selection Assume P(G) factorizes over individual edges To compute the posterior probability, the sum goes over all possible parent sets Assume a node can have no more than dmax parents

Results on simulated data 20 variables, 4 time-courses 8 time points Prior network had 54 extra edges and did not have 10 of the ground truth edges

Results are not sensitive to prior values Sensitivity to choice of hyper parameter Sensitivity to noisy prior graph

Inferred signaling network using a DBN Prior also had self-loops that are not shown Inferred signaling network Prior network

Using the DBN to make predictions Although many edges were expected, several edges were unexpected Select novel edges based on posterior probability and test them based on inhibitors For example, if an edge was observed from X to Y, inhibition of X should affect the value of Y, if X is a regulator of Y Example edges tested MAPKp to STAT3p(S727) with high probability (0.98) Apply MEKi, which is an inhibitor of MAPK, and measure MAPKp and STAT3p post inhibition AKTp to p70S6Kp, AKTp to MEKp and AKTp to cJUNp

Experimental validation of links Add MAPK inhibitor and measure MAPK and STAT3 MAPK is significantly inhibited (P-value 5X10-4) STAT3 is also inhibited (P-value 3.3X10-4) Fig. 4. Validation of predictions by targeted inhibition in breast cancer cell line MDA-MB-468. (a) MAPK-STAT3 crosstalk. Network inference (Fig. 3a) predicted an unexpected link between phospho-MAPK (MAPKp) and STAT3p(S727) in the breast cancer cell line MDA-MB-468. The hypothesis of MAPK-STAT3 crosstalk was tested by MEK inhibition: this successfully reduced MAPK phosphorylation and resulted in a corresponding decrease in STAT3p(S727). (b) AKTp → p70S6Kp, AKT-MAPK crosstalk and AKT-JNK/JUN crosstalk. AKTp is linked to p70S6kp, MEKp and cJUNp. In line with these model predictions, use of an AKT inhibitor reduced both p70S6K and MEK phosphorylation and increased JNK phosphorylation. (RPPA data; MEK inhibitor GSK1120212 and AKT inhibitor GSK690693B at 0 uM, 0.625 uM, 2.5 uMand 10 uM; measurements taken 0, 5, 15, 30, 60, 90, 120 and 180min after EGF stimulation; average values over 3 replicates shown, error bars indicate SEM) Their success is measured by the difference in the levels of the targets as a function of the levels of the inhibitors

Take away points Network dynamics can be defined in multiple ways We have seen two ways to capture network dynamics Skeleton network-based approaches The universe of networks is fixed Nodes become on or off No assumption or model of how the network changes over time Dynamic Bayesian network A type of probabilistic graphical model Describes how the system transitions from one state to another Assumes that the dependency between t-1 and t is the same for all time points Application to cancer signaling data DBNs are powerful for capturing the dynamics However, the prior was important to learn an accurate network

References N. Friedman, K. Murphy, and S. Russell, "Learning the structure of dynamic probabilistic networks," in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, ser. UAI'98.    San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1998, pp. 139-147. [Online]. Available: http://portal.acm.org/citation.cfm?id=2074111 S. M. Hill, Y. Lu, J. Molina, L. M. Heiser, P. T. Spellman, T. P. Speed, J. W. Gray, G. B. Mills, and S. Mukherjee, "Bayesian inference of signaling network topology in a cancer cell line." Bioinformatics (Oxford, England), vol. 28, no. 21, pp. 2804-2810, Nov. 2012. [Online]. Available: http://dx.doi.org/10.1093/bioinformatics/bts514