Probabilistic models for interpreting perturbations in networks: Nested effect models Sushmita Roy sroy@biostat.wisc.edu Computational Network Biology.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

CS188: Computational Models of Human Behavior
Bayesian network for gene regulatory network construction
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
Dynamic Bayesian Networks (DBNs)
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
An Introduction to Variational Methods for Graphical Models.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
The Factor Graph Approach to Model-Based Signal Processing Hans-Andrea Loeliger.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Functional genomics and inferring regulatory pathways with gene expression data.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
6. Gene Regulatory Networks
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Epistasis Analysis Using Microarrays Chris Workman.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks.
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks Achim Tresch.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Markov Cluster (MCL) algorithm Stijn van Dongen.
Dependency networks Sushmita Roy BMI/CS 576 Nov 25 th, 2014.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Today Graphical Models Representing conditional dependence graphically
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Graph clustering to detect network modules
Journal club Jun , Zhen.
Hierarchical Agglomerative Clustering on graphs
Spectral methods for Global Network Alignment
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
Incorporating graph priors in Bayesian networks
Multi-task learning approaches to modeling context-specific networks
Qian Liu CSE spring University of Pennsylvania
Learning gene regulatory networks in Arabidopsis thaliana
Multiple sequence alignment (msa)
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
How to understand the cell by breaking it
Dynamics and context-specificity in biological networks
Analysis of bio-molecular networks through RANKS (RAnking of Nodes
Network-based interpretation of perturbations to living systems
Dynamic Bayesian Networks
Recovering Temporally Rewiring Networks: A Model-based Approach
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Hidden Markov Models Part 2: Algorithms
Bayesian Models in Machine Learning
Probabilistic Models with Latent Variables
An Introduction to Variational Methods for Graphical Models
CSCI 5822 Probabilistic Models of Human and Machine Learning
Schedule for the Afternoon
Dynamics and context-specificity in biological networks
Evaluation of inferred networks
Spectral methods for Global Network Alignment
Input Output HMMs for modeling network dynamics
Markov Random Fields Presented by: Vladan Radosavljevic.
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Principle of Epistasis Analysis
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Presentation transcript:

Probabilistic models for interpreting perturbations in networks: Nested effect models Sushmita Roy sroy@biostat.wisc.edu Computational Network Biology Biostatistics & Medical Informatics 826 Computer Sciences 838 https://compnetbiocourse.discovery.wisc.edu Dec 6th 2016

Types of algorithms used to examine perturbations in networks Graph diffusion followed by subnetwork finding methods HOTNET NETBAG Information flow-based methods (also widely used for integrating different types of data) Prize collecting steiner tree Min cost max flow Probabilistic graphical model-based methods Factor graphs Nested Effect Models (NEMs)

Probabilistic graphical models for interpreting network perturbations C.-H. H. Yeang, T. Ideker, and T. Jaakkola, "Physical network models." Journal of computational biology : a journal of computational molecular cell biology, vol. 11, no. 2-3, pp. 243-262, Mar. 2004. F. Markowetz, D. Kostka, O. G. Troyanskaya, and R. Spang, "Nested effects models for high-dimensional phenotyping screens," Bioinformatics, vol. 23, no. 13, pp. i305-312, Jul. 2007. C. J. Vaske, C. House, T. Luu, B. Frank, C.-H. H. Yeang, N. H. Lee, and J. M. Stuart, "A factor graph nested effects model to identify networks from genetic perturbations." PLoS computational biology, vol. 5, no. 1, pp. e1 000 274+, Jan. 2009.

Motivation of nested effect models Perturbation of genes followed by high-throughput profiling of different phenotypes can be used to characterize functions of genes However, most genes do not function independently but interact in a network to drive a particular function Phenotypic measurements (e.g. mRNA levels) are indirect measurements of the underlying network structure Includes direct and indirect effects Given perturbation data from multiple genes, can we more systematically identify the functions of these genes and how they interact at a pathway level?

Problem overview Given global measurements of gene expression after single gene deletions of multiple genes Do Infer interactions between genes with deletions to enable further characterization of these genes Nested Effect Models are probabilistic model-based approaches to solve this problem

Nested Effect Models Markowetz et al, 2007

Nested Effect Models Key properties A generalization of similarity based clustering Orders the clusters according to subset relationships A gene A is upstream of another gene B if B’s effects are a subset of A’s effects Build a hierarchy of all perturbed genes by constructing from smaller sub-models of pairs and triplets of genes

Subset relationships to order genes A complete model. The left part of the figure shows a complete model M’xyz consisting of a transitively closed graph between genes and assignments of genes to specific effects (the dashed arrows). Given the complete model, we can formulate a prediction of what effects to expect: perturbing x should cause all effects, while perturbing y should only cause E3–E6, and perturbing z only E5 and E6 (middle plot). In reality, our observations will be noisy: there can be false positive (FP) and false negative (FN) effect observations (right plot).

Probabilistic graphical models for interpreting network perturbations C.-H. H. Yeang, T. Ideker, and T. Jaakkola, "Physical network models." Journal of computational biology : a journal of computational molecular cell biology, vol. 11, no. 2-3, pp. 243-262, Mar. 2004. F. Markowetz, D. Kostka, O. G. Troyanskaya, and R. Spang, "Nested effects models for high-dimensional phenotyping screens," Bioinformatics, vol. 23, no. 13, pp. i305-312, Jul. 2007. C. J. Vaske, C. House, T. Luu, B. Frank, C.-H. H. Yeang, N. H. Lee, and J. M. Stuart, "A factor graph nested effects model to identify networks from genetic perturbations." PLoS computational biology, vol. 5, no. 1, pp. e1 000 274+, Jan. 2009.

Key properties of Factor Graph-NEMs (FG-NEMs) NEMs assume the genes that are perturbed interact in a binary manner But many interactions have sign inhibitory or stimulating action FG-NEMs capture a broader set of interactions among the perturbed genes Formulation based on a Factor Graph Provide an efficient search over the space of NEMs

Notation S-genes: Set of genes that have been deleted individually E-genes: Set of effector genes that are measured Θ: The attachment of an effector gene to the S-gene network Φ: The interaction matrix of S-genes X: The phenotypic profile, each column gives the difference in expression in a knockout compared to wild type Rows: E-genes Columns: S-genes Y: Hidden effect matrix, each entry is {-1, 0, +1} which specifies whether an S-gene affects the E-gene

An example of 4 S-genes and 13 E-gens A->C is reflected in the scatter plot. When XA is up, X C is up. When XA is down, XC is down or no change S-genes E-genes Φ Θ X B-ID is also reflected in the scatter plot. XD is a subset of opposite changes from XB

S-gene interaction modes and their expression signatures Consistent with expectation Expected trend in E-genes for specific interaction modes Inconsistent with expectation

Factor graphs A type of graphical model A bi-partite graph with variable nodes and factor nodes Edges connect variables to potentials that the variables are arguments of Represents a global function as product of smaller local functions Perhaps the most general graphical model Bayesian networks and Markov networks have factor graph representations

Example factor graph Variable nodes Factor nodes From Kschischang, Frey, Loeliger 2001

Probabilistic model for NEMs Goal is to find a network, Φ and Θ that best fit the observed data (X) This is an inference problem Use a Maximum a posterior (MAP) approach X is a noisy measurement of Y. Y is the quantity we need to sum over

Probabilistic model continued Independence over all E-genes Re-arranging the terms How to compute Le?

Digging inside the Le term Note: Ye={YeA,YeB, YeC.. YeN}, where N is the total number of S-genes Define L’e, proportional to Le using a set of pairwise potentials \phi_{AB} is the relationship between the S-genes, \theta_{eAB} is capturing how e is attached to the S-gene graph based on A and B. The S- gene interaction Attachment of gene e with respect to A or B

Digging inside the Le term The S- gene interaction Attachment of gene e with respect to A or B Now the joint can be written in a more tractable way Each of these conditional distributions will correspond to a factor

Defining the factors Modeled as Gaussian distributions Four variable factor, over discrete variables YeA: binary variables Four values for each possible type of interaction: inhibitory, activating, equivalent, no interaction Interaction of e with A or B: inhibited or activated by A or B or no action This factor has value=1 if the E-gene e is attached to either A or B and e’s state is consistent with the interaction mode between A and B.

The prior over S-gene graph The prior P(Φ ) can incorporate prior knowledge of interactions among genes in pathways At its simplest, it should encode a transitivity relationship to force all pairwise interactions to be consistent among all triples Transitivity constraint for triples Physical network constraints Example transitivity: If A B, B C, Then, A->C

Factor graph representation of NEMs Prior

Inference on the factor graph Find most likely configurations for Use a message passing algorithm (standard for factor graphs) Called the Max-Product algorithm Message passing happens in two steps Messages are passed from observations XeA to the Messages are passed between the interaction and transitivity factors until convergence Messages are local functions of the variable associated with each edge.

Does FG-NEM capture activating and inhibitory relationships? FG-NEM: capture inhibitory and activating relationships uFG-NEM: capture only unsigned interactions FG-NEM AVT: FG-NEM run on absolute value data Solid lines: structure recovery Dashed lines: sign recovery

Does FG-NEM expand pathways better than the baseline approach Right shift in percentile rank difference of NEM methods

Pathway expansion Attach new E-genes to S-gene network An attached gene e to S-gene s asserts that e is directly downstream of s All E-genes attached to the S-gene network are called frontier genes An E-gene’s connectivity is examined based on the Log-likelihood Attachment Ratio One of the S genes

FG-NEM based pathway expansion in yeast Template matching: rank E genes based on similarity in expression to an “idealized template”

FG-NEM infers a more accurate network than the unsigned version in yeast FG-NEM and uFG-NEM networks inferred in the ion-homeostasis pathway FG-NEM inferred more genes associated with ion homeostatis compared to uFG-NEM

FG-NEM application to colon cancer Most interactions in S-gene network are activating Novel expanded genes that have significant effect on the invasive phenotype

Summary FG-NEMs: A general approach to infer an ordering of genes from knock-down phenotypes Strengths FG-NEMs could be used in an iterative computational-experimental framework Handles signed interactions between S-genes Weaknesses Computational complexity of the inference procedure might be high Required independence among E-genes Model pairs of S-genes at a time

Concluding remarks We have seen a suite of problems, algorithms and applications in a real setting These ranged from network inference, dynamic network inference, network modules, network alignment and network-based interpretation What we did not see (or saw less of) Integration of different types of networks Experimental design for better learning of networks More topological properties of networks Subnetwork network analysis If you remain interested in these topics or would like to learn more, send me an email