Network Motifs: Simple Building Blocks of Complex Network

Slides:



Advertisements
Similar presentations
Motif Mining from Gene Regulatory Networks
Advertisements

Great Theoretical Ideas in Computer Science
Gene duplication models and reconstruction of gene regulatory network evolution from network structure Juris Viksna, David Gilbert Riga, IMCS,
Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
11 - Markov Chains Jim Vallandingham.
Design principle of biological networks—network motif.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Great Theoretical Ideas in Computer Science.
Evaluating Hypotheses
Network Motifs Zach Saul CS 289 Network Motifs: Simple Building Blocks of Complex Networks R. Milo et al.
Inference about a Mean Part II
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Data Structures and Algorithms Graphs Minimum Spanning Tree PLSD210.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Lecture 13 Graphs. Introduction to Graphs Examples of Graphs – Airline Route Map What is the fastest way to get from Pittsburgh to St Louis? What is the.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Analysis of biological networks Part III Shalev Itzkovitz Shalev Itzkovitz Uri Alon’s group Uri Alon’s group July 2005 July 2005.
GG 313 beginning Chapter 5 Sequences and Time Series Analysis Sequences and Markov Chains Lecture 21 Nov. 12, 2005.
Exponential random graphs and dynamic graph algorithms David Eppstein Comp. Sci. Dept., UC Irvine.
Assessing the significance of (data mining) results Data D, an algorithm A Beautiful result A (D) But: what does it mean? How to determine whether the.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Markov Chain Monte Carlo in R
Hidden Markov Models BMI/CS 576
Network (graph) Models
Monte Carlo simulation
Sequential Algorithms for Generating Random Graphs
Topics In Social Computing (67810)
Network Motif Discovery using Subgraph Enumeration and Symmetry-Breaking by Grochow and Kellis Wooyoung Kim 4/3/2009 CSc 8910 Analysis of Biological Network,
Approximating the MST Weight in Sublinear Time
CSC321: Neural Networks Lecture 19: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
Hashing Alexandra Stefan.
Greedy Algorithm for Community Detection
Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Elementary Statistics
Depth-First Search.
Applicable Mathematics “Probability”
Community detection in graphs
Ahnert, S. E., & Fink, T. M. A. (2016). Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties.
Biological Networks Analysis Degree Distribution and Network Motifs
Hypothesis Theory examples.
Discrete Mathematics for Computer Science
Models of Network Formation
Heuristic search INT 404.
Properties of Random Numbers
Models of Network Formation
Models of Network Formation
CSE 589 Applied Algorithms Spring 1999
Lectures on Graph Algorithms: searching, testing and sorting
Models of Network Formation
CSCI2950-C Lecture 13 Network Motifs; Network Integration
School of Computer Science & Engineering
Significance Tests: The Basics
Stephen Govea Javad Zandazad
CMSC 471 – Fall 2011 Class #25 – Tuesday, November 29
Sampling Distributions (§ )
Threshold Autoregressive
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Stochastic Methods.
Presentation transcript:

Network Motifs: Simple Building Blocks of Complex Network Lecturer: Jian Li

Introduction Recently, it was found that biochemical and neuronal network share a similar property: they contain recurring circuit elements which occur more often far more than that in randomized networks. We call such simple building blocks network motifs.

Introduction In the case of biological regulation networks, it has been suggested that network motifs play key information processing roles.

Introduction Some examples: Three major network mortifs were found in the transcription network of bacteria and yeast. One of these the feed-forward loop, has been shown theoretically to perform information processing tasks such as sign-sensitive filtering, response acceleration and pulse-generation.

Introduction Some examples:

Introduction Schematic Illustration: Red dashed line indicate edges that participate in the feedforward loop motif, which occur five times in the real network.

Introduction Applications in other network Ecology (food web) Neurobiology (neuron connectivity) Engineering (electronic circuit, WWW) ……………………

Introduction Some remarks: The solution we get is closely related to the randomized network model. So a reasonable select of randomized network model is very important. Some functional-important but less-frequent building block will be missed no matter how we select our model. To find this type of things need specific knowledge and information which are beyond the sweep of graph theory approach.

Related Problems Theoretical Perspective: efficiently counting cycle. counting spanning trees. number of nonisomorphic graphs testing isomorphism approximating perfect matching. approximating frequent subgraphs based on the regularity lemma. …………………

Related Problems Data mining perspective. Mining frequent subgraphs. Mining a given subgraph. Mining subgraphs in sparse network. Graph-based substructure pattern mining(gSpan)…………………

Related Problems Random network. Generating randomized network with prescribed degree sequence. Estimating subgraphs in random networks.

Related Problems Random network. Erdos model -the distribution of the number of edges per node exhibit a Poissonian distribution. Scale-free model -the distribution of the number of edges per node exhibit a exponential distribution.

Randomized Network Generating randomized network Here we only give a simple algorithm. We employed a Markov-chain algorithm, based on starting with the real network and repeatedly swapping randomly chosen pairs of connections (X1->Y1, X2 ->Y2 is replaced by X1->Y2, X2->Y1) until the network is well randomized. Switching is prohibited if the either of the connections X1->Y2 or X2->Y1 already exist.

Randomized Network Controlling for Appearances of (n – 1)-Node Motifs We generate a series of randomized network ensembles, each of which has the same (n – 1)-node subgraph count as the real network, as a null hypothesis for detecting n-node motifs. This is done to avoid assigning high significance to a structure only because of the fact that it includes a highly significant substructure.

Randomized Network Controlling for Appearances of (n – 1)-Node Motifs Metropolis Monte-Carlo approach Vreal,k be the number of appearances of each of the kth (n-1)-node subgraphs in the real network and Vrand,k be the corresponding vector in the randomized network. We define an energy E = k(|Vreal,k – Vrand,k|/(Vreal,k + Vrand,k)). The energy E is zero only when all the three-node subgraph counts of the real and randomized graphs are equal.

Randomized Network Controlling for Appearances of (n – 1)-Node Motifs start by fully randomizing the network according to first algorithm. Then, we generate a random switch (X1->Y1, X2-> Y2 to (X1->Y2, X2->Y1), and similarly for double edges, as described above). If this switch lowers E, it is accepted. Otherwise, it is accepted with probability exp(–M E/T), where ME is the difference in energy before and after the switch and T is an effective temperature.

Graph Theoretical Results Controlling for Appearances of (n – 1)-Node Motifs This process is repeated, with a simulated annealing regiment to lower T slowly until a solution with E = 0 is obtained. This can be readily generalized to form (n – 1)-node null-hypothesis networks

Algorithm: Counting Goal: find all n-node network motif Method: Do the following for both real network and randomized network Simply enumerate all the possible n node subgraphs, classify them into non-isomorphic class. Count the number of subgraphs in each class.[see all types of 3,4node nonisomorphic graphs]

Algorithm: Counting Efficiently count all connected n-node subgraphs in a connectivity matrix M main{ for all rows i ; for each nonzero element (i, j); search (i,j); } search(i,j) { for each k such that Mik = 1 and k!=j{ if an n-node subgraph is obtained then record it and return; else search (i,k); do similar things for each Mki = 1, Mkj = 1, Mjk = 1;

Algorithm: Counting A table is formed that counts the number of appearances of each type of subgraph in the network, This process is repeated for each of the randomized networks. The number of appearances of each type of subgraph in the random ensemble is recorded, to assess its statistical significance.

Algorithm: Counting Criteria for Network Motif Selection (i) The probability that it appears in a randomized network an equal or greater number of times than in the real network is smaller than P = 0.01. (ii) The number of times it appears in the real network with distinct sets of nodes is at least 4. (iii) The number of appearances in the real network is significantly larger than in the randomized networks: Nreal – Nrand > 0.1Nrand. This is done to avoid detecting as motifs some common subgraphs that have only a slight difference between Nrand and Nreal but have a narrow distribution in the randomized networks.

Algorithm: Counting Result Ci=Ni/i Ni Z-scores : Z = (Creal –Crand)/Varrand (note the inequality: P[|(X-E(x))|>Z*Var(x)]<1/Z2 ) High Z-scores indicate the event is quit unlikely.

Algorithm: Sampling A clever trade-off between accuracy and efficiency. The counting algorithm can exactly enumerate the number of subgraph, but to detect network motifs, we only need to know which type of subgraph occur more frequently in real network than in randomized network.

Algorithm: Sampling Using random sampling method can do pretty good estimation. Random sampling has many applications. -approximating dense subset -approximating #P-complete problem -mechine learning ……………

Algorithm: Sampling This algorithm does not enumerate subgraphs exhaustively but instead samples subgraphs in order to estimate their relative frequency. The runtime of the algorithm asymptotically does not depend on the network size. Surprisingly, few samples are needed to detect network motifs reliably. The sampling method is useful for analyzing very large networks or for detection of high-order motifs, which are beyond the reach of exhaustive enumeration algorithms.

Algorithm: Sampling Definition:Es is the set of picked edges Vs is the set of all node that are touch be the edges in Es ALGORITHM Sampling: Initiate Vs= and Es = 1.Pick a random edge e1=(vi,vj),update Es={e1},Vs={vi,vj} 2.Make a list L of all neighboring edges of Es, omit all edges between Vs.if L= return to 1 3.pick a random edge e=(vk,vl)from L. Update Es=Es U {e}, Vs=Vs U {vk,vl} 4.Repeat steps 2-3 until completing n-node subgraph S. 5.Calculate the probability P to sample S.

Algorithm: Sampling The probability of sampling the subgraph is the sum of the probabilities of all such possible ordered sets of n-1 edges: Where Sm is a set of all (n-1)-permutations of the edges from the specific subgraph edges that could lead to a sample of the subgraph. Ej is the j -th edge in a specific (n-1)-permutation (σ).

Algorithm: Sampling

Algorithm: Sampling Add score W = 1/P to the accumulated score, Si , of the relevant subgraph type i: Si = Si + W. After ST samples, assuming we sampled L different subgraph types, we calculate the estimated subgraph concentrations Ci =Si/k=1L Sk

Algorithm: Sampling Z-scores is calculated as before. Z = (Creal –<Crand>)/Varrand where Creal is the concentration in the real network, <Crand> and Varrand are the mean and SD in the randomized networks.

Algorithm: Sampling Sampling method versus exhaustive enumeration, *Highlighted subgraphs were found to be network motifs.

Algorithm: Sampling Algorithm convergence The subgraph concentrations calculated by the sampling algorithm converged to the fully enumerated concentrations. Different numbers of samples were required for achieving good estimations for different subgraphs and in different networks. All of the simulations we performed, on a variety of networks, showed that the results converge toward the real values within ST = 105 samples or less.

Algorithm: Sampling Algorithm convergence It is seen that even with a small number of samples one can estimate reliably concentrations as low as C = 10-5. It is possible to use convergence studies in order to decide the required number of samples.(adaptive sampling method,using instantaneous convergence rate to decide how many samples are enough)

Algorithm: Sampling The sampling method allows accurate counting of rare, high-order subgraphs and motifs

Some discuss and Future attempt We focus on comparing between the real network and the randomized network with prescribed degree sequence. So our question is whether some real frequent building block are caused by the degree sequence. If so, so what we have done will miss this type of building block. Some other randomized network model (rather than the ones with prescribed degree sequence) could be introduced to deal with such case.

Some discuss and Future attempt Embedding the graph to euclidean space, and considering the subgraph with no only topological properties but also geometric properties.

THANKS~~~~~