Markov Cluster Algorithm

Slides:



Advertisements
Similar presentations
Social network partition Presenter: Xiaofei Cao Partick Berg.
Advertisements

Introduction to Graph Theory Instructor: Dr. Chaudhary Department of Computer Science Millersville University Reading Assignment Chapter 1.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
Graph-02.
KDD 2009 Scalable Graph Clustering using Stochastic Flows Applications to Community Discovery Venu Satuluri and Srinivasan Parthasarathy Data Mining Research.
Introduction This chapter explores graphs and their applications in computer science This chapter explores graphs and their applications in computer science.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
 Graph Graph  Types of Graphs Types of Graphs  Data Structures to Store Graphs Data Structures to Store Graphs  Graph Definitions Graph Definitions.
Entropy Rates of a Stochastic Process
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
1 Reduction between Transitive Closure & Boolean Matrix Multiplication Presented by Rotem Mairon.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Graph & BFS.
Graphs Intro G.Kamberova, Algorithms Graphs Introduction Gerda Kamberova Department of Computer Science Hofstra University.
Estimating the Global PageRank of Web Communities Paper by Jason V. Davis & Inderjit S. Dhillon Dept. of Computer Sciences University of Texas at Austin.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Graph COMP171 Fall Graph / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D E A C F B Vertex Edge.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
Representing Graphs Wade Trappe. Lecture Overview Introduction Some Terminology –Paths Adjacency Matrix.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Graphs, relations and matrices
Social Media Mining Graph Essentials.
Graph clustering Jin Chen CSE Fall 2012 MSU 1.
Graph mining in bioinformatics Laur Tooming. Graphs in biology Graphs are often used in bioinformatics for describing processes in the cell Vertices are.
GRAPH Learning Outcomes Students should be able to:
GRAPH THEORY.  A graph is a collection of vertices and edges.  An edge is a connection between two vertices (or nodes).  One can draw a graph by marking.
Graph Theory Topics to be covered:
Liang Ge.  Introduction  Important Concepts in MCL Algorithm  MCL Algorithm  The Features of MCL Algorithm  Summary.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Clustering Spatial Data Using Random Walk David Harel and Yehuda Koren KDD 2001.
Data Structures Week 9 Introduction to Graphs Consider the following problem. A river with an island and bridges. The problem is to see if there is a way.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
MA/CSSE 473 Day 28 Dynamic Programming Binomial Coefficients Warshall's algorithm Student questions?
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Adapted for 3030 To accompany the text ``Introduction to Parallel Computing'',
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
Markov Cluster (MCL) algorithm Stijn van Dongen.
Data Structures & Algorithms Graphs
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
Basic properties Continuation
Union By Rank Ackermann’s Function Graph Algorithms Rajee S Ramanikanthan Kavya Reddy Musani.
Lecture 14, CS5671 Clustering Algorithms Density based clustering Self organizing feature maps Grid based clustering Markov clustering.
Group 5 Algorithms Presentation. Agenda Items and Presenters Bell Numbers All Pairs Shortest Path Shell Sort and Radix Sort Psuedocode.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Introduction to Graph Theory By: Arun Kumar (Asst. Professor) (Asst. Professor)
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Theory of Computational Complexity Probability and Computing Lee Minseon Iwama and Ito lab M1 1.
1 Euler and Hamilton paths Jorge A. Cobb The University of Texas at Dallas.
Week 11 - Wednesday.  What did we talk about last time?  Graphs  Paths and circuits.
 Introduction  Important Concepts in MCL Algorithm  MCL Algorithm  The Features of MCL Algorithm  Summary.
Topics In Social Computing (67810) Module 1 (Structure) Centrality Measures, Graph Clustering Random Walks on Graphs.
Subject Four Graphs Data Structures. What is a graph? A data structure that consists of a set of nodes (vertices) and a set of edges that relate the nodes.
All-pairs Shortest paths Transitive Closure
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Chapter - 12 GRAPH MATRICES AND APPLICATIONS.
Graphs Lecture 19 CS2110 – Spring 2013.
Matrix Representation of Graphs
Computing Connected Components on Parallel Computers
Graph Operations And Representation
Graph Clustering based on Random Walk
Graph Operations And Representation
Connectivity Section 10.4.
Graphs.
GRAPHS Lecture 17 CS2110 Spring 2018.
Agenda Review Lecture Content: Shortest Path Algorithm
Chapter 9 Graph algorithms
Presentation transcript:

Markov Cluster Algorithm

Outline Introduction Important Concepts in MCL Algorithm MCL Algorithm The Features of MCL Algorithm Summary

Graph clustering Decompose a network into subnetworks based on some topological properties Usually we look for dense subnetworks

Graph clustering Algorithms: Exact: have proven solution quality and time complexity Approximate: heuristics are used to make them efficient Example algorithms: Highly connected subgraphs (HCS) Restricted neighborhood search clustering (RNSC) Molecular Complex Detection (MCODE) Markov Cluster Algorithm (MCL)

Graph Clustering Intuition: Model: High connected nodes could be in one cluster Low connected nodes could be in different clusters. Model: A random walk may start at any node Starting at node r, if a random walk will reach node t with high probability, then r and t should be clustered together.

Definitions and Representation An undirected graph and its adjacency matrix representation. An undirected graph and its adjacency list representation.

Theorem. Let M be the adjacency matrix for graph G Theorem. Let M be the adjacency matrix for graph G. Then each (i, j) entry in M r is the number of paths of length r from vertex i to vertex j. Note: This is the standard power of m, not a Boolean product.

K-path Graph power The kth power of a graph G: a graph with the same set of vertices as G and an edge between two vertices iff there is a path of length at most k between them The number of paths of length k between any two nodes can be calculated by raising adjacency matrix of G to the exponent k Then, G’s kth power is defined as the graph whose adjacency matrix is given by the sum of the first k powers of the adjacency matrix:

K-Path Clustering G G2 G3

All-Pairs Shortest Paths Given a weighted graph G(V,E,w), the all-pairs shortest paths problem is to find the shortest paths between all pairs of vertices vi, vj ∈ V. A number of algorithms are known for solving this problem.

All-Pairs Shortest Paths: Matrix-Multiplication Based Algorithm Consider the multiplication of the weighted adjacency matrix with itself - except, in this case, we replace the multiplication operation in matrix multiplication by addition, and the addition operation by minimization. Notice that the product of weighted adjacency matrix with itself returns a matrix that contains shortest paths of length 2 between any pair of nodes. It follows from this argument that An contains all shortest paths.

Matrix-Multiplication Based Algorithm

Markov Clustering (MCL) Markov process The probability that a random will take an edge at node u only depends on u and the given edge. It does not depend on its previous route. This assumption simplifies the computation.

MCL Flow of network is used to approximate the partition There is an initial amount of flow injected into each node. At each step, a percentage of flow will goes from a node to its neighbors via the outgoing edges.

MCL Edge Weight Similarity between two nodes Considered as the bandwidth or connectivity. If an edge has higher weight than the other, then more flow will be flown over the edge. The amount of flow is proportional to the edge weight. If there is no edge weight, then we can assign the same weight to all edges.

Intuition of MCL Two natural clusters When the flow reaches the border points, it is likely to return back, than cross the border. A B

MCL When the flow reaches A, it has four possible outcomes. Three back into the cluster, one leak out. ¾ of flow will return, only ¼ leaks. Flow will accumulate in the center of a cluster (island). The border nodes will starve.

Introduction—MCL in General Simualtion of Random Flow in graph Two Operations: Expansion and Inflation Intrinsic relationship between MCL process result and cluster structure

Introduction-Cluster Observation 1: The number of Higher-Length paths in G is large for pairs of vertices lying in the same dense cluster Small for pairs of vertices belonging to different clusters

Introduction-Cluster Oberservation 2: A Random Walk in G that visits a dense cluster will likely not leave the cluster until many of its vertices have been visited Let’s take a driving for example. Scenario a is like there are many different ways of driving from A to B if A and B are in the same district, and only few if they are in different districts. Scenario b is like driving round randomly, but in line with traffic regulations, will keep you in the same district for a long time. Make sense?

Definitions nxn Adjacency matrix A. A(i,j) = weight on edge from i to j If the graph is undirected A(i,j)=A(j,i), i.e. A is symmetric nxn Transition matrix P. P is row stochastic P(i,j) = probability of stepping on node j from node i = A(i,j)/∑iA(i,j)

Flow Formulation Flow: Transition probability from a node to another node. Flow matrix: Matrix with the flows among all nodes; ith column represents flows out of ith node. Each column sums to 1. 1 2 3 1 2 3 0.5 1.0 1 2 3 0.5 Flow Matrix 24

Definitions Adjacency matrix A Transition matrix P 1 1/2 1

What is a random walk t=0 1 1/2

What is a random walk 1 1/2 t=1 t=0 1 1/2

What is a random walk 1 1/2 t=1 t=0 1 1/2 t=2 1 1/2

What is a random walk 1 1/2 t=1 t=0 1 1/2 t=2 1 1/2 t=3 1 1/2

Probability Distributions xt(i) = probability that the surfer is at node i at time t xt+1(i) = ∑j(Probability of being at node j)*Pr(j->i) =∑jxt(j)*P(j,i) xt+1 = xtP = xt-1*P*P= xt-2*P*P*P = …=x0 Pt What happens when the surfer keeps walking for a long time?

Motivation behind MCL Measure or Sample any of these—high-length paths, random walks and deduce the cluster structure from the behavior of the samples quantities. Cluster structure will show itself as a peaked distribution of the quantities A lack of cluster structure will result in a flat distribution

Important Concepts about MCL Markov Chain Random Walk on Graph Some Definitions in MCL

Markov Chain A Random Process with Markov Property Markov Property: given the present state, future states are independent of the past states At each step the process may change its state from the current state to another state, or remain in the same state, according to a certain probability distribution. At each step the system may change its state from the current state to another state, or remain in the same state, according to a certain probability distribution. The changes of state are called transitions, and the probabilities associated with various state-changes are called transition probabilities.

Markov Chain Example Let’s take node 1 as an example. For node 1, it has 5 options when it takes the next step, which are node 2,6,7,10 and itself. The probability distribution tells that there are equal possibility for the next node that node 1 will come to . Markov chain is applied into many fields including economics, internet application(Page Rank is defined in Markov chain), gambling, physics.

Random Walk on Graph A walker takes off on some arbitrary vertex He successively visits new vertices by selecting arbitrarily one of outgoing edges There is not much difference between random walk and finite Markov chain.

Some Definitions in MCL Simple Graph Simple graph is undirected graph in which every nonzero weight equals 1.

Some Definitions in MCL Associated Matrix The associated matrix of G, denoted MG ,is defined by setting the entry (MG)pq equal to w(vp,vq)

Some Definitions in MCL Markov Matrix The Markov matrix associated with a graph G is denoted by TG and is formally defined by letting its qth column be the qth column of M normalized

Example

Explanation to Previous Example The associate matrix and markov matrix is actually for matrix M+I I denotes diagonal matrix with nonzero element equals 1 Adding a loop to every vertex of the graph because for a walker it is possible that he will stay in the same place in his next step

Example

Markov Cluster Algorithm Find Higher-Length Path Start Point: In associated matrix that the quantity (Mk)pq has a straightforward interpretation as the number of paths of length k between vp and vq

Example-Associate Matrix MG We can interprete the matrix as that for vertex 1, the number of 2-length path between itself, vertex 6, vertex 7 and vertex 10 is respectively 5,3,3,4. So we can presumely say that these four vertex belongs to the same cluster. Is is too arbitrary? It is just 2-length path and it seems we should have some kind of threshold? (MG+I)2

Example- Markvo Matrix MG There seems to have the same trend. This means that we’ve already taken one stepsof random walk and its distribution of probability that the walker will go next. It seems that for vertex 1, vertex 1,6,7,10 has higher possibility to choose. Does that mean that we just make those four vertexs as a cluster? Let’s continue to look:

Example-Markov Matrix The fiinal result is that for each of its columns all nonzeros values are homogeneously distributed. It can be interpreted as each node is equally attracted to all of its neighbours or at each node one moves to each of its neighbours with equal probability.

Conclusion Flow is easier with dense regions than across sparse boundaries, However, in the long run, this effect disappears. Power of matrix can be used to find higher- length path but the effect will diminish as the flow goes on.

Inflation Operation Idea: How can we change the distribution of transition probabilities such that prefered neighbours are further favoured and less popular neighbours are demoted. MCL Solution: raise all the entries in a given column to a certain power greater than 1 (e.g. squaring) and rescaling the column to have the sum 1 again.

Example for Inflation Operation From examples, we can tell that this operation does not change entries that are homogeneously distributed and that different column positions with nearly identical values will still be close to each other after rescaling. And those entries whose values are not close are further taken apart.

Definition for Inflation Operation

Apply Inflation Operation to the previous Markov Matrix

Inflation Effects

MCL Opeartions Expansion Operation: power of matrix, expansion of dense region Inflation Operation: mention aboved, elimination of unfavoured region

The MCL algorithm Output clusters Input: A, Adjacency matrix Initialize M to MG, the canonical transition matrix M:= MG:= (A+I) D-1 Enhances flow to well-connected nodes as well as to new nodes. Expand: M := M*M Increases inequality in each column. “Rich get richer, poor get poorer.” Inflate: M := M.^r (r usually 2), renormalize columns Prune Saves memory by removing entries close to zero. No Converged? Yes Output clusters 53 Output clusters

MCL Result for the Graph

An Striking Example

Striking Animation http://www.micans.org/mcl/ani/mcl- animation.html

Mapping nonnegative idempotent matrces onto clusters Find attractor: the node a is an attractor if Maa is nonzero Find attractor system: If a is an attractor then the set of its neighbours is called an attractor system. If there is a node who has arc connected to any node of an attractor system, the node will belong to the same cluster as that attractor system.

Example Attractor Set={1,2,3,4,5,6,7,8,9,10} The Attractor System is {1,2,3},{4,5,6,7},{8,9},{10} The overlapping clusters are {1,2,3,11,12,15},{4,5,6,7,13},{8,9,12,13,14,15},{10,12,13}

MCL Feature how many steps are requred before the algorithm converges to a idempoent matrix? The number is typically somewhere between 10 and 100 The effect of inflation on cluster granularity

Summary MCL stimulates random walk on graph to find cluster Expansion promotes dense region while Inflation demotes the less favoured region There is intrinsic relationship between MCL result and cluster structure

Markov Clustering (MCL) [van Dongen ‘00] The original algorithm for clustering graphs using stochastic flows. Advantages: Simple and elegant. Widely used in Bioinformatics because of its noise tolerance and effectiveness. Disadvantages: Very slow. - Takes 1.2 hours to cluster a 76K node social network. Prone to output too many clusters. Produces 1416 clusters on a 4741 node PPI network.