Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1.

Slides:



Advertisements
Similar presentations
Consistent Bipartite Graph Co-Partitioning for High-Order Heterogeneous Co-Clustering Tie-Yan Liu WSM Group, Microsoft Research Asia Joint work.
Advertisements

Partitional Algorithms to Detect Complex Clusters
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Information Networks Graph Clustering Lecture 14.
Normalized Cuts and Image Segmentation
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Clustering II CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering Will show slides from:
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Lecture 21: Spectral Clustering
Communities in Heterogeneous Networks Chapter 4 1 Chapter 4, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool,
Spectral Clustering 指導教授 : 王聖智 S. J. Wang 學生 : 羅介暐 Jie-Wei Luo.
Spectral Clustering Course: Cluster Analysis and Other Unsupervised Learning Methods (Stat 593 E) Speakers: Rebecca Nugent1, Larissa Stanberry2 Department.
Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik, Presented by: Alireza Tavakkoli.
L16: Micro-array analysis Dimension reduction Unsupervised clustering.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005
Fast algorithm for detecting community structure in networks.
Segmentation Graph-Theoretic Clustering.
Clustering (Part II) 11/26/07. Spectral Clustering.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 6 May 7, 2006
אשכול בעזרת אלגורתמים בתורת הגרפים
Image Segmentation Image segmentation is the operation of partitioning an image into a collection of connected sets of pixels. 1. into regions, which usually.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
COMMUNITIES IN MULTI-MODE NETWORKS 1. Heterogeneous Network Heterogeneous kinds of objects in social media – YouTube Users, tags, videos, ads – Del.icio.us.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
1 Graph Embedding (GE) & Marginal Fisher Analysis (MFA) 吳沛勳 劉冠成 韓仁智
Presenter : Kuang-Jui Hsu Date : 2011/5/3(Tues.).
Graph Partitioning and Clustering E={w ij } Set of weighted edges indicating pair-wise similarity between points Similarity Graph.
Segmentation using eigenvectors Papers: “Normalized Cuts and Image Segmentation”. Jianbo Shi and Jitendra Malik, IEEE, 2000 “Segmentation using eigenvectors:
Segmentation Course web page: vision.cis.udel.edu/~cv May 7, 2003  Lecture 31.
Models and Algorithms for Complex Networks Graph Clustering and Network Communities.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
Data Structures & Algorithms Graphs
1/24 Introduction to Graphs. 2/24 Graph Definition Graph : consists of vertices and edges. Each edge must start and end at a vertex. Graph G = (V, E)
Spectral Clustering Jianping Fan Dept of Computer Science UNC, Charlotte.
Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering Hongyuan Zha Department of Computer Science.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Clustering – Part III: Spectral Clustering COSC 526 Class 14 Arvind Ramanathan Computational Science & Engineering Division Oak Ridge National Laboratory,
Data Structures and Algorithms in Parallel Computing Lecture 7.
CS 590 Term Project Epidemic model on Facebook
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Spectral Clustering Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.
Mesh Segmentation via Spectral Embedding and Contour Analysis Speaker: Min Meng
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.
Analysis of Massive Data Sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Normalized Cuts and Image Segmentation Patrick Denis COSC 6121 York University Jianbo Shi and Jitendra Malik.
Spectral clustering of graphs
Spectral Methods for Dimensionality
Minimum Spanning Tree 8/7/2018 4:26 AM
June 2017 High Density Clusters.
Jianping Fan Dept of CS UNC-Charlotte
Degree and Eigenvector Centrality
Segmentation Graph-Theoretic Clustering.
Discovering Clusters in Graphs
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Approximating the Community Structure of the Long Tail
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
3.3 Network-Centric Community Detection
Asymmetric Transitivity Preserving Graph Embedding
Affiliation Network Models of Clusters in Networks
Deep learning enhanced Markov State Models (MSMs)
Analysis of Large Graphs: Overlapping Communities
Analysis of Large Graphs: Community Detection
Presentation transcript:

Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1

Overview  Introduction & Motivation  Graph cut criterion  Min-cut  Normalized-cut  Non-overlapping community detection  Spectral clustering  Deep auto-encoder  Overlapping community detection  BigCLAM algorithm 2

Intro to Analysis of Large Graphs  Introduction  Objective 1 KIM HYEONG CHEOL

Introduction  What is the graph?  Definition An ordered pair G = (V, E) A set V of vertices A set E of edges A line of connection between two vertices 2-elements subsets of V  Types Undirected graph, directed graph, mixed graph, multigraph, weighted graph and so on 4

Introduction  Undirected graph  Edges have no orientation  Edge (x,y) = Edge (y,x)  The maximum number of edges : n(n-1)/2 All pair of vertices are connected to each other Undirected graph G = (V, E) V : {1,2,3,4,5,6} E : {E(1,2), E(2,3), E(1,5), E(2,5), E(4,5) E(3,4), E(4,6)} 5

Introduction  The undirected large graph E.g) Social graph Graph of Harry potter fanfiction A sampled user -connectivity graph : Adapted from 6

Introduction  The undirected large graph E.g) Social graph Graph of Harry potter fanfiction A sampled user -connectivity graph : Adapted from Q : What do these large graphs present? 7

Motivation  Social graph : How can you feel? A sampled user -connectivity graph : VS 8

Motivation  Graph of Harry potter fanfiction : How can you feel? VS Adapted from 9

Motivation  If we can partition, we can use it for analysis of graph as below 10

Motivation  Graph partition & community detection 11

Motivation  Graph partition & community detection 12

Motivation  Graph partition & community detection Partition Community 13

Motivation  Graph partition & community detection Partition Community Q : How can we find the partitions? 14

Criterion : Graph partitioning  Minimum-cut  Normalized-cut 2 KIM HYEONG CHEOL

Criterion : Basic principle  A Basic principle for graph partitioning  Minimize the number of between-group connections  Maximize the number of within-group connections Graph partitioning : A & B 16

Criterion : Min-cut VS N-cut  A Basic principle for graph partitioning  Minimize the number of between-group connections  Maximize the number of within-group connections Minimum-Cutvs Normalized-Cut Min-cutN-cut Minimize: between group connections Maximize : within-group connections X 17

Mathematical expression : Cut (A,B)  For considering between-group 18

Mathematical expression : Vol (A)  For considering within-group vol (A) = 5 vol (B) = 5 19

Criterion : Min-cut  Minimize the number of between-group connections  min A,B cut(A,B) Cut(A,B) = 1 -> Minimum value A B 20

Criterion : Min-cut Cut(A,B) = 1 A B A B But, it looks more balanced… How? 21

Criterion : N-cut  Minimize the number of between-group connections  Maximize the number of within-group connections If we define ncut(A,B) as below, -> The minimum value of ncut(A,B) will produces more balanced partitions because it consider both principles 22

Methodology A B A B VS 23

Summary  What is the undirected large graph?  How can we get insight from the undirected large graph?  Graph Partition & Community detection  What were the methodology for good graph partition?  Min-cut  Normalized-cut 24

 Spectral Clustering  Deep GraphEncoder 3 Non-overlapping community detection: Waleed Abdulwahab Yahya Al-Gobi

Finding Clusters  How to identify such structure?  How to spilt the graph into two pieces? Nodes Adjacency Matrix Network 26

Spectral Clustering Algorithm  Three basic stages:  1) Pre-processing Construct a matrix representation of the graph  2) Decomposition Compute eigenvalues and eigenvectors of the matrix Focus is about and it corresponding.  3) Grouping Assign points to two or more clusters, based on the new representation 27

Matrix Representations  Adjacency matrix ( A ):  n  n binary matrix  A=[a ij ], a ij =1 if edge between node i and j

Matrix Representations  Degree matrix (D):  n  n diagonal matrix  D=[d ii ], d ii = degree of node i

Matrix Representations  How can we use (L) to find good partitions of our graph?  What are the eigenvalues and eigenvectors of (L)?  We know: L. x = λ. x 30

Spectrum of Laplacian Matrix (L)  The Laplacian Matrix (L) has:  Eigenvalues where  Eigenvectors 31

Best Eigenvector for partitioning  Second Eigenvector  Best eigenvector that represents best quality of graph partitioning.  Let’s check the components of through  Fact: For symmetric matrix ( L) : 32

λ2 as optimization problem 33 Details! Remember : L = D - A

λ2 as optimization problem i j 0 x Balance to minimize 34

Spectral Partitioning Algorithm: Example  1) Pre-processing:  Build Laplacian matrix L of the graph  2) Decomposition:  Find eigenvalues and eigenvectors x of the matrix L  Map vertices to corresponding components of X = X = How do we now find the clusters?

Spectral Partitioning Algorithm: Example  3) Grouping:  Sort components of reduced 1-dimensional vector  Identify clusters by splitting the sorted vector in two  How to choose a splitting point?  Naïve approaches: Split at 0 or median value Split at 0: Cluster A: Positive points Cluster B: Negative points A B 36

Example: Spectral Partitioning Rank in x 2 Value of x 2 37

Example: Spectral Partitioning Rank in x 2 Value of x 2 Components of x 2 38

k-Way Spectral Clustering  How do we partition a graph into k clusters?  Two basic approaches:  Recursive bi-partitioning [Hagen et al., ’92] Recursively apply bi-partitioning algorithm in a hierarchical divisive manner Disadvantages: Inefficient  Cluster multiple eigenvectors [Shi-Malik, ’00] Build a reduced space from multiple eigenvectors Commonly used in recent papers A preferable approach 39

Muhammad Burhan Hafez Deep GraphEncoder [Tian et al., 2014]  Spectral Clustering  Deep GraphEncoder 4

41 Autoencoder  Reconstruction loss:  Architecture: E1 D1 E2 D2

42 Autoencoder & Spectral Clustering  Simple theorem (Eckart-Young-Mirsky theorem) :  Let A be any matrix, with singular value decomposition (SVD) A = U Σ V T  Let be the decomposition where we keep only the k largest singular values  Then, is Note: If A is symmetric  singular values are eigenvalues & U = V = eigenvectors. Result (1): Spectral Clustering ⇔ matrix reconstruction

43  Autoencoder case:   based on previous theorem, where X = U Σ V T and K is the hidden layer size Autoencoder & Spectral Clustering (cont’d) Result (2): Autoencoder ⇔ matrix reconstruction

44 Deep GraphEncoder | Algorithm  Clustering with GraphEncoder: 1. Learn a nonlinear embedding of the original graph by deep autoencoder (the eigenvectors corresponding to the K smallest eigenvalues of graph Lablacian matrix). 2. Run k-means algorithm on the embedding to obtain clustering result.

45 Deep GraphEncoder | Efficiency  Approx. guarantee: Cut found by Spectral Clustering and Deep GraphEncoder is at most 2 times away from the optimal. Spectral ClusteringGraphEncoder Θ (n 3 ) due to EVD Θ (ncd) c : avg degree of the graph d: max # of hidden layer nodes  Computational Complexity:

46 Deep GraphEncoder | Flexibility  Sparsity constraint can be easily added. Improving the efficiency (storage & data processing). Improving clustering accuracy. Original objective function Sparsity constraint

Overlapping Community Detection  BigCLAM: Introduction 5 SHANG XINDI

48 Non-overlapping Communities Network Adjacency matrix Nodes

49 Non-overlapping vs Overlapping

Facebook Network 50 High school Summer internship Stanford (Squash) Stanford (Basketball) Social communities Nodes: Facebook Users Edges: Friendships

Overlapping Communities 51 Edge density in the overlaps is higher! Network Adjacency matrix

Assumption 52 j Communities Nodes

53 Detecting Communities with MLE

54 Detecting Communities with MLE

55 BigCLAM Yang, Jaewon, and Jure Leskovec. "Overlapping community detection at scale: a nonnegative matrix factorization approach." Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 2013.

BigCLAM 56

Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 5 He Ruidan

 Model Parameter: Community membership strength matrix F  Each row vector Fu in F is the community membership strength of node u in the graph  58 BigCLAM: How to find F

 Block coordinate gradient ascent: update Fu for each u with other Fv fixed  Compute the gradient of single row 59 BigCLAM v1.0: How to find F

 Coordinate gradient ascent:  Iterate over the rows of F 60 BigCLAM v1.0: How to find F

 This is slow! Takes linear time O(n) to compute  As we are solving this for each node u, there are n nodes in total, the overall time complexity is thus O(n^2).  Cannot be applied to large graphs with millions of nodes. 61 BigCLAM v1.0: How to find F Constant Time O(n)

 However, we notice that:  Usually, the average degree of node in a graph could be treat as constant, Then it takes constant time to compute  Therefore, time complexity to update matrix F is reduced to O(n) 62 BigCLAM v2.0: How to find F

Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 6 He Ruidan

 Model Parameter: Community membership strength matrix F  Each row vector Fu in F is the community membership strength of node u in the graph  64 BigCLAM: How to find F

 Block Coordinate gradient ascent:  Iterate over the rows of F 65 BigCLAM v1.0: How to find F x x + ax’

 This is slow! Takes linear time O(n) to compute  As we are solving this for each node u, there are n nodes in total, the overall time complexity is thus O(n^2).  Cannot be applied to large graphs with millions of nodes. 66 BigCLAM v1.0: How to find F Constant Time O(n)

 However, we notice that:  Usually, the average degree of node in a graph could be treat as constant, Then it takes constant time to compute  Therefore, time complexity to update matrix F is reduced to O(n) 67 BigCLAM v2.0: How to find F

Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 5 He Ruidan

 Representation learning of graph node.  Try to represent each node using as a numerical vector. Given a graph, the vectors should be learned automatically.  Learning objective: The representation vectors for nodes share similar connections are close to each other in the vector space  After the representation of each node is learnt. Community detection could be modeled as a clustering / classification problem. 69 Graph Representation

 Graph representation using neural networks / deep learning  B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In SIGKDD, pages 701–710. ACM,  J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In WWW. ACM,  F. Tian, B. Gao, Q. Cui, E. Chen, and T.-Y. Liu. Learning deep representations for graph clustering. In AAAI, Graph Representation

Summary  Introduction & Motivation  Graph cut criterion  Min-cut  Normalized-cut  Non-overlapping community detection  Spectral clustering  Deep auto-encoder  Overlapping community detection  BigCLAM algorithm 71

Appendix 72

Facts about the Laplacian L 73 Details!

Proof: 74 Details!