Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1.

Similar presentations


Presentation on theme: "Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1."— Presentation transcript:

1 Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1

2 Overview  Introduction & Motivation  Graph cut criterion  Min-cut  Normalized-cut  Non-overlapping community detection  Spectral clustering  Deep auto-encoder  Overlapping community detection  BigCLAM algorithm 2

3 Intro to Analysis of Large Graphs  Introduction  Objective 1 KIM HYEONG CHEOL

4 Introduction  What is the graph?  Definition An ordered pair G = (V, E) A set V of vertices A set E of edges A line of connection between two vertices 2-elements subsets of V  Types Undirected graph, directed graph, mixed graph, multigraph, weighted graph and so on 4

5 Introduction  Undirected graph  Edges have no orientation  Edge (x,y) = Edge (y,x)  The maximum number of edges : n(n-1)/2 All pair of vertices are connected to each other Undirected graph G = (V, E) V : {1,2,3,4,5,6} E : {E(1,2), E(2,3), E(1,5), E(2,5), E(4,5) E(3,4), E(4,6)} 5

6 Introduction  The undirected large graph E.g) Social graph Graph of Harry potter fanfiction A sampled user email-connectivity graph : http://research.microsoft.com/en-us/projects/S-GPS/ Adapted from http://colah.github.io/posts/2014-07-FFN-Graphs-Vis/ 6

7 Introduction  The undirected large graph E.g) Social graph Graph of Harry potter fanfiction A sampled user email-connectivity graph : http://research.microsoft.com/en-us/projects/S-GPS/ Adapted from http://colah.github.io/posts/2014-07-FFN-Graphs-Vis/ Q : What do these large graphs present? 7

8 Motivation  Social graph : How can you feel? A sampled user email-connectivity graph : http://research.microsoft.com/en-us/projects/S-GPS/ VS 8

9 Motivation  Graph of Harry potter fanfiction : How can you feel? VS Adapted from http://colah.github.io/posts/2014-07-FFN-Graphs-Vis/ 9

10 Motivation  If we can partition, we can use it for analysis of graph as below 10

11 Motivation  Graph partition & community detection 11

12 Motivation  Graph partition & community detection 12

13 Motivation  Graph partition & community detection Partition Community 13

14 Motivation  Graph partition & community detection Partition Community Q : How can we find the partitions? 14

15 Criterion : Graph partitioning  Minimum-cut  Normalized-cut 2 KIM HYEONG CHEOL

16 Criterion : Basic principle  A Basic principle for graph partitioning  Minimize the number of between-group connections  Maximize the number of within-group connections Graph partitioning : A & B 16

17 Criterion : Min-cut VS N-cut  A Basic principle for graph partitioning  Minimize the number of between-group connections  Maximize the number of within-group connections Minimum-Cutvs Normalized-Cut Min-cutN-cut Minimize: between group connections Maximize : within-group connections X 17

18 Mathematical expression : Cut (A,B)  For considering between-group 18

19 Mathematical expression : Vol (A)  For considering within-group vol (A) = 5 vol (B) = 5 19

20 Criterion : Min-cut  Minimize the number of between-group connections  min A,B cut(A,B) Cut(A,B) = 1 -> Minimum value A B 20

21 Criterion : Min-cut Cut(A,B) = 1 A B A B But, it looks more balanced… How? 21

22 Criterion : N-cut  Minimize the number of between-group connections  Maximize the number of within-group connections If we define ncut(A,B) as below, -> The minimum value of ncut(A,B) will produces more balanced partitions because it consider both principles 22

23 Methodology A B A B VS 23

24 Summary  What is the undirected large graph?  How can we get insight from the undirected large graph?  Graph Partition & Community detection  What were the methodology for good graph partition?  Min-cut  Normalized-cut 24

25  Spectral Clustering  Deep GraphEncoder 3 Non-overlapping community detection: Waleed Abdulwahab Yahya Al-Gobi

26 Finding Clusters  How to identify such structure?  How to spilt the graph into two pieces? Nodes Adjacency Matrix Network 26

27 Spectral Clustering Algorithm  Three basic stages:  1) Pre-processing Construct a matrix representation of the graph  2) Decomposition Compute eigenvalues and eigenvectors of the matrix Focus is about and it corresponding.  3) Grouping Assign points to two or more clusters, based on the new representation 27

28 Matrix Representations  Adjacency matrix ( A ):  n  n binary matrix  A=[a ij ], a ij =1 if edge between node i and j 1 1 3 3 2 2 5 5 4 4 6 6 123456 1 011010 2 101000 3 110100 4 001011 5 100101 6 000110 28

29 Matrix Representations  Degree matrix (D):  n  n diagonal matrix  D=[d ii ], d ii = degree of node i 1 1 3 3 2 2 5 5 4 4 6 6 123456 1 300000 2 020000 3 003000 4 000300 5 000030 6 000002 29

30 Matrix Representations  How can we use (L) to find good partitions of our graph?  What are the eigenvalues and eigenvectors of (L)?  We know: L. x = λ. x 30

31 Spectrum of Laplacian Matrix (L)  The Laplacian Matrix (L) has:  Eigenvalues where  Eigenvectors 31

32 Best Eigenvector for partitioning  Second Eigenvector  Best eigenvector that represents best quality of graph partitioning.  Let’s check the components of through  Fact: For symmetric matrix ( L) : 32

33 λ2 as optimization problem 33 Details! Remember : L = D - A

34 λ2 as optimization problem i j 0 x Balance to minimize 34

35 Spectral Partitioning Algorithm: Example  1) Pre-processing:  Build Laplacian matrix L of the graph  2) Decomposition:  Find eigenvalues and eigenvectors x of the matrix L  Map vertices to corresponding components of X 2 0.0-0.4 0.4-0.60.4 0.50.4-0.2-0.5-0.30.4 -0.50.40.60.1-0.30.4 0.5-0.40.60.10.30.4 0.00.4-0.40.40.60.4 -0.5-0.4-0.2-0.50.30.4 5.0 4.0 3.0 1.0 0.0 = X = How do we now find the clusters? -0.6 6 -0.3 5 4 0.3 3 0.6 2 0.3 1 123456 13 0 0 2 2 000 3 3 00 400 3 5 00 3 6000 2 35

36 Spectral Partitioning Algorithm: Example  3) Grouping:  Sort components of reduced 1-dimensional vector  Identify clusters by splitting the sorted vector in two  How to choose a splitting point?  Naïve approaches: Split at 0 or median value -0.6 6 -0.3 5 4 0.3 3 0.6 2 0.3 1 Split at 0: Cluster A: Positive points Cluster B: Negative points 0.3 3 0.6 2 0.3 1 -0.6 6 -0.3 5 4 A B 36

37 Example: Spectral Partitioning Rank in x 2 Value of x 2 37

38 Example: Spectral Partitioning Rank in x 2 Value of x 2 Components of x 2 38

39 k-Way Spectral Clustering  How do we partition a graph into k clusters?  Two basic approaches:  Recursive bi-partitioning [Hagen et al., ’92] Recursively apply bi-partitioning algorithm in a hierarchical divisive manner Disadvantages: Inefficient  Cluster multiple eigenvectors [Shi-Malik, ’00] Build a reduced space from multiple eigenvectors Commonly used in recent papers A preferable approach 39

40 Muhammad Burhan Hafez Deep GraphEncoder [Tian et al., 2014]  Spectral Clustering  Deep GraphEncoder 4

41 41 Autoencoder  Reconstruction loss:  Architecture: E1 D1 E2 D2

42 42 Autoencoder & Spectral Clustering  Simple theorem (Eckart-Young-Mirsky theorem) :  Let A be any matrix, with singular value decomposition (SVD) A = U Σ V T  Let be the decomposition where we keep only the k largest singular values  Then, is Note: If A is symmetric  singular values are eigenvalues & U = V = eigenvectors. Result (1): Spectral Clustering ⇔ matrix reconstruction

43 43  Autoencoder case:   based on previous theorem, where X = U Σ V T and K is the hidden layer size Autoencoder & Spectral Clustering (cont’d) Result (2): Autoencoder ⇔ matrix reconstruction

44 44 Deep GraphEncoder | Algorithm  Clustering with GraphEncoder: 1. Learn a nonlinear embedding of the original graph by deep autoencoder (the eigenvectors corresponding to the K smallest eigenvalues of graph Lablacian matrix). 2. Run k-means algorithm on the embedding to obtain clustering result.

45 45 Deep GraphEncoder | Efficiency  Approx. guarantee: Cut found by Spectral Clustering and Deep GraphEncoder is at most 2 times away from the optimal. Spectral ClusteringGraphEncoder Θ (n 3 ) due to EVD Θ (ncd) c : avg degree of the graph d: max # of hidden layer nodes  Computational Complexity:

46 46 Deep GraphEncoder | Flexibility  Sparsity constraint can be easily added. Improving the efficiency (storage & data processing). Improving clustering accuracy. Original objective function Sparsity constraint

47 Overlapping Community Detection  BigCLAM: Introduction 5 SHANG XINDI

48 48 Non-overlapping Communities Network Adjacency matrix Nodes

49 49 Non-overlapping vs Overlapping

50 Facebook Network 50 High school Summer internship Stanford (Squash) Stanford (Basketball) Social communities Nodes: Facebook Users Edges: Friendships

51 Overlapping Communities 51 Edge density in the overlaps is higher! Network Adjacency matrix

52 Assumption 52 j Communities Nodes

53 53 Detecting Communities with MLE 00.9 0 0 0 0 00.10.90 0110 1010 1101 0010

54 54 Detecting Communities with MLE

55 55 BigCLAM Yang, Jaewon, and Jure Leskovec. "Overlapping community detection at scale: a nonnegative matrix factorization approach." Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 2013.

56 BigCLAM 56

57 Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 5 He Ruidan

58  Model Parameter: Community membership strength matrix F  Each row vector Fu in F is the community membership strength of node u in the graph  58 BigCLAM: How to find F

59  Block coordinate gradient ascent: update Fu for each u with other Fv fixed  Compute the gradient of single row 59 BigCLAM v1.0: How to find F

60  Coordinate gradient ascent:  Iterate over the rows of F 60 BigCLAM v1.0: How to find F

61  This is slow! Takes linear time O(n) to compute  As we are solving this for each node u, there are n nodes in total, the overall time complexity is thus O(n^2).  Cannot be applied to large graphs with millions of nodes. 61 BigCLAM v1.0: How to find F Constant Time O(n)

62  However, we notice that:  Usually, the average degree of node in a graph could be treat as constant, Then it takes constant time to compute  Therefore, time complexity to update matrix F is reduced to O(n) 62 BigCLAM v2.0: How to find F

63 Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 6 He Ruidan

64  Model Parameter: Community membership strength matrix F  Each row vector Fu in F is the community membership strength of node u in the graph  64 BigCLAM: How to find F

65  Block Coordinate gradient ascent:  Iterate over the rows of F 65 BigCLAM v1.0: How to find F x x + ax’

66  This is slow! Takes linear time O(n) to compute  As we are solving this for each node u, there are n nodes in total, the overall time complexity is thus O(n^2).  Cannot be applied to large graphs with millions of nodes. 66 BigCLAM v1.0: How to find F Constant Time O(n)

67  However, we notice that:  Usually, the average degree of node in a graph could be treat as constant, Then it takes constant time to compute  Therefore, time complexity to update matrix F is reduced to O(n) 67 BigCLAM v2.0: How to find F

68 Overlapping Community Detection  BigCLAM: How to optimize parameter F ?  Additional reading: state of the art methods 5 He Ruidan

69  Representation learning of graph node.  Try to represent each node using as a numerical vector. Given a graph, the vectors should be learned automatically.  Learning objective: The representation vectors for nodes share similar connections are close to each other in the vector space  After the representation of each node is learnt. Community detection could be modeled as a clustering / classification problem. 69 Graph Representation

70  Graph representation using neural networks / deep learning  B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In SIGKDD, pages 701–710. ACM, 2014.  J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In WWW. ACM, 2015.  F. Tian, B. Gao, Q. Cui, E. Chen, and T.-Y. Liu. Learning deep representations for graph clustering. In AAAI, 2014. 70 Graph Representation

71 Summary  Introduction & Motivation  Graph cut criterion  Min-cut  Normalized-cut  Non-overlapping community detection  Spectral clustering  Deep auto-encoder  Overlapping community detection  BigCLAM algorithm 71

72 Appendix 72

73 Facts about the Laplacian L 73 Details!

74 Proof: 74 Details!


Download ppt "Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1."

Similar presentations


Ads by Google