James Hipp Senior, Clemson University.  Graph Representation G = (V, E) V = Set of Vertices E = Set of Edges  Adjacency Matrix  No Self-Inclusion (i.

Slides:



Advertisements
Similar presentations
Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University.
Advertisements

Social network partition Presenter: Xiaofei Cao Partick Berg.
Evaluating Graph Coloring on GPUs Pascal Grosset, Peihong Zhu, Shusen Liu, Suresh Venkatasubramanian, and Mary Hall Final Project for the GPU class - Spring.
Modularity and community structure in networks
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Graph Partitioning Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
V4 Matrix algorithms and graph partitioning
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Lecture 21: Spectral Clustering
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Fast algorithm for detecting community structure in networks.
1 On Compressing Web Graphs Michael Mitzenmacher, Harvard Micah Adler, Univ. of Massachusetts.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
The Role of Specialization in LDPC Codes Jeremy Thorpe Pizza Meeting Talk 2/12/03.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Clustering Unsupervised learning Generating “classes”
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 FLUTE: Fast Lookup Table Based RSMT Algorithm.
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
BIONFORMATIC ALGORITHMS Ryan Tinsley Brandon Lile May 9th, 2014.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.
Hao-Shang Ma and Jen-Wei Huang Knowledge and Information Discovery Lab, Dept. of Electrical Engineering, National Cheng Kung University The 7th Workshop.
Finding dense components in weighted graphs Paul Horn
Vladyslav Kolbasin Stable Clustering. Clustering data Clustering is part of exploratory process Standard definition:  Clustering - grouping a set of.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.
Union-find Algorithm Presented by Michael Cassarino.
Clustering.
Communities. Questions 1.What is a community (intuitively)? Examples and fundamental hypothesis 2.What do we really mean by communities? Basic definitions.
Network Community Behavior to Infer Human Activities.
Community Detection Algorithms: A Comparative Analysis Authors: A. Lancichinetti and S. Fortunato Presented by: Ravi Tiwari.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs Zhe Jin.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
Data Structures and Algorithms in Parallel Computing Lecture 3.
COMMUNITY DISCOVERY PART 1: A (BRIEF) INTRODUCTION Giulio Rossetti WMA - 4 May 2015.
University at BuffaloThe State University of New York Detecting Community Structure in Networks.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Finding community structure in very large networks
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Alan Mislove Bimal Viswanath Krishna P. Gummadi Peter Druschel.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Topics In Social Computing (67810) Module 1 (Structure) Centrality Measures, Graph Clustering Random Walks on Graphs.
High Performance Computing Seminar
Department of Computer and IT Engineering University of Kurdistan Social Network Analysis Communities By: Dr. Alireza Abdollahpouri.
Graph clustering to detect network modules
Cohesive Subgraph Computation over Large Graphs
Greedy Algorithm for Community Detection
Community detection in graphs
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
CS 3343: Analysis of Algorithms
Graphs Representation, BFS, DFS
Resolution Limit in Community Detection
KAIST CS LAB Oh Jong-Hoon
Overcoming Resolution Limits in MDL Community Detection
3.3 Network-Centric Community Detection
CSE 373 Data Structures and Algorithms
A Block Based MAP Segmentation for Image Compression
“Traditional” image segmentation
Presentation transcript:

James Hipp Senior, Clemson University

 Graph Representation G = (V, E) V = Set of Vertices E = Set of Edges  Adjacency Matrix  No Self-Inclusion (i != j)

 Modularity  Extent to which like is connected to like in a network

 Modularity Example (Newman)

 Vastly large and still growing, exceeding millions if not billions of nodes and links  Can be very sparse or dense, making comprehension of information difficult  Must have computationally efficient algorithm to gain useful information

 Requires the partitioning of networks into segments (“communities”) of densely connected nodes  This can be computationally difficult  Nodes belonging to different communities should be only sparsely connected

 It is difficult to obtain comprehensive information from the large networks that exist in the present-day  Algorithms must be able to perform computationally well to achieve this

 Minimum-Cut Method  Outdated  Useful for load-balancing for parallel computation  Not practical for most real networks in the sense of community partitioning  Useful for Pleasant Parallelism

 Hierarchical Clustering

 Markov Clustering  Spectral Methods  Exhaustive Modularity Maximization and Modularity Optimization (our focus)

 Girvan-Newman  Links between network segments are iteratively removed based on a measure of their betweenness  Complexity of O(N 3 )  Referred to as GN

 Girvan-Newman 1. The betweenness of all existing edges in the network is calculated first. 2. The edge with the highest betweenness is removed. 3. The betweenness of all affected edges is recalculated. 4. Steps 2 and 3 are iteratively repeated until no edges remain.

 Fast (Greedy) Modularity Optimization by Clauset, Newman, and Moore  Essentially similar to GN; a fast and more efficient implementation  Complexity of O(N log 2 N) for sparse graphs

 Drawbacks of Greedy Modularity Optimization  May produce values of modularity that are significantly lower than other methods  Tendency to create large super-communities that contain large fractions of nodes with no significant community structure (slows down algorithm)

 Much quicker than previous modularity maximization algorithms  2 Phase iterative process  Unfolds complete hierarchical structure  Useful for many social and real-world networks as they possess natural organization levels

 Phase 1  Assume weighted network of N nodes 1. Assign community to each node of network 2. Considering neighbors j of i, we evaluate potential modularity gained by removing i from its community and placing it in j’s community 3. Place i in the community where gain of modularity is maximum

 Phase 1  Iteratively repeated until no further improvement can be achieved  The node i can only be moved if the gain in Q is positive, if it is the same then i remains in its own community  There exists a breaking rule for ties

 Phase 1  A node may be and is often considered several times  Output of Phase 1 depends on ordering of nodes  Ordering does not seem to greatly affect modularity but does affect computation time

 Phase 1  The efficiency of the algorithm lies in the fact that the gain in modularity from moving an isolated node i into a community C can be easily computed:

 Phase 2  Construction of new network based off communities 1. Weights of the links between the new nodes are given by the sum of the weight of the links between nodes in the corresponding two communities 2. Nodes in the same community become self- loops

 Phase 2  After completion of Phase 2, we can reapply Phase 1 to the new resulting network and iterate  “Pass” = a combination of these two phases  Most of computing time takes place in first pass  Number of communities decreases each pass

 Simplicity = steps are intuitive and easy to understand, implement (making computation very fast)  Simulations suggest that the complexity of the algorithm is linear on typical and sparse data  Separated into different levels of organization (hierarchical structure)

-30 clique ring, 5 nodes per clique inter-connected through single links -1 st Pass = Partition -2 nd Pass = Global Maximization of Modularity where cliques exist in groups of 2

 Belgian mobile phone network, about 2 Million customers (nodes)  Red vs. Green represents the main language spoken in community (French vs. Dutch)  2 Mega-Communities of language clusters are obvious, but more of a mixture in the center

 Belgian mobile phone network  Only communities of at least 100 or more customers were plotted  All but one community of 10,000+ members had a dominant language spoken by at least 85% of members

 Comparisons between:  CNM = Clauset, Newman, Moore  PL = Pons and Latapy  WT = Wakita and Tsurumi  Our Algorithm = Blondel

 Results = modularity/computation time  Empty Cells = time > 24 hours

 Notice differences in Q between WT and Blondel for the Phone network  WT has tendency for creating balanced communities while Blondel’s creates unbalanced communities (more accurate Q calculation)

 Limitations of Blondel’s algorithm is storage of main memory rather than computation time  Algorithm allows a complete hierarchical structure of network to be viewed  Quickest and most efficient Modularity Maximization algorithm

 Possibility of setting a threshold for modularity in Phase 1 could speed up algorithm  Algorithm allows larger networks to be studied

 Fast unfolding of communities in large networks (Blondel)  Community detection algorithms: a comparitive analysis (Lancichinetti, Fortunato)  CPSC 481 Lecture PowerPoints