Michael L. Nelson CS 495/595 Old Dominion University

Slides:



Advertisements
Similar presentations
Class 12: Communities Network Science: Communities Dr. Baruch Barzel.
Advertisements

Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University.
Social network partition Presenter: Xiaofei Cao Partick Berg.
Clustering.
§3 Shortest Path Algorithms Given a digraph G = ( V, E ), and a cost function c( e ) for e  E( G ). The length of a path P from source to destination.
Information Retrieval Lecture 7 Introduction to Information Retrieval (Manning et al. 2007) Chapter 17 For the MSc Computer Science Programme Dell Zhang.
Weighted graphs Example Consider the following graph, where nodes represent cities, and edges show if there is a direct flight between each pair of cities.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
DIJKSTRA’s Algorithm. Definition fwd search Find the shortest paths from a given SOURCE node to ALL other nodes, by developing the paths in order of increasing.
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Peer-to-Peer and Social Networks Centrality measures.
Relationship Mining Network Analysis Week 5 Video 5.
Graph Partitioning Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
V4 Matrix algorithms and graph partitioning
Lecture 6 Image Segmentation
Fast algorithm for detecting community structure in networks.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
CS 312: Algorithm Analysis Lecture #16: Strongly Connected Components This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported.
Clustering Unsupervised learning Generating “classes”
Community Structure in Social and Biological Network
Community Detection by Modularity Optimization Jooyoung Lee
Online Social Networks and Media
Communities. Questions 1.What is a community (intuitively)? Examples and fundamental hypothesis 2.What do we really mean by communities? Basic definitions.
University at BuffaloThe State University of New York Detecting Community Structure in Networks.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Outline Standard 2-way minimum graph cut problem. Applications to problems in computer vision Classical algorithms from the theory literature A new algorithm.
Finding community structure in very large networks
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Prof. Swarat Chaudhuri COMP 382: Reasoning about Algorithms Fall 2015.
Selected Topics in Data Networking Explore Social Networks:
CS 312: Algorithm Design & Analysis Lecture #29: Network Flow and Cuts This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported.
1 3/21/2016 MATH 224 – Discrete Mathematics First we determine if a graph is connected.
Social Networks Strong and Weak Ties
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Modular organization.
Graph clustering to detect network modules
Social Media Analytics
A vertex u is reachable from vertex v iff there is a path from v to u.
Graphs Representation, BFS, DFS
School of Computing Clemson University Fall, 2012
The minimum cost flow problem
Visualizing Social Networks
Visualizing Social Networks
Parallel Density-based Hybrid Clustering
Community detection in graphs
Unweighted Shortest Path Neil Tang 3/11/2010
Visualizing Social Networks
Peer-to-Peer and Social Networks
Finding modules on graphs
Social Network Analysis - Lecture 3 - *
Graphs Representation, BFS, DFS
Minimum Spanning Tree.
Modeling and Simulation NETW 707
Centralities (4) Ralucca Gera,
Spanning Trees.
Shortest Path Trees Construction
Algorithms Lecture # 29 Dr. Sohail Aslam.
EE5900 Advanced Embedded System For Smart Infrastructure
Social Network Analysis - Lecture 2 - *
CSE572: Data Mining by H. Liu
Maximum Bipartite Matching
EMIS 8374 Search Algorithms Updated 12 February 2008
More Graphs Lecture 19 CS2110 – Fall 2009.
Presentation transcript:

Michael L. Nelson CS 495/595 Old Dominion University Graph Analysis Michael L. Nelson CS 495/595 Old Dominion University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License This course is based on Dr. McCown's class

Slides use figures from Ch 3 Slides use figures from Ch 3.6 of Networks, Crowds and Markets by Easley & Kleinberg (2010) http://www.cs.cornell.edu/home/kleinber/networks-book/

Co-authorship network How can the tightly clustered groups be identified mathematically? Newman & Girvan, 2004

Karate Club splits after a dispute Karate Club splits after a dispute. Can new clubs be identified based on network structure? Zachary, 1977 see data set at: http://konect.uni-koblenz.de/networks/ucidata-zachary and http://vlado.fmf.uni-lj.si/pub/networks/data/ucinet/ucidata.htm#zachary

Graph Partitioning Methods to break a network into sets of connected components called regions Many general approaches Divisive methods: Repeatedly identify and remove edges connecting densely connected regions Agglomerative methods: Repeatedly identify and merge nodes that likely belong in the same region

Divisive Methods 1 10 2 3 9 11 7 8 4 6 12 13 5 14

Agglomerative Methods 1 10 2 3 9 11 7 8 4 6 12 13 5 14

How to split/cluster?

Girvan-Newman Algorithm Divisive method Proposed by Girvan and Newman in 2002 Uses edge betweenness to identify edges to remove Edge betweenness: Total amount of “flow” an edge carries between all pairs of nodes where a single unit of flow between two nodes divides itself evenly among all shortest paths between the nodes (1/k units flow along each of k shortest paths) betweeness is a measure of centrality, or importance, in a graph http://en.wikipedia.org/wiki/Centrality

1 10 2 3 9 11 7 8 4 6 12 13 5 14 Edge Betweenness Example Calculate total flow over edge 7-8 9 11 7 8 4 6 12 13 5 14

One unit flows over 7-8 to get from 1 to 8 10 2 3 One unit flows over 7-8 to get from 1 to 8 9 11 7 8 4 6 12 13 5 14

One unit flows over 7-8 to get from 1 to 9 10 2 3 One unit flows over 7-8 to get from 1 to 9 9 11 7 8 4 6 12 13 5 14

One unit flows over 7-8 to get from 1 to 10 2 3 One unit flows over 7-8 to get from 1 to 10 9 11 7 8 4 6 12 13 5 14

7 total units flow over 7-8 to get from 1 to nodes 8-14 10 2 3 9 11 7 8 4 6 12 13 5 14 7 total units flow over 7-8 to get from 1 to nodes 8-14

7 total units flow over 7-8 to get from 2 to nodes 8-14 10 2 3 9 11 7 8 4 6 12 13 5 14 7 total units flow over 7-8 to get from 2 to nodes 8-14

7 total units flow over 7-8 to get from 3 to nodes 8-14 10 2 3 9 11 7 8 4 6 12 13 5 14 7 total units flow over 7-8 to get from 3 to nodes 8-14

7 x 7 = 49 total units flow over 7-8 from nodes 1-7 to 8-14 10 2 3 9 11 7 8 4 6 12 13 5 14

1 Edge betweenness = 49 10 2 3 9 11 7 8 4 6 12 13 5 14

Calculate betweenness for edge 3-7 1 10 Calculate betweenness for edge 3-7 2 3 9 11 7 8 4 6 12 13 5 14

3 units flow from 1-3 to each 4-14 node, 10 3 units flow from 1-3 to each 4-14 node, so total = 3 x 11 = 33 2 3 9 11 7 8 4 6 12 13 5 14

for each symmetric edge 1 Betweenness = 33 for each symmetric edge 10 2 3 9 11 33 33 7 8 33 33 4 6 12 13 5 14

Calculate betweenness for edge 1-3 10 Calculate betweenness for edge 1-3 2 3 9 11 7 8 4 6 12 13 5 14

Carries all flow to node 1 except from node 2, 10 Carries all flow to node 1 except from node 2, so betweenness = 12 2 3 9 11 7 8 4 6 12 13 5 14

for each symmetric edge 1 10 betweenness = 12 for each symmetric edge 12 12 2 3 9 11 12 12 7 8 4 12 6 12 12 13 12 12 5 14

Calculate betweenness for edge 1-2 10 Calculate betweenness for edge 1-2 2 3 9 11 7 8 4 6 12 13 5 14

Only carries flow from 1 to 2, so betweenness = 1 10 Only carries flow from 1 to 2, so betweenness = 1 2 3 9 11 7 8 4 6 12 13 5 14

betweenness = 1 for each symmetric edge 10 betweenness = 1 for each symmetric edge 1 1 2 3 9 11 7 8 4 6 12 13 1 1 5 14

Edge with highest betweenness 1 10 Edge with highest betweenness 1 12 12 1 2 3 9 11 12 12 33 33 7 49 8 33 33 4 12 6 12 12 13 12 12 1 1 5 14

Node Betweenness Betweenness also defined for nodes Node betweenness: Total amount of “flow” a node carries when a unit of flow between each pair of nodes is divided up evenly over shortest paths Nodes and edges of high betweenness perform critical roles in the network structure

Girvan-Newman Algorithm Calculate betweenness of all edges Remove the edge(s) with highest betweenness Repeat steps 1 and 2 until graph is partitioned into as many regions as desired

Girvan-Newman Algorithm Calculate betweenness of all edges Remove the edge(s) with highest betweenness Repeat steps 1 and 2 until graph is partitioned into as many regions as desired How much computation does this require? Newman (2001) and Brandes (2001) independently developed similar algorithms that reduce the complexity from O(mn2) to O(mn) where m = # of edges, n = # of nodes

Computing Edge Betweenness Efficiently For each node N in the graph Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by 2 Method developed by Brandes (2001) and Newman (2001)

Example Graph F B C I A D G K E H J

Computing Edge Betweenness Efficiently For each node N in the graph Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by 2

Breadth-first search from node A G H I J K

Computing Edge Betweenness Efficiently For each node N in the graph Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by 2

A B C D E 1 1 1 1 2 add 2 add F G H 1 3 add 3 add I J 6 add K

Computing Edge Betweenness Efficiently For each node N in the graph Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by 2

A B C D E F G H I J K 1 1 1 1 2 1 2 3 3 Work from bottom-up starting with K K 6

K gets 1 unit; equal, so evenly divide 1 unit B C E 1 D 1 1 1 F G H 2 1 2 I J 3 3 ½ K gets 1 unit; equal, so evenly divide 1 unit ½ K 6

A B C E 1 D 1 1 1 F G H 2 1 2 1 ½ I gets 1 unit & gets ½ unit from K (total 3/2); gets 2 times as much from F I J 3 3 ½ ½ K 6

A B C E 1 D 1 1 1 F G H 2 1 2 1 ½ ½ 1 J gets 1 unit & gets ½ unit from K (total 3/2); gets 2 times as much from H I J 3 3 ½ ½ K 6

F gets 1 unit & 1 unit from I; equal, so divide evenly B C 1 D E 1 1 1 1 1 F G H 2 1 2 1 ½ ½ 1 F gets 1 unit & 1 unit from I; equal, so divide evenly I J 3 3 ½ ½ K 6

G gets 1 unit & 0.5 + 0.5 from I & J, passes along 2 units B C D E 1 1 1 1 1 1 2 F G H 2 1 2 1 ½ ½ 1 G gets 1 unit & 0.5 + 0.5 from I & J, passes along 2 units I J 3 3 ½ ½ K 6

H gets 1 unit & gets 1 unit from J; equal, so divide evenly B C 1 D E 1 1 1 1 1 2 1 1 F G H 2 1 2 1 ½ ½ 1 H gets 1 unit & gets 1 unit from J; equal, so divide evenly I J 3 3 ½ ½ K 6

B gets 1 and 1 from F & passes 2 C E 1 D 1 1 1 1 1 2 1 1 F G H 2 1 2 1 ½ ½ 1 I J 3 3 ½ ½ K 6

C gets 1 and 1 from F & passes 2 B C D E 1 1 1 1 1 1 2 1 1 F G H 2 1 2 1 ½ ½ 1 I J 3 3 ½ ½ K 6

D gets 1 & gets 2 (G) and 1 (H); passes 4 B C E 1 D 1 1 1 1 1 2 1 1 F G H 2 1 2 1 ½ ½ 1 I J 3 3 ½ ½ K 6

E gets 1 gets 1 from D & passes along 2 4 B C E 1 D 1 1 1 1 1 2 1 1 F G H 2 1 2 1 ½ ½ 1 I J 3 3 ½ ½ K 6

A B C E D F G H I J K No flow yet… 1 1 1 1 2 1 2 3 3 6 2 2 2 4 1 1 2 1 ½ ½ 1 I J 3 3 ½ ½ K 6

Computing Edge Betweenness Efficiently For each node N in the graph Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by 2 Repeat for B, C, etc. Since sum includes flow from A  B and B  A, etc.