C OMMUNITIES AND B ALANCE IN S IGNED N ETWORKS : S PECTRAL A PPROACH -Pranay Anchuri*, Malik Magdon Ismail Rensselaer Polytechnic Institute, NY.
O UTLINE Introduction Structural Balance Heuristic Spectral Methods Results Conclusion Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
S IGNED S OCIAL N ETWORKS Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
S TRUCTURAL B ALANCE Stable Unstable Network is strongly balanced if all triads are stable. Notation : Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Positive Edge Negative Edge
W EAK S TRUCTURAL B ALANCE Stable Unstable Network is weakly balanced if all triads are stable. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
C OMMUNITIES IN B ALANCED N ETWORK Balanced network can be divided so that positive edges lie within communities negative edges between communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Real world networks are rarely structurally balanced. Frustration : Number of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Real world networks are rarely structurally balanced. Frustration : Number of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Frustration = 1
Real world networks are rarely structurally balanced. Frustration : Number of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Frustration = 1
Real world networks are rarely structurally balanced. Frustration : Number of edges that disturb the balance. Positive edges between communities + Negative edges within communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Community Detection
H EURISTIC Ignore the negative edges and cluster the remaining nodes. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
H EURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
H EURISTIC Isolated nodes are added in such a way that minimizes the frustration. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
H EURISTIC Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Spectral Methods
M INIMIZING F RUSTRATION Community C divided into C1,C2 Positive edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
M INIMIZING F RUSTRATION Community C divided into C1,C2 Positive edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. C1 C2 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Frustration = 2
M INIMIZING F RUSTRATION Community C divided into C1,C2 Positive edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. C1 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute C2 Frustration = 1 Frustration = 2
M INIMIZING F RUSTRATION Community C divided into C1,C2 Positive edges between C1 and C2 increase frustration. Negative edges between C1 and C2 decrease frustration. C1 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute C2
M ODULARITY Unsigned Modularity : Number of edges within communities – expected number if edges were randomly permuted. Measure of the “surprise” factor. Higher modularity is better. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
S IGNED M ODULARITY Signed Modularity Surprise factor due to positive edges within communities and negative edges between communities. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Minimizing Frustration Maximizing Modularity Both objectives reduce to maximizing S T M S Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
C OMPUTING THE M AXIMUM Maximizing f (M,S) = S T M S Optimum S : Eigen vector corresponding to maximum Eigen value of M. Eigen vector can be computed by Power Iteration. Requires sparse matrix vector multiplication which is efficient. S ε R n but we need S ε {-1,+1} n !! Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
B OOLEAN S OLUTION Rounding : Based on sign of s i, s i >= 0 1 and -1 o/w. Rounding w/ Improvement : Start with an initial Boolean solution and move the nodes one at a time. If there is a sequence of flips such that solution is closer optimum then retain the changes. Complexity : O(N^2). Rounding w/ Partial Improvement: Consider nodes whose magnitude is close to zero. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
NodeVal in Eigen Vector
Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute NodeVal in Eigen Vector
Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute NodeVal in Eigen Vector
M ULTIPLE C OMMUNITIES Communities can be further divided Until frustration cannot be reduced. Modularity cannot be increased. Change in the objective can be reduced to S T M S Also requires sparse matrix vector multiplication. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Results
M ODULARITY M AXIMIZATION Algorithm# CommunitiesLargestFrustration (% of –ve edges) Epinions.com Clustering ( 15 means) Clustering (40 means) Modularity Modularity w/ partial improvement Slashdot.com Clustering ( 15 means) Clustering (40 means) Modularity Modularity w/ partial improvement Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute Datasets obtained from
F RUSTRATION M INIMIZATION Algorithm# CommunitiesLargestFrustration ( % of – ve edges) Epinions.com Two Division Two Division w/ Partial Improvement Multiple Division Multiple Division w/ Partial Improvement Slashdot.com Two Division Two Division w/ Partial Improvement Multiple Division Multiple Division w/ Partial Improvement Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
S TRONG VS W EAK B ALANCE Minimum Frustration: = 1 when max # communities =2 = 0 when # communities = 3 ( each node in its own community) Minimum frustration with multiple communities implies weak balance. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
N EGATIVE I NCIDENT R ATIO NIR = 3/2 Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
C ONCLUSION Spectral algorithm to detect communities in signed communities. Objective Functions : Minimizing frustration, Maximizing frustration. Careful assignment of nodes leads to better communities. Structural balance (strong and weak) affects the communities detected. Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute
Thank You Questions ? Pranay Anchuri, Malik Magdon Ismail, Rensselaer Polytechnic Institute