Affiliation Network Models of Clusters in Networks

Slides:



Advertisements
Similar presentations
ICDE 2014 LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§, Kyomin Jung ¶, and.
Advertisements

An Evaluation of Community Detection Algorithms on Large-Scale Traffic 1 An Evaluation of Community Detection Algorithms on Large-Scale Traffic.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Community Detection with Edge Content in Social Media Networks Paper presented by Konstantinos Giannakopoulos.
Modeling Blog Dynamics Speaker: Michaela Götz Joint work with: Jure Leskovec, Mary McGlohon, Christos Faloutsos Cornell University Carnegie Mellon University.
Analysis and Modeling of Social Networks Foudalis Ilias.
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research.
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta, Michael Mahoney Yahoo! Research.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
Discovering Overlapping Groups in Social Media Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu Arizona State University.
Application of Graph Theory to OO Software Engineering Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides Department of Applied Informatics.
Department of Engineering Science Department of Zoology Soft partitioning in networks via Bayesian nonnegative matrix factorization Ioannis Psorakis, Steve.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
Popularity versus Similarity in Growing Networks Fragiskos Papadopoulos Cyprus University of Technology M. Kitsak, M. Á. Serrano, M. Boguñá, and Dmitri.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
On Finding Fine-Granularity User Communities by Profile Decomposition Seulki Lee, Minsam Ko, Keejun Han, Jae-Gil Lee Department of Knowledge Service Engineering.
Predicting Positive and Negative Links in Online Social Networks
Chapter 3. Community Detection and Evaluation May 2013 Youn-Hee Han
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Relative Validity Criteria for Community Mining Evaluation
Concept Switching Azadeh Shakery. Concept Switching: Problem Definition C1C2Ck …
Network Community Behavior to Infer Human Activities.
+ Big Data, Network Analysis Week How is date being used Predict Presidential Election - Nate Silver –
Jure Leskovec Kevin J. Lang Anirban Dasgupta Michael W. Mahoney WWW’ 2008 Statistical Properties of Community Structure in Large Social and Information.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Analysis of Massive Data Sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Analysis of Large Graphs Community Detection By: KIM HYEONGCHEOL WALEED ABDULWAHAB YAHYA AL-GOBI MUHAMMAD BURHAN HAFEZ SHANG XINDI HE RUIDAN 1.
The intimate relationship between networks' structure and dynamics
Graph clustering to detect network modules
Cohesive Subgraph Computation over Large Graphs
Negative Link Prediction and Its Applications in Online Political Networks Mehmet Yigit Yildirim Mert Ozer Hasan Davulcu.
Hierarchical Agglomerative Clustering on graphs
Groups of vertices and Core-periphery structure
Jon Crowcroft Pan Hui Computer Laboratory University of Cambridge
AF1: Thinking Scientifically
Section 8.6: Clustering Coefficients
Social and Information Network Analysis: Review of Key Concepts
Community detection in graphs
Overview of Communities in networks
Groups of vertices and Core-periphery structure
Distributed Representations of Subgraphs
Section 8.6 of Newman’s book: Clustering Coefficients
Community Distribution Outliers in Heterogeneous Information Networks
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
Apache Spark & Complex Network
Finding modules on graphs
Resolution Limit in Community Detection
Community Detection: Overlapping Communities
Discovering Functional Communities in Social Media
Statistical properties of network community structure
Classes will begin shortly
Approximating the Community Structure of the Long Tail
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
Overcoming Resolution Limits in MDL Community Detection
Emotions in Social Networks: Distributions, Patterns, and Models
Learning latent structure in complex networks 1 2
3.3 Network-Centric Community Detection
Topic models for corpora and for graphs
(Social) Networks Analysis II
Modelling and Searching Networks Lecture 2 – Complex Networks
Detecting Important Nodes to Community Structure
Advanced Topics in Data Mining Special focus: Social Networks
Analysis of Large Graphs: Overlapping Communities
Presentation transcript:

Affiliation Network Models of Clusters in Networks Jure Leskovec Stanford University Joint work with Jaewon Yang

Behind each of these complex systems there is a network, that defines the interactions between the components And thus basically everything around us is a network of interconnected parts. And we have been dealing and living with networks all our lives Networks

Network Clusters Networks are not uniform/homogeneous They exhibit clusters! Why clusters? What do they correspond to? Blogosphere [Adamic&Glance] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

From Clusters to Communities Clusters form communities Cluster: A set of appropriately connected nodes Community: Nodes with a shared latent property Many reasons for why communities form: World Wide Web Citation networks Social networks Metabolic networks Blogosphere [Adamic&Glance] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Basis for Community Formation How and why do communities form? Granovetter’s Strength of weak ties suggest and the models of small-world suggest: Strong ties are well embedded in the network Weak ties span long-ranges Given a network, how to find communities? 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

SFI collaboration network [Newman] Finding Communities Q: Given a network, how to find communities? A: Find weak ties and identify communities Girvan-Newman’s betweenness centrality, Modularity, and Graph partitioning methods are all based on this idea SFI collaboration network [Newman] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Overlapping Communities Non-overlapping vs. overlapping communities 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Overlaps of Social Circles [Palla et al., ‘05] Overlaps of Social Circles A node belongs to many social circles 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Clique Percolation Method (CPM) [Palla et al., ‘05] Clique Percolation Method (CPM) Two nodes belong to the same community if they can be connected through adjacent k-cliques: k-clique: Fully connected graph on k nodes Adjacent k-cliques: overlap in k-1 nodes k-clique community Set of nodes that can be reached through a sequence of adjacent k-cliques k-clique (k=3) adjacent cliques (k=3) A A C D B B D C k=3 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Mixed-Membership Block Models Mixed-Membership Stochastic Block Models [Airoldi et al.] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Step Back: Community Detection (1) Take a dataset (2) Represent it as a graph (3) Identify communities (really, clusters) (4) Interpret clusters as “real” communities Nodes that have “something” in common C A B D E H F G C A B D E H F G work in the same area publish in same journals 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Ground-Truth Networks with a an explicit notion of Ground-Truth: Collaborations: Conferences & Journals as proxies for scientific areas Social Networks: People join to groups, create lists Information Networks: Users create topic based groups C A B D E H F G C A B D E H F G work in the same area publish in same journals 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Examples of Ground-Truth LiveJournal social network: Users create and join to groups created around culture, entertainment, expression, fandom, life/style, life/support, gaming, sports, student life and technology TuDiabetes network Groups form around specific types of diabetes, different age groups, emotional and social support, arts and crafts groups, different geo regions A node can be a member of 0 or more groups 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Networks with Ground-Truth N … # of nodes E … # of edges C … # of ground-truth communities S … average community size A … memberships per node For example: … fans of Real Madrid ... subscribe to Lady Gaga videos … follow Volvo Ocean Race Youtube social network 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Ground-Truth: Consequences ≈ Ground-truth groups Inferred communities How groups map on the network?  Insights for better models How to evaluate and interpret?  “Accuracy” of methods 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Consequence: Overlaps are DENSER! Edge Probability Nodes u and v share k groups What is edge prob. P(edge | k) as a func. of k? P(edge | k) Consequence: Overlaps are DENSER! k 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Communities in Networks Does it matter at all? Ganovetter and all non-overlapping methods Palla et al., MMSB and overlapping methods as well 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Detecting Dense Overlaps? Can present community detection methods detect dense overlaps? No! (Mixed-Membership) Stochastic block model 𝐸[𝑃 𝑖,𝑗 ]= 𝑐 𝑃 𝑖,𝑐 𝑃 𝑗,𝑐 𝑃 𝑒 (𝑐) Clique percolation 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Dense Overlaps: Natural Model Communities, C Memberships, M Nodes , V Community-Affiliation Graph Model B(V,C,M,{pc}) Provably generates power-law degree distributions and other patterns real-world networks exhibit [Lattanzi, Sivakumar, STOC ‘09] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Community-Affiliation Graph Model AGM is flexible and can express variety of network structures: Non-overlapping, Nested, Overlapping 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Model-based Community Detection H F G Given a Graph, find the Model Affiliation graph B Number of communities Parameters pc Yes, we can! 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

MAG Model Fitting Task: Optimization problem (MLE) How to solve? Given network G(V,E). Find B(V, C, M, {pc}) Optimization problem (MLE) How to solve? Approach: Coordinate ascent (1) Stochastic search over B, while keeping {pc} fixed (2) Optimize {pc}, while keeping B fixed (convex!) Works well in practice! C A B D E H F G 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Experimental Setup Communities! Fit C A B D E H F G Evaluate Evaluation: How well do inferred communities correspond to ground-truth? F1 score, Ω-index, NMI, … Algorithms for comparison: MMSB [Airoldi et al., JMLR] Link Clustering [Ahn et al., Nature] Clique Percolation [Palla et al., Nature] 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Example: Facebook Ego-Net High-school Workplace Stanford, Squash 89% accuracy Stanford, Basketball 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Experiments: Ground-truth Overall (only overlaps) AGM improves (F1≈0.6) 57% (21%) over Link clustering 48% (22%) over CPM 10% (26%) over MMSB 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Experiments: Meta-data Based Evaluation based on node metadata [Ahn et al. ‘10] Similar level of improvement 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Conclusion & Reflections Ground-Truth Communities  Overlaps are denser Present methods can’t detect such overlaps Community-Affiliation Graph Model  Model-based Community Detection Outperforms state-of-the-art 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

Connections: Nested Core-Periphery Denser and denser core of the network Core contains 60% node and 80% edges Whiskers are responsible for good communities Nested Core-Periphery (jellyfish, octopus)

Communities & Core-Periphery 5/27/2019 Jure Leskovec

Explanation 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

THANKS! http://snap.stanford.edu 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks

References J. Yang, J. Leskovec. Structure and Overlaps of Communities in Networks. http://arxiv.org/abs/1205.6228 J. Yang, J. Leskovec. Defining and Evaluating Network Communities based on Ground-truth. http://arxiv.org/abs/1205.6233 J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. In Internet Mathematics. http://arxiv.org/abs/0810.1355 5/27/2019 Jure Leskovec: Affiliation Network Models of Clusters in Networks