Graph and Tensor Mining for fun and profit

Slides:



Advertisements
Similar presentations
Class 12: Communities Network Science: Communities Dr. Baruch Barzel.
Advertisements

1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Connectivity - Menger’s Theorem Graphs & Algorithms Lecture 3.
CMU SCS Large Graph Mining - Patterns, Explanations and Cascade Analysis Christos Faloutsos CMU.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
CMU SCS : Multimedia Databases and Data Mining Lecture #21: Tensor decompositions C. Faloutsos.
KDD 2009 Scalable Graph Clustering using Stochastic Flows Applications to Community Discovery Venu Satuluri and Srinivasan Parthasarathy Data Mining Research.
CMU SCS Copyright: C. Faloutsos (2012)# : Multimedia Databases and Data Mining Lecture #27: Graph mining - Communities and a paradox Christos.
The Connectivity and Fault-Tolerance of the Internet Topology
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta, Michael Mahoney Yahoo! Research.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
CMU SCS Copyright: C. Faloutsos (2014)# : Multimedia Databases and Data Mining Lecture #29: Graph mining - Generators & tools Christos Faloutsos.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
Discovering Overlapping Groups in Social Media Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu Arizona State University.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
Visual Mining of Communities in Complex Networks: Bringing Humans Into the Loop Perceptual Science and Technology REU Jack Murtagh & Florentina Ferati.
Clustering (Part II) 11/26/07. Spectral Clustering.
A scalable multilevel algorithm for community structure detection
Fast Random Walk with Restart and Its Applications
1 Clustering Applications at Yahoo! Deepayan Chakrabarti
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS Graph Analytics wkshpC. Faloutsos (CMU) 1 Graph Analytics Workshop: Tools Christos Faloutsos CMU.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
JJE: INEX XML Competition Bryan Clevenger James Reed Jon McElroy.
Mining Social Networks for Personalized Prioritization Shinjae Yoo, Yiming Yang, Frank Lin, II-Chul Moon [KDD ’09] 1 Advisor: Dr. Koh Jia-Ling Reporter:
CMU SCS Talk 3: Graph Mining Tools – Tensors, communities, parallelism Christos Faloutsos CMU.
CMU SCS Talk 3: Graph Mining Tools – Tensors, communities, parallelism Christos Faloutsos CMU.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Detecting Communities Via Simultaneous Clustering of Graphs and Folksonomies Akshay Java Anupam Joshi Tim Finin University of Maryland, Baltimore County.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Jure Leskovec Kevin J. Lang Anirban Dasgupta Michael W. Mahoney WWW’ 2008 Statistical Properties of Community Structure in Large Social and Information.
Data Structures and Algorithms in Parallel Computing Lecture 7.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
About Me Swaroop Butala  MSCS – graduating in Dec 09  Specialization: Systems and Databases  Interests:  Learning new technologies  Application of.
Efficient Semi-supervised Spectral Co-clustering with Constraints
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CS 312: Algorithm Design & Analysis Lecture #29: Network Flow and Cuts This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported.
Graph clustering to detect network modules
Large Graph Mining: Power Tools and a Practitioner’s guide
School of Computing Clemson University Fall, 2012
June 2017 High Density Clusters.
NetMine: Mining Tools for Large Graphs
Generative models preserving community structure
Community detection in graphs
Overview of Communities in networks
Groups of vertices and Core-periphery structure
Large Graph Mining: Power Tools and a Practitioner’s guide
Large Graph Mining: Power Tools and a Practitioner’s guide
15-826: Multimedia Databases and Data Mining
Peer-to-Peer and Social Networks
R-MAT: A Recursive Model for Graph Mining
Community Detection: Overlapping Communities
Generative models preserving community structure
Statistical properties of network community structure
Graph and Tensor Mining for fun and profit
Approximating the Community Structure of the Long Tail
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
3.3 Network-Centric Community Detection
Large Graph Mining: Power Tools and a Practitioner’s guide
Affiliation Network Models of Clusters in Networks
Analysis of Large Graphs: Community Detection
Presentation transcript:

Graph and Tensor Mining for fun and profit Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation ? KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection Algorithm Warning: ‘no good cuts’ P1.4: fraud/anomaly detection ? KDD 2018 Dong+

Problem Given a graph, and k Break it into k (disjoint) communities KDD 2018 Dong+

Short answer METIS [Karypis, Kumar] KDD 2018 Dong+

Solution#1: METIS Arguably, the best algorithm Open source, at http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.1.0.tar.gz and *many* related papers, at same url Main idea: coarsen the graph; partition; un-coarsen KDD 2018 Dong+

Solution #1: METIS G. Karypis and V. Kumar. METIS 4.0: Unstructured graph partitioning and sparse matrix ordering system. TR, Dept. of CS, Univ. of Minnesota, 1998. <and many extensions> KDD 2018 Dong+

Solutions #2,3… Fiedler vector (2nd singular vector of Laplacian). Modularity: Community structure in social and biological networks M. Girvan and M. E. J. Newman, PNAS June 11, 2002. 99 (12) 7821-7826; https://doi.org/10.1073/pnas.122653799 Co-clustering: [Dhillon+, KDD’03] Clustering on the A2 (square of adjacency matrix) [Zhou, Woodruff, PODS’04] Minimum cut / maximum flow [Flake+, KDD’00] …. KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection Algorithm Warning: ‘no good cuts’ P1.4: fraud/anomaly detection ? KDD 2018 Dong+

A word of caution BUT: often, there are no good cuts: KDD 2018 Dong+

A word of caution BUT: often, there are no good cuts: KDD 2018 Dong+

A word of caution Maybe there are no good cuts: ``jellyfish’’ shape [Tauro+’01], [Siganos+,’06], strange behavior of cuts [Chakrabarti+’04], [Leskovec+,’08] KDD 2018 Dong+

A word of caution Maybe there are no good cuts: ``jellyfish’’ shape [Tauro+’01], [Siganos+,’06], strange behavior of cuts [Chakrabarti+,’04], [Leskovec+,’08] ? ? KDD 2018 Dong+

R1: Jellyfish model [Tauro+] … A Simple Conceptual Model for the Internet Topology, L. Tauro, C. Palmer, G. Siganos, M. Faloutsos, Global Internet, November 25-29, 2001 Jellyfish: A Conceptual Model for the AS Internet Topology G. Siganos, Sudhir L Tauro, M. Faloutsos, J. of Communications and Networks, Vol. 8, No. 3, pp 339-350, Sept. 2006. KDD 2018 Dong+

R1: Jellyfish model [Tauro+] … A Simple Conceptual Model for the Internet Topology, L. Tauro, C. Palmer, G. Siganos, M. Faloutsos, Global Internet, November 25-29, 2001 Jellyfish: A Conceptual Model for the AS Internet Topology G. Siganos, Sudhir L Tauro, M. Faloutsos, J. of Communications and Networks, Vol. 8, No. 3, pp 339-350, Sept. 2006. KDD 2018 Dong+

R1: Jellyfish model [Tauro+] … A Simple Conceptual Model for the Internet Topology, L. Tauro, C. Palmer, G. Siganos, M. Faloutsos, Global Internet, November 25-29, 2001 Jellyfish: A Conceptual Model for the AS Internet Topology G. Siganos, Sudhir L Tauro, M. Faloutsos, J. of Communications and Networks, Vol. 8, No. 3, pp 339-350, Sept. 2006. KDD 2018 Dong+

R2: 'Familiar strangers’ Bipartite graph (‘heterophily’) ‘lawyers’ 'eng.’ eng. ? lawyers ? KDD 2018 Dong+

R3: ``Core-periphery’’ Bipartite graph + clique Main members ? satellites ? KDD 2018 Dong+

Strange behavior of min cuts ‘negative dimensionality’ (!) log (# edges) log (mincut-size / #edges) Clickstream graph Slope~ -0.45 1-1/d NetMine: New Mining Tools for Large Graphs, by D. Chakrabarti, Y. Zhan, D. Blandford, C. Faloutsos and G. Blelloch, in the SDM 2004 Workshop on Link Analysis, Counter-terrorism and Privacy Statistical Properties of Community Structure in Large Social and Information Networks, J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. WWW 2008. KDD 2018 Dong+

Strange behavior of min cuts ‘negative dimensionality’ (!) log (# edges) log (mincut-size / #edges) Clickstream graph Slope~ -0.45 1-1/d NetMine: New Mining Tools for Large Graphs, by D. Chakrabarti, Y. Zhan, D. Blandford, C. Faloutsos and G. Blelloch, in the SDM 2004 Workshop on Link Analysis, Counter-terrorism and Privacy Statistical Properties of Community Structure in Large Social and Information Networks, J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. WWW 2008. KDD 2018 Dong+

Short answer METIS [Karypis, Kumar] (but: maybe NO good cuts exist!) KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation KDD 2018 Dong+