Graph and Tensor Mining for fun and profit

Slides:



Advertisements
Similar presentations
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
BiG-Align: Fast Bipartite Graph Alignment
CMU SCS Large Graph Mining - Patterns, Explanations and Cascade Analysis Christos Faloutsos CMU.
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
© 2012 IBM Corporation IBM Research Gelling, and Melting, Large Graphs by Edge Manipulation Joint Work by Hanghang Tong (IBM) B. Aditya Prakash (Virginia.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
Analysis of online hate communities in Social Networks Presented by : Ruchi Bhindwale.
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Anomalous Node Detection in Time Series of Mobile Communication Graphs Leman Akoglu January 28, 2010.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
CMU SCS Mining Large Graphs: Fraud Detection, and Algorithms Christos Faloutsos CMU.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
SPOTTING FAKE RETWEETING ACTIVITY IN TWITTER Maria Giatsoglou 1, Despoina Chatzakou 1, Neil Shah 2, Alex Beutel 2, Christos Faloutsos 2, Athena Vakali.
Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro
ECML-PKDD 2010, Barcelona, Spain B. Aditya Prakash*, Hanghang Tong* ^, Nicholas Valler+, Michalis Faloutsos+, Christos Faloutsos* * Carnegie Mellon University,
Du, Faloutsos, Wang, Akoglu Large Human Communication Networks Patterns and a Utility-Driven Generator Nan Du 1,2, Christos Faloutsos 2, Bai Wang 1, Leman.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU.
CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER.
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Query-based Graph Cuboid Outlier Detection
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
 DM-Group Meeting Liangzhe Chen, Oct Papers to be present  RSC: Mining and Modeling Temporal Activity in Social Media  KDD’15  A. F. Costa,
Large Graph Mining: Power Tools and a Practitioner’s guide
Anomaly detection in large graphs
Graph partitioning I: Dense Sub-Graphs
Anomaly detection in large graphs
NetMine: Mining Tools for Large Graphs
Kijung Shin1 Mohammad Hammoud1
Graph and Tensor Mining for fun and profit
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Large Graph Mining: Power Tools and a Practitioner’s guide
Part 1: Graph Mining – patterns
R-MAT: A Recursive Model for Graph Mining
Deep Belief Nets and Ising Model-Based Network Construction
Graph and Tensor Mining for fun and profit
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Roadmap Introduction – Motivation Part#1: Graphs
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
Graph and Tensor Mining for fun and profit
3.3 Network-Centric Community Detection
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Algorithms for Large Graph Mining
Large Graph Mining: Power Tools and a Practitioner’s guide
Lecture 21 Network evolution
Factor Analysis (Principal Components) Output
Presentation transcript:

Graph and Tensor Mining for fun and profit Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs [break] Faloutsos Roadmap Introduction – Motivation Part#1: Graphs [break] Part#2: Tensors Conclusions KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs … Faloutsos Roadmap Introduction – Motivation Part#1: Graphs … P1.3: community detection P1.4: fraud/anomaly detection Outliers Lock-step behavior P1.5: belief propagation ? KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs … Faloutsos Roadmap Introduction – Motivation Part#1: Graphs … P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation un-supervised ? semi-supervised KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs … Faloutsos Roadmap Introduction – Motivation Part#1: Graphs … P1.3: community detection P1.4: fraud/anomaly detection P1.4.1. Outliers P1.4.2. Lock-step behavior P1.5: belief propagation un-supervised ? KDD 2018 Dong+

‘Recipe’ Structure: Problem definition Short answer/solution LONG answer – details Conclusion/short-answer KDD 2018 Dong+

Problem Given: Find: Outliers Lock-step KDD 2018 Dong+

Solution Given: Find: Outliers Lock-step OddBall SVD KDD 2018 Dong+

P1.4.1. Outliers Which node(s) are strange? Q: How to start? KDD 2018 Dong+

P1.4.1. Outliers Which node(s) are strange? Q: How to start? A1: egonet; and extract node features KDD 2018 Dong+

Ego-net Patterns: Which is strange? Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary McGlohon, Christos Faloutsos, PAKDD 2010 KDD 2018 Dong+

Ego-net Patterns: Which is strange? telemarketer, port scanner, people adding friends indiscriminatively, etc. Near-clique Near-star tightly connected people, terrorist groups?, discussion group, etc. Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos PAKDD 2010 KDD 2018 Dong+

P1.4.1. Outliers Which node(s) are strange? Q: How to start? A: egonet; and extract node features Q’: which features? A’: ART! Infinite! Pick a few, e.g.: KDD 2018 Dong+

Ego-net Patterns Ni: number of neighbors (degree) of ego i Ei: number of edges in egonet i Wi: total weight of egonet i λw,i: principal eigenvalue of the weighted adjacency matrix of egonet i Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos PAKDD 2010 Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary McGlohon, Christos Faloutsos, PAKDD 2010 KDD 2018 Dong+

Pattern: Ego-net Power Law Density Ei ∝ Niα 1 ≤ α ≤ 2 Enron CEO Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos PAKDD 2010 Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary McGlohon, Christos Faloutsos, PAKDD 2010 KDD 2018 Dong+

Pattern: Ego-net Power Law Density Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos PAKDD 2010 Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary McGlohon, Christos Faloutsos, PAKDD 2010 KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs … Faloutsos Roadmap Introduction – Motivation Part#1: Graphs … P1.3: community detection P1.4: fraud/anomaly detection Outliers Lock-step behavior P1.5: belief propagation ? KDD 2018 Dong+

Problem Given: Find: Outliers Lock-step KDD 2018 Dong+

P1.4.1. How to find ‘suspicious’ groups? ‘blocks’ are normal, right? idols fans KDD 2018 Dong+

P1.4.1. How to find ‘suspicious’ groups? ‘blocks’ are normal, right? idols fans KDD 2018 Dong+

Except that: ‘blocks’ are normal, right? ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’14] KDD 2018 Dong+

Except that: ‘blocks’ are usually suspicious ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’14] Q: Can we spot blocks, easily? KDD 2018 Dong+

Except that: ‘blocks’ are usually suspicious ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’14] Q: Can we spot blocks, easily? A: Silver bullet: SVD! KDD 2018 Dong+

From: SALSA Why HITS fixates on dense blocks (‘Tightly Knit Community’ TKC - often link farms) Should win, but doesn’t under HITS KDD 2018 Dong+

Crush intro to SVD From: HITS Recall: (SVD) matrix factorization: finds blocks ‘music lovers’ ‘singers’ ‘sports lovers’ ‘athletes’ ‘citizens’ ‘politicians’ M idols ~ + N fans KDD 2018 Dong+

Crush intro to SVD (SVD) matrix factorization: finds blocks A) Even if shuffled! ‘music lovers’ ‘singers’ ‘sports lovers’ ‘athletes’ ‘citizens’ ‘politicians’ M idols N fans ~ + + KDD 2018 Dong+

B) Even if ‘salt+pepper’ noise Crush intro to SVD (SVD) matrix factorization: finds blocks B) Even if ‘salt+pepper’ noise ‘music lovers’ ‘singers’ ‘sports lovers’ ‘athletes’ ‘citizens’ ‘politicians’ M idols ~ + N fans KDD 2018 Dong+

Toy example – 5 blocks From: HITS EigenPlots ‘fans’ ‘idols’ u1 v1 u0 KDD 2018 Dong+

Toy example – 5 blocks From: HITS ‘fans’ ‘idols’ u1 v1 u0 v0 u0 u1 v0 KDD 2018 Dong+

Inferring Strange Behavior from Connectivity Pattern in Social Networks PAKDD’14 Meng Jiang, Peng Cui, Shiqiang Yang (Tsinghua) Alex Beutel, Christos Faloutsos (CMU)

Real Data Spikes on the out-degree distribution   KDD 2018 Dong+

GraphRAD: A Graph-based Risky Account Detection System MLG’18 Workshop 08/20/2018 (Tomorrow 4:00 pm) ICC Capital Suit Room 8 GraphRAD: A Graph-based Risky Account Detection System Jun Ma, Danqing Zhang, Yun Wang, Yan Zhang, Alexey Pozdnoukhov KDD 2018 Dong+

Input: Gigantic account link graph Community Detection Semi-supervised Suspicious Detection Output: small candidate graphs for manual check KDD 2018 Dong+

Solution Given: Find: Outliers Lock-step OddBall SVD KDD 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs … Faloutsos Roadmap Introduction – Motivation Part#1: Graphs … P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation un-supervised ? semi-supervised KDD 2018 Dong+