Graph and Tensor Mining for fun and profit

Slides:



Advertisements
Similar presentations
BiG-Align: Fast Bipartite Graph Alignment
Advertisements

CMU SCS Large Graph Mining - Patterns, Explanations and Cascade Analysis Christos Faloutsos CMU.
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
CMU SCS : Multimedia Databases and Data Mining Lecture #21: Tensor decompositions C. Faloutsos.
CMU SCS Mining Billion-Node Graphs - Patterns and Algorithms Christos Faloutsos CMU.
Endend endend Carnegie Mellon University Korea Advanced Institute of Science and Technology VoG: Summarizing and Understanding Large Graphs Danai Koutra.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
Fast Random Walk with Restart and Its Applications
SCS CMU Joint Work by Hanghang Tong, Yasushi Sakurai, Tina Eliassi-Rad, Christos Faloutsos Speaker: Hanghang Tong Oct , 2008, Napa, CA CIKM 2008.
CMU SCS Large Graph Mining – Patterns, Tools and Cascade analysis Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
CMU SCS Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS U Kang (CMU) 1KDD 2012 GigaTensor: Scaling Tensor Analysis Up By 100 Times – Algorithms and Discoveries U Kang Christos Faloutsos School of Computer.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
CMU SCS Mining Large Graphs: Fraud Detection, and Algorithms Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
ParCube: Sparse Parallelizable Tensor Decompositions
Are All Brains Wired Equally Danai Koutra Yu GongJoshua VogelsteinChristos Faloutsos Motivation Connectomics -- creation of brain connectivity maps. Analysing.
Du, Faloutsos, Wang, Akoglu Large Human Communication Networks Patterns and a Utility-Driven Generator Nan Du 1,2, Christos Faloutsos 2, Bai Wang 1, Leman.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Single-Pass Belief Propagation
CMU SCS Patterns, Anomalies, and Fraud Detection in Large Graphs Christos Faloutsos CMU.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
CMU SCS Anomaly Detection in Large Graphs Christos Faloutsos CMU.
Large Graph Mining: Power Tools and a Practitioner’s guide
15-826: Multimedia Databases and Data Mining
Anomaly detection in large graphs
Anomaly detection in large graphs
NetMine: Mining Tools for Large Graphs
Kijung Shin1 Mohammad Hammoud1
Graph and Tensor Mining for fun and profit
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Large Graph Mining: Power Tools and a Practitioner’s guide
Large Graph Mining: Power Tools and a Practitioner’s guide
Part 1: Graph Mining – patterns
R-MAT: A Recursive Model for Graph Mining
15-826: Multimedia Databases and Data Mining
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Roadmap Introduction – Motivation Part#1: Graphs
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
15-826: Multimedia Databases and Data Mining
Large Graph Mining: Power Tools and a Practitioner’s guide
Learning to Rank Typed Graph Walks: Local and Global Approaches
Presentation transcript:

Graph and Tensor Mining for fun and profit Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge Bases Conclusions – Future research KDD, 2018 Dong+

Over-arching conclusion MANY, time-tested, algorithms for graph & tensor mining (more, are needed) … Acted_in dates directed produced born_in KDD, 2018 Dong+

Over-arching conclusion … Acted_in dates directed produced born_in Problems = + (some) solutions KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation ? ? KDD, 2018 Dong+

Problem definition Are real graphs random? S*: what do static graphs look like? T*: how do graphs evolve over time? KDD, 2018 Dong+

Short answer(s) Are real graphs random? S*: what do static graphs look like? S.0: ‘six degrees’ S.1: skewed degree distribution S.2: skewed eigenvalues S.3: triangle power-laws S.4: GCC; and skewed distr. of conn. comp. T*: how do graphs evolve over time? T.1: diameters T.2: densification KDD, 2018 Dong+

Power laws: y ~ xa Short answer(s) Take logarithms NOT Gaussians ? Short answer(s) Are real graphs random? S*: what do static graphs look like? S.0: ‘six degrees’ S.1: skewed degree distribution S.2: skewed eigenvalues S.3: triangle power-laws S.4: GCC; and skewed distr. of conn. comp. T*: how do graphs evolve over time? T.1: diameters T.2: densification Power laws: y ~ xa NOT Gaussians Take logarithms y (log scale) a x (log scale) KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation ? KDD, 2018 Dong+

Node importance - Motivation: Given a graph (eg., web pages containing the desirable query word) Q: Which node is the most important? KDD, 2018 Dong+

Node importance - Motivation: Given a graph (eg., web pages containing the desirable query word) Q: Which node is the most important? A1: PageRank (PR) A2: HITS A3: SALSA KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation ? KDD, 2018 Dong+

Problem Definition Given a graph, and k Break it into k (disjoint) communities KDD, 2018 Dong+

Short answer METIS [Karypis, Kumar] (but: maybe NO good cuts exist!) KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation ? KDD, 2018 Dong+

Problem Given: Find: Outliers Lock-step KDD, 2018 Dong+

Solution Given: Find: Outliers Lock-step OddBall SVD KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs P1.1: properties/patterns in graphs P1.2: node importance P1.3: community detection P1.4: fraud/anomaly detection P1.5: belief propagation KDD, 2018 Dong+

Problem What color, for the rest? Given homophily (/heterophily etc)? KDD, 2018 Dong+

Short answer: What color, for the rest? A: Belief Propagation (‘zooBP’) www.cs.cmu.edu/~deswaran/code/zoobp.zip KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC, HAR) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD, 2018 Dong+

… Problem dfn What is ‘normal’? suspicious? Groups? 3am, 4/1 3am, 4/1 C. Faloutsos Problem dfn What is ‘normal’? suspicious? Groups? … 3am, 4/1 3am, 4/1 10pm, 4/3 11pm, 4/3 KDD, 2018 Dong+

Recipe(s) What is ‘normal’? suspicious? Groups? C. Faloutsos Recipe(s) What is ‘normal’? suspicious? Groups? A: tensor factorization (~ high-mode SVD) PARAFAC (shown here) Tucker = + customer product timestamp shoes Apple fans jewelry … 3am, 4/1 10pm, 4/3 11pm, 4/3 KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD, 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings … locations … Acted_in dates directed produced born_in artists relations+ directed Acted_in dates Artistic-pred. KDD, 2018 Dong+

State-of-the-art results Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption State-of-the-art results Multiplication model Open-world assumption KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD, 2018 Dong+

Problem Definition Given existing triples Q: Is a given triple correct? KDD, 2018 Dong+

Conclusion/Short answer Infer from other connecting paths Path 1 Path 2 Prec 1 Rec 0.01 F1 0.03 Weight 2.62 Prec 0.03 Rec 0.33 F1 0.04 Weight 2.19 KDD, 2018 Dong+

Roadmap Introduction – Motivation Part#1: Graphs Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge Bases Conclusions – Future directions F1: forecasting F2: unifying factual info and behavior info KDD, 2018 Dong+

All of us every single year, we're a different person. Future work – F1 Forecasting: Who will buy what, and when? All of us every single year, we're a different person. … 3am, 4/1 10pm, 4/3 11pm, 4/3 KDD, 2018 Dong+

Future work – F2 Customer info + behavior + Product Graph … … directed 3am, 4/1 10pm, 4/3 11pm, 4/3 … Acted_in dates directed produced born_in KDD, 2018 Dong+

Future work – F2 Customer info + behavior + Product Graph … … directed 3am, 4/1 10pm, 4/3 11pm, 4/3 … Acted_in dates directed produced born_in KDD, 2018 Dong+

Thanks to Danai Koutra U. Michigan Namyong Park CMU Dhivya Eswaran CMU Hyun Ah Song CMU Vagelis Papalexakis UCR Rakshit Trivedi George Tech KDD, 2018 Dong+

P1 – Graphs - More references Danai Koutra and Christos Faloutsos, Individual and Collective Graph Mining: Principles, Algorithms, and Applications October 2017, Morgan Claypool KDD, 2018 Dong+

P1 – Graphs - More references Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool. KDD, 2018 Dong+

P1 – Graphs - More references Anomaly detection Leman Akoglu, Hanghang Tong, & Danai Koutra, Graph based anomaly detection and description: a survey Data Mining and Knowledge Discovery (2015) 29: 626. Arxiv version: https://arxiv.org/abs/1404.4679 KDD, 2018 Dong+

P2 – Tensors - More references Tensor survey Tamara G. Kolda and Brett W. Bader Tensor Decompositions and Applications SIAM Rev., 51(3), pp 455–500, 2009. KDD, 2018 Dong+

P2 – Tensors - More references Tensor survey #2 Nicholas D. Sidiropoulos, Lieven De Lathauwer,, Xiao Fu,, Kejun Huang, Evangelos E. Papalexakis, and Christos Faloutsos, Tensor Decomposition for Signal Processing and Machine Learning, IEEE TSP, 65(13), July 1, 2017 KDD, 2018 Dong+

THANK YOU! = + … Problems (some) solutions KDD, 2018 Dong+ directed Acted_in dates directed produced born_in Problems = + (some) solutions KDD, 2018 Dong+