CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.

Slides:



Advertisements
Similar presentations
Yasuko Matsubara (Kyoto University),
Advertisements

CMU SCS Identifying on-line Fraudsters: Anomaly Detection Using Network Effects Christos Faloutsos CMU.
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
CMU SCS Large Graph Mining - Patterns, Explanations and Cascade Analysis Christos Faloutsos CMU.
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
CMU SCS : Multimedia Databases and Data Mining Lecture #21: Tensor decompositions C. Faloutsos.
Dynamical Processes on Large Networks B. Aditya Prakash Carnegie Mellon University MMS, SIAM AN, Minneapolis, July 10,
CMU SCS Mining Billion-Node Graphs - Patterns and Algorithms Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #26: Graph mining - patterns Christos Faloutsos.
© 2012 IBM Corporation IBM Research Gelling, and Melting, Large Graphs by Edge Manipulation Joint Work by Hanghang Tong (IBM) B. Aditya Prakash (Virginia.
CMU SCS Mining Billion-Node Graphs - Patterns and Algorithms Christos Faloutsos CMU.
Making Diffusion Work for You B. Aditya Prakash Computer Science Virginia Tech. GraphEx Symposium, MIT Endicott House, Aug 21, 2014.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
CMU SCS Mining Large Graphs Christos Faloutsos CMU.
CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
CMU SCS Graph Mining and Influence Propagation Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining – Patterns, Tools and Cascade analysis Christos Faloutsos CMU.
CMU SCS Large Graph Mining – Patterns, Tools and Cascade analysis Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #28: Graph mining - patterns Christos Faloutsos.
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
School of Computer Science Carnegie Mellon LLNL, Feb. '07C. Faloutsos1 Mining static and time-evolving graphs Christos Faloutsos Carnegie Mellon University.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Understanding and Managing Cascades on Large Graphs B. Aditya Prakash Computer Science Virginia Tech. CS Seminar 11/30/2012.
Fast and Exact Monitoring of Co-evolving Data Streams Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Naonori Ueda (NTT) Masatoshi Yoshikawa (Kyoto.
Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University.
Making Diffusion Work for You: From Social Media to Epidemiology B. Aditya Prakash Computer Science Virginia Tech. BSEC Conference, ORNL, Aug 26, 2015.
CMU SCS Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
CMU SCS U Kang (CMU) 1KDD 2012 GigaTensor: Scaling Tensor Analysis Up By 100 Times – Algorithms and Discoveries U Kang Christos Faloutsos School of Computer.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CMU SCS Mining Billion-Node Graphs Christos Faloutsos CMU.
CMU SCS Mining Billion-Node Graphs: Patterns and Algorithms Christos Faloutsos CMU.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS Mining Large Graphs: Fraud Detection, and Algorithms Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
ParCube: Sparse Parallelizable Tensor Decompositions
CMU SCS Graph Mining: patterns and tools for static and time-evolving graphs Christos Faloutsos CMU.
Understanding and Predicting Human Behavior using Propagation: From Flu-trends to Cyber-Security B. Aditya Prakash Computer Science Virginia Tech. Keynote.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
CMU SCS Graph Mining Christos Faloutsos CMU. CMU SCS iCAST, Jan. 09C. Faloutsos 2 Thank you! Prof. Hsing-Kuo Kenneth Pao Eric, Morgan, Ian, Teenet.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
Large Graph Mining: Power Tools and a Practitioner’s guide
15-826: Multimedia Databases and Data Mining
Non-linear Mining of Competing Local Activities
Part 1: Graph Mining – patterns
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
15-826: Multimedia Databases and Data Mining
Large Graph Mining: Power Tools and a Practitioner’s guide
Presentation transcript:

CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU

CMU SCS C. Faloutsos (CMU) 2 Roadmap Introduction – Motivation –Why study (big) graphs? Part#1: Patterns in graphs Part#2: Cascade analysis Conclusions [Extra: ebay fraud; tensors; spikes] BT, June 2013

CMU SCS EXTRAS - Tools ebay fraud detection: network effects (‘Belief propagation’) NELL analysis (Never Ending Language Learner): Tensors – GigaTensor Spike analysis and forecasting: ‘SpikeM’ BT, June 2013C. Faloutsos (CMU) 3

CMU SCS BT, June 2013C. Faloutsos (CMU) 4 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07]

CMU SCS BT, June 2013C. Faloutsos (CMU) 5 E-bay Fraud detection

CMU SCS BT, June 2013C. Faloutsos (CMU) 6 E-bay Fraud detection

CMU SCS BT, June 2013C. Faloutsos (CMU) 7 E-bay Fraud detection - NetProbe

CMU SCS BT, June 2013C. Faloutsos (CMU) 8 E-bay Fraud detection - NetProbe FAH F99% A H49% Compatibility matrix heterophily details

CMU SCS Popular press And less desirable attention: from ‘Belgium police’ (‘copy of your code?’) BT, June 2013C. Faloutsos (CMU) 9

CMU SCS EXTRAS - Tools ebay fraud detection: network effects (‘Belief propagation’) NELL analysis (Never Ending Language Learner): Tensors – GigaTensor Spike analysis and forecasting: ‘SpikeM’ BT, June 2013C. Faloutsos (CMU) 10

CMU SCS GigaTensor: Scaling Tensor Analysis Up By 100 Times – Algorithms and Discoveries U Kang Christos Faloutsos KDD’12 Evangelos Papalexakis Abhay Harpale BT, June C. Faloutsos (CMU)

CMU SCS Background: Tensor Tensors (=multi-dimensional arrays) are everywhere –Hyperlinks &anchor text [Kolda+,05] URL 1 URL 2 Anchor Text Java C++ C# BT, June C. Faloutsos (CMU)

CMU SCS Background: Tensor Tensors (=multi-dimensional arrays) are everywhere –Sensor stream (time, location, type) –Predicates (subject, verb, object) in knowledge base “Barack Obama is the president of U.S.” “Eric Clapton plays guitar” (26M) (48M) NELL (Never Ending Language Learner) data Nonzeros =144M BT, June C. Faloutsos (CMU)

CMU SCS Background: Tensor Tensors (=multi-dimensional arrays) are everywhere –Sensor stream (time, location, type) –Predicates (subject, verb, object) in knowledge base BT, June C. Faloutsos (CMU) IP-destination IP-source Time-stamp Anomaly Detection in Computer networks

CMU SCS Problem Definition How to decompose a billion-scale tensor? –Corresponds to SVD in 2D case BT, June C. Faloutsos (CMU)

CMU SCS Problem Definition How to decompose a billion-scale tensor? –Corresponds to SVD in 2D case BT, June C. Faloutsos (CMU)

CMU SCS Problem Definition  Q1: Dominant concepts/topics?  Q2: Find synonyms to a given noun phrase?  (and how to scale up: |data| > RAM) (26M) (48M) NELL (Never Ending Language Learner) data Nonzeros =144M BT, June C. Faloutsos (CMU)

CMU SCS Experiments GigaTensor solves 100x larger problem Number of nonzero = I / 50 (J) (I) (K) GigaTensor Tensor Toolbox Out of Memory 100x BT, June C. Faloutsos (CMU)

CMU SCS A1: Concept Discovery Concept Discovery in Knowledge Base BT, June C. Faloutsos (CMU)

CMU SCS A1: Concept Discovery BT, June C. Faloutsos (CMU)

CMU SCS A2: Synonym Discovery BT, June C. Faloutsos (CMU)

CMU SCS EXTRAS - Tools ebay fraud detection: network effects (‘Belief propagation’) NELL analysis (Never Ending Language Learner): Tensors – GigaTensor Spike analysis and forecasting: ‘SpikeM’ BT, June 2013C. Faloutsos (CMU) 22

CMU SCS Meme (# of mentions in blogs) –short phrases Sourced from U.S. politics in “you can put lipstick on a pig” “yes we can” Rise and fall patterns in social media C. Faloutsos (CMU)BT, June 2013

CMU SCS Rise and fall patterns in social media 24 Can we find a unifying model, which includes these patterns? four classes on YouTube [Crane et al. ’08] six classes on Meme [Yang et al. ’11] C. Faloutsos (CMU)BT, June 2013

CMU SCS Rise and fall patterns in social media 25 Answer: YES! We can represent all patterns by single model C. Faloutsos (CMU)BT, June 2013 In Matsubara+ SIGKDD 2012

CMU SCS 26 Main idea - SpikeM -1. Un-informed bloggers (uninformed about rumor) -2. External shock at time n b (e.g, breaking news) -3. Infection (word-of-mouth) Time n=0Time n=n b β C. Faloutsos (CMU)BT, June 2013 Infectiveness of a blog-post at age n: -Strength of infection (quality of news) -Decay function Time n=n b +1

CMU SCS Un-informed bloggers (uninformed about rumor) -2. External shock at time n b (e.g, breaking news) -3. Infection (word-of-mouth) Time n=0Time n=n b β C. Faloutsos (CMU)BT, June 2013 Infectiveness of a blog-post at age n: -Strength of infection (quality of news) -Decay function Time n=n b +1 Main idea - SpikeM

CMU SCS BT, June 2013C. Faloutsos (CMU) slope J. G. Oliveira & A.-L. Barabási Human Dynamics: The Correspondence Patterns of Darwin and Einstein. Nature 437, 1251 (2005). [PDF]PDF Response time (log) Prob(RT > x) (log)

CMU SCS SpikeM - with periodicity Full equation of SpikeM 29 Periodicity noon Peak 3am Dip Time n Bloggers change their activity over time (e.g., daily, weekly, yearly) Bloggers change their activity over time (e.g., daily, weekly, yearly) activity C. Faloutsos (CMU)BT, June 2013

CMU SCS Details Analysis – exponential rise and power-raw fall 30 Lin-log Log-log Rise-part SI -> exponential SpikeM -> exponential Rise-part SI -> exponential SpikeM -> exponential C. Faloutsos (CMU)BT, June 2013

CMU SCS Details Analysis – exponential rise and power-raw fall 31 Lin-log Log-log Fall-part SI -> exponential SpikeM -> power law Fall-part SI -> exponential SpikeM -> power law C. Faloutsos (CMU)BT, June 2013

CMU SCS Tail-part forecasts 32 SpikeM can capture tail part C. Faloutsos (CMU)BT, June 2013

CMU SCS “What-if” forecasting 33 e.g., given (1) first spike, (2) release date of two sequel movies (3) access volume before the release date ? ? (1) First spike (2) Release date (3) Two weeks before release C. Faloutsos (CMU)BT, June 2013 ? ?

CMU SCS “What-if” forecasting 34 SpikeM can forecast upcoming spikes (1) First spike (2) Release date (3) Two weeks before release C. Faloutsos (CMU)BT, June 2013

CMU SCS C. Faloutsos (CMU) 35 References Leman Akoglu, Christos Faloutsos: RTG: A Recursive Realistic Graph Generator Using Random Typing. ECML/PKDD (1) 2009: Deepayan Chakrabarti, Christos Faloutsos: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38(1): (2006) BT, June 2013

CMU SCS C. Faloutsos (CMU) 36 References Deepayan Chakrabarti, Yang Wang, Chenxi Wang, Jure Leskovec, Christos Faloutsos: Epidemic thresholds in real networks. ACM Trans. Inf. Syst. Secur. 10(4): (2008) Deepayan Chakrabarti, Jure Leskovec, Christos Faloutsos, Samuel Madden, Carlos Guestrin, Michalis Faloutsos: Information Survival Threshold in Sensor and P2P Networks. INFOCOM 2007: BT, June 2013

CMU SCS C. Faloutsos (CMU) 37 References Christos Faloutsos, Tamara G. Kolda, Jimeng Sun: Mining large graphs and streams using matrix and tensor tools. Tutorial, SIGMOD Conference 2007: 1174 BT, June 2013

CMU SCS C. Faloutsos (CMU) 38 References T. G. Kolda and J. Sun. Scalable Tensor Decompositions for Multi-aspect Data Mining. In: ICDM 2008, pp , December BT, June 2013

CMU SCS C. Faloutsos (CMU) 39 References Jure Leskovec, Jon Kleinberg and Christos Faloutsos Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations, KDD 2005 (Best Research paper award). Jure Leskovec, Deepayan Chakrabarti, Jon M. Kleinberg, Christos Faloutsos: Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication. PKDD 2005: BT, June 2013

CMU SCS References Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos, "Rise and Fall Patterns of Information Diffusion: Model and Implications", KDD’12, pp. 6-14, Beijing, China, August 2012 BT, June 2013C. Faloutsos (CMU) 40

CMU SCS References Jimeng Sun, Dacheng Tao, Christos Faloutsos: Beyond streams and graphs: dynamic tensor analysis. KDD 2006: BT, June 2013C. Faloutsos (CMU) 41

CMU SCS C. Faloutsos (CMU) 42 References Hanghang Tong, Christos Faloutsos, Brian Gallagher, Tina Eliassi-Rad: Fast best-effort pattern matching in large attributed graphs. KDD 2007: (Best paper award, CIKM'12) Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos and Christos Faloutsos Gelling, and Melting, Large Graphs by Edge Manipulation, Maui, Hawaii, USA, Oct Gelling, and Melting, Large Graphs by Edge Manipulation BT, June 2013

CMU SCS C. Faloutsos (CMU) 43 References Hanghang Tong, Spiros Papadimitriou, Christos Faloutsos, Philip S. Yu, Tina Eliassi-Rad: Gateway finder in large graphs: problem definitions and fast solutions. Inf. Retr. 15(3-4): (2012) BT, June 2013

CMU SCS THE END (Really, this time) BT, June 2013C. Faloutsos (CMU) 44