Large Graph Mining: Power Tools and a Practitioner’s guide

Slides:



Advertisements
Similar presentations
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Advertisements

Jennifer Tour Chayes Joint work with N. Berger, C. Borgs, A. Ganesh, A. Saberi, D. B. Wilson Controlling the Spread of Viruses on Power-Law Networks.
On the Vulnerability of Large Graphs
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
COMP 621U Week 3 Social Influence and Information Diffusion
Dynamical Processes on Large Networks B. Aditya Prakash Carnegie Mellon University MMS, SIAM AN, Minneapolis, July 10,
Interacting Viruses: Can Both Survive? Alex Beutel, B. Aditya Prakash, Roni Rosenfeld, Christos Faloutsos Carnegie Mellon University, USA KDD 2012, Beijing.
Information Networks Failures and Epidemics in Networks Lecture 12.
School of Computer Science Carnegie Mellon Sensor and Graph Mining Christos Faloutsos Carnegie Mellon University & IBM
© 2012 IBM Corporation IBM Research Gelling, and Melting, Large Graphs by Edge Manipulation Joint Work by Hanghang Tong (IBM) B. Aditya Prakash (Virginia.
CMU SCS Large Graph Mining - Patterns, Tools and Cascade Analysis Christos Faloutsos CMU.
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
Virus Propagation Modeling Chenxi Wang, Christos Faloutsos Yang Wang, Deepayan Chakrabarti Carnegie Mellon University Center for Computer.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Graph Mining and Influence Propagation Christos Faloutsos CMU.
CMU SCS Yahoo/Hadoop, 2008#1 Peta-Graph Mining Christos Faloutsos Prakash, Aditya Shringarpure, Suyash Tsourakakis, Charalampos Appel, Ana Chau, Polo Leskovec,
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
Online Social Networks and Media Epidemics and Influence.
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Lectures 6 & 7 Centrality Measures Lectures 6 & 7 Centrality Measures February 2, 2009 Monojit Choudhury
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Interacting Viruses in Networks: Can Both Survive? Authors: Alex Beutel, B. Aditya Prakash, Roni Rosenfeld, and Christos Faloutsos Presented by: Zachary.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
1 Worm Propagation Modeling and Analysis under Dynamic Quarantine Defense Cliff C. Zou, Weibo Gong, Don Towsley Univ. Massachusetts, Amherst.
Winner-takes-all: Competing Viruses or Ideas on fair-play Networks B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos Carnegie Mellon University,
CMU SCS Talk 3: Graph Mining Tools – Tensors, communities, parallelism Christos Faloutsos CMU.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
ECML-PKDD 2010, Barcelona, Spain B. Aditya Prakash*, Hanghang Tong* ^, Nicholas Valler+, Michalis Faloutsos+, Christos Faloutsos* * Carnegie Mellon University,
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Propagation on Large Networks B. Aditya Prakash Christos Faloutsos Carnegie Mellon University.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Epidemic spreading on preferred degree adaptive networks Shivakumar Jolad, Wenjia Liu, R. K. P. Zia and Beate Schmittmann Department of Physics, Virginia.
Inferring Networks of Diffusion and Influence
L – Modeling and Simulating Social Systems with MATLAB
B. Aditya Prakash Naren Ramakrishnan
Large Graph Mining: Power Tools and a Practitioner’s guide
DOULION: Counting Triangles in Massive Graphs with a Coin
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Failures and Epidemics in Networks
Through University Faculty
Large Graph Mining: Power Tools and a Practitioner’s guide
Large Graph Mining: Power Tools and a Practitioner’s guide
Degree and Eigenvector Centrality
Part 1: Graph Mining – patterns
R-MAT: A Recursive Model for Graph Mining
The Effect of Network Topology on the Spread of Epidemics
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Properties and applications of spectra for networks
Viral Marketing over Social Networks
Analysis of Large Graphs: Overlapping Communities
Presentation transcript:

Large Graph Mining: Power Tools and a Practitioner’s guide Task 6: Virus/Influence Propagation Faloutsos, Miller,Tsourakakis CMU KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Outline Introduction – Motivation Task 1: Node importance Task 2: Community detection Task 3: Recommendations Task 4: Connection sub-graphs Task 5: Mining graphs over time Task 6: Virus/influence propagation Task 7: Spectral graph theory Task 8: Tera/peta graph mining: hadoop Observations – patterns of real graphs Conclusions KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Detailed outline Epidemic threshold Problem definition Analysis Experiments Fraud detection in e-bay KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Virus propagation How do viruses/rumors propagate? Blog influence? Will a flu-like virus linger, or will it become extinct soon? KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis The model: SIS ‘Flu’ like: Susceptible-Infected-Susceptible Virus ‘strength’ s= b/d Healthy Prob. d N2 Prob. b Prob. β N1 N Infected N3 KDD'09 Faloutsos, Miller, Tsourakakis

Epidemic threshold t of a graph: the value of t, such that if strength s = b / d < t an epidemic can not happen Thus, given a graph compute its epidemic threshold KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Epidemic threshold t What should t depend on? avg. degree? and/or highest degree? and/or variance of degree? and/or third moment of degree? and/or diameter? KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Epidemic threshold [Theorem 1] We have no epidemic, if β/δ <τ = 1/ λ1,A KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Epidemic threshold [Theorem 1] We have no epidemic (*), if epidemic threshold recovery prob. β/δ <τ = 1/ λ1,A largest eigenvalue of adj. matrix A attack prob. Proof: [Wang+03] (*) under mild, conditional-independence assumptions KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis details Beginning of proof Healthy @ t+1: - ( healthy or healed ) - and not attacked @ t Let: p(i , t) = Prob node i is sick @ t+1 1 - p(i, t+1 ) = (1 – p(i, t) + p(i, t) * d ) * Pj (1 – b aji * p(j , t) ) Below threshold, if the above non-linear dynamical system above is ‘stable’ (eigenvalue of Hessian < 1 ) KDD'09 Faloutsos, Miller, Tsourakakis

Epidemic threshold for various networks Formula includes older results as special cases: Homogeneous networks [Kephart+White] λ1,A = <k>; τ = 1/<k> (<k> : avg degree) Star networks (d = degree of center) λ1,A = sqrt(d); τ = 1/ sqrt(d) Infinite power-law networks λ1,A = ∞; τ = 0 ; [Barabasi] KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Epidemic threshold [Theorem 2] Below the epidemic threshold, the epidemic dies out exponentially KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Detailed outline Epidemic threshold Problem definition Analysis Experiments Fraud detection in e-bay KDD'09 Faloutsos, Miller, Tsourakakis

Current prediction vs. previous Number of infected nodes PL-3 PL-3 Our Our β/δ β/δ Oregon Star The formula’s predictions are more accurate KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Experiments (Oregon) b/d > τ (above threshold) b/d = τ (at the threshold) b/d < τ (below threshold) KDD'09 Faloutsos, Miller, Tsourakakis

SIS simulation - # infected nodes vs time above Log - Lin at #inf. (log scale) below Time (linear scale) KDD'09 Faloutsos, Miller, Tsourakakis

SIS simulation - # infected nodes vs time above Log - Lin at #inf. (log scale) below Exponential decay Time (linear scale) KDD'09 Faloutsos, Miller, Tsourakakis

SIS simulation - # infected nodes vs time above Log - Log at #inf. (log scale) below Time (log scale) KDD'09 Faloutsos, Miller, Tsourakakis

SIS simulation - # infected nodes vs time above Log - Log at #inf. (log scale) below Power-law Decay (!) Time (log scale) KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis extra Detailed outline Epidemic threshold Fraud detection in e-bay KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis extra E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks, S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos (WWW'07), pp. 201-210 KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis extra E-bay Fraud detection lines: positive feedbacks would you buy from him/her? KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis extra E-bay Fraud detection lines: positive feedbacks would you buy from him/her? or him/her? KDD'09 Faloutsos, Miller, Tsourakakis

E-bay Fraud detection - NetProbe extra E-bay Fraud detection - NetProbe Belief Propagation gives: KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis Conclusions λ1,A : Eigenvalue of adjacency matrix determines the survival of a flu-like virus It gives a measure of how well connected is the graph (~ # paths – see Task 7, later) May guide immunization policies [Belief Propagation: a powerful algo] KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis References D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos, Epidemic Thresholds in Real Networks, in ACM TISSEC, 10(4), 2008 Ganesh, A., Massoulie, L., and Towsley, D., 2005. The effect of network topology on the spread of epidemics. In INFOCOM. KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis References (cont’d) Hethcote, H. W. 2000. The mathematics of infectious diseases. SIAM Review 42, 599–653. Hethcote, H. W. AND Yorke, J. A. 1984. Gonorrhea Transmission Dynamics and Control. Vol. 56. Springer. Lecture Notes in Biomathematics. KDD'09 Faloutsos, Miller, Tsourakakis

Faloutsos, Miller, Tsourakakis References (cont’d) Y. Wang, D. Chakrabarti, C. Wang and C. Faloutsos, Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint, in SRDS 2003 (pages 25-34), Florence, Italy KDD'09 Faloutsos, Miller, Tsourakakis