1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 Bernhard Schölkopf 1 S UBMODULAR I NFERENCE OF D IFFUSION NETWORKS FROM.

Slides:



Advertisements
Similar presentations
Competition in VM – Completing the Circle. Previous work in Competitive VM Mainly follower’s perspective: given state (say of seed selection) of previous.
Advertisements

Viral Marketing – Learning Influence Probabilities.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.
Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.
Maximizing the Spread of Influence through a Social Network
Suqi Cheng Research Center of Web Data Sciences & Engineering
Guest lecture II: Amos Fiat’s Social Networks class Edith Cohen TAU, December 2014.
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Blogosphere  What is blogosphere?  Why do we need to study Blog-space or Blogosphere?
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Copyright N. Friedman, M. Ninio. I. Pe’er, and T. Pupko. 2001RECOMB, April 2001 Structural EM for Phylogentic Inference Nir Friedman Computer Science &
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
Diffusion in Social and Information Networks Part II W ORLD W IDE W EB 2015, F LORENCE MPI for Software SystemsGeorgia Institute of Technology Le Song.
On Ranking and Influence in Social Networks Huy Nguyen Lab seminar November 2, 2012.
Viral Marketing for Dedicated Customers Presented by: Cheng Long 25 August, 2012.
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 David Balduzzi 1 Bernhard Schölkopf 1 UNCOVERING THE TEMPORAL DYNAMICS.
Intelligent Database Systems Lab Presenter: MIN-CHIEH HSIU Authors: NHAT-QUANG DOAN ∗, HANANE AZZAG, MUSTAPHA LEBBAH 2013 NN Growing self-organizing trees.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
December 7-10, 2013, Dallas, Texas
Jing (Selena) He and Hisham M. Haddad Department of Computer Science, Kennesaw State University Shouling Ji, Xiaojing Liao, and Raheem Beyah School of.
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Manuel Gomez Rodriguez Structure and Dynamics of Information Pathways in On-line Media W ORKSHOP M ENORCA, MPI FOR I NTELLIGENT S YSTEMS.
Online Social Networks and Media
I NFORMATION C ASCADE Priyanka Garg. OUTLINE Information Propagation Virus Propagation Model How to model infection? Inferring Latent Social Networks.
Xutao Li1, Gao Cong1, Xiao-Li Li2
On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at Dallas.
1/22 Optimization of Google Cloud Task Processing with Checkpoint-Restart Mechanism Speaker: Sheng Di Coauthors: Yves Robert, Frédéric Vivien, Derrick.
Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS , ICML ‘12.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
A Latent Social Approach to YouTube Popularity Prediction Amandianeze Nwana Prof. Salman Avestimehr Prof. Tsuhan Chen.
Cost-effective Outbreak Detection in Networks Presented by Amlan Pradhan, Yining Zhou, Yingfei Xiang, Abhinav Rungta -Group 1.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
F EATURE -E NHANCED P ROBABILISTIC M ODELS FOR D IFFUSION N ETWORK I NFERENCE Stefano Ermon ECML-PKDD September 26, 2012 Joint work with Liaoruo Wang and.
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1
Inferring Networks of Diffusion and Influence
Nanyang Technological University
Finding Dense and Connected Subgraphs in Dual Networks
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
DM-Group Meeting Liangzhe Chen, Nov
Link Prediction and Network Inference
Independent Cascade Model and Linear Threshold Model
Mixture of Mutually Exciting Processes for Viral Diffusion
Cost-effective Outbreak Detection in Networks
A Dynamic System Analysis of Simultaneous Recurrent Neural Network
Independent Cascade Model and Linear Threshold Model
Human-centered Machine Learning
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 Bernhard Schölkopf 1 S UBMODULAR I NFERENCE OF D IFFUSION NETWORKS FROM M ULTIPLE T REES , ICML ‘12

2 Propagation Social Networks Recommendation Networks Epidemiology Human Travels Information Networks P ROPAGATION TAKES PLACE ON W E CAN EXTRACT PROPAGATION TRACES FROM

3 Propagation over unknown networks  Diffusion often takes place over implicit or hard-to- observe highly dynamic networks  We observe when a node copies information or becomes infected but … … connectivity and diffusion sources are unknown! Implicit networks of blogs and news sites that spread news without mentioning their sources Hard-to-observe/hidden networks of drug users that share needles among them

4 Examples of diffusion Virus propagation Viruses propagate in the network Diffusion ProcessAvailable dataHidden data Time when people get sick Who infected whom Viral marketing Time when people buy products Recommendations propagate in the network Who influenced whom Information propagation Information propagates in the network Time when blogs reproduce information Who copied whom Can we infer the hidden data from the available temporal data?

5 Directed Diffusion Network  Directed network over which cascades propagate: Cascade 1: (n 1, t 1 =1), (n 2, t 2 =4), (n 3, t 3 =6), (n 4, t 4 =10) Our aim is to infer the network before it changes from a few temporal traces Cascade 1Cascade 2 Cascade 2: (n 2, t 1 =3), (n 5, t 2 =10), (n 3, t 3 =12), (n 4, t 4 =23), … n1n1 n3n3 n2n2 n4n4 n5n5 n6n6 n7n7

6 Outline 1.Efficiently compute the likelihood of the observed cascades 2.Efficiently find the network that maximize the likelihood of the observed cascades 3.Validate our algorithm on synthetic and real diffusion data and compare with state-of-art methods (N ET R ATE, C ON NI E & N ET I NF )

7 Pairwise interactions  Likelihood of transmission time :  It depends on the transmission time (t i – t j ) and the rate α (j, i) i i j j  Prior probability of transmission of edge : i i j j  Suppose node gets infected, is the probability that a cascade propagates over an edge (j, i) j j E XP P OW R AY small α j,i big α j,i t j -t i S OCIAL AND I NFORMATION D IFFUSION M ODELS E PIDEMIOLOGY  As α j,i 0, E[tx time] ∞  For simplicity, three parametric transmission models:

8 Probability of a tree in a given network Given a cascade t5t5 t4t4 t3t3 t1t1 t2t2 There are several propagation trees Each node infected by a single parent  Assume the same prior probability of tx for each edge, the probability of any feasible tree given a network is: T1T1 T2T2 T3T3 T4T4 q: # edges in T r: # inactive edges

9 Likelihood of a cascade for a given tree  Consider a particular feasible tree T for a cascade, the likelihood of the cascade given the tree T is: Example: t1t1 t2t2 t4t4 t3t3 t5t5 f(t c | T) = f(t 2 | t 1 ; α) f(t 3 | t 1 ; α) f(t 4 | t 3 ; α) f(t 5 | t 3 ; α)

10 Likelihood of a cascade in a given network  Consider a cascade t c spreading over a a network G, the likelihood of the cascade in the network is: Set of trees induced by the cascade t c in the network G Likelihood of a cascade given a tree T Probability of a tree T in the network G There are O(n n ) possible propagation trees per cascade!

11 Likelihood of a cascade in a given network Even though O(n n ) possible propagation trees per cascade 1.Matrix Tree Theorem We compute f(t c | T) by solving det(A), where A is a matrix that we build in O(n 2 ) 1.Time ordering A is upper triangular, we compute det(A) in O(n) We are able to compute the likelihood f(t c | T) in O(n 2 )  Finally, the log-likelihood of a cascade set C over a network G is: How?!

12 Network Inference Problem  We have shown how to compute the likelihood of a cascade set in a network G. However, our aim is to find the optimal network G with k edges that maximizes the likelihood:  Unfortunately, Theorem. The network inference problem defined by Eq. (1) is NP-hard. (1)

13 Submodular maximization  However, our objective function satisfies a natural diminishing property: Theorem. The objective function is a submodular function in the set of edges in G. We can efficiently obtain a suboptimal solution with a 63% provable guarantee using a greedy algorithm

14 Experimental Setup  We validate our method on:  How does our algorithm perform in comparison with state-of-the-art?  How fast do we infer a network?  How does our algorithm perform with respect to the amount of observed cascades? Synthetic data 1. Generate network 2. Generate a few cascades over the network 3. Run our inference algorithm Real data 1. MemeTracker data (172m news articles 08/2009– ) 2. We extract hyperlink cascades 3. Run our inference algorithm

15 Precision vs Recall 1,024 node networks with 200 cascades Erdos-Renyi, E XP Hierarchical, P OW Core Periphery, R AY  Our algorithm discovers greater number of edges; it reaches a higher recall

16 ROC gain wrt N ET I NF vs # cascades 1,024 node networks Erdos-RenyiHierarchical Core Periphery  Our algorithm outperform N ET I NF by ≈10-20% for small cascade sets and performs similarly for large cascade sets  The ROC gain depends on the network structure and transmission model

17 Scalability  Considering all trees (in contrast with single tree inference in N ET I NF ) does not impact scalability significantly  Both our method and N ET I NF are typically more than one order of magnitude faster than N ET R ATE

18 Real Network: performance (1,000 node, 10,000 edges) hyperlink network, 1,000 hyperlinks cascades  Our method outperforms the state-of-the-art for most of the full range of their tunable parameters.

19 Conclusions  We infer hidden networks from small sets of cascades  In our algorithm:  We use Matrix Tree Theorem & time ordering to compute cascades likelihoods efficiently in O(N 2 )  We show submodularity of the objective function in the network inference problem (open problem from KDD’10), which allows us to solve it efficiently  We beat the state of the art when only a few cascades are recorded

20 Thanks! C ODE & M ORE :