Thang N. Dinh, Dung T. Nguyen, My T. Thai Dept. of Computer & Information Science & Engineering University of Florida, Gainesville, FL Hypertext-2012, Milwaukee, WI. USA
Spread of Influence Word-of-mouth effect: Trust our friends more than strangers Online Social Networks (OSNs) Platform for spreading INFLUENCE Information Innovation Political influence… Thang N. Dinh 2 1. Introduction Source: Source:
Viral Marketing as an Optimization Problem Thang N. Dinh 3
Information Propagation: Observation 1 M. Cha et al. WWW’09, Propagation in Flickr. Not widely – within two yards Not quickly, it takes a long time. J. Leskovec et al. ACM TWEB Recommendations often stop after one-hop The average delay in information propagation across social links is about 140 days!!! 1. Introduction
Information Propagation: Observation 2 My T. Thai 5
Questions ??? Thang N. ACM HT’ 12 6
Our contributions Theoretically justify the seeding size to influence the network in presence of time-limit and power-law topology. Study the difference in the hardness of the influence problem in general networks vs. power-law networks. Provide VirAds, a scalable algorithm for fast influence propagation. Thang N. ACM HT’ 12 7
8
Cost-effective, Fast, and Massive viral marketing problem (CFM) Given Network G=(V, E). Diffusion model Objective Spread the influence into the whole network within d hops Task Find the minimum set of nodes to target! 9 Source: M. G. Rodriguez, J. Leskovec, A. Krause
Diffusion Model
Hardness of Approximating My T. Thai Hardness of Approximating S1S1 e1e1 e2e2 e3e3 e4e4 e5e5 e|U|e|U| x1x1 x2x2 xtxt x' 1 x' 2 x' t S3S3 S2S2 D’D S U S|S|S|S| w1w1 u uv 1... v w1w1 W c(ρ ) uv 2 uv d-1...
Hardness of Approximating (d>1) Thang N. ACM HT’ Hardness of Approximating A solution of size k A solution of size An optimal solution An optimal solution... ba c d b a c d w1w1 w2w2 W c(ρ) Direct Failures (d=1) D-hop failures
CFM Power-law Networks Power-law Networks
Power-law networks vs. Genral Networks Thang N. ACM HT’ 12 14
Optimal solutions via Math. Prog. Thang N. ACM HT’ 12 15
Optimal solutions via Math. Prog. Propagation in Erdos’s Collaboration network: Thang N. ACM HT’ 12 16
Thang N. ACM HT’ 12 17
Efficient Heuristics for Large Scale Networks VirAds-Fast-Spreading Algorithm 1. A priority queue of nodes: priority = # affected vertices + #affected edges. 2. Pickup vertex with highest priority 3. Recalculate priority, and select the vertex if the new priority is still the highest, repeat otherwise 4. Update the number of activated vertices with the selected node 5. Lazy update: Update priority for only vertices that are “affected” by the selected vertex VirAds: Scalable Algorithm
Heuristics for Large Scale Networks Datasets: Physic collaboration network 37K vertices, 63 K edges Facebook New Orleans City: 90 K vertices, ~4M edges. Orkut social network: 3 M vertices, 220 M edges Competitors: Max degree selector Virads: One-hop greedy selector Exaustive Update: Expensive multi-hop greedy Cannot run for large networks (e.g. Orkut) 19
Experiments Results 20 Solution Quality Physics Facebook Orkut
Experimental Results Running time My T. Thai 21 Physics Facebook Orkut
Summary Finding seed nodes is a hard problem in general “Not so hard” in power-law networks The seeding cost is often NOT cheap. Propose VirAds: Scalable algorithm for target selection Better than centrality heuristic Scalable for networks of millions nodes Thang N. ACM HT’ 12 22
Acknowledgement We would like to thank NSF and DTRA for their generous support. We thank anonymous reviewers who provided helpful comments to improve the paper. Thang N. ACM HT’ 12 23
Hypertext-2012, Milwaukee, WI. USA 24