DAVA: Distributing Vaccines over Networks under Prior Information

Slides:



Advertisements
Similar presentations
On the Vulnerability of Large Graphs
Advertisements

O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Maximizing the Spread of Influence through a Social Network
Suqi Cheng Research Center of Web Data Sciences & Engineering
© 2012 IBM Corporation IBM Research Gelling, and Melting, Large Graphs by Edge Manipulation Joint Work by Hanghang Tong (IBM) B. Aditya Prakash (Virginia.
Scalable Vaccine Distribution in Large Graphs given Uncertain Data Yao Zhang, B. Aditya Prakash Department of Computer Science Virginia Tech CIKM, Shanghai,
Absorbing Random walks Coverage
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Minimum Spanning Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
1 Spanning Trees Lecture 20 CS2110 – Spring
Discussion #36 Spanning Trees
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
1 On Compressing Web Graphs Michael Mitzenmacher, Harvard Micah Adler, Univ. of Massachusetts.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Carmine Cerrone, Raffaele Cerulli, Bruce Golden GO IX Sirmione, Italy July
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.
Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds Author: Jochen Könemann R. Ravi From CMU CS 3150 Presentation by Dan.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Internet Traffic Engineering by Optimizing OSPF Weights Bernard Fortz (Universit é Libre de Bruxelles) Mikkel Thorup (AT&T Labs-Research) Presented by.
Efficient Gathering of Correlated Data in Sensor Networks
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
December 7-10, 2013, Dallas, Texas
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Models in I.E. Lectures Introduction to Optimization Models: Shortest Paths.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Efficient k-Coverage Algorithms for Wireless Sensor Networks Mohamed Hefeeda.
Online Social Networks and Media
I NFORMATION C ASCADE Priyanka Garg. OUTLINE Information Propagation Virus Propagation Model How to model infection? Inferring Latent Social Networks.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
Two Connected Dominating Set Algorithms for Wireless Sensor Networks Overview Najla Al-Nabhan* ♦ Bowu Zhang** ♦ Mznah Al-Rodhaan* ♦ Abdullah Al-Dhelaan*
Minimum Bottleneck Spanning Trees (MBST)
Vasilis Syrgkanis Cornell University
Energy-Efficient Randomized Switching for Maximizing Lifetime in Tree- Based Wireless Sensor Networks Sk Kajal Arefin Imon, Adnan Khan, Mario Di Francesco,
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
1 Lecture 16 Epidemics University of Nevada – Reno Computer Science & Engineering Department Fall 2015 CS 791 Special Topics: Network Architectures and.
Controlling Propagation at Group Scale on Networks Yao Zhang*, Abhijin Adiga +, Anil Vullikanti + *, and B. Aditya Prakash* *Department of Computer Science.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
Inferring Networks of Diffusion and Influence
Nanyang Technological University
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
MEIKE: Influence-based Communities in Networks
The minimum cost flow problem
Independent Cascade Model and Linear Threshold Model
Effective Social Network Quarantine with Minimal Isolation Costs
Lecture 19-Problem Solving 4 Incremental Method
Introduction Wireless Ad-Hoc Network
Cost-effective Outbreak Detection in Networks
Automatic Segmentation of Data Sequences
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

DAVA: Distributing Vaccines over Networks under Prior Information Yao Zhang, B. Aditya Prakash Department of Computer Science Virginia Tech SDM, Philadelphia, April 24, 2014 Zhang and Prakash, SDM 2014

Motivation: Epidemiology Virus spreads over contact networks SIR model [Anderson+ 1991] Susceptible-Infectious-Recovered Weights pij: propagation prob. from i to j Recovered prob. δ for each node (models mumps-like infections) Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

Motivation: Social Media Meme/Rumor spreads over friendship networks E.g.: Twitter following network Independent cascade model (IC) [Kempe+ KDD2003] Each node has only one chance to infect its neighbors Special case of SIR model Zhang and Prakash, SDM2014

Immunization Centers for Disease Control (CDC) cares about containing epidemic diseases E.g: ~400 million dollars used for vaccines for children in 2013 Twitter tries to stop rumor spread E.g.: rumors of victims after the Boston Marathon bombs in 2013 How to choose best nodes to vaccinate (remove)? Zhang and Prakash, SDM2014

Immunization Good for baseline strategies Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] pick a random person, immunize one of its neighbors at random Netshield [Tong+ 2010] Minimize the epidemic threshold (point when the virus takes-off) Good for baseline strategies Zhang and Prakash, SDM2014

In reality ? Typically the epidemic has already started! this paper Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] Netshield [Tong+ 2010] ? Typically the epidemic has already started! More realistic intervention Which nodes to vaccinate now? We call it Data-Aware Immunization this paper Zhang and Prakash, SDM2014

Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion Zhang and Prakash, SDM2014

Data-Aware Vaccination Problem Problem: Given a set of infected nodes and a contact graph, how to distribute k vaccines (node removal) to minimize the expected number of infected nodes at the end of the epidemic? D D Best solution A A E E B B 1 vaccine? F F C Remove A, save {A, D}; Remove B, save {B}; Remove C, save {C}; C pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion Zhang and Prakash, SDM2014

Complexity of DAV NP-hard Approximation algorithm? See paper for details NP-hard Reduce from Maximum K-Intersection Problem (MaxKI: maximizing the intersection of k subsets) MaxKI is NP-Complete [Vinterbo 2004] Approximation algorithm? Not submodular Actually, DAV is hard to approximate within an absolute error! Zhang and Prakash, SDM2014

Outline Motivation Problem Definition Complexity Our Proposed Methods assume IC model and undirected graph Experiments Conclusion Zhang and Prakash, SDM2014

1: Simplify - Merging infected nodes Idea: merge all the infected nodes into a single ‘super infected’ node I Original Graph Merged Graph Super node I A A pA pA Equivalent pX B B pB pY Logical-OR pB=1-(1-pX)(1-pY) pC pC C C Zhang and Prakash, SDM2014

2: DAVA-Tree Algorithm: Idea Select nodes with the largest “benefit” : the expected number of saved nodes after removing set S on graph G Benefit of adding additional node j into S: # of saved nodes after adding j into S Merged Infected Node Additional number of saved nodes when adding node j into S Benefit: 5 Benefit: 4 pij =1for all edges Benefit: 2 Zhang and Prakash, SDM2014

DAVA-Tree Alg.: Optimal on Trees For any set S: Merged Infected Node Fact 1: the chosen nodes in the optimal set must be neighbors of infected node I Fact 2: the benefit of each such node is independent of the rest of the set S Benefit: 2 Benefit: 5 pij =1for all edges Linear Time Benefit: 4 DAVA-tree algorithm: Select top k node from I’s neighbors with the max. benefit Zhang and Prakash, SDM2014

3: General Case – Arbitrary Graphs Idea We have the optimal algorithm for a tree Extract a spanning tree, then run DAVA-tree What kind of tree? Minimum spanning tree Optimal on MST by DAVA-tree Optimal solution Dom captures the ‘closeness’ of nodes to the infectious nodes, and importance of saving nodes. MST pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

3: General Case – Arbitrary Graphs Idea We have the optimal algorithm for a tree Build a spanning tree first What kind of tree? Minimum spanning tree Software engineering We propose to use dominator tree u dominates v Dom captures the ‘closeness’ of nodes to the infectious nodes, and importance of saving nodes. every path from I to v contains u 4 dominates 8,9,10,11 pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

Dominator Tree u is immediate dominator of v u dominates v AND every other dominator of v dominates u Dominator tree: add an edge between every such u and v Optimal from DAVA-tree Optimal solution Linear time [Buchsbaum, Tarjan 1998] pij =1 for all edges Dominator Tree Merged Graph Fact 1: the optimal solution should be among the children of root I in the dominator tree for any arbitrary graph Fact 2: (for special case, k = 1, p = 1) running DAVA-tree on the dominator tree gives the optimal solution Zhang and Prakash, SDM2014

Weighting the dominator tree #P-complete Our solution: maximum propagation path probability between nodes I and v (using Dijkstra’s algorithm) w1 p1 p3 w3 p6 w6 Dominator Tree Merged Graph Zhang and Prakash, SDM2014

DAVA algorithm Step: 1. T = Build a dominator tree Merged Graph (pij =1 for all edges) Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Not finished |S|=2 Iteration=1 Dominator Tree Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

DAVA algorithm Step: 1. T = Build a dominator tree Merged Graph Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Remove selected node O(k(|E|+ |V|log|V|)) Too slow for large networks! Dominator tree Not finished |S|=2 Iteration=2 Iteration=1 Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

DAVA-fast: a faster algorithm Step: 1. T = Build a dominator tree 2. S = Run DAVA-tree on T with budget=k Merged Graph |S|=2 In practice, the performance of DAVA-fast is very close to DAVA Time complexity: subquadratic! DAVA-fast: O(|V|log|V|+|E|) Note finished Dominator tree Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

Extending to SIR model See the paper Zhang and Prakash, SDM2014

Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion Zhang and Prakash, SDM2014

Experiments Virus Propagation Model IC and SIR Settings (See more settings in the paper) Randomly uniformly chosen initial infected nodes Baseline Algorithms RANDOM: randomly uniformly chosen healthy nodes DEGREE: choose nodes with top weighted degrees PAGERANK: choose nodes with top pageranks NETSHIELD state-of-the-art pre-emptive immunization algorithm to minimize the epidemic threshold of the graph [Tong+ ICDM 2010] Assumes no data is given before the epidemic starts Zhang and Prakash, SDM2014

Experiments: datasets Datasets are chosen from different domains Social media (IC model) OREGON: AS router graph STANFORD: hyperlink network GNUTELLA: peer-to-peer network BRIGHTKITE: friendship network Epidemiology (SIR model) PORTLAND and MIAMI: large urban social-contact graph used in national smallpox modeling studies [Eubank+, 2004] OREGON STANFORD GNUTELLA BRIGHTKITE PORTLAND MIAMI |V| 633 8,929 10,876 58,228 0.5 million 0.6 million |E| 2,172 53,829 39,994 21,4078 1.6 million 2.1 million Zhang and Prakash, SDM2014

Experiments: Quality GNUTELLA (IC model) PORTLAND (SIR model) Higher is better DAVA consistently outperforms the baseline algorithms. Further DAVA-fast performs almost as well as DAVA. (See more results in the paper) Zhang and Prakash, SDM2014

Experiments: Scalability did not finish within 10 hours Running time(sec.) Lower is better Zhang and Prakash, SDM2014

Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion Zhang and Prakash, SDM2014

Conclusion Data-Aware Vaccination problem Given: Graph and Infected nodes Find: ‘best’ nodes for immunization Complexity NP-hard Hard to approximate within an absolute error DAVA-tree Optimal solution on the tree DAVA and DAVA-fast Merging infected nodes Build a dominator tree, and run DAVA-tree Running time: subquadratic DAVA: O(k(|E|+ |V|log|V|)) DAVA-fast: O(|E|+|V|log|V|) Graph with infected nodes Merged graph Dominator tree Zhang and Prakash, SDM2014

Any Questions? Code at: http://people.cs.vt.edu/~yaozhang Yao Zhang Graph with infected nodes Code at: http://people.cs.vt.edu/~yaozhang Merged graph Yao Zhang B. Aditya Prakash Dominator tree Thanks for the support of NSF (Grant No. IIS-1353346). Zhang and Prakash, SDM2014