│ University of Texas at Dallas │

Slides:



Advertisements
Similar presentations
Great Theoretical Ideas in Computer Science
Advertisements

Approximations for Min Connected Sensor Cover Ding-Zhu Du University of Texas at Dallas.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Spread of Influence through a Social Network Adapted from :
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Maximizing the Spread of Influence through a Social Network
Least Cost Rumor Blocking in Social networks Lidan Fan Computer Science Department the University of Texas at Dallas.
Ding-Zhu Du │ University of Texas at Dallas │ Lecture 7 Rumor Blocking 0.
Absorbing Random walks Coverage
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Identifying Early Buyers from Purchase Data Paat Rusmevichientong, Shenghuo Zhu & David Selinger Presented by: Vinita Shinde Feb 18 th, 2010.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Influence Maximization
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
Community Structure and Rumor Blocking Ding-Zhu Du University of Texas at Dallas.
Models of Influence in Online Social Networks
Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds Author: Jochen Könemann R. Ravi From CMU CS 3150 Presentation by Dan.
Approximation Algorithms in Computational Social Networks Weili Wu Ding-Zhu Du University of Texas at Dallas.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
Problem Setting :Influence Maximization A new product is available in the market. Whom to give free samples to maximize the purchase of the product ? 1.
Online Social Networks and Media
Lecture 3-1 Independent Cascade Weili Wu Ding-Zhu Du University of Texas at Dallas.
On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at Dallas.
Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS , ICML ‘12.
LOCALIZED MINIMUM - ENERGY BROADCASTING IN AD - HOC NETWORKS Paper By : Julien Cartigny, David Simplot, And Ivan Stojmenovic Instructor : Dr Yingshu Li.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
Hedonic Clustering Games Moran Feldman Joint work with: Seffi Naor and Liane Lewin-Eytan.
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Computational Social Networks --computational data networks Weili Wu Ding-Zhu Du University of Texas at Dallas.
1/54 Rumor Source Detection: A Group Testing Approach Ding-Zhu Du Department of Computer Science University of Texas at Dallas.
1/54 Rumor Source Detection: monitor placement Ding-Zhu Du Department of Computer Science University of Texas at Dallas.
Inferring Networks of Diffusion and Influence
Cohesive Subgraph Computation over Large Graphs
Seed Selection.
Nanyang Technological University
Finding Dense and Connected Subgraphs in Dual Networks
Independent Cascade Model and Linear Threshold Model
Heuristic & Approximation
Greedy & Heuristic algorithms in Influence Maximization
Rumor Source Detection: A Group Testing Approach
A Study of Group-Tree Matching in Large Scale Group Communications
Influence Maximization
Great Theoretical Ideas in Computer Science
Independent Cascade Model and Linear Threshold Model
Influence Maximization
Maximizing the Spread of Influence through a Social Network
The Importance of Communities for Learning to Influence
Effective Social Network Quarantine with Minimal Isolation Costs
Lecture 23 Greedy Strategy
Coverage Approximation Algorithms
Detect rumor sources 2019/1/3
Bharathi-Kempe-Salek Conjecture
General Threshold and Cascade Models
EE5900 Advanced Embedded System For Smart Infrastructure
Influence Maximization
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
And Competitive Influence
Noncooperative Rumor Blocking
Presentation transcript:

│ University of Texas at Dallas │ Lecture 7-1 Rumor Blocking Ding-Zhu Du │ University of Texas at Dallas │ Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers or other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. The advantage of social network analysis is that, unlike many other methods, it focuses on interaction (rather than on individual behavior). Network analysis allows us to examine how the configuration of networks influences how individuals and groups, organizations, or systems function.

Least Cost Rumor Blocking in Social networks Lidan Fan, Zaixin Lu, Weili Wu, Bhavani Thuraisingham, Huan Ma, Yuanjun Bi. Published in ICDCS2013 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Social networks 11/12/2018

Social Network Social network is a social structure made up of individuals and relations between these individuals Social network provides a platform for influence diffusion 11/12/2018

Applications Single cascade …… Multiple cascades Political election Viral marketing Recommender systems Feed ranking …… Multiple cascades Political election Multiple products promotion Rumor/misinformation controlling 11/12/2018

Social network properties Small-world effect The average distance between vertices in a network is short. Power-law or exponential form There are many nodes with low degree and a small number with high degree. Clustering or network transitivity Two vertices that are both neighbors of the same third vertex have a high probability of also being neighbors of one another. Community structure The connections within the same community are dense and between communities are sparse. 11/12/2018

Influence spreads fast within the same community. Influence spreads slow across different communities. Social networks have community structure, with the property that the edges within the same community is dense and among communities is sparse. 11/12/2018 8

11/12/2018

When misinformation or rumor spreads in social networks, what will happen? 11/12/2018

A misinformation said that the president of Syria is dead, and it hit the twitter greatly and was circulated fast among the population, leading to a sharp, quick increase in the price of oil. http://news.yahoo.com/blogs/technology-blog/twitter-rumor-leads-sharp-increase-price-oil-173027289.html 11/12/2018

In August, 2012, thousands of people in Ghazni province left their houses in the middle of the night in panic after the rumor of earthquake. http://www.pajhwok.com/en/2012/08/20/quake-rumour-sends-thousands-ghazni-streets In August, 2012, thousands of people in Ghazni province left their houses in the middle of the night in panic after the rumor of earthquake, which said that a major earthquake would hit the area until 5 am [3]. Believing in it, many people from the Ghazni city and some other districts of the province left their house and spent the whole night outside. The panic spread by the rumor was so intense that the people, who were in thousands, did not dare to return to their houses till morning. Mirwais, a resident of Ghazni city, talked to a news agency about this announcement. Later, the imams of the mosques had started believing in it and according to the statement made by Mirwais, Then, imams of mosques also started announcing about the earthquake. 11/12/2018

Control the spread of rumors 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Rumors generated in a community will influence the members in the network. Find protectors to reduce the influence of rumors. Real-world limitation: the overhead spent on protectors and protected members should be balanced. Rumors spread very fast within their community---too much cost Rumors spread slow across different communities---little cost Find least number of protectors to reduce rumor influence to the members in other communities. 11/12/2018

Our Tasks Determine influence diffusion models. Design efficient algorithms to find protectors. Obtain real world data to evaluate our algorithms. 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

  11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Opportunistic One Activate One (OPOAO) Deterministic One Activate Many (DOAM) Opportunistic One Activate One (OPOAO) 11/12/2018

Common properties Two cascades: rumor and protector; Diffusion starts time: the same; Tie breaking rule: protector has priority over rumor; Status of each node: inactive, infected, protected; Monotonicity assumption: the status of infected or protected never changes. 11/12/2018

Deterministic One Activate Many (DOAM) 11/12/2018

Additional properties of the DOAM model When a node becomes active (infected or protected), it has a single chance to activate all of its currently inactive (not infected and not protected) neighbors. The activation attempts succeed with a probability 1. 11/12/2018

Example Two kinds of influence cascades: rumors and protectors. 6 2 1 5 Two kinds of influence cascades: rumors and protectors. Each individual has three status: inactive, rumored, protected. The active individual activates all of its neighbors successfully. When rumors and protectors influence an individual at the same time, then the individual is protected. Each individual only has one chance to influence their neighbors. A node will never change its status if it has been activated. 3 4 1 is a rumor, 6 is a protector. Step 1: 1--2,3; 6--2,4. 2 and 4 are protected, 3 is infected. 11/12/2018 24

Example 6 2 1 5 3 4 Step 2: 4--5. 5 is protected. 11/12/2018

Opportunistic One Activate One (OPOAO) 11/12/2018

Additional properties of the OPOAO model At each step, each active (infected or protected) node u can only choose one of its neighbors as its target, and each neighbor is chosen with a probability of 1/deg(u). Each active (infected or protected) node has unlimited chance to select the same node as its target. 11/12/2018

Example 1 is a rumor, 6 is a protector. 2 1 5 3 4 1 is a rumor, 6 is a protector. Step 1:1--2, 6--2. 2 is protected. 11/12/2018

Example Two kinds of influence cascades: rumors and protectors. 6 2 1 5 3 4 Two kinds of influence cascades: rumors and protectors. Each individual has three status: inactive, rumored, protected. The active individual only activates one of its neighbors successfully. When rumors and protectors influence an individual at the same time, then the individual is protected. Each individual has unlimited opportunities to influence their neighbors. Each node will never change its status if it has been activated. Step 2:1--3, 6--2. 3 is infected. 11/12/2018 29

Example Step 3:1--2, 3--4, 6--4. 4 is protected. 6 2 1 5 3 4 11/12/2018

Example Step 4:1--3, 3--2, 6--4, 4--5. 5 is protected. 6 2 1 5 3 4 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Least Cost Rumor Blocking Problem (LCRB) Bridge ends: form a vertex set; belong to neigborhood communities of rumor community; each can be reached from the rumors before others in its own community. C0 Red node is a rumor; Yellow nodes are bridge ends. C2 C1 11/12/2018

LCRB-D problem for the DOAM model Given: community structure rumors rumor community Goal: Find least number of protectors to protect all of the bridge ends. 11/12/2018

Set Cover Based Greedy (SCBG) Algorithm Main idea Convert to set cover problem using Breadth First Search (BFS) method. Three stages: construct Rumor Forward Search Trees (RFST)--bridge ends construct Bridge End Backward Search Trees (BEBST)--protector candidates construct vertex sets used in set cover problem 11/12/2018

Construct Rumor Forward Search Trees (RFST) 7 6 9 5 8 10 3 4 11 2 1 Yellow nodes are bridge ends. 12 14 13 11/12/2018

Rumor 4 Forward Search Tree The minimal hops: 1 hop between 4 and 5; 2 hops between 4 and 12; 3 hops between 4 and 8. 5,8,12 are the bridge ends. 1 2 5 3 12 8 11/12/2018

Construct Bridge End Backward Search Trees (BEBST) 7 6 9 5 8 10 3 4 Blue nodes are protector candidates. 11 2 1 12 14 13 11/12/2018

Bridge End Backward Search Trees 4 3 4 7 4 11 2 9 10 3 2 5 8 12 Record the protector candidate sets for each bridge end: 5: {5,7}; 8:{2,3,8,9,10,11}; 12:{2,3,12} 11/12/2018

Apply the Greedy algorithm Construct vertex sets in set cover problem Find the bridge ends that each candidate can protect: 2:{8,12}; 3:{8,12} ; 5:{5}; 7:{5}; 8:{8}; 9:{8}; 10:{8};11{8}; 12{12} Apply the Greedy algorithm choose 2 or 3 , bridge ends 8 and 12 are protected; choose 5 or 7, bridge end 5 is protected; the output is {2,5} or {2,7} or {3,5} or {3,7}. 11/12/2018

Theoretical Results There is a polynomial time O(ln n)−approximation algorithm for the LCRB-D problem, where n is the number of vertices in the set of bridge ends. If the LCRB-D problem has an approximation algorithm with ratio k(n) if and only if the set cover problem has an approximation algorithm with ratio k(n). 11/12/2018

Set-Cover Given a collection C of subsets of a set E, find a minimum subcollection C’ of C such that every element of E appears in a subset in C’ .

Example of Submodular Function

Greedy Algorithm

Analysis

Weighted Set Cover Given a collection C of subsets of a set E and a weight function w on C, find a minimum total-weight subcollection C’ of C such that every element of E appears in a subset in C’ .

A General Problem

Greedy Algorithm

A General Theorem Remark:

Proof

1 2 3

ze1 zek Ze2

Proof can be found in

Experiments Two datasets Collaboration Network (http://snap.stanford.edu/data/cit-HepPh.html): Covers scientific collaborations between authors with papers submitted to High Energy Physics. Nodes: Papers Edge (i,j): Author i co-authored a paper with author j Email Network (http://snap.stanford.edu/data/email-Enron.html): Covers all the email communications within a dataset of around half million emails. Nodes: Email addresses Edge (i, j): Address i sends at least one email to address j 11/12/2018

# of selected communities 1 2 Datasets HEP-PH Enron-Email # of nodes 15233 36692 # of edges 58891 367662 Average degree 7.73 10 # of selected communities 1 2 Description of the communities chosen Size:308 Bridge end size: 387 Size: 80 Bridge end size:135 Size: 2631 Bridge end size: 2250 11/12/2018

Experimental Results Our algorithm performs the best. SCBG Proximity MaxDegree Hep/15233/308 1% 32.9 25.3 140.6 5% 42.1 74.3 147.8 10% 48.9 133.8 152.6 Email/36692/80 6.2 43.7 72.7 8.2 46.9 79.3 20% 13.8 62.9 91.1 Email/36692/2631 20.4 289.3 1208.8 50.9 1067.6 1350.2 68.4 1422.6 1683.8 Our algorithm performs the best. The third community, which is dense and has large number of nodes, shows that our algorithm is robust and scalable. 11/12/2018

Experiments 11/12/2018

Experiments Our algorithm performs in all figures except Fig7(a). the network is sparse, when the number of rumors is small, it is possible that Proximity performs better than ours Proximity is better than MaxDegree in Fig7and Fig8. number of rumors is small and network is sparse MaxDegree is better than Proximity in Fig9. number of rumor is large and network is dense 11/12/2018

Rumor Blocking problem under the OPOAO model Given: the community structure Rumor sources R rumor community number of protectors k Goal: Find k protectors such that the expected number of bridge ends protected is maximized. Influence function σ(A) of node set A: Expected number of nodes that would be infected if A is selected as the protector seeds initially. 11/12/2018

  11/12/2018

Property of Submodularity PB(A): the set of nodes that can be protected by set A. PB(A+v)-PB(A): can be protected by A+v can not be protected by A A B v PB(A) PB(B) PB(v) 11/12/2018

Main Results   11/12/2018

Proof of Submodularity Timestamp assignment of rumor diffusion x y x y y.1 y.2 y.3 y.4 x.1 x.2 x.4 x.3 x.1 x.3 y.1 y.3 u v y.3 u v x.2 x.3 x.4 y.4 y.2 y.4 x.2 y.4 x.4 x.4 y.2 x.3 x.3 w z w z x.t: the influence spread of rumor x arrive a node at step t 11/12/2018

Proof of Submodularity Prove the submodularity of cardinality function |PB(A)| The nodes in PB(A) satisfies: infected if the set of protectors is empty not infected if the set of protectors is A Create rumor(protector) random diffusion graph-Gr(Gp). Among the incoming edges of bridge end u in Gr and Gp: find the oldest timestamp in Gr and Gp respectively compare them if the oldest one in Gp is older than the one in Gr then u can be protected otherwise then u will be infected 11/12/2018

Example Determine whether u is protected or infected r r.3 r r.1 r.2 w u r.3 p.3 u w p.1 r.1 p.1 p.2 w u p.3 p Random rumor diffusion graph Gr p Random protector diffusion graph Gp Graph G r: rumor p: protector Since p.1 is older than r.3, then u is protected. 11/12/2018

σ(A) is submodular. Submodularity of function σ(A) Fact: A non-negative linear combination of monotone and submodular functions is still monotone and submodular. Probabilities are non-negative; |PB(A)| is submodular; σ(A) is submodular. 11/12/2018

  11/12/2018

  11/12/2018

A general result on greedy algorithm With non-integer potential function Consider a monotone increasing, submodular function Consider the following problem: where is a nonnegative cost function

Greedy Algorithm G

Theorem Suppose in Greedy Algorithm G, selected x always satisfies Then its p.r. where

Proof. Let be obtained by Greedy Algorithm G. Denote Let be an optimal solution. Denote

Note that There exists i such that

Let Let Note that So

Note Hence,

  11/12/2018

Experiments 11/12/2018

Experiments In Fig4, Fig5 and Fig6, our algorithm performs the best except in several early hops. number of rumors is small Proximity is better than MaxDegree in Fig4, Fig5 and Fig6. stochastic selection mechanism The difference between Proximity and MaxDegree in Fig4 is larger than that in Fig5 and Fig6. network in Fig4 is sparse 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Conclusions The least cost rumor blocking (LCRB) problem in two models Introduce two influence diffusion models Deterministic One Activate Many --DOAM Opportunistic One Activate One--OPOAO The least cost rumor blocking (LCRB) problem in two models LCRB-D problem under the DOAM—protect all the bridge ends Algorithm: Set Cover Based Greedy (SCBG) Data: collaboration network and email network Our algorithm: robust and scalable. LCRB-P problem under the OPOAO—protect α fraction of the bridge ends Influence function σ(A): submodularity Method: timestamp assignment Algorithm: Greedy Data: collaboration network and email network. Our algorithm: robust and scalable 11/12/2018

Outline Background Related Works Our Contribution Conclusions Motivation Problem formulation Related Works Our Contribution Two influence diffusion models Least cost rumor blocking problem Conclusions Future Works 11/12/2018

Future Works Establish continuous time influence propagation model In real world, under most situations, influence diffuses in continuous time. Measure the diffusion time based on factors such as individual attributes, information properties, strength of relations, etc. Knowing these sources is very important to the network administrators in the following essences: (1) it helps to understand the ultimate goals for the misinformation, who the misleading information targets to as well as the potential size of the misinformation spread, and (2) it provides valuable insights into designing effective strategies for the containment campaign. 2018/11/12

Future Works Study rumor blocking and influence diffusion under dynamic social structures Under most cases, the relations between individuals change along with time, that is, social structures change along with time, what results can we get for rumor blocking and influence diffusion in dynamic situation. Knowing these sources is very important to the network administrators in the following essences: (1) it helps to understand the ultimate goals for the misinformation, who the misleading information targets to as well as the potential size of the misinformation spread, and (2) it provides valuable insights into designing effective strategies for the containment campaign. 2018/11/12

Future Works Detect rumor sources Previous works in controlling rumor diffusion assume that rumor sources are known. However, in reality, it is hard to know the accurate rumor sources. Estimate rumor sources accurately using existing information. Knowing these sources is very important to the network administrators in the following essences: (1) it helps to understand the ultimate goals for the misinformation, who the misleading information targets to as well as the potential size of the misinformation spread, and (2) it provides valuable insights into designing effective strategies for the containment campaign. 2018/11/12

Thank you! 11/12/2018