An Analysis of Social Network-Based Sybil Defenses Sybil Defender Sybil examples Wei Wei∗, Fengyuan Xu∗, Chiu C. Tan†, Qun Li∗ ∗The College of William and Mary
Table of Contents Introduction to Sybil attack Sybil Defense mechanisms Sybil Defender algorithm Limiting number of attacks Evaluation of the algorithm Limitations of Sybil Defense schemes Comparison of algorithms. Performance comparison of generated node Ranking Performance comparison for detection of Sybils
Schema of Sybil Attack www. Reputation System Sybils votes ID: 007 Internet traffic www.
Introduction to Sybil attack Malicious attackers can create multiple identities and influence the working of systems that rely upon open membership. Avoiding multiple identity, or Sybil, attacks is known to be a fundamental problem in the design of distributed systems.
Delivery systems -Examples
Recommendation system
Traditional defenses rely on trusted identities provided by a certification authority. Disadvantage : requiring users to present trusted identities runs counter to the open membership.
Solutions Rely on social network structure instead of real users of network so they don’t require central trusted identities . All Sybil defense schemes rank nodes similarly—nodes within local communities around the trusted node. For example: nodes that around the trusted node are ranked higher than nodes in the rest of the network.
Problem Analysis We look on the scheme core. the ranking nodes based on how well the nodes are connected to a trusted node.
Assumptions The attacker cannot establish an arbitrarily large number of social connections to non-Sybil nodes. The honest region is fast mixing. Sybil node have to cross small cut between regions. The network consist at least 1 honest node.
Synthetic Network two densely connected communities of 256 nodes each
Sybil Defender algorithm Based on random walk on the graph– the sequence of moves of a particle between nodes of G. The defender detect Sybil nodes and community of Sybils close to the theoretical bound. 2 users share a link if there is relationship between them. User = node Sybil entity = # of nodes , honest = 1 node Sybil community consist all the Sybil node.
Sybil Defender algorithm 3 components : Sybil Identification Algorithm Sybil Community Detection Algorithm Limiting the Number of Attack Edges
Definitions Frequency of a node - the number of times the node being traversed by a set of random walks. random walk on a graph- the sequence of moves of a particle between nodes of G.
Algorithm 1 Log n -> fast mixing Lmax -> large enough for R random walks to cover the region
Algorithm 2 Mean -> Average node number of frequency
Results of pre-processing
Limiting the Number of Attack Edges The theoretical bound of the sybils node that we cannot detect is O(log n). the users rate their relationships (friend or stranger). removing the relationships rated as stranger from the social graph when applying the Sybil defense schemes. Build activity network that is based on the interaction between users. Two nodes share an edge in an activity network if and only if they have interacted directly through the communication mechanisms or applications provided by the corresponding social network.
Examples of defenses to limit the attackers Captcha Verify mail Ip Social security number Copy of ID
Evaluation parameters L0 = 1000 , Lmin = 100 , Lmax = 10000 T = 5, alpha = 20 , Ls = 20 , F = 100 R e {1000,1500,2000} Number per attack = 1000 F+ -> percentage of honest that detect as sybil F- -> percentage of sybil detect as honest Sybil region = 10000 nodes Each point avg of 20 experiments Phi = mean – alpha * stdDeviation = t
EVALUATION evaluate the effectiveness of Sybil Defender using 2 data sets – the largest data sets that evaluate Sybil defense : 20% rate of confirm fake friend Facebook Orkut 3,097,165 nodes 3,072,441 nodes 28,377,481 edges 117,185,083 edges average degree of 18.32 average degree of 76.28 In the experiments we use 2 models to construct the sybil regions respectively: the preferential attachment (PA) model and the Erd¨os-R´enyi (ER) model.
Orkut
Compare originating PA Model Assumption: the existence of a small cut between the honest region and the Sybil region.
Compare per attack
Sybil limit VS Sybil defender
Sybil limit result
Running time Sybil Limit (R=2000) Sybil Defender (R=2000) 11.56 seconds 0.87 seconds one Sybil node 83.55 seconds 7.11 seconds one honest node Sybil Limit invokes a large number (r = 10000 for our Facebook data set) of instances of the random route generation protocol. Sybil Defender only relies on performing a limited number of random walks
Comparing algorithm defenses Each algorithm has been shown to work well under its own assumptions about the structure of the social network and the links connecting non-Sybil and Sybil nodes.
Comparing approch 1 view the schemes as complete coherent proposals (treat them as “black boxes”). Pros: would provide useful performance comparisons between a fixed configuration of schemes over a given set of social networks and attack strategies by the Sybils. Cons: would not yield conclusive information on how a particular scheme would perform if either the given social network or the behavior of the attacker should change. not allow us to derive any fundamental insights into how these schemes work.
Comparing approch 2 find a core insight common to all the schemes that would explain their performance in any setting. Pros: provides guidance on improving future designs, but also sheds light on the limits of social network-based Sybil defense. Cons: we need to reduce the schemes to their core task before analyzing them.
How the schemes works schemes attempt to isolate Sybils embedded within a social network topology. Every scheme declares nodes in the network as either Sybils or non-Sybils from the perspective of a trusted node, effectively partitioning the nodes in the social network into two distinct regions (non-Sybils and Sybils).
Balanced Partition graph The problem is under NP-Hard section (if the graph degree balanced so it NP-C). The problem is to find (k,v) partition k components of at most size v·(n/k) while minimizing the capacity of the edges between separate components.
Graph partition methods local methods to find partition graph are the Kernighan–Lin algorithm, and Fiduccia-Mattheyses algorithms Usage of this methods
Sybil Community Detection Algorithm
Sybil Community Detection Algorithm
Sybil community detection algorithm
Examples of defenses schemes
Data set evaluation ROC - is the probability that a Sybil defense scheme ranks a randomly selected Sybil node lower than a randomly selected non-Sybil node Conductance - metric for evaluating the quality of communities (lower numbers indicate stronger communities) Mutual Information - measures the similarity of two partitions of a set : 0 = no correlation 1 = perfect match
Limitations of Sybil Defense - Impact of Social Network Structure Synthetic Network
Limitations of Sybil Defense - Impact of Social Network Structure
Limitations of Sybil Defense – Targeted Sybil Attacks Sybil defense schemes assume that attackers (Sybils) establish links to randomly selected nodes in the network. To find out the performance of Sybil defense schemes in targeted attacks, attackers have more control over their link placement to k nodes closest to trusted node. As Sybil links get closer to trusted node, Sybil nodes are ranked higher than non-Sybil nodes
Community Detection (CD) Algorithms Section of algorithms that Very widely explorer and investigate so we can use of its detection of local community. We use the algorithm of “Mislove” that iteratively pass on his neighbor’s nodes from a given 1 or 2 initialize node. We will compare its node ranking with those of existing Sybil defense schemes, to determine if it is able to defend against sybils with similar accuracy.
Comparison of Generated Rankings Synthetic Network The similarity of generated partitions and quality of communities is max at partition size of 256
Comparison of Generated Rankings (Real World Networks) Facebook Network Astrophysics Network Nodes that are tightly connected around a trusted node are more likely to be ranked higher When there are multiple nodes that are similarly well connected to the trusted node are often ranked differently in different algorithms.
Performance comparison for Sybil Detection Synthetic Network Facebook Network