DSybil: Optimal Sybil-Resistance for Recommendation Systems Haifeng Yu National University of Singapore Chenwei Shi National University of Singapore Michael.

Slides:

Advertisements

Similar presentations

An analysis of Social Network-based Sybil defenses Bimal Viswanath § Ansley Post § Krishna Gummadi § Alan Mislove ¶ § MPI-SWS ¶ Northeastern University.

Advertisements

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.

QoS-based Management of Multiple Shared Resources in Dynamic Real-Time Systems Klaus Ecker, Frank Drews School of EECS, Ohio University, Athens, OH {ecker,

I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.

Amit Goyal Laks V. S. Lakshmanan RecMax: Exploiting Recommender Systems for Fun and Profit University of British Columbia

Cope with selfish and malicious nodes

Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.

Indexing and Range Queries in Spatio-Temporal Databases

1 The Case for Byzantine Fault Detection. 2 Challenge: Byzantine faults Distributed systems are subject to a variety of failures and attacks Hacker break-in.

Krishna P. Gummadi Networked Systems Research Group MPI-SWS

Maximizing the Spread of Influence through a Social Network

FilterBoost: Regression and Classification on Large Datasets Joseph K. Bradley 1 and Robert E. Schapire 2 1 Carnegie Mellon University 2 Princeton University.

Authors Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, Abraham Flaxman Presented by: Jonathan di Costanzo & Muhammad Atif Qureshi 1.

Improving Peer-to-Peer Networks “Limited Reputation Sharing in P2P Systems” “Robust Incentive Techniques for P2P Networks”

An Analysis of Social Network-Based Sybil Defenses Sybil Defender

TRUST Spring Conference, April 2-3, 2008 Write Markers for Probabilistic Quorum Systems Michael Merideth, Carnegie Mellon University Michael Reiter, University.

Toward an Optimal Social Network Defense Against Sybil Attacks Haifeng Yu National University of Singapore Phillip B. Gibbons Intel Research Pittsburgh.

Haifeng Yu National University of Singapore

Distributed Algorithms for Secure Multipath Routing

CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.

1 Defragmenting DHT-based Distributed File Systems Jeffrey Pang, Srinivasan Seshan Carnegie Mellon University Phillip B. Gibbons, Michael Kaminsky Intel.

Dynamic Internet Congestion with Bursts Stefan Schmid Roger Wattenhofer Distributed Computing Group, ETH Zurich 13th International Conference On High Performance.

1 SybilGuard: Defending Against Sybil Attacks via Social Networks Haifeng Yu Michael Kaminsky Phillip B. Gibbons Abraham Flaxman Presented by John Mak,

CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian

Distributed Computing Group TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA Distributed Asymmetric Verification.

1 The Sybil Attack John R. Douceur Microsoft Research Presented for Cs294-4 by Benjamin Poon.

© 2006 Andreas Haeberlen, MPI-SWS 1 The Case for Byzantine Fault Detection Andreas Haeberlen MPI-SWS / Rice University Petr Kouznetsov MPI-SWS Peter Druschel.

2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.

SybilGuard: Defending Against Sybil Attacks via Social Networks Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman Presented by Ryan.

 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.

SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.

COVERTNESS CENTRALITY IN NETWORKS Michael Ovelgönne UMIACS University of Maryland 1 Chanhyun Kang, Anshul Sawant Computer Science Dept.

Active Learning for Probabilistic Models Lee Wee Sun Department of Computer Science National University of Singapore LARC-IMS Workshop.

Trust Management in Mobile Ad Hoc Networks Using a Scalable Maturity-Based Model Authors: Pedro B. Velloso, Rafael P. Laufer, Daniel de O. Cunha, Otto.

A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.

OSN Research As If Sociology Mattered Krishna P. Gummadi Networked Systems Research Group MPI-SWS.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

Terminodes and Sybil: Public-key management in MANET Dave MacCallum (Brendon Stanton) Apr. 9, 2004.

CSC8320. Outline Content from the book Recent Work Future Work.

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

Reputations Based On Transitive Trust Slides by Josh Albrecht.

BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu

Preference elicitation Communicational Burden by Nisan, Segal, Lahaie and Parkes October 27th, 2004 Jella Pfeiffer.

Leveraging Social Networks to Defend against Sybil attacks Krishna Gummadi Networked Systems Research Group Max Planck Institute for Software Systems Germany.

Secure and Highly-Available Aggregation Queries via Set Sampling Haifeng Yu National University of Singapore.

Security Mechanisms for Distributed Computing Systems A9ID1007, Xu Ling Kobayashi Laboratory GSIS, TOHOKU UNIVERSITY 2011/12/15 1.

Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.

Hyper-heuristics. 2 Outline Hyper-heuristics Hyper-heuristics for strip packing Hyper-heuristics for Stock forecasting Conclusion.

Md. Tanvir Al Amin Shah Md. Rifat Ahsan CSE 6809 – Distributed Search Techniques.

Bimal Viswanath § Ansley Post § Krishna Gummadi § Alan Mislove ¶ § MPI-SWS ¶ Northeastern University SIGCOMM 2010 Presented by Junyao Zhang Many of the.

“SybilGuard: Defending Against Sybil Attacks via Social Networks” Authors: Haifeng Yu, Phillip B. Gibbons, and Suman Nath (several slides based on authors’)

A. Haeberlen Fault Tolerance and the Five-Second Rule 1 HotOS XV (May 18, 2015) Ang Chen Hanjun Xiao Andreas Haeberlen Linh Thi Xuan Phan Department of.

Optimal Resource Allocation for Protecting System Availability against Random Cyber Attack International Conference Computer Research and Development(ICCRD),

Guard Sets for Onion Routing JOSHUA FREE. Tor Most popular low-latency distributed anonymity network Controversial decisions of guard selection strategies.

1 - CS7701 – Fall 2004 Review of: Detecting Network Intrusions via Sampling: A Game Theoretic Approach Paper by: – Murali Kodialam (Bell Labs) – T.V. Lakshman.

The Cost of Fault Tolerance in Multi-Party Communication Complexity Binbin Chen Advanced Digital Sciences Center Haifeng Yu National University of Singapore.

SybilGuard: Defending Against Sybil Attacks via Social Networks.

CSE 486/586 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.

The EigenTrust Algorithm for Reputation Management in P2P Networks

Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer.

Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.

Sybil Attacks VS Identity Clone Attacks in Online Social Networks Lei Jin, Xuelian Long, Hassan Takabi, James B.D. Joshi School of Information Sciences.

1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.

Measuring the Mixing Time of Social Graphs Abedelaziz Mohaisen, Aaram Yun, and Yongdae Kim Computer Science and Engineering Department University of Minnesota.

Talal H. Noor, Quan Z. Sheng, Lina Yao,

The Beta Reputation System

Location Cloaking for Location Safety Protection of Ad Hoc Networks

SocialMix: Supporting Privacy-aware Trusted Social Networking Services

Server Allocation for Multiplayer Cloud Gaming

Differential Privacy in Practice

Presentation transcript:

DSybil: Optimal Sybil-Resistance for Recommendation Systems Haifeng Yu National University of Singapore Chenwei Shi National University of Singapore Michael Kaminsky Intel Research Pittsburgh Phillip B. Gibbons Intel Research Pittsburgh Feng Xiao National University of Singapore

Attacks on Recommendation Systems  Netflix, Amazon, Razor, Digg, YouTube, …  Attacker may cast misleading votes  To be more effective  Bribe other users  Compromise other users  Ultimate form: Sybil attack Haifeng Yu, National University of Singapore 2

3 Sybil Attack launch sybil attack malicious  “Post at random intervals to make it look like real people”  “Supports multiple random proxies to make posts look like they came from visitors across the world”  “Multithreaded comment blaster with account rotation” …… honest automated sybil attack for $147

Haifeng Yu, National University of Singapore 4 Background: Defending Against Sybil Attack  Tie identities to human beings based on credentials (e.g., passport)  Privacy concerns, etc.  Resource challenges  Vulnerable to attacks from botnets  Social-network-based defense  SybilGuard [SIGCOMM’06], SybilLimit [Oakland’08], SybilInfer [NDSS’09], SumUp [NSDI’09]  Better guarantees Sybil defense widely considered challenging: >1000 papers acknowledging sybil attack, most without having a solution

Byzantine consensus n /3 DHT n /4 …… Recommendation systems n/500 Rec Systems Are More Vulnerable  On an avg Digg object, only 1 out of every 500 honest users vote  n/500 sybil identities are sufficient to out- vote the honest voters Haifeng Yu, National University of Singapore 5 # sybil identities we can tolerate ( n identities total)

Social-network-based Defenses Not Sufficiently Strong For Rec Systems  Lower bound on all social-network-based approaches  Applicable to SybilGuard, SybilLimit, SybilInfer, SumUp, etc…  Compromising a degree-10 node creates 10 sybil identities  To create n/500 sybil identities: Compromise 1 node out of every 5000 honest nodes is sufficient Haifeng Yu, National University of Singapore 6

Alternative: Leverage History and Trust  Ancient idea: Adjust “trust” to an identity based on its historical behavior  Numerous heuristics proposed -- target a few fixed attack strategies  No guarantees beyond the few strategies targeted  Attacker is intelligent and will adapt  arms race Haifeng Yu, National University of Singapore 7

Our Results  DSybil: A novel defense mechanism  Based on feedback and trust  Loss (# of bad recommendations) is provably even under worst-case attack  We prove that DSybil’s loss is optimal  Experimental results (from 1-year Digg trace):  High-quality recommendation under potential sybil attack (with optimal strategy) from million-node botnet Haifeng Yu, National University of Singapore 8 D : Dimension of the objects (< 10 in Digg) M : Max # of sybil identities voting on each obj

Outline  Background and our contribution  Trust-based approaches – The obvious, the subtle, and the challenge  Main component of DSybil: DSybil’s recommendation algorithm  Experimental results Haifeng Yu, National University of Singapore 9

Subtle Aspects of Using Trust 1. How to identify “correct” but “non-helpful” votes?  Vote for a good object that already has many votes -- this additional vote is “non-helpful”  Sybil identities may gain trust for free  Determine the “contribution amount” by voting order does not work – see paper Haifeng Yu, National University of Singapore 10 A good object my vote: this obj is good! sybil identity gain trust for free

Subtle Aspects of Using Trust 2. How to assign initial trust to new identities?  Positive initial trust for all: Invites whitewashing  “Trial period of 5 votes” not effective  Cast 5 “correct” votes and then cheat 3. How exactly to grow trust?  Multiplicatively? Additively? 4. How exactly to make recommendations?  Pick obj with most votes? Probabilistically? How about negative votes? ….. Haifeng Yu, National University of Singapore 11

The Central Challenge  Numerous design choices -- fundamental tension between  Giving trust to honest identities  Not giving trust to sybil identities casting “correct” votes (who may cause damage later)  Impossible to explore all design alternatives  Our approach: Directly design an optimal algorithm  Needs to strike the optimal balance Haifeng Yu, National University of Singapore 12

DSybil’s Key Insights Key #1: Leverage typical voting behavior of honest users  Heavy-tail distribution  Exist very active users who cast many votes Key #2: If user is already getting “enough help”, then do not give out more trust  Enables us to strike an optimal balance Haifeng Yu, National University of Singapore 13 # votes cast (on various objs) % of users casting x votes all log-scale

 Objects to be recommended are either good or bad (e.g., Digg)  DSybil is personalized  Each user may have different subjective opinions  Different users may get different recommendations  From now on, always with respect to a user Alice  Run by either Alice or a central server Haifeng Yu, National University of Singapore 14 System Model and Attack Model

Haifeng Yu, National University of Singapore 15  Each round has a pool of objects  DSybil recommends one object for Alice to consume  Alice provide feedbacks after consumption  DSybil adjust trust based on feedback  See paper for generalizations… 2 good objs 2 bad objs System Model and Attack Model DSybil does not know which are good

Haifeng Yu, National University of Singapore 16  Other identities have cast votes  DSybil only use positive votes  We prove that using negative votes will not help…  Each identity cast at most one vote/object  At most M (e.g ) sybil identities voting on each object E F GH 2 good objs 2 bad objs H System Model and Attack Model

DSybil Rec Algorithm: Classifying Objects Haifeng Yu, National University of Singapore 17 E : 0.2 total: 0.4 F : 0.2 total: 0.2 G total: 0.2 H total: good objs 2 bad objs H : 0.2  Reminder: Trust is always with respect to Alice (how much Alice “trusts” the given identity)  Each identity starts with initial trust Fix later…  An object is overwhelming if total trust ≥ C  C = 1.0

Rounds without Overwhelming Objects 1. Recommend: Uniformly random obj Haifeng Yu, National University of Singapore 18 E : 0.2 total: 0.4 F : 0.2 total: 0.2 G total: 0.2 H total: good objs 2 bad objs H : 0.2 trust to G: 0.2  0.2  trust to E: 0.2  0.2  trust to F: 0.2  0.2  2. Adjust trust after feedback:  If obj bad, multiply trust of voters by 0 ≤  < 1  If obj good, multiply trust of voters by  > 1 Additive increase would result in linear (instead of logarithmic) loss… Recommend obj with largest total trust would result in linear (instead of logarithmic) loss…

Defining Guides and Dimension Haifeng Yu, National University of Singapore 19  Guides: Honest users with same/similar “opinion” with Alice  Never/seldom votes for bad objects  Dimension: # of guides needed to “cover” large fraction (e.g., 60%) of the good objects -- Called critical guides XXY Z W Dimension = 2; Critical guides = {X, Y} or {X, W} DSybil does not know who are the guides (critical guides) or what the dimension is

Key #1: Leverage Small Dimension Haifeng Yu, National University of Singapore 20  Dimension is typically small in practice – results later…  Small dimension  Will encounter critical guides frequently when picking random objects  Trust to critical guides quickly grow to C  This will result in overwhelming objects…

Rounds with Overwhelming Objects 1. Recommend: Arbitrary overwhelming obj  Will confiscate sufficient trust if object is bad… 2. Adjust trust after feedback:  If obj bad, multiply trust of the voters by 0 ≤  < 1  If obj good, no additional trust given out Haifeng Yu, National University of Singapore 21 E : 1.0 total: 1.2 H : 0.2 G total: 0.4 F : 1.0 total: good obj 2 bad objs H : 0.2 trust to E: 1.0  1.0 trust to H: 0.2  0.2

Key #2: Identify Whether Help is Sufficient  Consumes good overwhelming object = Alice already has “sufficient help”  Thus do not give out additional trust  Prevent sybil identities from getting trust “for free”  May hurt honest identities (But remember this is optimal…) Haifeng Yu, National University of Singapore 22

Omitted Details  No “free” initial trust given out when Alice is getting “sufficient help”  Proof for loss even under worst- case attack  Optimality  Alternative designs/tweaks  Most will break optimality Haifeng Yu, National University of Singapore 23

Results on Dimension  One-year Digg dataset with half-million users  Pessimistically assuming guides are only 2% of the honest users -- see paper for other settings…  To cover 60% of good objs, need only 3 guides  Robustness:  Remove the previous 3 guides – 5 guides to cover 60%  Remove top 100 heaviest voters – 5 guides needed to cover 60%  See paper for more…  Relates to heavy-tail distribution of votes cast by individual users – see paper  Exist very active users who cast many votes  Similar heavy-tail distribution observed in 4 other datasets Haifeng Yu, National University of Singapore 24

Results on Loss (Based on Digg Dataset)  Attack capacity: Max 10 billion sybil voters on any obj  In Digg, avg # honest voters on each obj is only ~1,000  Fraction of bad recommendations (under worst-case attack): 12%  Growing defense: 5% if user has used DSybil for a week before attack starts  If attack starts at random point, applies to 51/52 = 98% users  1-minute computational puzzle per week  10 billion identities needs a million-node botnet Haifeng Yu, National University of Singapore 25

Conclusion  Defending against sybil attacks is challenging  It is even harder in the context of rec systems  DSybil: Provable and optimal loss  Almost no previous approaches provide provable guarantees against worst-case attack  DSybil key insights:  Leverage small dimension of the voting pattern  Carefully identify when help is already “sufficient” Haifeng Yu, National University of Singapore 26

Haifeng Yu, National University of Singapore 27

Haifeng Yu, National University of Singapore 28 Which object to pick?

Central Question Answered by This Work Haifeng Yu, National University of Singapore 29 Can trust sufficiently diminish the influence of sybil identities in recommendation systems? Aim for provable guarantees under all attack strategies (including worst-case attack from intelligent attacker) Short answer: YES!

Our Results  DSybil: A novel defense mechanism  Growing defense: If the user has used DSybil for some time before the attack starts, loss will be even smaller  Experimental results (from one-year trace of Digg): High-quality recommendation even under potential sybil attack (with optimal strategy) from a million- node botnet Haifeng Yu, National University of Singapore 30