A Robust and Efficient Reputation System for Active Peer-to-Peer Systems Dominik Grolimund, Luzius Meisser, Stefan Schmid, Roger Wattenhofer Computer Engineering and Networks Laboratory (TIK), ETH Zurich NetEcon’06 June 10, Ann Arbor, Michigan, USA Havelaar Distributed Computing Group
2 / 26 Talk Outline Environment Existing Solutions Principles of Havelaar Evaluation Conclusions
3 / 26 Do You Know YouTube? Very popular online video platform > 30 mio. users, growing rapidly >> 1 mio. watched every day >> 10,000 uploaded every day very active!
4 / 26 Guess What: YouTube is Centralized Hosted on servers Simple, but: huge costs 1 mio. $ / month for bandwidth and storage Low quality Limited (10-minute clips)
5 / 26 Imagine YouTube Being Decentralized Files stored in a distributed storage system Resources provided by the users Uncontrollable environment: unreliable, ordinary desktop computers private users turn computer on and off at any time can leave the system forever at any time open, attracts malicious agents, attacks rational agents, free-riders
6 / 26 Three Key Problems 1. Availability How can the data be made immediately accessible when requested, although users can turn off their computer at any time? 2. Reliability How can the data be stored persistently, despite the inherent dynamics, node departures, and malicious nodes? 3. Incentives (focus of this talk) How can rational agents be encouraged to provide their resources without free-riding?
7 / 26 Kangoo – A Distributed Storage System Research at ETH Zurich Availability achieved with redundancy: A file is divided into ~100 blocks, which are then encrypted and encoded into ~500 redundant fragments using erasure codes Any 100 are sufficient to reconstruct the file Lots of transactions necessary! Usage of YouTube would result in tens of thousands of transactions per peer and week Not ready yet, but you can subscribe for the beta:
8 / 26 This Talk: Havelaar How to encourage peers to provide their upload bandwidth? (storage and online time are handled by Kangoo itself) Havelaar is independent of Kangoo can be used for other systems as well. Robust to attacks Efficient, scalable in the number of transactions
9 / 26 Talk Outline Environment Existing Solutions Principles of Havelaar Evaluation Conclusions
10 / 26 Existing Solutions Direct reciprocity (e.g. BitTorrent) Tit-for-tat, iterated prisoner dilemma Works for content distribution, but not for a system where interactions are too infrequent Monetary-based (e.g. Karma) Economic theory But: centralized or else inefficient, market regulations,... Reputation systems (e.g. eBay) Service differentiation: The higher your reputation, the better your service Good, but how...
11 / 26 Reputation Systems How to keep track of the contribution of each peer? Client (e.g. Kazaa) Simple to subvert, as it has been shown with Kazaa Lite Centralized (e.g. eBay) Many many more transactions if used for fairness in a p2p system server cluster would be needed Decentralized Good, but how...
12 / 26 Decentralized Reputation Systems Direct observations do not scale to large networks with infrequent interactions We need to incorporate second-hand observations Big new problem: false reports
13 / 26 Coping with False Reports How to defend against false reports? Max-flow Maximum likelihood estimation Bayesian approach Transitivity of trust weigh the voting by the reputation of the sender Most systems are designed for a decentralized „pure reputation system“ (e.g. eBay), but not meant for a fairness system where we need to track the contribution of each peer with lots of transactons
14 / 26 Storing Contribution Values Where to store the contribution value of each peer? Flood in the system (e.g. EigenRep) Request from peers before transaction Store in a DHT: „DHT-based approach“ store and update contribution value of peer u at h(u) in a DHT Scales linearly in the number of transactions
15 / 26 Talk Outline Environment Existing Solutions Principles of Havelaar Evaluation Conclusions
16 / 26 Introducing Havelaar Approximation is good enough! If peer u provides three times more than v, u should get about three times a better download bandwidth than v Track contribution value C: bandwidth b, size s If locally computed contribution value is close to the global / real one for all peers, that‘s fine
17 / 26 Local Vector Every peer has a local observation vector o After u downloads from v (bandwidth of 5, size 3), u will increase the entry of v by 5 * 3 (C v += 15) Only after complete transaction
18 / 26 Send Local Vector To Successors h 1 (w) h 2 (w) h 3 (w) h 4 (w) observation vector o o o o o w once a round (~ week) k successors: determined by hash functions on the sender id w same successors in every round can only send to its k successors limited influence can only send once per round „self-observation“ of the sender is dropped cannot praise itself defend against attacks:
19 / 26 Aggregation: Need More Observations Need more observations for an accurate approximation Aggregate exponentially more: o0o0 use all for contribution update c O O O = [o0,o1+o1+o1,o2+o2+o2] [o1,o2,o3] own observations o 3 dropped defend against attacks: for each entry, outliers are detected and dropped praise or accusation „within bounds“ will be smoothed out (lots of observations aggregated) distribution of a vector can be analyzed if spiked, then it is most likely an attack drop, maybe even decrease the trust value of that peer
20 / 26 Rewarding Always allocate full bandwidth No artificial limits Contention: Two or more want to download from a third node at the same time allocate according to the contribution values Different resource allocation algorithms possible. We chose an algorithm similar to: An Incentive Mechanism for P2P Networks, R. B. Ma et al., ICDS 2004
21 / 26 Talk Outline Environment Existing Solutions Principles of Havelaar Evaluation Conclusions
22 / 26 Evaluation bootstrapping We have analyzed and simulated Havelaar 5 successors and a matrix with four vectors is already enough for huge networks with more than 100,000 nodes and 5,000 transactions per peer and round.
23 / 26 Communication Costs Need to send a huge matrix, but: it does not depend on the number of transactions! The more transactions, the higher the accuracy!
24 / 26 Talk Outline Environment Existing Solutions Principles of Havelaar Evaluation Conclusions
25 / 26 Conclusions Havelaar for active, long-term peer-to-peer systems Robust against attacks, false reports Low communication costs: scalable in the number of transactions Churn: not an issue because the local vector can be sent at any time in a round Kangoo takes care about other attacks (sybil attacks, white washing) and has strong identifiers
26 / 26 Thank you for your attention!