Experience with an Object Reputation System for Peer-to-Peer File Sharing NSDI’06(3th USENIX Symposium on Networked Systems Design & Implementation) Kevin Walsh Emin Gun Sirer Cornell University Presenter: Elaine
2 Problem A P2P filesharing application with search capability (e.g. Gnutella) Filesharing apps use meta-data for searching Meta-data like file name, file size, file descriptors, content-hash, etc Problem –Users blindly believe the meta-data –Object authenticity (or Reputaiton) downloaded file == what it claims to be Current peer-to-peer filesharing networks, which are rife with corrupt and mislabeled content Much of this pollution can be attributed to deliberate attacks [14]
3 Recent approaches Past experience –Small portion of peers # of shared files as an endorsement –Large number of malicious peers can share the same files –Angry users may share Voting –Trust on voters? –No incentive to vote Call for a trustworthiness metrics –Credence
4 Credence How to decide the reputation of an object –Use voting and deal with the trust problem How? –Compare voting history of two peers –Trust peers with identical votes more Correlation Computation –If they don ’ t share enough history, build a trust relationship graph and trust multi-hop peers (transitive trust)
5 Computing correlation Calculate each two peers’ trust relationship –A standard formula for the coefficient of correlation between binary data sets Phi coefficient Theta takes a range of [-1,1] Positive values indicate agreement Negative values indicate disagreement A-A+ B-336 B
6 Transitive trust (ref. from K Walsh, EG Sirer) Voting history (1 == correct, 0 == incorrect) Obj 31 Obj 41 Obj 50 Obj 60 ABC Obj 50 Obj 60 Obj 71 Obj 81 Obj 11 Obj 20 Obj 31 Obj 41 Local Trust Transitive Trust θac = θab * θbc
7 Voting on Object A Vote is a signed tuple: K –H - File content hash –S – Statement about the file Thumb up ( unconditionally thumb up) Thumb down ( unconditionally thumb up) –T – Timestamp –K – Certificate
8 Three basic operations in Credence Voting –A peer casts a vote on a object after each downloading and store it locally to the vote database Algorithm Voting Issuing a vote-gather query Evaluating the object reputation
Sender (Issuing a vote-gather query) –Issuing a vote-gather query, specifying the hash of content (a given object), to the underlying Gnutella network, store the gathering votes to the vote database. Receiver (After receiving a vote-gather query) –Send back their own matching votes and any matching votes they have seen recently with the most reliable weight –Advantages: Bound the overall cost of votes collection Voters are not required to remain online Voting Issuing a vote-gather query Evaluating the object reputation
Votes that apply positively are given an initial value of +1, and those that apply negatively −1 Look up the trust relationship from correlation table Calculate the weighted average of votes using correlation values to derive the object reputation scores Voting Issuing a vote-gather query Evaluating the object reputation
11 Evaluation Overview 10,000 downloads since March crawlers collected 200 daily snapshots of the network structure Dataset –Data compiled from about 1,200 Credence clients – 39,000 votes and 84,000 files
Presents the correlation values between any pair of peers with overlapping vote histories On average, each node is directly correlated with 27 other peers. Four groups of peers Correlation between Credence peers
35% of altruistic users, 50% of non-participants, and 15% of attackers Attackers may not have malicious intention The votes from attackers actually provide a tangible benefit to the system The file authenticity is a fairly universal concept among filesharing users Credence users Classification
Local and Transitive Relationships % of peers with valid correlation values Not many high-quality correlations!!!
Different correlation strength and size of usable votes set Consistency –The number of pairs of votes in agreement divided by the number of pairs in agreement or conflict. size of usable set Consistancy of usable vote set
Vote classification Most peer discover their first peer correlation after casting fewer than 18 votes Coordinated attackers cast a lot of votes!! # of votes cast
17 Files in Credence Data set –681 Credence clients. These users advertise a total of 84,838 files, of which 67,794 are unique
File distribution(Decoys) By number of times shared By number of hosts Two types of attacks
File Voting Popularity Voting data set comprises 39,761 votes cast on 35,690 unique files. Positive votes are spread evenly Negative votes a more skewed distribution
Sharing and voting behavior largely independent Voting Can Contradict Sharing Voting and Sharing
21 End-to-End Performance Load generator to repeatedly query the Gnutella network for typical keywords over a 24 hour period, and logged the search results returned (Sortign the file by # of peers sharing it ).
Resistance to Collusion Pick peers from main cluster Large scale attack are more likely to be detected. Detect 75% decoys
Ranking Performance
24 Credence Overhead Inbound traffic: A highly active client receives 100 bytes per second of additional background traffic in Credence. Outbound traffic: depends on popularity of client’s votes, client’s reputation and Gnutella connectivity Processing overhead < 1% of 1.7 GHz
25 Conclusion The fisrt distributed p2p object reputation system to identify pollutions Provide incentives for users to participate honestly in voting Not specific to Gnutella network
26 My comment Pros –Incentive seems robust Cons –Performance verification is weak –No comparison with other mechanism –Still need a centralized certificate authority –Storing votes waste space (need to maintain vote data base, trust graph, correlation table) –People are lazy (Emule way, but can not avoid large attacks)
27 The design of Credence is guided by several goals that are necessary requirements for a successful peer-to-peer reputation mechanism –Relevance –Distribution and Decentralization –Robustness –Isolation –Motivation To participate honestly in the reputation system