Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering.

Similar presentations


Presentation on theme: "A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering."— Presentation transcript:

1 A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering National Taiwan University 1

2 Outline Background Problem Methodology Simulation Environment & Results Conclusion 2

3 P2P file sharing system with search capability Issue a query with keywords to search for a file Meta-Data (For keyword matching) Content A file in system song A Song title, length, encoding scheme of song A Overview of P2P file sharing system 3 songA Version1 (HashValue1) Peer 1 Peer 2 Version2 (HashValue2) Peer 3 Version3 (HashValue3) Peer 4 Peer 5 HashValue Hash function Different versions of song A Mp3, wma,…

4 How a user searches for a file 4 P2P network Query for song A Peer 1 Responses for song A SourceVersion Peer 1 Version 1 Peer 2 Version 2 :: Peer n Version m Randomly choose a source for download

5 Pollution in file sharing system Definition of a polluted file – Meta-data description doesn’t match its content ! Current P2P networks are full of polluted files [1] – Unintentional – Intentional Meta-Data A Content B [1] J. Liang, Y. X. R. Kumar, and K. Ross, “Pollution in p2p file sharing systems,” in Proceedings of IEEE Infocom, 2005 5

6 Problem Pollution in P2P system results in the following problems – Reduce content availability – Increase redundant traffic There are different anti-pollution mechanisms existing – Which one is better? 6

7 Methodology Simulation study on anti-pollution mechanisms – Extending a P2P simulator [2] – Existing anti-pollution mechanisms Peer reputation system – Choose a reputable peer to download file – EigenTrust [3] Object reputation system – Choose a reputable version of a file to download – Credence [4] – Different pollution attacks – User behavior 7 [2] M. Schlosser and S. Kamvar, “Simulating a file-sharing p2p network,” In Proc.of SemPGRID 2003 [3] S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina, “The eigentrust algorithm for reputation management in p2p networks”, in Proceedings of the Twelfth International World Wide Web Conference, [4] K. Walsh and E. G. Sirer, “Experience with an object reputation system for peer-topeer filesharing”, in Proceedings of Networked System Design and Implementation (NSDI), May 2006.

8 Rate a peer by it’s uploading history from the whole system Peer Reputation System : EigenTrust Peer i Peer j C ij = Local reputation (C ij ) Global reputation(T i ) Good file

9 Rate a peer by it’s uploading history from the whole system Choose a reputable peer to download T 1 =? T2T2 Peer 2 Peer 1 Peer 3 C 21 C 31 T3T3 Peer Reputation System : EigenTrust 9 Peer i Peer j C ij = Local reputation (C ij ) Global reputation(T i ) Bad file C 12 C 14 Peer 2 Peer 4 Peer 1 A peer will store a list of local reputations T 1 = C 21 * T 2 + C 31 *T 3

10 Calculate an object (file) reputation by weighted votes – After download  vote it as clean or polluted Query of song A Vote-gather Query of song A Object Reputation System : Credence 10 Vote database of Peer 1 Obj 3 Good Obj 4 Good Obj 5 Bad Obj 6 Bad P2P2 P1P1 P3P3 P4P4 P5P5

11 Calculate an object (file) reputation by weighted votes – After download  vote it as clean or polluted Choose a reputable version for download Vote p3 Version 1 Responses of song A Vote-responses of song A Version no. SourcesReceived Votes Version Reputation Version 1 P 2, P 3 Vote P2 Vote P3 Corr P1,P2 *Vote P2 + Corr P1,P3 *Vote P3 Version 2 P4P4 Vote P4 Vote P5 Corr P1,P4 *Vote P4 + Corr P1,P5 *Vote P5 Object Reputation System : Credence 11 Vote database of Peer 2 Obj 1 Bad Obj 2 Bad Obj 3 Good Obj 4 Good Vote database of Peer 3 Obj 5 Good Obj 6 Good Obj 7 Bad Obj 8 Bad Vote database of Peer 1 Obj 3 Good Obj 4 Good Obj 5 Bad Obj 6 Bad Received Responses of P 1 Positive correlation Negative correlation Version 2 P2P2 P1P1 P3P3 P4P4 P5P5 Version 1 Vote p2 Vote p4 Vote p5  random choose a source

12 Pollution Attacks Prevalent pollution attacks [5] – Decoy Insertion – Hash Corruption 12 A clean file of Song A Hash Corruption MAMA H1H1 Clean MAMA H1H1 Corrupted MAMA H2H2 [5] F. Benevenuto, C. Costa, M. Vasconcelos, V. Almeida, J. Almeida, and M. Mowbray, “Impact of peer incentives on the dissemination of polluted content”, in SAC ’06 Decoy Insertion

13 [6] U. Lee, M. Choi, J. Cho, M. Y. Sanadidi, and M. Gerla, “Understanding pollution dynamics in p2p file sharing”, in Proceedings of the 5th International Workshop on Peer-to-Peer Systems (IPTPS’06), 2006 Slackness [6] – A period of time between download completion and quality check – Bimodal distribution Awareness [6] – The probability that a user can correctly recognize a file being polluted – No clear characteristic is observed high-awareness prob. = 0.8 low-awareness prob. = 0.2 User Behavior 13

14 Outline Background Problem Methodology Simulation Environment & Results Conclusion 14

15 Simulator Description P2P Query Cycle based simulator – In a cycle, each peer issues one query and repeats downloading until satisfied Extension – Types of attacks Decoy Insertion, Hash Corruption – Anti-Pollution mechanisms EigenTrust, Credence – User behavior Slackness, awareness 15

16 Simulation Scenario 16 Type of Peer Malicious Always share polluted files based on different attack s NormalShare what they’ve downloaded

17 Simulation Setup 17 Peers [9] # of normal peers # of malicious peers # of neighbors 100 6 Content Distribution [8] [9] # of Categories in the system # of Categories of each peer Files in a category and Versions of each file File size distribution 20 At least 4 Zipf distribution with α = 1 Table 1 Simulation # of cycles # of experiments 300 10 [8] S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina, “The eigentrust algorithm for reputation management in p2p networks”, in Proceedings of the Twelfth International World Wide Web Conference, [9] K. Walsh and E. G. Sirer, “Experience with an object reputation system for peer-topeer filesharing”, in Proceedings of Networked System Design and Implementation (NSDI), May 2006. [10] N. Leibowitz, M. Ripeanu, and A. Wierzbicki, “Deconstructing the Kazaa network”, Internet Applications. WIAPP 2003. Proceedings. The Third IEEE Workshop Size1KB10KB100KB1MB10MB100MB1GB Percentage1.5%1.83%26.67%10.00%35.00%15.00%10.00% Table 1. File size distribution of P2P traffic [10]

18 Critical Evaluation Parameters 18 Attack Fraction of high-aware peers in the network slackness Decoy-Insertion 80% 50% 20% Yes or No Hash-corruption 80% 50% 20% Yes or No Decoy-Insertion & Hash- corruption 80% 50% 20% Yes or No Evaluate different anti-pollution mechanisms under the following scenarios

19 Successful Downloading Rate (per cycle) Redundant Traffic (per cycle) Reduced traffic Ratio(compared to randomly selection ) Evaluation metrics 19 SymbolDescriptions MjMj Mechanism of Credence or EigentTrust n# of high-aware peers titi Trials of downloads for a peer i to get a clean file in a cycle PTPolluted traffic CTControl traffic Total successful downloads Total trials of downloads Redundant traffic generated by random selection Reduced redundant traffic by using M j

20 Simulation Result Compare the performance of different anti-pollution mechanisms under different scenarios – EigenTrust – Credence – Random 20

21 Successful Downloading Rate 21 Credence is more sensitive to the type of attacks Under Hash-Corruption attackUnder Decoy-Insertion attack Credence identifies a clean version before download EigenTrsut rates on peers, not the hashvalue Converge after 100 cycles Credence > EigenTrust EigenTrust > Credence

22 Observation 1 : User awareness 22 EigenTrustCredence Reasons: 1. Fewer peers share clean files 2. Less peers correctly operate the reputation system

23 Observation 1 : User awareness 23 EigenTrustCredence User awareness is critical on anti-pollution mechanisms Reasons: 1. Fewer peers share clean files 2. Less peers correctly operate the reputation system

24 Observation 2 : User slackness User slackness has negative effect on Anti-pollution mechanisms 24 Pollution held by a user longer has more chances to be download

25 Discussion User behavior has significant effect on anti-pollution mechanisms Credence performs better under Decoy Insertion, while Eigentrust performs better under Hash Corruption – Type of attacks can’t be predicted – Suggest a hybrid anti-pollution mechanism 25

26 VersionsSources Version 1 P 1, P 5, P 7,...P 124 Version 2 P 14, P 21, P 35 :: Version N P 4, P 2 Hybrid Anti-pollution Mechanism 26 Response -list Step1: Select a reputable version by object reputation mechanism Step2: Select a reputable peer by peer reputation mechanism P2P network Query for song A

27 Successful Downloading Rate 27 Decoy InsertionHash Corruption Ensure both a reputable version and a source  confront different types of attacks

28 Successful Downloading Rate 28 Decoy InsertionHash Corruption Ensure both a reputable version and a source  confront different types of attacks Hybrid mechanism performs the best under both attacks

29 Reduced-Traffic Ratio Hybrid mechanism generate more control traffic – Trade-off between pollution traffic & control traffic 29 The trade-off is worthwhile Decoy InsertionHash Corruption

30 Conclusion Both peer reputation and object reputation system are necessary User behavior has significant influence on anti-pollution mechanisms 30

31 Thank you! 31


Download ppt "A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering."

Similar presentations


Ads by Google