Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine.

Similar presentations


Presentation on theme: "1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine."— Presentation transcript:

1 1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine

2 2 Outline Pollution in P2P file sharing system Related work User behavior study Analytic pollution model and its result The impact of pollution on P2P traffic load

3 3 Pollution in P2P file sharing system Object in p2p file sharing system is composed of two parts Meta-data (song title,artist name, length, encoding scheme, etc.) Content

4 4 Pollution in P2P file sharing system P2P file sharing system with search capability (e.g Emule, Gnutella, kazaa, etc.) Query for song A...... P2P network Responses for song A Get return list of available peers Select one source for download

5 5 Pollution in P2P file sharing system Definition of Pollution A file’ s actual content does’ t match its meta-data description! MAMA CACA Clean file of song A MAMA CBCB Tampering with Meta-data MAMA C A’ Tampering with Content

6 6 Pollution in P2P file sharing system File sharing system impact sales in music, video, games industry, television networks How pollution attack works. Injecting a massive number of decoys into the peer-to- peer network, to reduce the availability of the targeted item A Case 2003 Madonna’s new album Pollution companies occurs Overpeer, Loudeye, Retspan

7 7 Related work 1. [Content Availability, Pollution and Poisoning in File Sharing Peer to Peer Networks], ACM conference on Electronic commerce Understanding different pollution attack impact on content availability in peer-to-peer file sharing networks 2. [Denial of Service Resilience in peer to Peer File Sharing Systems], SIGMETRICS Made the first attempt to model the dynamics of P2P file pollution attacks 3. [Pollution in P2P File Sharing System], INFOCOM KaZaA is severely polluted Given that polluters have limited capabilities (bandwidth/processing power), current level of pollution is too high Not only does a user always detect the polluted file, but he also deletes it

8 8 Related work This paper A more accurate model to model pollution Polluted files indeed spread from user to user over the network Pollution has a significant impact on P2P traffic load

9 9 User behavior study Goal How does user behavior impact pollution spread? Two stages of this study Questionnaire : familiarity / usage patterns Behavior observation : awareness / slackness 30 graduate students (UCLA, KAIST)

10 10 User behavior study P2P Familiarity 1. Have you ever used P2P file sharing? 2. Do you frequently share files with P2P systems? 3. Do you know how to enable or disable sharing local files? 4. Do you know how popular P2P software works? 5. Do you know about multipart downloading or swarming? Familiarity is high

11 11 User behavior study P2P Usage Pattern 1) Preparation Stage quality availability 57% 20% file size Download decision 2) Download Stage frequent size-dependent 41% 35% 20%check later Checking frequency Re-download? yes23% file size dep. 57% 3) Post-download Stage Pollution experience yes 70% Failed in noticing pollution yes 30% Sharing yes 43.3%

12 12 User behavior study Summary 1. Even sophisticated P2P users sometimes fail to recognize polluted files 2. Users do not check the quality of a downloaded file immediately after the completion of download 3. Not all users are cooperative in sharing downloaded files 4. Users make their download decisions primarily based quality of a file.

13 13 User Behavior Observation Mainly interested in measuring the following two parameters Awareness probability The fraction of users who recognize pollution in a downloaded file Slackness distribution Distribution of intervals between download completion time and quality checking time.

14 14 User Behavior Observation Experiment Setup Seeding the server with genuine/polluted Mp3 files Modified P2P software to monitor user behavior Users are asked to use it and to download files After each downloaded topic, asking users about familiarity and if polluted or not Controlled downloading speed (50K - 1Mbps) One month

15 15 User Behavior Observation Pollution techniques (on MP3 files) Meta-data modification : changed names Quality degradation Incomplete file : cut (30-60 seconds beg./end.) Noise insertion: every 20 seconds Shuffled content : randomly shuffled content 20 genuine songs 5*4 = 20

16 16 User Behavior Observation User Awareness Shuffled Content Noise Insertion Incomplete File Quality degradation Meta-data Modification

17 17 User Behavior Observation slackness The elapsed time between download completion and pollution checking. Summary: (1)P2P users are lacking in pollution awareness (2) Slackness distribution shows a bimodal form.

18 18 POLLUTION MODEL Discrete time analysis by extending the previous model and incorporating study results Total M users in the system Only one kind of file in the system G 0 /B 0 : initial # genuine/bogus copies Download process 1. At step k, a user (never downloaded before) downloads a file with probability s k (i.e., interest level) 2. After download, the authenticity is checked after an interval t, where t <= L (max. slackness) 3. Realizes bogus with probability p a (i.e., awareness), and delete directly; if so, he will try again with probability p r (i.e., re-download prob.), and goes back to step 2 4. Share the file with probability p c (i.e., cooperativeness)

19 19 Pollution Model # downloads at time step k (N k ) Ever downloaded users (D k ) g k (b k ) : # of users currently downloading genuine(bad) copies Total G k and B k files are shared in the system New TrialsRe-downloadsNew Trials

20 20 Pollution Model Total # genuine files (G k+1 ) Total # bogus files (B k+1 ) Prob. of not sharing bad files # re-downloads at k+1 L: max. slackness g k : # incoming good files at k b k : # incoming bad files at k s j : prob. of checking after j p c : cooperation probability p a : awareness probability p r : re-download probability Max Slackness: L=3 kk-1k-2 g k-2 s 3 gks1gks1 g k-1 s 2 Total # checking at k: g k-2 s 3 + g k-1 s 2 + g k s 1

21 21 Analytic Results Metrics for measuring the efficacy of pollution Pollution level (for a given time slot) B k / G k Settings M=15,000 (total number of users) L=48 (max. slackness) s k = 1/24 (gets interested in every 24 hrs.) p r = 1 (re-downloads always!) p c = 0.25 (cooperativeness) Initial pollution level = 20

22 22 Analytic Results Comparison with previous model Where people’ re perfect at recognizing pollution 1.Polluted files indeed spread due to the lack of awareness 2.Such a high level of pollution in KaZaA [7] can be explained using our model

23 23 Analytic Results The effectiveness of increasing the initial pollution level by the polluter vs Retry probability The more that users are impatient, the more the polluter is successful in polluting files

24 24 Analytic Results The effectiveness of increasing the initial pollution level by the polluter vs User awareness 1.As awareness increases, a higher k does not provide the polluter much improvement. 2. Awareness is critical to make an effective attack (The polluter can’t controll p c, p r )

25 25 Analytic Results As the level of pollution increases, awareness becomes much more important than user cooperativeness for the growth of genuine copies Number of genuine files in steady state as a function of cooperativeness and awareness

26 26 Popular files are targets of the polluters!! Users will re-download with probability p r At time step t s, the total # of retrial 1.In the worst case, # of re-downloads is x3 larger!! 2.60% of the Internet traffic is P2P IMPACT ON INTERNET TRAFFIC LOAD

27 27 Conclusion User behavior study shows Users are not error-free in recognizing pollution Users’ slackness follows a bimodal distribution Developed an analytical model Analytic model shows Awareness is one of the key factors in pollution dynamics Pollution has a great impact on the P2P traffic loads

28 28 Reference 1. [Content Availability, Pollution and Poisoning in File Sharing Peer to Peer Networks], ACM conference on Electronic commerce 2. [Denial of Service Resilience in peer to Peer File Sharing Systems], SIGMETRICS 3. [Understanding Pollution Dynamics in P2P File Sharing], INFOCOM 4) [Understanding Pollution Dynamics in P2P File Sharing], IPTPS ’06 Part of power point from author (with mark)


Download ppt "1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine."

Similar presentations


Ads by Google