1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine.

Slides:



Advertisements
Similar presentations
The Index Poisoning Attack in P2P File Sharing Systems Keith W. Ross Polytechnic University.
Advertisements

Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry.
CodeTorrent: Content Distribution using Network Coding in VANET Uichin Lee, JoonSang Park, Joseph Yeh, Giovanni Pau, Mario Gerla Computer Science Dept,
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Cloud Control with Distributed Rate Limiting Raghaven et all Presented by: Brian Card CS Fall Kinicki 1.
Kangaroo: Video Seeking in P2P Systems Xiaoyuan Yang †, Minas Gjoka ¶, Parminder Chhabra †, Athina Markopoulou ¶, Pablo Rodriguez † † Telefonica Research.
On the Economics of P2P Systems Speaker Coby Fernandess.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
1 Distributed, Automatic File Description Tuning in Peer-to-Peer File-Sharing Systems Presented by: Dongmei Jia Illinois Institute of Technology April.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Peer-to-Peer Networking By: Peter Diggs Ken Arrant.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
Presented by Stephen Kozy. Presentation Outline Definition and explanation Comparison and Examples Advantages and Disadvantages Illegal and Legal uses.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Before start… Earlier work single-path routing in sensor networks
MobEyes: Smart Mobs for Urban Monitoring with Vehicular Sensor Networks* Uichin Lee, Eugenio Magistretti, Mario Gerla, Paolo Bellavista, Antonio Corradi.
Self Healing Wide Area Network Services Bhavjit S Walha Ganesh Venkatesh.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Intrusion detection Anomaly detection models: compare a user’s normal behavior statistically to parameters of the current session, in order to find significant.
Countering Large-Scale Internet Pollution and Poisoning Aleksandar Kuzmanovic Northwestern University
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
P2P WeeSan Lee
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,

INFOCOM, 2007 Chen Bin Kuo ( ) Young J. Won ( ) DPNM Lab.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
Load Balancing in Structured P2P System Ananth Rao, Karthik Lakshminarayanan, Sonesh Surana, Richard Karp, Ion Stoica IPTPS ’03 Kyungmin Cho 2003/05/20.
DELAYED CHAINING: A PRACTICAL P2P SOLUTION FOR VIDEO-ON-DEMAND Speaker : 童耀民 MA1G Authors: Paris, J.-F.Paris, J.-F. ; Amer, A. Computer.
1 How to 0wn the Internet in Your Spare Time Authors: Stuart Staniford, Vern Paxson, Nicholas Weaver Publication: Usenix Security Symposium, 2002 Presenter:
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Alexander Afanasyev Tutors: Seung-Hoon Lee, Uichin Lee Content Distribution in VANETs using Network Coding: Evaluation of the Generation Selection Algorithms.
« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.
SPAM DETECTION IN P2P SYSTEMS Team Matrix Abhishek GhagDarshan Kapadia Pratik Singh.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Self Regulated Search in Unstructured Peer-to-Peer Networks Niloy Ganguly Department of Computer Science and Engineering IIT Kharagpur.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
AlvisP2P : Scalable Peer-to-Peer Text Retrieval in a Structured P2P Network Toan Luu, Gleb Skobeltsyn, Fabius Klemm, Maroje Puh, Ivana Podnar Zarko, Martin.
TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES Lesson №18 Telecommunication software design for analyzing and control packets on the networks by using.
FastTrack Network & Applications (KaZaA & Morpheus)
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
1 - CS7701 – Fall 2004 Review of: Detecting Network Intrusions via Sampling: A Game Theoretic Approach Paper by: – Murali Kodialam (Bell Labs) – T.V. Lakshman.
A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering.
Cooperative Mobile Live Streaming Considering Neighbor Reception SPEAKER: BO-YU HUANG ADVISOR: DR. HO-TING WU 2015/10/15 1.
Aug 22, 2002Sigcomm 2002 Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen AT&T Labs-research Scott Shenker ICIR.
Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.
Content Availability and Bundling in Swarming Systems Reporter: Jian He.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Network and Systems Laboratory nslab.ee.ntu.edu.tw Yipeng Zhou, Dah Ming Chiu, and John C.S. Lui Information Engineering Department The Chinese University.
Reputation Systems for Fighting Pollution in Peer-to-Peer File Sharing Systems 7 th.IEEE International Conference on Peer-to-Peer Computing Cristiano Costa,
Large-Scale Monitoring of DHT Traffic Ghulam Memon – University of Oregon Reza Rejaie – University of Oregon Yang Guo – Corporate Research, Thomson Daniel.
Proposal Pollution prevention in the P2P file sharing system Presenter: Elaine.
Peer-to-Peer Information Systems Week 13: Trust Old Dominion University Department of Computer Science CS 495/595 Fall 2003 Michael L. Nelson 11/17/03.
Performance Enhancement of Multirate IEEE WLANs with Geographically Scattered Stations 1 Duck-Yong Yang, 2 Tae-Jin Lee, 3 Kyunghun Jang, 3 Jin-Bong.
Earn money by sharing files on P2P networks
Geethanjali College Of Engineering and Technology Cheeryal( V), Keesara ( M), Ranga Reddy District. I I Internal Guide Mrs.CH.V.Anupama Assistant Professor.
The Hidden Locality in Swarms
IFIP – Performance 2007 A Modeling Framework to Understand the Tussle between ISPs and Peer-to-Peer File Sharing Users Michele Garetto - unito.
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine

2 Outline Pollution in P2P file sharing system Related work User behavior study Analytic pollution model and its result The impact of pollution on P2P traffic load

3 Pollution in P2P file sharing system Object in p2p file sharing system is composed of two parts Meta-data (song title,artist name, length, encoding scheme, etc.) Content

4 Pollution in P2P file sharing system P2P file sharing system with search capability (e.g Emule, Gnutella, kazaa, etc.) Query for song A P2P network Responses for song A Get return list of available peers Select one source for download

5 Pollution in P2P file sharing system Definition of Pollution A file’ s actual content does’ t match its meta-data description! MAMA CACA Clean file of song A MAMA CBCB Tampering with Meta-data MAMA C A’ Tampering with Content

6 Pollution in P2P file sharing system File sharing system impact sales in music, video, games industry, television networks How pollution attack works. Injecting a massive number of decoys into the peer-to- peer network, to reduce the availability of the targeted item A Case 2003 Madonna’s new album Pollution companies occurs Overpeer, Loudeye, Retspan

7 Related work 1. [Content Availability, Pollution and Poisoning in File Sharing Peer to Peer Networks], ACM conference on Electronic commerce Understanding different pollution attack impact on content availability in peer-to-peer file sharing networks 2. [Denial of Service Resilience in peer to Peer File Sharing Systems], SIGMETRICS Made the first attempt to model the dynamics of P2P file pollution attacks 3. [Pollution in P2P File Sharing System], INFOCOM KaZaA is severely polluted Given that polluters have limited capabilities (bandwidth/processing power), current level of pollution is too high Not only does a user always detect the polluted file, but he also deletes it

8 Related work This paper A more accurate model to model pollution Polluted files indeed spread from user to user over the network Pollution has a significant impact on P2P traffic load

9 User behavior study Goal How does user behavior impact pollution spread? Two stages of this study Questionnaire : familiarity / usage patterns Behavior observation : awareness / slackness 30 graduate students (UCLA, KAIST)

10 User behavior study P2P Familiarity 1. Have you ever used P2P file sharing? 2. Do you frequently share files with P2P systems? 3. Do you know how to enable or disable sharing local files? 4. Do you know how popular P2P software works? 5. Do you know about multipart downloading or swarming? Familiarity is high

11 User behavior study P2P Usage Pattern 1) Preparation Stage quality availability 57% 20% file size Download decision 2) Download Stage frequent size-dependent 41% 35% 20%check later Checking frequency Re-download? yes23% file size dep. 57% 3) Post-download Stage Pollution experience yes 70% Failed in noticing pollution yes 30% Sharing yes 43.3%

12 User behavior study Summary 1. Even sophisticated P2P users sometimes fail to recognize polluted files 2. Users do not check the quality of a downloaded file immediately after the completion of download 3. Not all users are cooperative in sharing downloaded files 4. Users make their download decisions primarily based quality of a file.

13 User Behavior Observation Mainly interested in measuring the following two parameters Awareness probability The fraction of users who recognize pollution in a downloaded file Slackness distribution Distribution of intervals between download completion time and quality checking time.

14 User Behavior Observation Experiment Setup Seeding the server with genuine/polluted Mp3 files Modified P2P software to monitor user behavior Users are asked to use it and to download files After each downloaded topic, asking users about familiarity and if polluted or not Controlled downloading speed (50K - 1Mbps) One month

15 User Behavior Observation Pollution techniques (on MP3 files) Meta-data modification : changed names Quality degradation Incomplete file : cut (30-60 seconds beg./end.) Noise insertion: every 20 seconds Shuffled content : randomly shuffled content 20 genuine songs 5*4 = 20

16 User Behavior Observation User Awareness Shuffled Content Noise Insertion Incomplete File Quality degradation Meta-data Modification

17 User Behavior Observation slackness The elapsed time between download completion and pollution checking. Summary: (1)P2P users are lacking in pollution awareness (2) Slackness distribution shows a bimodal form.

18 POLLUTION MODEL Discrete time analysis by extending the previous model and incorporating study results Total M users in the system Only one kind of file in the system G 0 /B 0 : initial # genuine/bogus copies Download process 1. At step k, a user (never downloaded before) downloads a file with probability s k (i.e., interest level) 2. After download, the authenticity is checked after an interval t, where t <= L (max. slackness) 3. Realizes bogus with probability p a (i.e., awareness), and delete directly; if so, he will try again with probability p r (i.e., re-download prob.), and goes back to step 2 4. Share the file with probability p c (i.e., cooperativeness)

19 Pollution Model # downloads at time step k (N k ) Ever downloaded users (D k ) g k (b k ) : # of users currently downloading genuine(bad) copies Total G k and B k files are shared in the system New TrialsRe-downloadsNew Trials

20 Pollution Model Total # genuine files (G k+1 ) Total # bogus files (B k+1 ) Prob. of not sharing bad files # re-downloads at k+1 L: max. slackness g k : # incoming good files at k b k : # incoming bad files at k s j : prob. of checking after j p c : cooperation probability p a : awareness probability p r : re-download probability Max Slackness: L=3 kk-1k-2 g k-2 s 3 gks1gks1 g k-1 s 2 Total # checking at k: g k-2 s 3 + g k-1 s 2 + g k s 1

21 Analytic Results Metrics for measuring the efficacy of pollution Pollution level (for a given time slot) B k / G k Settings M=15,000 (total number of users) L=48 (max. slackness) s k = 1/24 (gets interested in every 24 hrs.) p r = 1 (re-downloads always!) p c = 0.25 (cooperativeness) Initial pollution level = 20

22 Analytic Results Comparison with previous model Where people’ re perfect at recognizing pollution 1.Polluted files indeed spread due to the lack of awareness 2.Such a high level of pollution in KaZaA [7] can be explained using our model

23 Analytic Results The effectiveness of increasing the initial pollution level by the polluter vs Retry probability The more that users are impatient, the more the polluter is successful in polluting files

24 Analytic Results The effectiveness of increasing the initial pollution level by the polluter vs User awareness 1.As awareness increases, a higher k does not provide the polluter much improvement. 2. Awareness is critical to make an effective attack (The polluter can’t controll p c, p r )

25 Analytic Results As the level of pollution increases, awareness becomes much more important than user cooperativeness for the growth of genuine copies Number of genuine files in steady state as a function of cooperativeness and awareness

26 Popular files are targets of the polluters!! Users will re-download with probability p r At time step t s, the total # of retrial 1.In the worst case, # of re-downloads is x3 larger!! 2.60% of the Internet traffic is P2P IMPACT ON INTERNET TRAFFIC LOAD

27 Conclusion User behavior study shows Users are not error-free in recognizing pollution Users’ slackness follows a bimodal distribution Developed an analytical model Analytic model shows Awareness is one of the key factors in pollution dynamics Pollution has a great impact on the P2P traffic loads

28 Reference 1. [Content Availability, Pollution and Poisoning in File Sharing Peer to Peer Networks], ACM conference on Electronic commerce 2. [Denial of Service Resilience in peer to Peer File Sharing Systems], SIGMETRICS 3. [Understanding Pollution Dynamics in P2P File Sharing], INFOCOM 4) [Understanding Pollution Dynamics in P2P File Sharing], IPTPS ’06 Part of power point from author (with mark)