Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06.

Slides:



Advertisements
Similar presentations
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
Advertisements

Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry.
Effective Straggler Mitigation: Attack of the Clones [1]
CodeTorrent: Content Distribution using Network Coding in VANET Uichin Lee, JoonSang Park, Joseph Yeh, Giovanni Pau, Mario Gerla Computer Science Dept,
More on protocol implementation Version walks Timers and their problems.
Cloud Control with Distributed Rate Limiting Raghaven et all Presented by: Brian Card CS Fall Kinicki 1.
Mobile Code Security Aviel D. Rubin, Daniel E. Geer, Jr. MOBILE CODE SECURITY, IEEE Internet Computing, 1998 Minkyu Lee
Session 8b, 5 th July 2012 Future Network & MobileSummit 2012 Copyright 2012 Mobile Multimedia Laboratory Realistic Media Streaming over BitTorrent George.
Load Balancing of Elastic Traffic in Heterogeneous Wireless Networks Abdulfetah Khalid, Samuli Aalto and Pasi Lassila
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
FleaNet : A Virtual Market Place on Vehicular Networks Uichin Lee, Joon-Sang Park Eyal Amir, Mario Gerla Network Research Lab, Computer Science Dept.,
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
How to Logon Oracle Collaboration Suite and change password? STEP 1 Launch
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
1 Internet and Data Management Junghoo “John” Cho UCLA Computer Science.
1 End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
User studies. Why user studies? How do we know security and privacy solutions are really usable? Have to observe users! –you may be surprised by what.
1 Automatic Identification of User Goals in Web Search Uichin Lee, Zhenyu Liu, Junghoo Cho Computer Science Department, UCLA {uclee, vicliu,
Mining Long Sequential Patterns in a Noisy Environment Jiong Yang, Wei Wang, Philip S. Yu, and Jiawei Han SIGMOD 2002 Presented by: Eddie Date: 2002/12/23.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
thinking hats Six of Prepared by Eman A. Al Abdullah ©
Learning Objective Chapter 10 Questionnaire Design CHAPTER ten Questionnaire Design Copyright © 2000 by John Wiley & Sons, Inc.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
Privacy in P2P based Data Sharing Muhammad Nazmus Sakib CSCE 824 April 17, 2013.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Security+ All-In-One Edition Chapter 14 – and Instant Messaging Brian E. Brzezicki.
Enhancing TCP Fairness in Ad Hoc Wireless Networks using Neighborhood RED Kaixin Xu, Mario Gerla UCLA Computer Science Department
Othman Othman M.M., Koji Okamura Kyushu University 1.
Source-End Defense System against DDoS attacks Fu-Yuan Lee, Shiuhpyng Shieh, Jui-Ting Shieh and Sheng Hsuan Wang Distributed System and Network Security.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
Alexander Afanasyev Tutors: Seung-Hoon Lee, Uichin Lee Content Distribution in VANETs using Network Coding: Evaluation of the Generation Selection Algorithms.
Today  Table/List operations  Parallel Arrays  Efficiency and Big ‘O’  Searching.
Estimating Bandwidth of Mobile Users Sept 2003 Rohit Kapoor CSD, UCLA.
1 Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06 Elaine.
Autonomous Replication for High Availability in Unstructured P2P Systems Francisco Matias Cuenca-Acuna, Richard P. Martin, Thu D. Nguyen
Weather Forecasting. Project outline This is an extended piece of work which will include using higher level thinking skills to allow you to achieve level.
Othman Othman M.M., Koji Okamura Kyushu University 1.
Secure and Highly-Available Aggregation Queries via Set Sampling Haifeng Yu National University of Singapore.
Slide title In CAPITALS 50 pt Slide subtitle 32 pt Dynamic and Persistent Scheduling for Voice over IP Traffic in the Long-Term Evolution Uplink Master’s.
FastTrack Network & Applications (KaZaA & Morpheus)
FORS 8450 Advanced Forest Planning Lecture 5 Relatively Straightforward Stochastic Approach.
1 On the Performance of Internet Worm Scanning Strategies Authors: Cliff C. Zou, Don Towsley, Weibo Gong Publication: Journal of Performance Evaluation,
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
A Simulation Study of P2P File Pollution Prevention Mechanisms Chia-Li Huang, Polly Huang Network & Systems Laboratory Department of Electrical Engineering.
1 On the Performance of Internet Worm Scanning Strategies Cliff C. Zou, Don Towsley, Weibo Gong Univ. Massachusetts, Amherst.
1 Modeling, Early Detection, and Mitigation of Internet Worm Attacks Cliff C. Zou Assistant professor School of Computer Science University of Central.
Page Quality: In Search of an Unbiased Web Ranking Seminar on databases and the internet. Hebrew University of Jerusalem Winter 2008 Ofir Cooper
The Online World ONLINE DOCUMENTS. Online documents Online documents (such as text documents, spreadsheets, presentations, graphics and forms) are any.
Traffic Correlation in Tor Source and Destination Prediction PETER BYERLEY RINDAL SULTAN ALANAZI HAFED ALGHAMDI.
Doc.: IEEE /00144r0 Submission 3/01 Nada Golmie, NISTSlide 1 IEEE P Working Group for Wireless Personal Area Networks Dialog with FCC Nada.
Prophet/Critic Hybrid Branch Prediction B B B
Proposal Pollution prevention in the P2P file sharing system Presenter: Elaine.
Managing your implementation with Microsoft Dynamics ® Lifecycle Services 2015 | JUNCTION SOLUTIONS.
Learning Outcomes 1. Know software installation processes 2. Be able to prepare for software installation 3. Be able to install and configure software.
Principles of Demonstrative Instructional Video Peyton R. Glore Assistant Professor School of Information Technology Macon State College October 17, 2007.
Natalija Budinski Primary and secondary school “Petro Kuzmjak” Serbia
CHAPTER 10 Questionnaire Design

OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
Coexistence Mechanism
IFIP – Performance 2007 A Modeling Framework to Understand the Tussle between ISPs and Peer-to-Peer File Sharing Users Michele Garetto - unito.
Presentation transcript:

Understanding Pollution Dynamics in P2P File Sharing Uichin Lee, Min Choi *, Junghoo Cho M. Y. Sanadidi, Mario Gerla UCLA, KAIST * IPTPS’06

Outline Pollution in P2P file sharing User behavior study Pollution model Impact on P2P traffic loads Conclusion

Pollution in P2P File Sharing Pollution is a defensive mechanism to discourage illegal downloads of copyrighted materials. Polluting a title (e.g., Maroon5 – This Love) Polluters aggressively Tamper content or meta-data of files to create “polluted versions.” Pour many polluted versions into the system. Users powerlessly Encounter the polluted files with the genuine ones Randomly select one without knowing pollution.

Too Many Polluted Files & Why? KaZaA is severely polluted!! Reported by Liang et al. (May, 2004) “My Band” 70% out of 1,816,663 copies Given that polluters have limited capabilities (bandwidth/processing power), current level of pollution is too high. Recent models were not able to clearly explain such pollution dynamics. To better understand pollution dynamics, P2P user behavior must be examined.

User Behavior Study Goal How does user behavior impact pollution spread? User behavior study setup Two stage test Questionnaire : familiarity / usage patterns Behavior observation : awareness / slackness 30 graduate students (UCLA, KAIST)

Questionnaire - Familiarity of Participants Asked five questions related to P2P Do you know how popular P2P software works? Do you know multi-part downloading or swarming? …

Questionnaire - Usage Patterns P2P usage pattern 1. Download preparation: send queries/start downloading 2. Download: check download status 3. Post-download: check files/decide to share? Asked few questions related to each stage

Questionnaire - Usage Patterns (Results) 3) Post-download Stage 1) Preparation Stage 2) Download Stage quality availability 57% 20% file size Download decision frequent size-dependent 41% 35% 20%check later Checking frequency Pollution experience yes 70% Failed in noticing pollution yes 30% Re-download? yes23% file size dep. 57%

User Behavior Observation Metrics Awareness probability : the fraction of users who recognize pollution in a downloaded file Slackness distribution : distribution of intervals between download completion time and quality checking time Setup Modified P2P software to monitor user behavior Users are asked to use it and to download files Controlled downloading speed (50K - 1Mbps)

User Behavior Observation Pollution techniques (on MP3 files) Meta-data modification : changed names Quality degradation Incomplete file : cut (30-60 seconds beg./end.) Noise insertion: every 15 seconds Shuffled content : randomly shuffled content Tested files Applied each pollution technique to four songs 40 popular songs (20 polluted + 20 clean) For each download, a user checks familiarity/pollution

User Behavior Observation Awareness Shuffled Content Noise Insertion Incomplete File Quality degradation Meta-data Modification Awareness

User Behavior Observation Slackness – bimodal distribution

Pollution Model Discrete time analysis by extending the previous model and incorporating study results Total M users in the system G 0 /B 0 : initial # genuine/bogus copies Download process 1. At step k, a user (never downloaded before) downloads a file with probability p S (i.e., interest level) 2. After download, the authenticity is checked after an interval k with probability s k where t <= L (max. slackness) 3. Realizes bogus with probability p a (i.e., awareness); if so, he will try again with probability p r (i.e., re-download prob.) 4. Share the file with probability p c (i.e., cooperativeness)

Pollution Model # downloads at k (N k ) Ever downloaded users (D k ) # incoming genuine/bogus files at k (g k, b k ) Total G k and B k files are shared in the system M: total number of users p s : user’s interest rate D k : ever downloaded a file G k : total # good files B k : total # bogus files r k : # of re-downloads at k New TrialsRe-downloads

Pollution Model Total # genuine files (G k+1 ) Total # bogus files (B k+1 ) Prob. of not sharing bad files # re-downloads at k+1 L: max. slackness g k : # incoming good files at k b k : # incoming bad files at k s j : prob. of checking after j p c : cooperation probability p a : awareness probability p r : re-download probability Max Slackness: L=3 kk-1k-2 g k-2 s 3 gks1gks1 g k-1 s 2 Total # checking at k: g k-2 s 3 + g k-1 s 2 + g k s 1

Pollution Model Analysis Iterative solution of the proposed model Metric Pollution level = # polluted copies / # genuine copies Setting M=15,000 (total number of users) L=48 (max. slackness) p s = 1/24 (gets interested in every 24 hrs.) p r = 1 (re-downloads always!) p c = 0.25 (cooperativeness) Initial pollution level = 20

Pollution Model Impact of user awareness P a = 1 Polluted Files P a = 0.9 Polluted Files P a = 0.8 Polluted Files P a = 1 Genuine P a = 0.9 Genuine P a = 0.8 Genuine

Pollution Model Pollution level (k) = #polluted (k)/#genuine (k) P a = 1 P a = 0.95 P a = 0.90 P a = 0.85 P a = 0.80 Time Step Pollution Level

Pollution Model Increasing initial pollution level Awareness is critical to make an effective attack Final Pollution Level Awareness (p a ) Initial pollution level

Impact on P2P Traffic Loads Popular files are targets of the polluters!! Users will re-download with probability p r From our model, we can estimate the total # of re-downloads In the worst case, # of re-downloads is x3 larger!! 60% of the Internet traffic is P2P

Conclusion User behavior study shows Users are not error-free in recognizing pollution Users’ slackness follows a bimodal distribution Developed an analytical model Analytic model shows Awareness is one of the key factors in pollution dynamics Pollution has a great impact on the P2P traffic loads