Measurements, Analysis, and Modeling of BitTorrent-like Systems

Slides:



Advertisements
Similar presentations
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Advertisements

Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
Peter R. Pietzuch Peer-to-Peer Computing – or how to make your BitTorrent downloads go faster... Peter Pietzuch Large-Scale Distributed.
Rarest First and Choke Algorithms Are Enough
Rarest First and Choke Algorithms are Enough Arnaud LEGOUT INRIA, Sophia Antipolis France G. Urvoy-Keller and P. Michiardi Institut Eurecom France.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Incentives Build Robustness in BitTorrent Bram Cohen.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Clustering and Sharing Incentives in BitTorrent Systems Arnaud Legout 1, Nikitas Liogkas 2, Eddie Kohler 2, Lixia Zhang 2 1 INRIA, Projet Planète, Sophia.
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.
Modelling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks.
Analyzing and Improving BitTorrent Ashwin R. Bharambe ( Carnegie Mellon University ) Cormac Herley ( Microsoft Research, Redmond ) Venkat Padmanabhan (
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
A survey of BitTorrent study Jian Liang EL933 Prof. Yong Liu.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
1 A Framework for Lazy Replication in P2P VoD Bin Cheng 1, Lex Stein 2, Hai Jin 1, Zheng Zhang 2 1 Huazhong University of Science & Technology (HUST) 2.
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Modeling and analysis of BitTorrent-like P2P network Fan Bin Oct,1 st,2004.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
Hardware-based Load Generation for Testing Servers Lorenzo Orecchia Madhur Tulsiani CS 252 Spring 2006 Final Project Presentation May 1, 2006.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Can Internet Video-on-Demand Be Profitable? SIGCOMM 2007 Cheng Huang (Microsoft Research), Jin Li (Microsoft Research), Keith W. Ross (Polytechnic University)
1 Analyzing Patterns of User Content Generation in Online Social Networks Lei Guo, Yahoo! Enhua Tan, Ohio State University Songqing Chen, George Mason.
1 The Stretched Exponential Distribution of Internet Media Access Patterns Lei Yahoo! Inc. Enhua Ohio State University Songqing George.
BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results.
BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo 1, Songqing Chen 2, Zhen Xiao 3, Enhua Tan 1, Xiaoning Ding 1, and Xiaodong Zhang.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
1 BitTorrent System Efrat Oune Bar-Ilan What is BitTorrent? BitTorrent is a peer-to-peer file distribution system (built for intensive daily use.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
Do incentives build robustness in BitTorrent? Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy, Arun Venkataramani.
Bit Torrent A good or a bad?. Common methods of transferring files in the internet: Client-Server Model Peer-to-Peer Network.
Multiclass P2P Networks: Static Resource Allocation for Service Differentiation and Bandwidth Diversity Florence Clévenot-Perronnin, Philippe Nain and.
MULTI-TORRENT: A PERFORMANCE STUDY Yan Yang, Alix L.H. Chow, Leana Golubchik Internet Multimedia Lab University of Southern California.
Understanding Long-term Evolution and Lifespan in Peer-to- Peer Systems Yong Zhao, Zhibin Zhang, Li Guo Institute of Computing Technology, Chinese Academy.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Ph.D. Candidate Park Graduate Research Award Presentation.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Content Availability and Bundling in Swarming Systems Reporter: Jian He.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Analyzing and Improving BitTorrent Ashwin R. Bharambe ( Carnegie Mellon University ) Cormac Herley ( Microsoft Research, Redmond ) Venkat Padmanabhan (
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
Does Internet media traffic really follow the Zipf-like distribution? Lei Guo 1, Enhua Tan 1, Songqing Chen 2, Zhen Xiao 3, and Xiaodong Zhang 1 1 Ohio.
An example of peer-to-peer application
The Hidden Locality in Swarms
Introduction to BitTorrent
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers
I know what you are Sharing
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Measuring Service in Multi-Class Networks
مظفر بگ محمدی دانشگاه ایلام
Early Measurements of a Cluster-based Architecture for P2P Systems
OneSwarm: Privacy Preserving P2P
Determining the Peer Resource Contributions in a P2P Contract
Aditya Ganjam, Bruce Maggs*, and Hui Zhang
Managing Inter-domain Traffic in the Presence of BitTorrent File-Sharing Srinivasan Seetharaman and Mostafa Ammar School of Computer Science Objective:
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
Small Is Not Always Beautiful
Swarming Overlay Construction Strategies
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
#02 Peer to Peer Networking
Content Distribution Networks + P2P File Sharing
Presentation transcript:

Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo1, Songqing Chen2, Zhen Xiao3, Enhua Tan1, Xiaoning Ding1, and Xiaodong Zhang1 1College of William and Mary 2George Mason University, 3AT & T Labs - Research

Basic Model of P2P Systems Peers sharing different files self-organize into a P2P network Exchange files they desire Limitations Free riding Large file downloading ♫ Examples: Gnutella, KaZaa, eDonkey/eMule/Overnet

BitTorrent: Fast Delivery with Incentive A large file is divided into chunks Peers interested in the same file self-organize into a torrent Peers exchange file chunks with each other Incentive is established by tit for tat Very simple and effective, scale fairly well during flash crowd 4 5 Torrent of Bits ...

BitTorrent Traffic Online users Traffic volume 6.8 million in August 2004, 9.6 million in August 2005 (BigChampagne) Traffic volume 53% of all P2P traffic on the Internet in June 2004 (CacheLogic) P2P traffic: 60-80% Other traffic: 20-30% Source: CacheLogic, 2004

Limited Understanding of BitTorrent Existing studies on BitTorrent systems (INFOCOM04, SIGCOMM04) Unrealistic assumptions in system model: no evolution considered Single-torrent based: more than 85% BT users join multiple torrents What we are not clear about BitTorrent systems Service availability Service stability Service fairness Our objective of this work Evolution of single-torrent system, and limitations of BT Multi-torrent model for inter-torrent relation and collaboration during the entire lifetime

Outline BitTorrent mechanism and our methodology Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system Inter-torrent collaboration Conclusion

How BitTorrent Works: Publishing seed foo.torrent announce: tracker URL for bootstrap creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk I am here! 3 4 5 ... foo.torrent Tracker site Web site peer list The publisher Create a meta file Publish on a Web site Start the tracker site Start a BT client as the initial seed

How BitTorrent Works: Downloading seed The downloader Download the meta file Start a BT client, connect to the tracker site Get peer list from tracker Get first chunk from other peers (seeds) 3 4 5 ... foo.torrent Tracker site Web site peer list peer list I am here! download foo.torrent

How BitTorrent Works: Downloading seed The downloader Download the meta file Start a BT client, connect to the tracker site Get peer list from tracker Get first chunk from other peers (seeds) Exchange file chunk with other peers Download complete: become a new seed 3 4 5 ... foo.torrent Tracker site Web site peer list foo.torrent foo.torrent

How BitTorrent Works: Downloading seed Future performance Depends on the arrival and departure of new downloaders and seeds The downloader Download the meta file Start a BT client, connect to the tracker site Get peer list from tracker Get first chunk from other peers (seeds) Exchange file chunk with other peers Download complete: become a new seed Initial seed leaves 3 4 5 ... foo.torrent Tracker site Web site peer list foo.torrent foo.torrent seed 3 4 5 ...

Our Methodology of this Study Measurement BitTorrent traffic pattern Meta file downloading and tracker statistics Analysis BitTorrent user behavior and performance limitations Curve fitting, parameter estimation and validation of mathematical models Modeling Torrent evolution and inter-torrent relation Fluid model, probability model, and graph model

Meta File Downloading The first HTTP packets of .torrent file downloading Cable network: 3,000+ downloads, 1,000+ torrent meta files Server farm: 50 tracker sites host hundreds of torrents Gigasope: fast Internet traffic monitoring tool by AT&T What information it contains? Torrent birth time Peer arrival time to the torrent (packet capture time of downloading) About 10 days announce: tracker URL creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk foo.torrent

Torrent Statistics on Trackers Professional/dedicated tracker sites Each may host thousands of torrents at the same time http://www.alluvion.org/ and http://www.crapness.com/, collected by University of Massachusetts, Amherst Ex: alluvion -- 1,500 torrents, 550 are fully traced What information it contains? Torrents: torrent birth time, file size, number of peers/seeds Peers: request time, downloading/uploading bytes, downloading/uploading bandwidth Sampled every 0.5 hour for 48 days

Outline BitTorrent mechanism and our methodology Modeling and characterization of single-torrent system The evolution of torrent over time Limitations of current BitTorrent systems Modeling and characterization of multi-torrent system Inter-torrent collaboration Conclusion

Torrent Popularity Peer arrivals: decrease with time exponentially meta file workload 0 100 200 100 101 102 103 CCDF of peer arrival ------ raw data ------ linear fit 100 102 104 tracker site workload 0 20 40 ------ raw data ------ linear fit 100 300 500 relative deviation (%) 10 20 30 individual torrents torrents ranked by population (non ascending order) 6% in average time after torrent birth (day) Peer arrivals: decrease with time exponentially Peer arrival rate derivative of CCDF

inter-arrival time > seed service time Torrent Death Peer n arrives at time tn : inter-arrival time: peer arrival rate: seed service time: seed leaving rate: downloading time: downloading rate: When tn  , what will happen? inter-arrival time > seed service time t torrent dead peer n peer n+1 tn tn+1

Torrent Population and Lifespan trace model 100 101 102 103 100 102 104 torrent population rank of torrents trace model 0 200 400 600 100 102 104 torrent lifespan (hour) torrents Most torrents are small (avg 102) Most torrents are short live (avg 8 days)

Downloading Failure Ratio download failure population 0 200 400 600 10-3 10-1 100 10-2 downloading failure ratio 102 104 torrent population torrents ranked in non-ascending order of downloading failure ratio Define: Avg downloading failure ratio about 10% Different evolution patterns Small population  large Rfail Reminder: most torrents have small population! Altruistic peers make torrents long live

Torrent Evolution: Fluid Model Existing model (SIGCOMM 04) Constant arrival rate  = const Torrent reaches equilibrium The correct model Exponentially decreasing arrival rate Torrent dead finally Verified by our measurements Two completely different pictures

Torrent Evolution: Modeling Results trace model 0 100 200 0 100 200 time (hour) 40 80 Flash crowd Downloader #: exponentially  Seed #: exponentially  Peek time A very short duration Constant arrival model: flat peak Attenuation – a long tail Downloader #: exponentially  Seed #: exponentially  Constant arrival model is far from the reality: no attenuation Torrent death constant arrival model # of downloaders # of seeds constant arrival model

Performance Stability Evolution over time 2 4 6 8 10 0 50 100 150 200 torrents 101 101 103 105 downloader seed download speed Snapshot of torrents at time t # of peers model trace time (hour) avg download speed (byte/sec) 50 100 150 200 5 10 15 104 avg download speed (byte/sec) Only stable when torrent is large Fluctuate significantly after peak time Larger torrents have higher and more table performance

Service Unfairness peer contribution ratio Contribution ratio: ranked peers 0 0.2 0.4 0.6 0.8 1 102 100 10-2 106 104 download speed (byteps) peer contribution ratio + contribution ratio –x– download speed 0 0.2 0.4 0.6 0.8 1 102 100 10-2 101 103 peer contribution ratio # of torrents + contribution ratio –x– # of torrents ranked peers Contribution ratio: uploaded bytes downloaded bytes Unfairness:  download speed,  uploading contribution Seeds serve high speed downloaders first Peers not willing to serve after downloading Not due to new file downloading: selfish

Single-torrent Model : Summary Torrent evolution over time Exponentially decreasing arrival rate Flash crowd – short peak – long tailed attenuation BitTorrent Limitations Content availability: torrent death Performance stability Service fairness

Outline BitTorrent mechanism and our methodology Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system Traffic pattern and user behavior Graph based model of inter-torrent relation Inter-torrent collaboration Conclusion

Multi-torrent Environment Dynamics Torrent birth Request arrival Peer birth CDF of torrents CDF of requests CDF of peers ------ raw data ------ linear fit ------ raw data ------ linear fit ------ raw data ------ asymptotic fit Torrent birth time, request arrival time, and peer birth time (hour) Considering peers and torrents on the Internet as an open system Torrent birth rate, torrent request rate, and peer birth rate are constant Implication: The lifecycle of a BT peer: downloading, seeding, sleeping, …, dead avg # of torrents a peer requests torrent request rate peer birth rate = = constant

Peer Request Pattern: Request Rate 108 102 Peer request rate: requests by a peer to different torrents per unit time 104 101  r (day) # of torrents Assume +  r –x– # torrents 100 100 Explain 77 years 0 2000 4000 r  77 years ! peers Peer request process: seems Poisson-like Request a new torrent with a probability p: participation probability Dead with probability 1-p

Peer Request Pattern: Participation Probability ––– raw data ––– linear fit 40 20 number of torrents (m) 100 102 104 peer rank (log i) Probability model peers request at least m torrents p = 0.8551 Another estimation of p Probability model confirmed

Inter-torrent Relation Graph: How Torrents Can Help with Each Other? j i some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i

Inter-torrent Relation Graph: How Torrents Can Help with Each Other? weighted out-degree weighted in-degree torrent size (# of online peers) trace model torr size j i some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i Edge weight Wi,j : number of such peers

Single-torrent vs. Multi-torrent Model Single-torrent model  seed service time,  download failure rate Limited seed service time , but inter-arrival time  exponentially Small improvement Multi-torrent model Old peers come back multiple times  peer arrival rate,  peer inter-arrival time Significant improvement

Single-torrent vs. Multi-torrent Model Single-torrent model Multi-torrent model 0.1 seeds stay 10 times longer:  * =  /10 torrent death ' (T'life) =  0.01 110-6 ≈ 0 Inter-torrent collaboration is much more effective than stimulating seeds to serve longer

Outline BitTorrent mechanism and our methodology Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system Inter-torrent collaboration Tracker site overlay Instant incentive for collaboration Conclusion

Tracker Site Overlay Self-organized P2P network (a logical structure) B Neighbor-in torrents that can serve me B C A Neighbor-out torrents that I can serve (peer list) D D C Self-organized P2P network (a logical structure) An instance of inter-torrent relation graph A built-in mechanism for content search, cover 99%+ torrents Trackerless BitTorrent: uses DHT to store meta file

Incentive for Inter-Torrent Collaboration Jack file A file D Thanks Jack! A C D Tom Instant incentive – similar to “tit-for-tat” principle Neighboring cycle detection Neighboring cycle construction Bandwidth trading: get one chunk, serve multiple peers

Conclusion Extensive analysis and modeling to study the behaviors of BT-like systems Tracker trace and .torrent downloading trace Mathematical model BitTorrent system has its limitations due to exponentially decreasing peer arrival rate Service availability, performance stability, and fairness Graph based multi-torrent model System design for inter-torrent collaboration

Thank you!

Backup for Questions

torrent lifespan (hour) trace model 0 200 400 600 100 102 104 torrent lifespan (hour) torrents Extract t and t from trace Get 0 and  using linear regression Lifespan model verified by measurement

(in non-ascending order of modeling results) Torrent Population Total population Model verified by measurement Observations: The population of most torrents are small (102 in average) Downloading failure ratio Small population  large Rfail trace model 100 101 102 103 100 102 104 torrent population rank of torrents (in non-ascending order of modeling results)

Torrent Evolution: Fluid Model Basic equation set Parameters x(t) number of downloaders y(t) number of seeds 0 initial peer arrival rate  attenuation parameter of   uploading bandwidth c downloading bandwidth (c >> )  seed leaving rate  file sharing efficiency 1,2 eigen values of the equation set a,b,c1, c2,d1,d2 constants Resolution

Peer Request Pattern: Summary Multi-torrent environment: an open model Torrent birth rate: 0.9454 per hour (nearly a constant) Peer birth rate: 19.37 per hour (nearly a constant) Torrent request rate (for all peers over all torrents): 133.39 per hour (nearly a constant) Actually increase slowly according to BigChampagne Peer request pattern Lifecycle: downloading, seeding, sleeping, …, next req with prob. p Peer participation probability: 0.85 Request rate (for different torrents by a peer): Poission-like

Tracker Site Overlay Table size Node degree distribution Similar to unstructured P2P networks Many content search and msg routing algorithms Flooding Random walk … Trackerless BitTorrent: uses DHT to store meta file

Simulation Experiments without inter-collaboration with inter-collaboration content availability performance stability service fairness downloading failure ratio downloading speed contribution ratio Rfail 0 more stable more balanced Inter-torrent collaboration can improve BitTorrent performance