Download presentation
Presentation is loading. Please wait.
Published byAdelia Warren Modified over 9 years ago
1
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon
2
Peer-to-peer systems have become increasingly popular ◦ Millions of simultaneous users ◦ Significant percentage of Internet traffic is one of the most popular p2p applications ◦ Responsible for 35% of all Internet traffic [Parker05] BitTorrent is important because ◦ Popularity ◦ Its impact on the network
3
3 Scalable one to many peer-to-peer file distribution Overlay: Unstructured, Random, High degree Swarming ◦ File is divided into segments ◦ Segments are randomly distributed among peers – Get rarest seg. first Contribution ◦ Peers exchange segments and contribute their outgoing bandwidth ◦ Incentive: Tit-for-Tat Tracker ◦ Torrent coordinator ◦ Periodic peer status updates Performance: Intuitively depends on ◦ Peer properties (BW, Contribution, etc. ) ◦ Group properties (Population, Content availability, Churn) Introduction
4
4 1. Modeling and analytical studies 2. Simulation studies 3. Empirical studies ◦ Capture BitTorrent system properties in operation through measurement (instrumented clients)[Legout06] ◦ Group properties[Izal04]: Population, Average cont. avail.,.. ◦ No explicit notion of performance ◦ No study on the effects of underlying factors of peer performance Related work Characterization: ◦ Understanding group-level and peer-level properties in a torrent Analysis: ◦ What are the main factors that affect observed performance by individual peers?
5
5 Common approach: Instrumented clients ◦ Detailed and flexible ◦ Representative? Our approach: Tracker logs ◦ Coarse granularity(30 min) ◦ Global view Data Sets Methodology/Approach Tracker Logfile Tracker Source #Torre nts Start Date End Date #Rep orts #Sessio ns RedHat13/038/032M170k Debian15992/053/0532M1268k Games25858/0312/0438M4416k TorrentFile Size# Sessions, rank Duration RedHat1.8GB170k, 3rd146d Debian677MB139k, 6th51d Games363MB195k, 2th66d Tracker logs setsSelected Torrents
6
6 Session: ◦ Set of all updates from a particular peer from its arrival till departure Peer-level properties: ◦ Represent the peer’s status during a session: ◦ Average download rate ◦ Average upload rate Methodology Slope = upload rate Download Complete Session Start Studied zone(leeching) Download rate Slopes= upload rates Downloa d rates Avg download rate
7
7 Population, Avg. Content Availability, Churn Sampling approach: ◦ Once every τ minutes ◦ Last update before and first update after each sample ◦ Interpolation ◦ Averaging across peers τ determines sampling resolution τ > average update interval Peer view: ◦ Average of the samples during peer’s download time Measurement methodology Update Time τ
8
8 Is Download Rate a good performance metric ? ◦ A reference is needed to evaluate peer’s download rate ◦ Ideally peer performance is: ◦ Accurate measurement of Utilization is difficult We use maximum observed download rate as a (lower bound) estimate for incoming bandwidth. Standard deviation of download rate captures stability of download rate ◦ Rates close to avg. higher performance ◦ Normalization comparability Two performance metrics: Methodology
9
Similar distribution across 3 different torrents Utilization has an almost uniform distribution ◦ Nearly Fixed probability density 90% show closely uniform distribution Diverse performance No dominant modes 9 Characterization Results/Peer-Properties
10
10 Content availability ◦ 75% of peers in RH observe an average cont. avail. of 50% ◦ No content shortage Avg. Population ◦ Very different ◦ Flash crowd in RH Characterization Results/Peer-view of group properties Initial flash crowd
11
11 Underlying factors Remember the second questions ◦ What are the peer- or group-level properties that primarily determine the observed performance by individual peers in a torrent? Performance metrics: ◦ Utilization and Stability Possible Underlying factors: ◦ Group-level properties: Population, Churn, Content avail. ◦ Peer-level properties: Upload rate, etc. Approach To Identify Underlying factors ◦ Scatter-plot ◦ Linear Regression (Using S-plus) ◦ Spearman’s rank correlation (S-Plus)
12
12 Utilization vs. Average group content availability ◦ No obvious correlation Utilization vs. Average group population ◦ Vertical patterns ◦ No obvious correlation Statistical Analysis/Scatter-plots
13
13 Suggested techniques result in marginal improvement (R-squared) No single parameter with dominant effect Seed percentage was removed by step() suggests number of seeds is sufficient Statistical Analysis/Linear Regression Several values to consider: R-Squared determines goodness of fit [0:1] P-value determines: “Probability of obtaining a result as impressive” just by chance
14
14 Highest correlation with deviation of upload rate for all torrents -> Tit-for-tat effect Two perf. metrics are similarly affected with opposite signs GA: Little correlation with util. -> unreliable metric DE: Slightly larger effect from content avail. Statistical Analysis/Spearman’s Rank correlation
15
15 Conclusions ◦ No single factor determines observed performance by peers ◦ Outgoing bandwidth seems to have the largest effect Tit-for-tat is working ◦ There often appears to be sufficient number of seeds available (non-factor on performance) ◦ Capturing comparable performance is hard ◦ Performance of the peers in a torrent is rather diverse Instrumented clients cannot reflect a representative picture. Future work ◦ Active monitoring of BitTorrent ◦ BitTorrent overlay topology using peer exchange feature ◦ Characterizing new features: DHT, super-seeding, peer exchange
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.