Seed Scheduling for Peer-to-Peer Networks Flavio Esposito Ibrahim Matta Pietro Michiardi Nobuyuki Mitsutake Damiano Carra
BitTorrent Protocol The Algorithms Seed Leecher Peer SelectionPiece Selection FCFS Local Rarest First (LRF) Tit-for-Tat + Optimistic Round Robin The Messages of the Protocol Seed [-,-] INTERESTED Seed UNCHOKE Seed LRF REQUEST Seed HAVE(2) Seed PIECE 2 Seed INTERESTED [-,-]
Outline Problem and Motivation Seed Scheduling – Our file sharing client BUtorrent Analytical results Experimental Results Conclusions
Problems and Motivation Example of Optimal Seed Scheduling: Pieces uploaded [1,-,3,4,-,-]First Scheduling round A = [1,-,-,-,-,-] B = [4,-,-,-,-,-] C = [3,-,-,-,-,-] A = [1,2,-,-,-,-] B = [4,5,-,-,-,-] C = [3,6,-,-,-,-] Pieces uploaded [1,2,3,4,5,6]Second Scheduling round A = [1,-,-,-,-,-] B = [4,-,-,-,-,-] C = [3,-,-,-,-,-] Pieces uploaded [1,-,3,4,-,-]First Scheduling round Example of Sub-optimal Seed Scheduling: A = [1,2,-,-,-,-] B = [4,2,-,-,-,-] C = [3,4,-,-,-,-] Pieces uploaded [1,2,3,4,-,-]Second Scheduling round Collision at this round Already uploaded in previous rounds Def (Initial Phase): time between first scheduling to instant all pieces have been uploaded once.
Do we really have “Collisions”? Proof on the Tech Report based on: Bonferroni inequality And on the following inequality: Collisions in initial phase: Theorem (Bound on Collision): Expected collision with L leechers in the first p rounds is: - Leecher unaware of each other - LRF is not modeled
Problem and Motivation Seed [1,-] 1 [-,-] Round 1 Seed [1,-] 2 [-,2] Round 2 Seed [1,-] 1 [1,2] Round 3 Seed [1,2] 2 Round 4 Seed [1,-] 1 [-,-] Round 1 Seed [1,-] 2 [1,2][1,2] Round 2 1 Seed [1,2] Round 3 2 Seed Main problem : Unawareness - leechers disconnected (peers) - slow HAVE messages propagation (pieces)
Problem and Motivation Summarizing: 1.Waste uploads during Initial Phases 2.Increase in downloading time A = [1,2,3,4,5,6] B = [4,2,1,3,6,5] C = [3,4,2,5,1,6] Leechers unaware (topology and propagation) of each other waste seed’s upload capacity 4. Burst Arrivals produce same effect of an initial phase
Outline Problems and Motivation Seed Scheduling Analytical results Experimental Results Conclusions
Algorithm IDEA based on Proportional Fair Scheduling (PFS) Seed Scheduling Seeds need to help LRF considering the past and the current requests 1.Unchoke every possible leecher 2.Collect all possible requests 3. Upload the pieces i* that are - Requested the most in this round (max ri (t) ) - Uploaded the least in the past rounds (min ti (t-1) ) Details of the algorithm in the paper
Seed Scheduling History should be forgotten If a seed gets the (N+1) th request for piece i, It must be a new leecher asking After N rounds seed needs to completely forget N = max seed’s TCP connections (80-200) Seed forget previous scheduling decisions with: Exponential Weighted Moving Average where:
Outline Problems and Motivation Seed Scheduling Analytical results Experimental Results Conclusions
Analytical Results We consider Qiu-Srikant model for seed and leechers evolution Correct effectiveness definition in initial phase(s) We show that the distribution follows a Zipf Here LRF and PFS are modeled
Analytical Results Solving Qiu-Srikant differential equations numerically with new effectiveness: 1.Even SMALL improvements of effectiveness translates in faster downloading time 2.Higher effectiveness reduces downloading time 3.The improvement is higher for higher arrival rate (peerset more dynamic) Take Home messages:
Outline Problems and Motivation Seed Scheduling Analytical results Experimental Results – Simulations – PlanetLab Conclusions
Simulation Results We compare average downloading time of PFS with FCFS and GRF Burstiness *B := peak rate /average rate of arrival of leechers 1. If Burstiness B too low for = 0.02 forgetting too fast has same effect as not remembering (as BitTorrent) 2. If B too high, peersets are too small for effective piece exchange 3. Under PFS up to 25% improvement over BitTorrent. Take home
Simulation Results Seed Utilization* vs File Size *Seed Utilization := # seed uploads / avg # leechers uploads 1.Even for less dynamic peerset, download time does not improve much, Seed utilization improves under PFS 2. Improvement increases for bigger files. Take home
BUtorrent on PlanetLab 1. We implemented a new file-sharing client with our seed scheduling algorithm (BUtorrent) 2. Simulation result confirmed on PlanetLab Take home Average downloading time of BUtorrent vs BitTorrent
Conclusions Showed the need for smarter scheduling to reduce initial phases and so Downloading Time Our idea: to give a more global view to seeds “Supporting and not Substituting LRF” New seed scheduling algorithm implemented in a real client Boston University BUtorrent BUtorrent improves BitTorrent’s average downloading time by up to 25%.
Thank you !!! Download BUtorrent from