Epidemic Dissemination & Efficient Broadcasting in Peer-to-Peer Systems Laurent Massoulié Thomson, Paris Research Lab Based on joint work with: Bruce Hajek, Sujay Sanghavi, Andy Twigg, Christos Gkantsidis and Pablo Rodriguez
2 Context P2P systems for live streaming & Video-on-Demand – PPLive, Sopcast, TVUPlay, Joost, Kontiki… Internet hosts form overlay network – Data exchanges between overlay neighbours – Aim: real time playback at all receivers Soon the main channel for multimedia diffusion?
3 Diffusion of Code Red Virus
4 Logistic curve (Verhulst 1838, Lotka 1925,…) Exponential growth Optimal global infection time: logarithmic in population size
5 Epidemics for live streaming diffusion 1243 Data packets 12 2 Mechanism specification: selection rule for target node packet to transmit Epidemics (one per packet) competing for resources
6 Problem statement Currently deployed systems rely on epidemic approach Appeal of simple & decentralised schemes – Large user populations (10 3 – 10 6 ) – High churn (nodes join and leave) “Cost of decentralisation? i.e., can epidemics make efficient use of communication resources? Metrics: rate and delay
7 Outline Delay-optimal schemes [S. Sanghavi, B. Hajek, LM] Rate-optimal schemes [LM, C. Gkantsidis, P. Rodriguez and A. Twigg] Outlook
8 The access constraint scenario … Scarce resource: access capacity Models DSL / Cable uplink bandwidth limitations Normalised: 1 packet / second Bounds on optimal performance Throughput = N / (N-1) 1 (pkt / second) Delay = log 2 (N) where N: number of nodes
9 Challenge Naïve approach Random target First useful packet Sender’s packets Receiver’s packets 3 1 st useful packet Fraction of nodes reached Time Tension between timeliness of delivery and diversity
10 The “random target / latest packet” policy ?? Sender’s packets Receiver’s packets Latest packet ?????? Fraction of nodes reached Time
11 Diffusion at rate 63% of optimal and with optimal delay feasible (Do source coding at source over consecutive data windows) The “random target / latest packet” policy Main result: Each node receives each packet w.p. 1-1/e 63% with optimal delay ( less than log 2 (N) ), Independently for distinct packets.
12 t Proof idea time Fraction of nodes t+1 Nodes that have pkt with label t Nodes that have pkt with label t+1 Number of transmission attempts for packet t: N area between curves = N 1 Number of nodes receiving t: Same dynamics as single epidemic diffusion translated logistic curve
13 Outline Delay-optimal schemes [S. Sanghavi, B. Hajek, LM] Rate-optimal schemes [LM, C. Gkantsidis, P. Rodriguez and A. Twigg] Outlook
14 Access constraints scenario Network assumptions: – access capacities, c i – Everyone can send to everyone (complete communication graph) Statistical assumptions: – source creates fresh packets at instants of Poisson process with rate λ – Packet transmission time from node i: Exponential r.v. with mean 1/c i Optimal broadcast rate:
15 The “Most deprived neighbour / random useful packet” policy Sender’s packets Potential receiver 1Potential receiver 2 5 Source policy: sends “fresh” packets if any (fresh = not sent yet to anyone)
16 Main result Provided λ < λ*, Markov process describing system state is ergodic. Hence all packets are received at all nodes after time bounded in probability Proof: identifies “workload” as Lyapunov function for fluid dynamics of Markov process Open questions: Magnitude of delays (simulations suggest logarithmic) Extension to general, not complete graphs
17 Extension to limited neighborhoods Each node maintains shortlist of neighbours Sends to most deprived from neighbour set Periodically adds randomly chosen neighour, and dumps least deprived Neighbourhood size stays fixed Ergodicity result still holds: fluid dynamics unchanged Q: impact of neighborhood size?
18 Network constraints Graph connecting nodes Capacities assigned to edges Achievable broadcast rate [Edmonds, 73]: Equals maximal number of edge-disjoint spanning trees that can be packed in graph Coincides with minimal max-flow ( = min-cut) between source and arbitrary receiver
19 Based on local informations No explicit construction of spanning trees Random useful packet selection and Edmonds’ theorem Main result: When injection rate λ strictly feasible, Markov process is ergodic ? ? ? ? ? ? ?? ?
20 Proof idea s 12 3 Original network Variables x A : Number of packets present exactly at nodes in set A Fluid Renormalisation: The x A obey deterministic dynamics s,1 s s,1,2,3 s,2 s,1,3s,2,3 Induced network s,1,2 λ λ ? Convergence to zero of fluid trajectories: shown by using Lyapunov function
21 Comments Provides “analytical” proof of Edmond’s theorem Delays?
22 Conclusions Epidemic diffusion – Straightforward implementation – Efficient use of bandwidth resources Random & local decisions lead to global optimum
23 Outlook Open problems – Schemes both delay- and rate- optimal? – Concurrent stream diffusions? – Stability proofs without the Lyapunov function?