1 Coupon Replication Systems Laurent Massoulié & Milan Vojnović Microsoft Research Cambridge, UK
2 System we look at file dissemination by file swarming file sliced into chunks (we say coupons) user granted initial coupon from server other coupons collected by replication between users scalability slicing into K chunks reduces server load by factor 1/K ex. BitTorrent partial view greedy: top 4 uploaders random search: optimistic unchoke approx direct reciprocity of exchanged bits
3 Scale large number of distinct coupons ex. movie files 2 GB file length & ¼ MB coupon length = 8000 coupons same for some software binaries (ex. Linux Redhat) large num. of concurrent users for popular files ~ 1000 [The Lord of the Rings, Pouwelse et al 2005]
4 Related work a few theoretical studies Yang & de Veciana (Infocom04) service capacity & scaling Qiu & Srikant (Sigcomm04) macroscopic population dynamics some empirical work Izal et al (PAM04) Pouwelse et al (IPTPS05)
5 Our work model of probabilistic replication we call: coupon replication system population dynamics model system performance captured by mean file download time closed system: leftover users with incomplete collections sheds light how critical is: replication strategy (who to peer with? which coupon to replicate?) user altruism (users offer coupons after collected all distinct coupons)
6 Outline open system two peering strategies How long it takes to download a file? by-products: stability results closed system no new user arrivals (flashcrowd end-phase) How many users are left with incomplete collection? conclusion
7 Open system: assumptions peering strategy LAYER peer with a random user having same number of coupons peering strategy FLAT peer with a random user RANDOM PULL instigator user copies a random coupon of interest from the encounter user throughout users assumed non-altruistic after completing coupon collection offer no coupons
8 Model X c (t) = number of users with coupon collection c at time t each user initiates encounters at instants Poi(1) X is Markov process: Prob (c encounters s) defined to capture either LAYER or FLAT user arrives with coupon collection c user with collection c enlarges its collection with coupon i fixed arrival rate of users with collection c
9 Large population limit scaled process X N : Kurtz: X N /N converges uniformly on finite intervals to x object of our study arrival rate departure rate
10 LAYER number of users in layer i = sojourn time in layer i = general convergence results (not on slides; see paper) Result: mean file download time = K+O(1) asymptotically optimal as K tends to (Littles law)
11 FLAT analysis more difficult than for LAYER results under symmetric arrival rates and initial value Result: (i) (ii) mean file download time = asymptotically optimal as K tends to same as for LAYER T i = sojourn time in layer i
12 Mean file download time for LAYER & FLAT flat layer K mean file download time / (K-1) optimum
13 Sojourn times per layer for LAYER & FLAT k flat layer sojourn time in layer k
14 Closed system Problem a closed population of users given initial coupon collections over users How many users are left with incomplete collection? models flashcrowd end-phase leftover users impose workload on server partial result: last-missing coupon each user has initially all but 1 coupon
15 Phase transition Result: for initial point x 1 x 2 … x K, the limit point is Result: assume X 1 = N, X 2 +…+ X K = (K-1)N K ~ a log(N) then if leftover users =
16 Conclusion good news on file swarming open system for both LAYER & FLAT, mean file download time asymptotically optimal for large coupon collection last-missing coupon for K / log(N) sufficiently large, number of leftover users ~ log(log(N))
17 Outlook results suggest replication strategy & user altruism not critical open: stability of FLAT beyond last-missing coupon heterogeneous download/upload capacities topology effects – beyond random mixing