Friendships that last Peer lifespan and its role in P2P protocols Fabián E. Bustamante & Yi Qiao Department of Computer Science Northwestern University {fabianb,yqiao}cs.northwestern.edu www.aqualab.cs.northwestern.edu
Dept. of Computer Science Northwestern University P2P and heterogeneity P2P computing: sharing of computer resources & services by direct exchange between participants Purest form … all peers are equal Problem: clash between assumption and reality - peer populations show high variations on storage, bandwidth, latency, degree of sharing, uptime, … P2P – Idea; purest form; problem: clash bet/ assumption and heterogeneity and transient population Dept. of Computer Science Northwestern University
Transient peers and P2P systems Peers defined an overlay network Set of connections to other peers (their “friends”) Maintenance protocol that repairs the overlay Degree of peer transiency Median up-time ~ 70’ Implications Maintenance-related messages Plus degree of replication, effectiveness of caches, spread of queries, overall system scalability, … Dept. of Computer Science Northwestern University
Dept. of Computer Science Northwestern University Our approach Part of the problem is whom one befriends One solution: pick those that will live/stay long Without knowing the future, can we predict it? Yes; peer lifespan follows a Pareto distribution! Given a good prediction - how should it be used in P2P protocols? Can it really help? Dept. of Computer Science Northwestern University
Determining lifespan distribution In Gnutella, using a modified client, between March 1st-8th, 2003 Some details: Attempt a Gnutella connection setup 20 monitoring peers for fine probe granularity First-time found peers only recorded with Time-When-Found Peer considered dead when Connection attempt fails 3rd time Unexpected response is received Dept. of Computer Science Northwestern University
Peer lifespan distribution 500,000 peers, ~1 million peers’ lifespans Create-based method for sample limited scope Figures show RCDF of peers with lifespan in [1,300 sec, 3.5 days] Pareto distribution of the form λTk (k < 0) Dept. of Computer Science Northwestern University
Peer Lifespan and P2P protocols Choosing among “acquaintances”: When deciding whom to befriend Responding to requests for references In most P2P protocols – random selection Peer lifespan fits a Pareto distribution Pareto distributions Є UBNE class (Used Better than New in Expectation) Peer’s expected remaining lifetime directly proportional to current age Dept. of Computer Science Northwestern University
Dept. of Computer Science Northwestern University Some of the questions … How could we incorporate lifespan-based ideas into P2P systems? Potential gains in reduced maintenance overhead Effects on application performance … Dept. of Computer Science Northwestern University
Lifespan-based protocols Increased dependency as commitment to the community becomes clear Protocol Connect? Recommend? LSPAN-1 Oldest Random LSPAN-2 LSPAN-3 Oldest & more available connections Dept. of Computer Science Northwestern University
Dept. of Computer Science Northwestern University Experimental setup Trace-driven simulation – P2P simulator includes membership management and various query distribution, cache and replication strategies Runs of one of the 20 collected traces for a period of 510,000 sec., ~36,577 peers Cold start, warm-up ~80,000 sec. excluded ~1,000 peers under stable conditions Newer results where obtained using 4 traces (instead of 1) Dept. of Computer Science Northwestern University
Alternative protocols compared Unstructured Decentralized Protocol (UDP) ~ early Gnutella Separate pools for cached pongs (per connection) Pong replies include random set of entries from cache Hierarchical Decentralized Protocol (HDP) ~ new Gnutella, KaZaa Leaf- and ultra-peers: leafs can only connect to ultras; ultras to anybody To decide a peer’s role – trace information Dept. of Computer Science Northwestern University
Comparing connection breakdowns Indicator of stability √ Lifespan-based protocols More selective → fewer breakdowns Reductions 42-43% -LSPAN-2 26-30% -LSPAN-1 and LSPAN-3 Saw-tooth shape → time-of-day patterns Dept. of Computer Science Northwestern University
Comparing connection rejections Does preference for long-lived peers have to mean high rejection rates? True for LSPAN-2 – although may be a reasonable “cost” Still, for LSPAN-1 and LSPAN-3 low enough to be ignored LSPAN-3 ~ 1/17.58 hrs! Dept. of Computer Science Northwestern University
Comparing number of connections … not just rejections, what about number of connections? LSPAN-1 and LSPAN-3 – higher ratio of connections per peer Little benefit from checking available connections Dept. of Computer Science Northwestern University
A preview: Effects on applications Gains in scalability With random-walkers & NCU (Neighboring Caching) Lifespan-based: 5 and random topology: 16 walkers Dept. of Computer Science Northwestern University
Conclusions and future work Peer lifespan fits a Pareto distribution – current age to predict lifespan Illustrative lifespan-based protocols Advantages of considering peers’ age in P2P protocols Possible research paths Effect on query distribution and cache strategies Lifespan-based strategies Determining a peer’s age in decentralized P2P systems Lifespan and DHTs Dept. of Computer Science Northwestern University