1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University
2 P2P File Sharing Gnutella, KaZaA, etc. Internet
3 Architecture Super-node network with flooding-based search Search Query
4 Problem Accept new queries from local clients Handle remote queries from other super-nodes Where is the balance? ?
5 Problem (2) Objective: Remote Work –process as many queries from other nodes as possible. Query
6 Problem (3) Remote work done Number of new queries injected Where is the optimal?
7 Simple Model Super-nodes operate in rounds Capacity C Accepts new queries from local clients Handles remote queries
8 When Overloaded Choose queries with the highest TTL first Ties can be broken randomly Has a steady state and is optimal in remote work ?
9 Example 3 super-nodes with TTL = 1 = ? AB C 1313 local neighbor 1neighbor 2
10 Example (2) = ? super-nodes with TTL = 1
11 Solution 6 super-nodes with TTL = 1 24
12 Solution (2) 6 super-nodes with TTL = { 2, 2, 3, 3, 4, 4 } 2 47 = 1313
13 Another Example 5 super-nodes with TTL = = 7 > 5 { 3, 4, 4, 5, 5 } = 1414
14 Intuition = 1616 UnsaturatedSaturated
15 Intuition (2) = 1717 UnsaturatedSaturated Loss = Gain ?
16 Different Each super-node could use a different More work done in the network! Spare capacity
17 Example Star topology with TTL = 1 Identical = 0.5 Remote work = 3.5 C Different Remote work = 6 C
18 Penalty of using identical ... D 1i n DD Maximum remote work is at most n C Pick = all nodes saturated D 1 1 penalty is D 1 1 D 1 1 remote work = n C (1 - )
19 Penalty of using identical (2) D 1 1 How big is ? D 1 TTL + 1 D 1 50 penalty is less than 2% In practice:
20 Solving for different Similar to finding the dominating set for the graph w1 w2 w3 w4 Minimize sum of all weights
21 Why?... UnsaturatedSaturated Boost unsaturated nodes
22 Future Directions Nodes of different capacities Incremental algorithm for computing at each node An incentive mechanism so that each node will forward neighbors’ queries
23 Conclusion Controlling rate of query injection leads to better efficiency Solutions for finding the optimal rate For other P2P related work, google for “Stanford Peers”