Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures.

Similar presentations


Presentation on theme: "Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures."— Presentation transcript:

1 Evolution of P2P Content Distribution Pei Cao

2 Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures Techniques to Improve Gnutella Techniques to Improve Gnutella Brief Overview of DHT Brief Overview of DHT Techniques to Improve BitTorrent Techniques to Improve BitTorrent

3 History of P2P Napster Napster Gnutella Gnutella KaZaa KaZaa Distributed Hash Tables Distributed Hash Tables BitTorrent BitTorrent

4 Napster Centralized directory Centralized directory –A central website to hold directory of contents of all peers –Queries performed at the central directory –File transfer occurs between peers –Support arbitrary queries –Con: Single point of failure

5 Gnutella Decentralized homogenous peers Decentralized homogenous peers –No central directory –Queries performed distributed on peers via “flooding” –Support arbitrary queries –Very resilient against failure –Problem: Doesn’t scale

6 FastTrack/KaZaa Distributed Two-Tier architecture Distributed Two-Tier architecture –Supernodes: keep content directory for regular nodes –Regular nodes: do not participate in Query processing –Queries performed by Supernodes only –Support arbitrary queries –Con: supernodes stability affect system performance

7 Distributed Hash Tables Structured Distributed System Structured Distributed System –Structured: all nodes participate in a precise scheme to maintain certain invariants –Provide a directory service: Directory service Directory service Routing Routing –Extra work when nodes join and leave –Support key-based lookups only

8 BitTorrent Distribution of very large files Distribution of very large files Tracker connects peers to each other Tracker connects peers to each other Peers exchange file blocks with each other Peers exchange file blocks with each other Use Tit-for-Tat to discourage free loading Use Tit-for-Tat to discourage free loading

9 Improving Gnutella

10 Gnutella-Style Systems Advantages of Gnutella: Advantages of Gnutella: –Support more flexible queries Typically, precise “name” search is a small portion of all queries Typically, precise “name” search is a small portion of all queries –Simplicity –High resilience against node failures Problems of Gnutella: Scalability Problems of Gnutella: Scalability –Flooding  # of messages ~ O(N*E)

11 Flooding-Based Searches...... Duplication increases as TTL increases in flooding Duplication increases as TTL increases in flooding Worst case: a node A is interrupted by N * q * degree(A) messages Worst case: a node A is interrupted by N * q * degree(A) messages 1 2 3 4 5 6 78

12 Load on Individual Nodes Why is a node interrupted: Why is a node interrupted: –To process a query –To route the query to other nodes –To process duplicated queries sent to it

13 Communication Complexity Communication complexity determined by: Network topology Network topology Distribution of object popularity Distribution of object popularity Distribution of replication density of objects Distribution of replication density of objects

14 Network Topologies Uniform Random Graph (Random) Uniform Random Graph (Random) –Average and median node degree is 4 Power-Law Random Graph (PLRG) Power-Law Random Graph (PLRG) –max node degree: 1746, median: 1, average: 4.46 Gnutella network snapshot (Gnutella) Gnutella network snapshot (Gnutella) –Oct 2000 snapshot –max degree: 136, median: 2, average: 5.5 Two-dimensional grid (Grid) Two-dimensional grid (Grid)

15 Modeling Methods Object popularity distribution p i Object popularity distribution p i –Uniform –Zipf-like Object replication density distribution r i Object replication density distribution r i –Uniform –Proportional: r i  p i –Square-Root: r i   p i

16 Evaluation Metrics Overhead: average # of messages per node per query Overhead: average # of messages per node per query Probability of search success: Pr(success) Probability of search success: Pr(success) Delay: # of hops till success Delay: # of hops till success

17 Duplications in Various Network Topologies

18 Relationship between TTL and Search Successes

19 Problems with Simple TTL- Based Flooding Hard to choose TTL: Hard to choose TTL: –For objects that are widely present in the network, small TTLs suffice –For objects that are rare in the network, large TTLs are necessary Number of query messages grow exponentially as TTL grows Number of query messages grow exponentially as TTL grows

20 Idea #1: Adaptively Adjust TTL “Expanding Ring” “Expanding Ring” –Multiple floods: start with TTL=1; increment TTL by 2 each time until search succeeds Success varies by network topology Success varies by network topology –For “Random”, 30- to 70- fold reduction in message traffic –For Power-law and Gnutella graphs, only 3- to 9- fold reduction

21 Limitations of Expanding Ring

22 Idea #2: Random Walk Simple random walk Simple random walk –takes too long to find anything! Multiple-walker random walk Multiple-walker random walk –N agents after each walking T steps visits as many nodes as 1 agent walking N*T steps –When to terminate the search: check back with the query originator once every C steps

23 Search Traffic Comparison

24 Search Delay Comparison

25 Flexible Replication In unstructured systems, search success is essentially about coverage: visiting enough nodes to probabilistically find the object => replication density matters In unstructured systems, search success is essentially about coverage: visiting enough nodes to probabilistically find the object => replication density matters Limited node storage => what’s the optimal replication density distribution? Limited node storage => what’s the optimal replication density distribution? –In Gnutella, only nodes who query an object store it => r i  p i –What if we have different replication strategies?

26 Optimal r i Distribution Goal: minimize  ( p i / r i ), where  r i =R Goal: minimize  ( p i / r i ), where  r i =R Calculation: Calculation: –introduce Lagrange multiplier, find r i and that minimize:  ( p i / r i ) + * (  r i - R)  ( p i / r i ) + * (  r i - R) => - p i / r i 2 = 0 for all i => - p i / r i 2 = 0 for all i => r i   p i => r i   p i

27 Square-Root Distribution General principle: to minimize  ( p i / r i ) under constraint  r i =R, make r i proportional to square root of p i General principle: to minimize  ( p i / r i ) under constraint  r i =R, make r i proportional to square root of p i Other application examples: Other application examples: –Bandwidth allocation to minimize expected download times –Server load balancing to minimize expected request latency

28 Achieving Square-Root Distribution Suggestions from some heuristics Suggestions from some heuristics –Store an object at a number of nodes that is proportional to the number of node visited in order to find the object –Each node uses random replacement Two implementations: Two implementations: –Path replication: store the object along the path of a successful “walk” –Random replication: store the object randomly among nodes visited by the agents

29 Evaluation of Replication Methods Metrics Metrics –Overall message traffic –Search delay Dynamic simulation Dynamic simulation –Assume Zipf-like object query probability –5 query/sec Poisson arrival –Results are during 5000sec-9000sec

30 Distribution of r i

31 Total Search Message Comparison Observation: path replication is slightly inferior to random replication Observation: path replication is slightly inferior to random replication

32 Search Delay Comparison

33 Summary Multi-walker random walk scales much better than flooding Multi-walker random walk scales much better than flooding –It won’t scale as perfectly as structured network, but current unstructured network can be improved significantly Square-root replication distribution is desirable and can be achieved via path replication Square-root replication distribution is desirable and can be achieved via path replication

34 KaZaa Use Supernodes Use Supernodes Regular Nodes : Supernodes = 100 : 1 Regular Nodes : Supernodes = 100 : 1 Simple way to scale the system by a factor of 100 Simple way to scale the system by a factor of 100

35 DHTs: A Brief Overview (Slides by Bard Karp)

36 What Is a DHT? Single-node hash table: Single-node hash table: key = Hash(name) put(key, value) get(key) -> value How do I do this across millions of hosts on the Internet? How do I do this across millions of hosts on the Internet? –Distributed Hash Table

37 Distributed Hash Tables Chord Chord CAN CAN Pastry Pastry Tapastry Tapastry etc. etc. etc. etc.

38 The Problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Put (Key=“title” Value=file data…) Client Get(key=“title”) ? Key Placement Routing to find key

39 Key Placement Traditional hashing Traditional hashing –Nodes numbered from 1 to N –Key is placed at node (hash(key) % N) Why Traditional Hashing have problems Why Traditional Hashing have problems

40 Consistent Hashing: IDs Key identifier = SHA-1(key) Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Node identifier = SHA-1(IP address) SHA-1 distributes both uniformly SHA-1 distributes both uniformly How to map key IDs to node IDs? How to map key IDs to node IDs?

41 Consistent Hashing: Placement A key is stored at its successor: node with next higher ID K80 N32 N90 N105 K20 K5 Circular 7-bit ID space Key 5 Node 105

42 Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

43 “Finger Table” Allows log(N)-time Lookups N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128

44 Finger i Points to Successor of n+2 i N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 112 N120

45 Lookups Take O( log(N) ) Hops N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19

46 Joining: Linked List Insert N36 N40 N25 1. Lookup(36) K30 K38

47 Join (2) N36 N40 N25 2. N36 sets its own successor pointer K30 K38

48 Join (3) N36 N40 N25 3. Copy keys 26..36 from N40 to N36 K30 K38 K30

49 Join (4) N36 N40 N25 4. Set N25’s successor pointer Predecessor pointer allows link to new host Update finger pointers in the background Correct successors produce correct lookups K30 K38 K30

50 Chord Lookup Algorithm Properties Interface: lookup(key)  IP address Interface: lookup(key)  IP address Efficient: O(log N) messages per lookup Efficient: O(log N) messages per lookup –N is the total number of servers Scalable: O(log N) state per node Scalable: O(log N) state per node Robust: survives massive failures Robust: survives massive failures Simple to analyze Simple to analyze

51 Many Many Variations of The Same Theme Different ways to choose the fingers Different ways to choose the fingers Ways to make it more robust Ways to make it more robust Ways to make it more network efficient Ways to make it more network efficient etc. etc. etc. etc.

52 Improving BitTorrent

53 BitTorrent File Sharing Network Goal: replicate K chunks of data among N nodes Form neighbor connection graph Form neighbor connection graph Neighbors exchange data Neighbors exchange data

54 BitTorrent: Neighbor Selection Tracker file.torrent 1 Seed Whole file A 5 2 3 4

55 BitTorrent: Piece Replication Tracker file.torrent 1 Seed Whole file A 3 2

56 BitTorrent: Piece Replication Algorithms “Tit-for-tat” (choking/unchoking): “Tit-for-tat” (choking/unchoking): –Each peer only uploads to 7 other peers at a time –6 of these are chosen based on amount of data received from the neighbor in the last 20 seconds –The last one is chosen randomly, with a 75% bias toward new comers (Local) Rarest-first replication: (Local) Rarest-first replication: –When peer 3 unchokes peer A, A selects which piece to download

57 Performance of BitTorrent Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks –Demonstrated by simulation studies –Confirmed by theoretical modeling studies Intuition: in a random graph, Intuition: in a random graph, Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%

58 Lessons from BitTorrent Often, randomized simple algorithms perform better than elaborately designed deterministic algorithms Often, randomized simple algorithms perform better than elaborately designed deterministic algorithms

59 Problems of BitTorrent ISPs are unhappy ISPs are unhappy –BitTorrent is notoriously difficult to “traffic engineer” –ISPs: different links have different monetary costs –BitTorrent: Peers are all equal Peers are all equal Choices made based on measured performance Choices made based on measured performance No regards for underlying ISP topology or preferences No regards for underlying ISP topology or preferences

60 BitTorrent and ISPs: Play Together? Current state of affairs: a clumsy co-existence Current state of affairs: a clumsy co-existence –ISPs “throttle” BitTorrent traffic along high-cost links –Users suffer Can they be partners? Can they be partners? –ISPs inform BitTorrent of its preferences –BitTorrent schedules traffic in ways that benefit both Users and ISPs

61 Random Neighbor Selection Existing studies all assume random neighbor selection Existing studies all assume random neighbor selection –BitTorrent no longer optimal if nodes in the same ISP only connect to each other Random neighbor selection  high cross- ISP traffic Random neighbor selection  high cross- ISP traffic Q: Can we modify the neighbor selection scheme without affecting performance?

62 Biased Neighbor Selection Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP ISP

63 Implementing Biased Neighbor Selection By Tracker By Tracker –Need ISP affiliations of peers Peer to AS maps Peer to AS maps Public IP address ranges from ISPs Public IP address ranges from ISPs Special “X-” HTTP header Special “X-” HTTP header By traffic shaping devices By traffic shaping devices –Intercept “peer  tracker” messages and manipulate responses –No need to change tracker or client

64 Evaluation Methodology Event-driven simulator Event-driven simulator –Use actual client and tracker codes as much as possible –Calculate bandwidth contention, assume perfect fair- share from TCP Network settings Network settings –14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s download –Seed node, 400Kb/s upload –Optional “university” nodes (1Mb/s upload) –Optional ISP bottleneck to other ISPs

65 Limitation of Throttling

66 Throttling: Cross-ISP Traffic Redundancy: Average # of times a data chunk enters the ISP

67 Biased Neighbor Selection: Download Times

68 Biased Neighbor Selection: Cross- ISP Traffic

69 Importance of Rarest-First Replication Random piece replication performs badly Random piece replication performs badly –Increases download time by 84% - 150% –Increase traffic redundancy from 3 to 14 Biased neighbors + Rarest-First  More uniform progress of peers Biased neighbors + Rarest-First  More uniform progress of peers

70 Biased Neighbor Selection: Single-ISP Deployment

71 Presence of External High- Bandwidth Peers Biased neighbor selection alone: Biased neighbor selection alone: –Average download time same as regular BitTorrent –Cross-ISP traffic increases as # of “university” peers increase Result of tit-for-tat Result of tit-for-tat Biased neighbor selection + Throttling: Biased neighbor selection + Throttling: –Download time only increases by 12% Most neighbors do not cross the bottleneck Most neighbors do not cross the bottleneck –Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peers

72 Comparison with Alternatives Gateway peer: only one peer connects to the peers outside the ISP Gateway peer: only one peer connects to the peers outside the ISP –Gateway peer must have high bandwidth It is the “seed” for this ISP It is the “seed” for this ISP –Ends up benefiting peers in other ISPs Caching: Caching: –Can be combined with biased neighbor selection –Biased neighbor selection reduces the bandwidth needed from the cache by an order of magnitude

73 Summary By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost –Biased neighbor selection: choose initial set of neighbors well –Can be combined with throttling and caching  P2P and ISPs can collaborate!


Download ppt "Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures."

Similar presentations


Ads by Google