Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures.

Evolution of P2P Content Distribution Pei Cao

Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures Techniques to Improve Gnutella Techniques to Improve Gnutella Brief Overview of DHT Brief Overview of DHT Techniques to Improve BitTorrent Techniques to Improve BitTorrent

History of P2P Napster Napster Gnutella Gnutella KaZaa KaZaa Distributed Hash Tables Distributed Hash Tables BitTorrent BitTorrent

Napster Centralized directory Centralized directory –A central website to hold directory of contents of all peers –Queries performed at the central directory –File transfer occurs between peers –Support arbitrary queries –Con: Single point of failure

Gnutella Decentralized homogenous peers Decentralized homogenous peers –No central directory –Queries performed distributed on peers via “flooding” –Support arbitrary queries –Very resilient against failure –Problem: Doesn’t scale

FastTrack/KaZaa Distributed Two-Tier architecture Distributed Two-Tier architecture –Supernodes: keep content directory for regular nodes –Regular nodes: do not participate in Query processing –Queries performed by Supernodes only –Support arbitrary queries –Con: supernodes stability affect system performance

Distributed Hash Tables Structured Distributed System Structured Distributed System –Structured: all nodes participate in a precise scheme to maintain certain invariants –Provide a directory service: Directory service Directory service Routing Routing –Extra work when nodes join and leave –Support key-based lookups only

BitTorrent Distribution of very large files Distribution of very large files Tracker connects peers to each other Tracker connects peers to each other Peers exchange file blocks with each other Peers exchange file blocks with each other Use Tit-for-Tat to discourage free loading Use Tit-for-Tat to discourage free loading

Improving Gnutella

Gnutella-Style Systems Advantages of Gnutella: Advantages of Gnutella: –Support more flexible queries Typically, precise “name” search is a small portion of all queries Typically, precise “name” search is a small portion of all queries –Simplicity –High resilience against node failures Problems of Gnutella: Scalability Problems of Gnutella: Scalability –Flooding  # of messages ~ O(N*E)

Flooding-Based Searches...... Duplication increases as TTL increases in flooding Duplication increases as TTL increases in flooding Worst case: a node A is interrupted by N * q * degree(A) messages Worst case: a node A is interrupted by N * q * degree(A) messages 1 2 3 4 5 6 78

Load on Individual Nodes Why is a node interrupted: Why is a node interrupted: –To process a query –To route the query to other nodes –To process duplicated queries sent to it

Communication Complexity Communication complexity determined by: Network topology Network topology Distribution of object popularity Distribution of object popularity Distribution of replication density of objects Distribution of replication density of objects

Network Topologies Uniform Random Graph (Random) Uniform Random Graph (Random) –Average and median node degree is 4 Power-Law Random Graph (PLRG) Power-Law Random Graph (PLRG) –max node degree: 1746, median: 1, average: 4.46 Gnutella network snapshot (Gnutella) Gnutella network snapshot (Gnutella) –Oct 2000 snapshot –max degree: 136, median: 2, average: 5.5 Two-dimensional grid (Grid) Two-dimensional grid (Grid)

Modeling Methods Object popularity distribution p i Object popularity distribution p i –Uniform –Zipf-like Object replication density distribution r i Object replication density distribution r i –Uniform –Proportional: r i  p i –Square-Root: r i   p i

Evaluation Metrics Overhead: average # of messages per node per query Overhead: average # of messages per node per query Probability of search success: Pr(success) Probability of search success: Pr(success) Delay: # of hops till success Delay: # of hops till success

Duplications in Various Network Topologies

Relationship between TTL and Search Successes

Problems with Simple TTL- Based Flooding Hard to choose TTL: Hard to choose TTL: –For objects that are widely present in the network, small TTLs suffice –For objects that are rare in the network, large TTLs are necessary Number of query messages grow exponentially as TTL grows Number of query messages grow exponentially as TTL grows

Idea #1: Adaptively Adjust TTL “Expanding Ring” “Expanding Ring” –Multiple floods: start with TTL=1; increment TTL by 2 each time until search succeeds Success varies by network topology Success varies by network topology –For “Random”, 30- to 70- fold reduction in message traffic –For Power-law and Gnutella graphs, only 3- to 9- fold reduction

Limitations of Expanding Ring

Idea #2: Random Walk Simple random walk Simple random walk –takes too long to find anything! Multiple-walker random walk Multiple-walker random walk –N agents after each walking T steps visits as many nodes as 1 agent walking N*T steps –When to terminate the search: check back with the query originator once every C steps

Search Traffic Comparison

Search Delay Comparison

Flexible Replication In unstructured systems, search success is essentially about coverage: visiting enough nodes to probabilistically find the object => replication density matters In unstructured systems, search success is essentially about coverage: visiting enough nodes to probabilistically find the object => replication density matters Limited node storage => what’s the optimal replication density distribution? Limited node storage => what’s the optimal replication density distribution? –In Gnutella, only nodes who query an object store it => r i  p i –What if we have different replication strategies?

Optimal r i Distribution Goal: minimize  ( p i / r i ), where  r i =R Goal: minimize  ( p i / r i ), where  r i =R Calculation: Calculation: –introduce Lagrange multiplier, find r i and that minimize:  ( p i / r i ) + * (  r i - R)  ( p i / r i ) + * (  r i - R) => - p i / r i 2 = 0 for all i => - p i / r i 2 = 0 for all i => r i   p i => r i   p i

Square-Root Distribution General principle: to minimize  ( p i / r i ) under constraint  r i =R, make r i proportional to square root of p i General principle: to minimize  ( p i / r i ) under constraint  r i =R, make r i proportional to square root of p i Other application examples: Other application examples: –Bandwidth allocation to minimize expected download times –Server load balancing to minimize expected request latency

Achieving Square-Root Distribution Suggestions from some heuristics Suggestions from some heuristics –Store an object at a number of nodes that is proportional to the number of node visited in order to find the object –Each node uses random replacement Two implementations: Two implementations: –Path replication: store the object along the path of a successful “walk” –Random replication: store the object randomly among nodes visited by the agents

Evaluation of Replication Methods Metrics Metrics –Overall message traffic –Search delay Dynamic simulation Dynamic simulation –Assume Zipf-like object query probability –5 query/sec Poisson arrival –Results are during 5000sec-9000sec

Distribution of r i

Total Search Message Comparison Observation: path replication is slightly inferior to random replication Observation: path replication is slightly inferior to random replication

Search Delay Comparison

Summary Multi-walker random walk scales much better than flooding Multi-walker random walk scales much better than flooding –It won’t scale as perfectly as structured network, but current unstructured network can be improved significantly Square-root replication distribution is desirable and can be achieved via path replication Square-root replication distribution is desirable and can be achieved via path replication

KaZaa Use Supernodes Use Supernodes Regular Nodes : Supernodes = 100 : 1 Regular Nodes : Supernodes = 100 : 1 Simple way to scale the system by a factor of 100 Simple way to scale the system by a factor of 100

DHTs: A Brief Overview (Slides by Bard Karp)

What Is a DHT? Single-node hash table: Single-node hash table: key = Hash(name) put(key, value) get(key) -> value How do I do this across millions of hosts on the Internet? How do I do this across millions of hosts on the Internet? –Distributed Hash Table

Distributed Hash Tables Chord Chord CAN CAN Pastry Pastry Tapastry Tapastry etc. etc. etc. etc.

The Problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Put (Key=“title” Value=file data…) Client Get(key=“title”) ? Key Placement Routing to find key

Key Placement Traditional hashing Traditional hashing –Nodes numbered from 1 to N –Key is placed at node (hash(key) % N) Why Traditional Hashing have problems Why Traditional Hashing have problems

Consistent Hashing: IDs Key identifier = SHA-1(key) Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Node identifier = SHA-1(IP address) SHA-1 distributes both uniformly SHA-1 distributes both uniformly How to map key IDs to node IDs? How to map key IDs to node IDs?

Consistent Hashing: Placement A key is stored at its successor: node with next higher ID K80 N32 N90 N105 K20 K5 Circular 7-bit ID space Key 5 Node 105

Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

“Finger Table” Allows log(N)-time Lookups N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128

Finger i Points to Successor of n+2 i N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 112 N120

Lookups Take O( log(N) ) Hops N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19

Joining: Linked List Insert N36 N40 N25 1. Lookup(36) K30 K38

Join (2) N36 N40 N25 2. N36 sets its own successor pointer K30 K38

Join (3) N36 N40 N25 3. Copy keys 26..36 from N40 to N36 K30 K38 K30

Join (4) N36 N40 N25 4. Set N25’s successor pointer Predecessor pointer allows link to new host Update finger pointers in the background Correct successors produce correct lookups K30 K38 K30

Chord Lookup Algorithm Properties Interface: lookup(key)  IP address Interface: lookup(key)  IP address Efficient: O(log N) messages per lookup Efficient: O(log N) messages per lookup –N is the total number of servers Scalable: O(log N) state per node Scalable: O(log N) state per node Robust: survives massive failures Robust: survives massive failures Simple to analyze Simple to analyze

Many Many Variations of The Same Theme Different ways to choose the fingers Different ways to choose the fingers Ways to make it more robust Ways to make it more robust Ways to make it more network efficient Ways to make it more network efficient etc. etc. etc. etc.

Improving BitTorrent

BitTorrent File Sharing Network Goal: replicate K chunks of data among N nodes Form neighbor connection graph Form neighbor connection graph Neighbors exchange data Neighbors exchange data

BitTorrent: Neighbor Selection Tracker file.torrent 1 Seed Whole file A 5 2 3 4

BitTorrent: Piece Replication Tracker file.torrent 1 Seed Whole file A 3 2

BitTorrent: Piece Replication Algorithms “Tit-for-tat” (choking/unchoking): “Tit-for-tat” (choking/unchoking): –Each peer only uploads to 7 other peers at a time –6 of these are chosen based on amount of data received from the neighbor in the last 20 seconds –The last one is chosen randomly, with a 75% bias toward new comers (Local) Rarest-first replication: (Local) Rarest-first replication: –When peer 3 unchokes peer A, A selects which piece to download

Performance of BitTorrent Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks –Demonstrated by simulation studies –Confirmed by theoretical modeling studies Intuition: in a random graph, Intuition: in a random graph, Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%

Lessons from BitTorrent Often, randomized simple algorithms perform better than elaborately designed deterministic algorithms Often, randomized simple algorithms perform better than elaborately designed deterministic algorithms

Problems of BitTorrent ISPs are unhappy ISPs are unhappy –BitTorrent is notoriously difficult to “traffic engineer” –ISPs: different links have different monetary costs –BitTorrent: Peers are all equal Peers are all equal Choices made based on measured performance Choices made based on measured performance No regards for underlying ISP topology or preferences No regards for underlying ISP topology or preferences

BitTorrent and ISPs: Play Together? Current state of affairs: a clumsy co-existence Current state of affairs: a clumsy co-existence –ISPs “throttle” BitTorrent traffic along high-cost links –Users suffer Can they be partners? Can they be partners? –ISPs inform BitTorrent of its preferences –BitTorrent schedules traffic in ways that benefit both Users and ISPs

Random Neighbor Selection Existing studies all assume random neighbor selection Existing studies all assume random neighbor selection –BitTorrent no longer optimal if nodes in the same ISP only connect to each other Random neighbor selection  high cross- ISP traffic Random neighbor selection  high cross- ISP traffic Q: Can we modify the neighbor selection scheme without affecting performance?

Biased Neighbor Selection Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP ISP

Implementing Biased Neighbor Selection By Tracker By Tracker –Need ISP affiliations of peers Peer to AS maps Peer to AS maps Public IP address ranges from ISPs Public IP address ranges from ISPs Special “X-” HTTP header Special “X-” HTTP header By traffic shaping devices By traffic shaping devices –Intercept “peer  tracker” messages and manipulate responses –No need to change tracker or client

Evaluation Methodology Event-driven simulator Event-driven simulator –Use actual client and tracker codes as much as possible –Calculate bandwidth contention, assume perfect fair- share from TCP Network settings Network settings –14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s download –Seed node, 400Kb/s upload –Optional “university” nodes (1Mb/s upload) –Optional ISP bottleneck to other ISPs

Limitation of Throttling

Throttling: Cross-ISP Traffic Redundancy: Average # of times a data chunk enters the ISP

Biased Neighbor Selection: Download Times

Biased Neighbor Selection: Cross- ISP Traffic

Importance of Rarest-First Replication Random piece replication performs badly Random piece replication performs badly –Increases download time by 84% - 150% –Increase traffic redundancy from 3 to 14 Biased neighbors + Rarest-First  More uniform progress of peers Biased neighbors + Rarest-First  More uniform progress of peers

Biased Neighbor Selection: Single-ISP Deployment

Presence of External High- Bandwidth Peers Biased neighbor selection alone: Biased neighbor selection alone: –Average download time same as regular BitTorrent –Cross-ISP traffic increases as # of “university” peers increase Result of tit-for-tat Result of tit-for-tat Biased neighbor selection + Throttling: Biased neighbor selection + Throttling: –Download time only increases by 12% Most neighbors do not cross the bottleneck Most neighbors do not cross the bottleneck –Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peers

Comparison with Alternatives Gateway peer: only one peer connects to the peers outside the ISP Gateway peer: only one peer connects to the peers outside the ISP –Gateway peer must have high bandwidth It is the “seed” for this ISP It is the “seed” for this ISP –Ends up benefiting peers in other ISPs Caching: Caching: –Can be combined with biased neighbor selection –Biased neighbor selection reduces the bandwidth needed from the cache by an order of magnitude

Summary By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost –Biased neighbor selection: choose initial set of neighbors well –Can be combined with throttling and caching  P2P and ISPs can collaborate!

Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures.

Similar presentations

Presentation on theme: "Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures.

Similar presentations

Presentation on theme: "Evolution of P2P Content Distribution Pei Cao. Outline History of P2P Content Distribution Architectures History of P2P Content Distribution Architectures."— Presentation transcript:

Similar presentations

About project

Feedback