Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.

Similar presentations


Presentation on theme: "1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft."— Presentation transcript:

1 1 PASTRY

2 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft Research) and Peter Druschel (Rice University), IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001 Pastry Homepage http://research.microsoft.com/en-us/um/people/antr/Pastry/default.htm Sources

3 Related work Chord [Sigcomm’01] CAN [Sigcomm’01] Tapestry [TR UCB/CSD-01-1141] PNRP [unpub.] Viceroy [PODC ’02] Kademlia [IPTPS ’02] Small World [Kleinberg ‘99, ‘00] Plaxton Trees [Plaxton et al. ‘97] Generalized Hypercube [Bhuyan et al. ‘84]

4 4 Pastry Generic p2p location and routing substrate (DHT) Self-organizing overlay network (join, departures, locality repair) Consistent hashing Lookup/insert object in < log 2 b N routing steps (expected) O(log N) per-node state Network locality heuristics Scalable, fault resilient, self-organizing, locality aware, secure

5 5 Pastry: Object distribution objId/key Consistent hashing 128 bit circular id space nodeIds (uniform random) objIds/keys (uniform random) Invariant: node with numerically closest nodeId maintains object nodeIds O 2 128 - 1

6 6 Pastry: Object insertion/lookup X Route(X) Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible O 2 128 - 1

7 CMPT 880: P2P Systems - SFU7 Pastry Node Represented by 128-bit randomly chosen nodeId (Hash of IP or public key) NodeId is in base 2 b (b is a configuration parameter; b typical value 2 or 4) Evenly distributed nodeIds along the circular namespace (0-2 128 – 1 space). Routes a message in O(log N) steps to destination N: size of network Node state contains: Leaf Set ( L ) Routing table ( R ) Neighborhood Set ( M )

8 8 Pastry node state Leaf set: L/2 Numerically closest nodes ( L is a configuration parameter = 16, 32 typically ) Routing Table (Prefix- based) Neighborhood Set: M physically closest nodes

9 9 Pastry node state (Leaf Set) Serves as a fall back for routing table and contains: L/2 numerically closest and larger nodeIds L/2 numerically closest and smaller nodIds Size of L is typically 2 b or 2 x 2 b Nodes in L are numerically close (could be geographically diverse)

10 10 Pastry node state: Neighborhood set (M) Contains the IP addresses and nodeIds of closest nodes according to proximity metric Size of |M| is typically 2 b or 2x2 b Not used in routing, but instead for maintaining locality properties

11 11 Node state: Routing Table Matrix of Log 2 b N rows and 2 b – 1 columns ( N is the number of nodes in the network ) Entries in row n match the first n digits of current nodeId AND Column number follows matched digits: Format: matched digits–column number–rest of ID Log 2 b N populated on average

12 12 Node10233102 (2), (b = 2, l = 8) 0123 022121022230120331203203 113012331223020313021022 100312031013210210323302 10200230102113021022302 102303221023100010232121 1023300110233232 10233120

13 13 Pastry: Routing Tradeoff O(log N) routing table size 2 b * log 2 b N + 2l O(log N) message forwarding steps

14 Prefix Routing Node IDs and keys from randomized namespace (SHA-1) incremental routing towards destination ID each node has small set of outgoing routes log (n) neighbors per node, log (n) hops between any node pair  To: ABCE  ID: ABCE  A930  AB5F  ABC0

15 15 Pastry: Routing table (# 10233102 ) L nodes in leaf set log 2 b N Rows (actually log 2 b 2 128 = 128/b) 2 b columns L neighbors

16 16 D: Message Key L i : i th closest NodeId in leaf set shl(A, B): Length of prefix shared by nodes A and B R i j : (j, i) th entry of routing table (1) Node is in the leaf set (2) Forward message to a closer node (Better match) (3) Forward towards numerically Closer node (not a better match) Pastry: Routing procedure

17 17 Pastry: Routing procedure If (destination is within range of our leaf set) forward to numerically closest member else let l = length of shared prefix let d = value of l-th digit in D’s address if ( R l d exists) forward to R l d else forward to a known node (from ) that (a) shares at least as long a prefix (b) is numerically closer than this node

18 CMPT 880: P2P Systems - SFU18 If message with key D is within range of leaf set, forward to numerically closest leaf Else forward to node that shares at least one more digit with D in its prefix than current nodeId If no such node exists, forward to node that shares at least as many digits with D as current nodeId but numerically nearer than current nodeId Pastry: Routing procedure

19 19 Pastry: Routing Properties log 2 b N steps O(log N) state d46a1c Look for (d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1

20 20 Pastry: Locality properties Assumption: scalar proximity metric e.g. ping/RTT delay, # IP hops traceroute, subnet masks a node can probe distance to any other node Proximity invariant: Each routing table entry refers to a node close to the local node (in the proximity space), among all nodes with the appropriate nodeId prefix.

21 21 Pastry: Geometric Routing in proximity space d46a1c Route(d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1 d467c4 65a1fc d13da3 d4213f d462ba Proximity space  The proximity distance traveled by message in each routing step is exponentially increasing (entry in row l is chosen from a set of nodes of size N/2 bl )  The distance traveled by message from its source increases monotonically at each step (message takes larger and larger strides) NodeId space

22 22 Pastry: Locality properties Each routing step is local, but there is no guarantee of globally shortest path Nevertheless, simulations show: Expected distance traveled by a message in the proximity space is within a small constant of the minimum Among k nodes with nodeIds closest to the key, message likely to reach the node closest to the source node first

23 23 Pastry: Self-organization Initializing and maintaining routing tables and leaf sets Node addition Node departure (failure) The goal is to maintain all routing table entries to refer to a near node, among all live nodes with appropriate prefix

24 24 New node X contacts nearby node A A routes “ join ” message to X, which arrives to Z, closest to X X obtains leaf set from Z, i ’ th row for routing table from i ’ th node from A to Z X informs any nodes that need to be aware of its arrival X also improves its table locality by requesting neighborhood sets from all nodes X knows In practice: optimistic approach Pastry: Node addition

25 25 Pastry: Node addition X=d46a1c Route(d46a1c) d462ba d4213f d13da3 A = 65a1fc Z=d467c4 d471f1 New node: X=d46a1c A is X’s neighbor

26 26 d467c4 65a1fc d13da3 d4213f d462ba Proximity space Pastry: Node addition New node: d46a1c d46a1c Route(d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1 NodeId space X is close to A, B is close to B1. Why X is close to B1? The expected distance from B to its row one entries (B1) is much larger than the expected distance from A to B (chosen from exponentially decreasing set size) X B1 is first row of B

27 27 Node departure (failure) Leaf set repair (eager – all the time): Leaf set members exchange keep-alive messages request set from furthest live node in set Routing table repair (lazy – upon failure): get table from peers in the same row, if not found – from higher rows Neighborhood set repair (eager)

28 28 Pastry: Summary Generic p2p overlay network Scalable, fault resilient, self-organizing, secure O(log N) routing steps (expected) O(log N) routing table size Network locality properties


Download ppt "1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft."

Similar presentations


Ads by Google