Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First Year Graduate Student CS Division, UC Berkeley
Sep 8, 2003CS 294-4: Peer-to-Peer Systems2 Pastry An overlay network that provides a self- organizing routing and location service (like Chord) Seeks to minimize the distance (scalar proximity metric like routing hops) messages travel Expected number of routing steps is O(log N); N=No. of Pastry nodes in the network
Sep 8, 2003CS 294-4: Peer-to-Peer Systems3 Outline Guarantees Design of pastry Node state Routing algorithm Self-organization Addressing locality Experimental Results Discussion
Sep 8, 2003CS 294-4: Peer-to-Peer Systems4 Guarantees NodeId randomly assigned from {0,.., } b, |L| are configuration parameters Under normal conditions: 1. A pastry node can route to the numerically closest node to a given key in less than log 2 b N steps 2. Despite concurrent node failures, delivery is guaranteed unless more than |L|/2 nodes with adjacent NodeIds fail simultaneously 3. Each node join triggers O(log 2 b N) messages
Sep 8, 2003CS 294-4: Peer-to-Peer Systems5 Pastry Node State (1) Set of nodes with |L|/2 smaller and |L|/2 larger numerically closest NodeIds |M| “physically” closest nodes Prefix-based routing entries
Sep 8, 2003CS 294-4: Peer-to-Peer Systems6 Routing Table NodeIds are in base 2 b Several rows – one for each prefix of local NodeId (Log 2 b N populated on average) 2 b – 1 columns – one for each possible digit in the NodeId representation b defines the tradeoff: (Log 2 b N) x (2 b – 1) entries Vs. Log 2 b N routing hops
Sep 8, 2003CS 294-4: Peer-to-Peer Systems7 Routing Messages D: Message Key L i : i th closest NodeId in leaf set shl(A, B): Length of prefix shared by nodes A and B R i j : (j, i) th entry of routing table (1) Single hop (2) Towards better prefix-match (3) Towards numerically closer NodeId
Sep 8, 2003CS 294-4: Peer-to-Peer Systems8 Routing Performance: Intuition (1) – Single hop, termination (2) – No. of nodes which prefix-match the key upto current length reduces by 2 b (3) – Low probability, adds one hop
Sep 8, 2003CS 294-4: Peer-to-Peer Systems9 The Pastry API Operations exported by Pastry nodeId = pastryInit(Credentials,Application) route(msg,key) Operations exported by the application working above Pastry deliver(msg,key) forward(msg,key,nextId) newLeafs(leafSet)
Sep 8, 2003CS 294-4: Peer-to-Peer Systems10 Self-organization: Node Arrival Arriving Node X knows “nearby” node A X asks A to route a “join” message with key = NodeId(X) Message targets Z, whose NodeId is numerically closest to NodeId(X) All nodes along the path A, B, …, Z send state tables to X X initializes its state using this information X sends its state to “concerned” nodes
Sep 8, 2003CS 294-4: Peer-to-Peer Systems11 State Initialization X borrows A’s Neighborhood Set X’s leaf set derived from Z’s leaf set X 0 set to A 0 X 1 set to B 1, X 2 set to C 2, … X A B C Z
Sep 8, 2003CS 294-4: Peer-to-Peer Systems12 Self-organization: Node Failure Detected when a live node tries to contact a failed node Updating Leaf set – get leaf set from L -|L|/2 or L |L|/2 Updating routing table - To repair R d l, ask R i l for its R d l Updating neighborhood set – Ask alive set-members for their neighbors |L|/2 bound on failed nodes
Sep 8, 2003CS 294-4: Peer-to-Peer Systems13 Locality Application provides the “distance” function Invariant: “All routing table entries refer to a node that is near the present node, according to the proximity metric, among all live nodes with an appropriate prefix” Invariant maintained on self-organization
Sep 8, 2003CS 294-4: Peer-to-Peer Systems14 Handling Malicious Nodes Routing is deterministic Randomize choice between multiple suitable candidates – with a bias towards the best one
Sep 8, 2003CS 294-4: Peer-to-Peer Systems15 Routing Performance b=4; |L|=16; |M|=32; 200,000 lookups; Random end points
Sep 8, 2003CS 294-4: Peer-to-Peer Systems16 Messaging Distance b=4; |L|=16; |M|=32; 200,000 lookups; Random end points
Sep 8, 2003CS 294-4: Peer-to-Peer Systems17 Quality of Routing Tables b=4; |L|=16; |M|=32; 5000 New Nodes
Sep 8, 2003CS 294-4: Peer-to-Peer Systems18 Take-away “Locality concerns added to O(log N) routing and self-organization”
Sep 8, 2003CS 294-4: Peer-to-Peer Systems19 Discussion When is the neighborhood set used? No equivalent of Chord’s key migration when new nodes join – will non- replicating file-sharing applications suffer? role of newLeafs(leafSet) operation?