PASTRY.

Slides:



Advertisements
Similar presentations
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Advertisements

Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Peer-to-Peer Structured Overlay Networks
Scalable peer-to-peer substrates: A new foundation for distributed applications? Peter Druschel, Rice University Antony Rowstron, Microsoft Research Cambridge,
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems Peter Druschel, Rice University Antony Rowstron, Microsoft.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Bowstron & Peter Druschel Presented by: Long Zhang.
Secure routing for structured peer-to-peer overlay networks M. Castro, P. Druschel, A. Ganesch, A. Rowstron, D.S. Wallach 5th Unix Symposium on Operating.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Secure routing for structured peer-to-peer overlay networks Miguel Castro, Ayalvadi Ganesh, Antony Rowstron Microsoft Research Ltd. Peter Druschel, Dan.
Pastry Partially borrowed for Gabi Kliot. Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems  Antony Rowstron.
1 Pastry and Past Based on slides by Peter Druschel and Gabi Kliot (CS Department, Technion) Alex Shraer.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
1 Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Gabi Kliot, Computer Science Department, Technion Topics.
Pastry And Squirrel Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
Overlays and DHTs Presented by Dong Wang and Farhana Ashraf.
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
The Impact of DHT Routing Geometry on Resilience and Proximity K. Gummadi, R. Gummadi..,S.Gribble, S. Ratnasamy, S. Shenker, I. Stoica.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XIV: P2P.
Secure Routing for Structured Peer-to-Peer Overlay Networks M. Castro, P. Druschel, A. Ganesh, A. Rowstron and D. S. Wallach Proc. Of the 5 th Usenix Symposium.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Squirrel: A decentralized peer-to- peer web cache Paper by Sitaram Iyer, Antony Rowstron and Peter Druschel (© 2002) Presentation* by Alexander Prohaska.
Peer to Peer Network Design Discovery and Routing algorithms
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony I.T.
1 Tuesday, February 03, 2009 “Work expands to fill the time available for its completion.” - Parkinson’s 1st Law.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Peer-to-Peer Networks 05 Pastry Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Antony Rowstron, Microsoft Research Cambridge, UK
Pastry Scalable, decentralized object locations and routing for large p2p systems.
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
Distributed Hash Tables
CS 3700 Networks and Distributed Systems
Controlling the Cost of Reliability in Peer-to-Peer Overlays
(slides by Nick Feamster)
COS 461: Computer Networks
COMP/ELEC 429 Introduction to Computer Networks
Plethora: Infrastructure and System Design
Accessing nearby copies of replicated objects
Peter Druschel, Rice University Antony Rowstron,
P2P Systems and Distributed Hash Tables
COS 461: Computer Networks
Applications (2) Outline Overlay Networks Peer-to-Peer Networks.
A Semantic Peer-to-Peer Overlay for Web Services Discovery
P2P: Distributed Hash Tables
Peer-to-Peer Networks
Presentation transcript:

PASTRY

Sources Pastry paper Pastry Homepage “Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems” by Antony Rowstron (Microsoft Research) and Peter Druschel (Rice University), IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001 Pastry Homepage http://research.microsoft.com/en-us/um/people/antr/Pastry/default.htm

Related work Chord [Sigcomm’01] CAN [Sigcomm’01] Tapestry [TR UCB/CSD-01-1141] PNRP [unpub.] Viceroy [PODC ’02] Kademlia [IPTPS ’02] Small World [Kleinberg ‘99, ‘00] Plaxton Trees [Plaxton et al. ‘97] Generalized Hypercube [Bhuyan et al. ‘84]

Scalable, fault resilient, self-organizing, Pastry Generic p2p location and routing substrate (DHT) Self-organizing overlay network (join, departures, locality repair) Consistent hashing Lookup/insert object in < log2b N routing steps (expected) O(log N) per-node state Network locality heuristics Scalable, fault resilient, self-organizing, locality aware, secure Why do we say Pastry is secure?

Pastry: Object distribution 2128 - 1 O Consistent hashing 128 bit circular id space nodeIds (uniform random) objIds/keys (uniform random) Invariant: node with numerically closest nodeId maintains object objId/key Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. nodeIds

Pastry: Object insertion/lookup 2128 - 1 O Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible X Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Route(X)

Pastry Node Represented by 128-bit randomly chosen nodeId (Hash of IP or public key) NodeId is in base 2b (b is a configuration parameter; b typical value 2 or 4) Evenly distributed nodeIds along the circular namespace (0-2128 – 1 space). Routes a message in O(log N) steps to destination N: size of network Node state contains: Leaf Set ( L ) Routing table ( R ) Neighborhood Set ( M ) CMPT 880: P2P Systems - SFU

Pastry node state Leaf set: L/2 Numerically closest nodes (L is a configuration parameter = 16, 32 typically ) Routing Table (Prefix-based) Neighborhood Set: M physically closest nodes

Pastry node state (Leaf Set) Serves as a fall back for routing table and contains: L/2 numerically closest and larger nodeIds L/2 numerically closest and smaller nodIds Size of L is typically 2b or 2 x 2b Nodes in L are numerically close (could be geographically diverse) Fall back= rely on=helper Whenever the object key is among the leaf set, the routing process can be stopped.

Pastry node state: Neighborhood set (M) Contains the IP addresses and nodeIds of closest nodes according to proximity metric Size of |M| is typically 2b or 2x2b Not used in routing, but instead for maintaining locality properties

Node state: Routing Table Matrix of Log2b N rows and 2b – 1 columns (N is the number of nodes in the network) Entries in row n match the first n digits of current nodeId (prefix: from left to right) Column number follows matched digits: Format: matched digits–column number–rest of ID Log2b N populated on average

Node10233102 (2), (b = 2, L= 8) 1 2 3 02212102 22301203 31203203 11301233 12230203 13021022 10031203 10132102 10323302 10200230 10211302 1022302 10230322 10231000 10232121 10233001 10233232 10233120

Pastry: Routing Tradeoff O(log N) routing table size 2b * log2bN + 2L O(log N) message forwarding steps

Prefix Routing Node IDs and keys from randomized namespace (SHA-1) incremental routing towards destination ID each node has small set of outgoing routes log (n) neighbors per node, log (n) hops between any node pair ID: ABCE ABC0 To: ABCE AB5F A930

Pastry: Routing table (# 10233102) L nodes in leaf set log2b N Rows (actually log2b 2128= 128/b) 2b columns L neighbors b=2, l=2^b=4

Pastry: Routing procedure (1) Node is in the leaf set (2) Forward message to a closer node (Better match) (3) Forward towards numerically Closer node (not a better match) A is the current node. D: Message Key Li: ith closest NodeId in leaf set shl(A, B): Length of prefix shared by nodes A and B Rij: (j, i)th entry of routing table

Pastry: Routing procedure If (destination is within range of our leaf set) forward to numerically closest member else let l = length of shared prefix let d = value of l-th digit in D’s address if (Rld exists) forward to Rld forward to a known node (from ) that (a) shares at least as long a prefix (b) is numerically closer than this node

Pastry: Routing procedure If message with key D is within range of leaf set, forward to numerically closest leaf Else forward to node that shares at least one more digit with D in its prefix than current nodeId If no such node exists, forward to node that shares at least as many digits with D as current nodeId but numerically nearer than current nodeId CMPT 880: P2P Systems - SFU

Pastry: Routing Properties log2b N steps O(log N) state d471f1 d467c4 d462ba d46a1c d4213f Properties log2b N steps O(log N) state Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Look for (d46a1c) d13da3 65a1fc

Pastry: Locality properties Assumption: scalar proximity metric e.g. ping/RTT delay, # IP hops traceroute, subnet masks a node can probe distance to any other node Proximity invariant: Each routing table entry refers to a node close to the local node (in the proximity space), among all nodes with the appropriate nodeId prefix.

Pastry: Geometric Routing in proximity space d46a1c Route(d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1 d467c4 65a1fc d13da3 d4213f d462ba Proximity space NodeId space Proximity space is the geometrical space. The proximity distance traveled by message in each routing step is exponentially increasing (entry in row l is chosen from a set of nodes of size N/2bl) The distance traveled by message from its source increases monotonically at each step (message takes larger and larger strides)

Pastry: Locality properties Each routing step is local, but there is no guarantee of globally shortest path Nevertheless, simulations show: Expected distance traveled by a message in the proximity space is within a small constant of the minimum Among k nodes with nodeIds closest to the key, message likely to reach the node closest to the source node first

Pastry: Self-organization Initializing and maintaining routing tables and leaf sets Node addition Node departure (failure) The goal is to maintain all routing table entries to refer to a near node, among all live nodes with appropriate prefix

Pastry: Node addition New node X contacts nearby node A A routes “join” message to X, which arrives to Z, closest to X X obtains leaf set from Z, i’th row for routing table from i’th node from A to Z X informs any nodes that need to be aware of its arrival X also improves its table locality by requesting neighborhood sets from all nodes X knows In practice: optimistic approach

Pastry: Node addition d471f1 Z=d467c4 d462ba X=d46a1c New node: X=d46a1c d4213f A is X’s neighbor Route(d46a1c) Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. d13da3 A = 65a1fc

Pastry: Node addition X d467c4 d471f1 d467c4 d462ba d46a1c d4213f New node: d46a1c d46a1c Route(d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1 NodeId space d467c4 65a1fc d13da3 d4213f d462ba Proximity space B1 is first row of B Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. X X is close to A, B is close to B1. Why X is close to B1? The expected distance from B to its row one entries (B1) is much larger than the expected distance from A to B (chosen from exponentially decreasing set size)

Node departure (failure) Leaf set repair (eager – all the time): Leaf set members exchange keep-alive messages request set from furthest live node in set Routing table repair (lazy – upon failure): get table from peers in the same row, if not found – from higher rows Neighborhood set repair (eager)

Pastry: Summary Generic p2p overlay network Scalable, fault resilient, self-organizing, secure O(log N) routing steps (expected) O(log N) routing table size Network locality properties