Download presentation
Presentation is loading. Please wait.
Published byJocelyn Strickland Modified over 8 years ago
1
1 Tuesday, February 03, 2009 “Work expands to fill the time available for its completion.” - Parkinson’s 1st Law
2
2 Introduction Coupling Loose | Tight SOA Purpose and Problem History P2P Design Characteristics Overlay routing vs. IP routing Dangers and Attacks Current Peer-Peer Concerns Recap
3
3 Introduction Napster and its legacy –self study Peer-to-Peer middleware –self study Routing overlays Overlay case studies: Pastry, Tapestry Application case studies: Squirrel, OceanStore, Ivy Summary Where are we ?
4
Routing Overlay In P2P we cannot maintain the database at all the client nodes, giving the location of all the resources Resource location knowledge must be partitioned and distributed Each node is made responsible for maintaining detailed knowledge of the locations of nodes and objects in a portion of the namespace As well as general knowledge of the topology of the entire name space High degree of replication of this knowledge is necessary to ensure dependability in the face of the volatile availability of hosts and intermittent network connectivity. A distributed algorithm known as routing overlays takes responsibility for locating nodes and objects
5
Distribution of information in a routing overlay Routing overlay takes the responsibility for locating nodes and objects A’s D’s B’sC’s
6
Routing Overlay Ensures that any node can access any object by routing each request through a sequence of nodes, exploiting knowledge at each of them to locate the destination object using pure names It also maintains the knowledge of location of all the replicas of the object and deliver request to nearest live node GUID is computed from all or part of the state using a hash function with very high probability, unique. Uniqueness is verified by searching for another object with the same GUID.
7
Tasks of Routing Overlay Main task of Routing Overlay Client wishing to invoke an operation on an object submits a request including the object’s GUID to the routing overlay, which routes the request to a node at which a replica of the object resides Other task of Routing Overlay Node wishing to make new object available to a P2P service computes a GUID for the object and announces it to the routing overlay, which then ensures that the object is reachable by all other clients. When clients request the removal of the object from the service the routing overlay must make them unavailable. Nodes may join or leave the service, when a node joins the service, the routing overlay arranges for it to assume some of the responsibilities of other nodes. When a node leaves its responsibilities are distributed amongst the other nodes.
8
Basic programming interface for a distributed hash table (DHT) as implemented by the PAST API over Pastry put(GUID, data) The data is stored in replicas at all nodes responsible for the object identified by GUID. remove(GUID) Deletes all references to GUID and the associated data. value = get(GUID) The data associated with GUID is retrieved from one of the nodes responsible for it.
9
Basic programming interface for distributed object location and routing (DOLR) as implemented by Tapestry publish(GUID) GUID can be computed from the object (or some part of it, e.g. its name). This function makes the node performing a publish operation by the host for the object corresponding to GUID. unpublish(GUID) Makes the object corresponding to GUID inaccessible. sendToObj(msg, GUID, [n]) Following the object-oriented paradigm, an invocation message is sent to an object in order to access it. This might be a request to open a TCP connection for data transfer or to return a message containing all or part of the object’s state. The final optional parameter [n], if present, requests the delivery of the same message to n replicas of the object.
10
Pastry Overlay Case Study: Pastry It is a message routing infrastructure deployed in several applications All the nodes & objects are assigned 128-bit GUIDs For nodes: computed by applying a secure hash function to public key of the node For objects: computed by applying a secure hash function to the objects name or some part of its stored state Resulting GUIDs are randomly distributed in the range 0 to 2 128 -1 Clashes between GUIDs for different nodes or objects are extremely unlikely, still pastry can detect & mange this unlikely event In a network with N participating nodes the Pastry algo will correctly route a message addressed to any GUID in O(log N) steps If GUID identifies a node which is active, message is delivered to that node otherwise delivered to a active node with closet numeric GUID Prefix routing
11
Pastry Routing steps involve the use of an underlying transport protocol (normally UDP) to transfer the message to a Pastry node that is closer to its destination Closeness in pastry refers to an entirely artificial space – the space of GUIDs Real transport of message across internet between two pastry nodes may require lots of IP hops. For better path option, pastry uses locality metric on network distance in the underlying network (hop count, two way latency) to select appropriate neighbors when setting up the routing tables used at each node Pastry id fully self organizing New nodes get info form neighbors to construct the table Nodes can detect the absence of the node and can update the table Classesless Interdomain Routing for IP Packets
12
Pastry: Routing Algo Explanation in two stages Stage 01: simplified form of the algo which routes messages correctly but inefficiently without a routing table Each active node stores a leaf set –a vector L (of size 2l) containing the GUIDs and IP addresses of the nodes whose GUIDs are numerically closest on either side of its own. Leaf sets are maintained by Pastry as nodes join and leave. Stage 02: full routing algo which routes request to any node in O(log N) messages NUPastery library provides an implementation of Pastry algorithm Overlay Weaver toolkit provides multiple routing algorithms, Chord, Kademlia, Koorde, Pastry and Tapestry
13
Figure 10.6: Circular routing alone is correct but inefficient Based on Rowstron and Druschel [2001] The dots depict live nodes. The space is considered as circular: node 0 is adjacent to node (2128-1). The diagram illustrates the routing of a message from node 65A1FC to D46A1C using leaf set information alone, assuming leaf sets of size 8 (l=4). This is a degenerate type of routing that would scale very poorly; it is not used in practice.
14
Pastry: Routing Algo Each pastry node maintains a tree structured routing table giving GUIDs and IP addresses for a set of nodes spread through out the entire range of 2 128 possible values, with increased density of coverage for GUID numerically close to its own
15
Routing tables structured GUIDs are viewed as hexadecimal values IP addresses for set of nodes spread hexadecimal values & tables classifies GUIDs based on their hexadecimal prefixes throughout the entire range of 2 128 possible GUID values, with increased density of coverage for GUID’s numerically close to its own. Tables has as many rows as there are hexadecimal digits in a GUID, so for our prototype there are 128/4 = 32 rows Each row contains 15 entries Each entry in the table points to one of the potentially many nodes whose GUIDs have the relevant prefix.
16
Figure 10.7: First four rows of a Pastry routing table
17
Figure 10.8: Pastry routing example Based on Rowstron and Druschel [2001]
18
Figure 10.9: Pastry’s routing algorithm
19
Reading Assignment 19 Reading Host integration Host Failure or Departure
20
Pastry’s routing algorithm Algo will succeed in delivering the message M to its destination cause lines 1,2 & 7 They perform action as described in stage 01 The remaining steps are designed to improve the algorithm's performance by reducing the numbers of hops required
21
Host integration Compute GUID Contact nearby (ref. to network distance)pastry node New node X sends join request to node A Node will forward through pastry routing algorithm this join request to node which is closest to A in terms of GUID address, let say node Z A, Z and all the nodes (B, C, …) through which the join message is routed on its way to Z add additional step to normal pastry routing algo. This result in transmission of the relevant parts of their respective leaf set to X X examine them and constructs its own Routing table
22
Host Failure or Departure Pastry node is considered failed when its immediate neighbors in GUID space can no longer communicate with it. So repair a leaf containing info of such node is necessary. For leaf set L repair, node will look for another closest live node and request for its leaf set L’. L’ will contain a sequence of GUIDs that partly overlap in L, including the appropriate value to replace the failed node. Other nodes then informed about the failure and they too also perform the same operation
23
Other issues in pastry Locality Highly redundant Proximity Neighbor selection algo Locality metric – number of IP hops or measured latency 30 to 50 percent longer then optimum path Fault Tolerance Heartbeat messages are sent by all nodes to nearest neighbours Failed node info cannot be sent sufficiently rapidly Nor does it account for malicious nodes Message delivery guarantee o some extent is achieved through mechanism of retransmission of message through slightly different less optimal route Dependability MSPastry – adopts acknowledgement & retransmission as well
24
Performance Evaluation of MSPastry With assumed IP message loss rate 0% MSPastry failed ti deliver 1.5 in 100,000 requests, all message arrived at correct node With assumed IP message loss rate 5% MSPastry failed ti deliver 3.3 in 100,000 requests, and 1.6 were delivered at wrong node Performance overhead of overlay MSPastry algo is 2.2 to 5 percent Extra message for leafset maintenance and initially at setting up the routing table
25
RT updates Node failure or departure Node is considered failed when its immediate neighbors are unable to contact Node which discovers the failure of the node, looks for the next nearest live node, and request for its leaf set This leaf set will contain the overlapping info of failed node leaf set. Discovering node will choose the best node from this leaf set to replace the failed node.
26
Self study Locality Fault tolerance Dependability – MS Pastry Evaluation of MS Pastry
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.