Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.

Similar presentations


Presentation on theme: "1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam."— Presentation transcript:

1 1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam

2 2 Overview Introduction Design Design Improvements Design Review Related works Discussion

3 3 Introduction Hash Table Functionality:  Maps ‘key’ to a ‘value’. Content Addressable Network (CAN) :- Is a concept that provides distributed infrastructure which has Hash Table like functionality on Internet like Scale. Characteristics:  scalable, fault-tolerant and completely self-organizing.

4 4 Introduction (cont..) Napster  Locating a file is centralized. Gnutella  Floods the request for a file, not scalable CAN provides a solution:  Scalable - Nodes maintain small amount of control state  Distributed - Hash table is stored in all Peers, so it is.

5 5 Design Each node stores a chunk of hash table entry and details of adjacent zones. Requests are forwarded towards the CAN node that contains the key. Indexing uses virtual d-dimensional Cartesian coordinates.  Coordinates are purely logical

6 6 Coordinate Space A D B C 0,0 1,0 0,1 Each node randomly picks a coordinate. Coordinate space is dynamically partitioned Each node owns its individual zone

7 7 Design (cont..) Inserting a pair ( key K 1, value V 1 )  Use Hash function to map K 1 to a point P 1 in space  Then this pair is stored in the Node that owns the zone Retrieving a value :  Need to know the key and use the key to identify the node  Node learns and maintains the table of details of adjacent nodes.

8 8 Routing Information's needed for routing  CAN node hold routing table that contains IP address and its virtual coordinate space.  Neighbor is determined if one of the d-dimension is same and another dimension abuts.  For a d-dimensional coordinate individual node maintains 2d neighbors

9 9 In figure nodes 5&1 are neighbors, as 5 has same Y coordinates as 1 and X coordinate abut 1’s.

10 10 Routing (Cont..) CAN message has destination address By simple greedy forwarding to the neighbor closest to the destination it proceeds it routing. average path length = (d/4)n 1/d hops. ( n - # of zones) As many path is available, network sustains even if some node fails.

11 11 Construction 1. First the new node must find a node already in the CAN. 2. Next, using the CAN routing mechanisms, it must find a node whose zone will be split. 3. Finally, the neighbors of the split zone must be notified so that routing can include the new node.

12 12 Bootstrap From DNS domain name, one or more bootstrap nodes is determined. A bootstrap node maintains a partial list of CAN nodes it believes are currently in the system. TO join a CAN, a new node looks up the CAN domain name in DNS to retrieve a bootstrap nodes IP address. This bootstrap node then supplies the IP address of several randomly chosen nodes currently in system.

13 13 Finding a zone New node randomly chooses a point (p) in space. Sends JOIN request destined for P. This is sent into CAN via existing CAN node. Current occupant node then splits its zone in half and assigns one half to the new node. Splitting is done by assuming certain order.  Eg, in 2 d, X coordinate splits first and then Y coordinate.

14 14 Maintenance Departure of a Node Single Node Failure Multiple Failure

15 15 Departure of a Node The node that departs hands over the details to the one of its neighbor. If the zone of one of the neighbors can be merged with the departing node’s zone to produce a valid single zone, then this is done. If not, then the zone is handed to the neighbor whose current zone is smallest, and that node will then temporarily handle both zones.

16 16 Departure of a Node A D B C 1,0 0,1 0,0 D E F. When node F fails, E will be merged with F

17 17 Failures Prolonged absence of update message will indicate the failure of a node.  Neighbor node starts a takeover timer running.  When the timer expires, a node sends a TAKEOVER message conveying its own zone volume to all of the failed node’s neighbors.  It accepts the TAKEOVER only if the zone volume in the message is smaller than its own zone volume.  Otherwise it sends its TAKEOVER message.

18 18 Multiple Failure First does a ring search to get the unreachable nodes. Then rebuilds neighbor state table to do safe takeover.

19 19 Design Improvements Multi-dimensioned coordinate spaces  Increasing the dimensions of the CAN coordinate space reduces the routing path length, and hence the path latency.  Increase in Dimension => increase in neighbor => increase in routing => increases routing fault tolerance

20 20

21 21 Design Improvements Realities: multiple coordinate spaces  Each node maintain multiple, independent coordinate spaces with each node in the system. Each such coordinate space is a “reality”.  Given a coordinate, it is searched in all realities.  This reduces the average path length. Multiple dimensions vs. multiple realities  Multiple Reality has increased fault tolerance and data availability than multiple dimensions.

22 22 Design Improvements Overloading coordinate zones  allow multiple nodes to share the same zone. Nodes that share the same zone are termed peers.  MAXPEERS, which is the maximum number of allowable peers per zone.  reduced path length (number of hops), and hence reduced path latency  improved fault tolerance Multiple hash functions  Almost equal to multi realities.

23 23 Design Improvements Topologically-sensitive construction of the CAN overlay network  CAN nodes are ordered with their round-trip-time to each of landmarks.  With m landmarks, m! such orderings are possible.  Every portion is assigned a landmark ordering.  a new node joins the CAN at a random point in that portion of the coordinate space associated with its landmark ordering.

24 24 Design Improvements More Uniform Partitioning  Zone are split after comparing volume of its zone with those of its immediate neighbors in the coordinate space.  Zone with the largest volume is split.  we can see that without the uniform partitioning feature a little over 40% of the nodes are assigned to zones with volume V as compared to almost 90% with this feature and the largest zone volume drops from 8V to 2V.  Not surprisingly, the partitioning of the space further improves with increasing dimensions. Caching and Replication techniques

25 25

26 26 Design Review Following metrics were used to evaluate system performance:  Path length : the number of (application-level) hops required to route between two points in the coordinate space.  Neighbor-state: the number of CAN nodes for which an individual node must retain state.  Latency : we consider both the end-to-end latency of the total routing path between two points in the coordinate space and the per-hop latency, i.e., latency of individual application level hops obtained by dividing the end-to-end latency by the path length.  Volume : the volume of the zone to which a node is assigned that is indicative of the request and storage load a node must handle.  Routing fault tolerance : the availability of multiple paths between two points in the CAN.  Hash table availability : adequate replication of a (key,value) entry to withstand the loss of one or more replicas.

27 27 Design Review The key design parameters affecting system performance are:  dimensionality of the virtual coordinate space: d  number of realities: r  number of peer nodes per zone: p  number of hash functions (i.e. number of points per reality at which a (key, value) pair is stored): k  use of the RTT-weighted routing metric  use of the uniform partitioning Test system specification:  A system size of n=218 nodes,Transit-Stub topology with delay of 100ms on intra- transit links, 10ms on stub-transit links and 1ms on intra-stub links (i.e. 100ms on links that connect two transit nodes, 10ms on links that connect a transit node to a stub node and so forth). Transit-stub models explicitly group vertices into domains, and reflect that grouping in the connectivity between vertices.

28 28 100 node transit-stub topology

29 29 Bare bones: CAN that does not utilize most of our additional design features Knobs-on-full: CAN making full use of our added features (without the landmark ordering feature )

30 30 Related Work Related Algorithms  Distance vector and Link State algorithms These need widespread topological information. CAN in other hand stores only less data.  Plaxton algorithm Each node has n bit label divided into l levels. Each level has width w = n/ l. Each node forwards a packet to a neighbor whose label matches the destination label in more digits.

31 31 Related Work Algorithms with geographic routing.  ‘space’ in this algorithm refers to physical space.  No neighbor search problem.  Correctly mimic the space is a trivial problem  It is not extensible to multi dimension

32 32 Related System Domain Name System  It stores (domain name, IP address). Ocean Store  To provide continuous access to persistent information  Uses Plaxtons algorithm Peer-to-Peer file sharing systems  Freenet Stores Keys ( analogous URL ), address of other nodes, data corresponding to key.

33 33 Discussion Addresses two key problems in the design of Content-Addressable Networks: scalable routing and indexing. Simulation results validate the scalability of our overall design – for a CAN with over 260,000 nodes, we can route with a latency that is less than twice the IP path latency. Future works  Secure CAN  Key word searching


Download ppt "1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam."

Similar presentations


Ads by Google