Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.

Similar presentations


Presentation on theme: "Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims."— Presentation transcript:

1 Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims

2 Introduction Objective is create a scalable indexing mechanism for large-scale peer-to-peer systems Content-Addressable Networks (CAN) are presented as a scalable, fault-tolerant and completely self-organizing peer-to-peer overlay network Indexing is accomplished with Distributed Hash Table mapping keys to values

3 Design Multi-dimensional coordinate space with d dimensions (d-torus) Each node owns a zone in the space zone is a section of the hash table So each node stores a section of the table

4 Distributed Hash Table Uniform hash function is used to map key K to point P Creates table of key value pairs (K, V) For any point P, the corresponding (K, V) stored at node N that owns the zone that contains point P Entries are retrieved by using same hash function to map K to P and retrieve entry from node that owns the zone containing P

5 Routing Each node stores the IP address and coordinate zone of adjoining, or neighboring, nodes This data makes up the node’s routing table Greedy algorithm if P is within the Zone of current node, return (K, V) else forward the query to the neighbor with coordinates closest to P

6 More Routing Draw a straight line from point in local zone to P Follow straight line via neighbors For d-dimensional space, each node maintains 2d neighbors Nodes are self-organizing, making decisions dynamically

7 Node Joining the CAN New node N1 attempts to locate node N2 already in the CAN, typically using the IP address of a bootstrap node Generate random point P in the space Use hash function to locate zone that contains P Send JOIN message to node N3 that owns zone that contains P N3 splits its zone in half, assigns half to N1 by sending half of (K, V) pairs to N1, along with neighbor information N3 informs neighbors of space reallocation

8 Node departure Explicit departure – assigns zone and (K, V) pairs to a neighbor node to produce a single zone Attempt to combine with a neighboring node to form a valid zone, else two zones are temporarily handled by smallest neighbor

9 Failures Each node sends periodic update messages to each of its neighbors Crashed nodes are detected by neighbors by a lack of periodic update messages Neighbor nodes start takeover timer Send a takeover message to all of failed node’s neighbors Neighboring nodes agree on node with smallest volume Smallest node takes over crashed node’s zone

10 Design Improvements Multiple dimensions Multiple realities Multiple Hash functions Overload the coordinate zones Round trip time (RTT) Ratio Topologically-sensitive construction (landmarking) Uniform Partitioning

11 Multiple Dimensions Increase number of dimensions Reduce average path length Reduce path latency Increases routing table size due to greater number of neighbors

12 Multiple Realities Increase number of Realities Multiple coordinate spaces exist at the same time, each space is called a reality Each node assigned a different node in each reality Shorter paths, higher fault-tolerance (K, V) mapping to P at (x,y,z) is possibly stored at three different nodes

13 Dimensions v. Realities Two improvements with greatest impact Dimensions have a larger effect on reducing path length Realities provide stronger fault- tolerance and data availability

14 Multiple Hash Functions Multiple hash functions increases data availability, reduces query latency Improve data availability by mapping a single key to k points in the coordinate space by using k hash functions (K, V) only unavailable when all nodes crash Parallel querying of k nodes with k hash functions can reduce lookup latency

15 Overload Coordinate Zones Overload the coordinate zones by assigning more than one node to share the same zone Reduces the average path length, improved fault- tolerance No additional neighbors

16 RTT Ratio Limiting the round-trip-time (RTT) Each node measures RTT to neighbors Favor the lower latency paths

17 Topologically Sensitive Construction Use physical landmarks for construction Each node measures RTT of each landmark

18 Uniform Partitioning A form of volume balancing When a JOIN is received by a node, it also checks its neighbor nodes when deciding to accept JOIN Largest neighbor accepts and splits Achieves a load balance amongst the nodes

19 Design Review Ran two simulations using 2 18 nodes “bare bones” CAN without improvements “knobs-on-full” CAN using all features except landmarks and multiple hashes Biggest gain from number of dimensions (path length 198 to 5)

20 Questions?


Download ppt "Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims."

Similar presentations


Ads by Google