A Scalable Content Addressable Network (Sigcomm 2001) Authors: Sylvia Ratsanamy, Mark Handley, Paul Francis Richard Karp, Scott Shenker Ehsan Foroughi
Outline Introduction Design Improvements Logical Environment Find and Insert Operations Departure and Recovery Improvements
Introduction Large Scale Addressing/Storage Management Key Features: Distributable/Scalable Fault Tolerance Fast Access vs. Low Overhead Hash Tables: How to distribute them?
Basic Design Logical Hash Space Keys hash to a d-dimensional Cartesian co-ordinate space on a d-torus The space is partitioned into distinct zones Zones are d-dimensional rectangles Zones are neighbor if share in (d-1) dimention
d-dimensional Torus Y X 1.0 0, 0 0.5
Routing Queries are hashed to the space, then routed to the node that handles the destination point on the virtual space Each node maintains information about its neighbors Routing is greedy, i.e. the query is forwarded to the neighbor with least distance to destination
Example of Routing
Insertion Find an arbitrary node of the system Pick a random key and hash it to some point Find the owner of the zone using the routing mechanism Start to Join Split P
Insertion cont’d Split the destination zone Take over a half of the new zone Update the neighbor information lists both on new zones and all neighbors of the old zone New Zone
Operation Cost Average path length is (d/2)(n1/d) hops Average neighbor list is 2d Letting d=log n both average path length and neighbor list will be O( log n )
Departure & Recovery Departure can be With previous warning Sudden death In case of normal departure, the node will explicitly hand out the zone to one of its neighbors and the depart Under normal conditions, nodes transmit update message so that neighbors will know when someone is dead using a timeout timer and ping!
Recovery In case of abnormal departure, the neighbors decide who should TAKEOVER the zone (node with smallest zone) In both case, some nodes may handle more than one zone temporarily Massive cluster failures can bring CAN to an inconsistent state! (not very probable)
Zone Reassignment Fragmentation problem due to departure of nodes A background algorithm reassigns zone and tries to merge-up zones to reduce the number of zones each node handles
Improvements Multi-Dimension Metrics Multi-Coordinate Spaces Overloading the Zones Multiple Hash Functions Topologically Sensitive Construction Uniform Partitioning Caching
Conclusion Drawbacks Applications Not possible to increase dimension Physical neighboring vs. logical neighboring Applications P2P File Sharing DNS …