MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference Chord: A scalable peer-to-peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference 2002. 12. 18. Jiyong Park
Contents Chord Overview Chord System Model Chord Protocol Lookup Lookup with Scalability Node joining Node joining (concurrent) Failure Simulation Results Conclusion
Overview Chord: lookup protocol for P2P Characteristics Especially for file-sharing application lookup(key) node which stores {key, value} Characteristics N: number of nodes, K: number of keys Keys for each node: (1+ε)K/N at most Messages for lookup: O(log N) When Nth node joins: O(K/N) keys are moved to diff location When Nth node joins: O(log2 N) messages for reorganization
Overview – Related Work Central Indexed: Napster, … Single source of failure Flooded Requests: Gnutella, … lot of broadcasts not scalable Document Routing: Freenet, Chord, … Scalable document ID must be known server download search search download 24 File Id = h(data) = 8 1500 10 200 1200
System Model {key, value} Operations System paramerer Area Key: m-bit ID Value: array of bytes (file, IP address, …) Operations insert(key, value), update(key, value) lookup(key) join() / leave() System paramerer r : degree of redundancy Area Only for lookup service security, authentication, … not concerned
Protocol Evolution All node know each other join / leave Not scalable Consistent Hashing Not scalable Scalable Don’t have to know each other No join / leave Chord Static Network Dynamic Network Does not know each other Join / leave but not concurrent In Practice Does not know each other Join / leave (concurrent) Handling failure Replication
Protocol – Consistent Hashing m-bit ID space node id: n = hash(IP address) key id: k = hash(Key) Successor(k) Node ID storing {key, value} m = 3 Node : {0, 1, 3} Key : {1, 2, 6}
Protocol – Consistent Hashing Properties (K keys, N nodes) Each Node is responsible for at most (1+ε)K/N keys When Node joins/leaves: O(K/N) keys are moved 6 Node ID 6 Joins 1 successor(1) = 1 7 1 successor(2) = 3 6 6 2 successor(6) = 6 3 5 4 2
Protocol – Basic Chord m-entry routing table (for each node) stores information only about small number of nodes amount of info. falls off exponentially with distance in key-space
Protocol – Basic Chord
Node 8 has more information about 12 Protocol – Basic Chord node 0.lookup(12) Try to find closest preceding of 12 finger[i].node is in test interval ? If yes, search at finger[i].node If no, i = i - 1 test interval (0, 12) node 0 finger table invoke node 8.lookup(12) Node 8 has more information about 12
Protocol – Basic Chord Node 8.lookup(12) Node 10.lookup(12) 12 is between this node (=10) and 10.successor (=14) So, key 12 is stored in node 14 O(log N) nodes contacted during lookup test interval (8, 12) node 8 finger table Invoke node 10.lookup(12)
Protocol – Join / Leave Assumptions join / leave Join procedure Each node should be in static state No concurrent join / leave Should know at least one existing node (n’) Join procedure Initialize finger table of node n (using n’) Update finger table of other nodes Copy keys for which node n has became their successor to n
Protocol – Join / Leave Initialize finger table finger table from the local node’s viewpoint finger[i].node = successor(finger[i].start) finger[i].node = n’.find_successor(finger[i].start) Update finger table of other nodes in counter clock-wise, for i = 1 to m if this node is ith finger of other node, update that entry Move keys from successor(finger[1].node)
Protocol – Join / Leave Node 6 joins O(log2 N) messages needed to re-establish after join/leave
Protocol – Concurrent Join Requisites Every node have to know its immediate predecessor & successor at all time Drawback less strict time bound Algorithm Find n’s predecessor & successor notify them that they have a new immediate neighbour fill n’s finger table initialize n’s predecessor stabilize periodically
Protocol – Concurrent Join Case 1 (distant node join concurrently) existing node s new node p s Find immediate predecessor & successor p s Notify them to change s & p pointer p
Protocol – Concurrent Join Case 2 (adjacent node join concurrently) s existing node new node p Find immediate predecessor & successor Notify them to change s & p pointer Call stabilize() periodically (fix wrong link)
Protocol – Failure & Replication Detected by stabilize() on immediate neighbor node Find new neighbor
Protocol – Failure & Replication When node n fails, n’ (successor of n) should have {key/value} in n each node has ‘r’ nearest successors pointer When insertion, propagate copy to r successors r r
Simulation Results Load balancing (keys per node = O(K/N) )
Simulation Results Path length (= log N)
Simulation Results Path length (N = 1012 , do not exceed 12)
Simulation Results Failure (after network has been stabilized)
Conclusion Benefits Limitations Scalable, Efficient: O(log N) Load balance (for uniform request) Bounded routing latency Limitations Destroys locality Discard useful application-specific information (e.g. hierarchy) Load unbalance (for skewed request)