Download presentation
Presentation is loading. Please wait.
Published byAdelia Kelley Modified over 8 years ago
1
1 Distributed Hash tables
2
2 Overview r Objective A distributed lookup service Data items are distributed among n parties Anyone in the network can find an item efficiently No central node or infrastructure r Applications: P2P file sharing, without central node (items are files) Large scale, distributed or grid computation (items are computational tasks)
3
3 Chord r Uses consistent hashing to map keys to nodes. r Enables Lookup Node joining the network Node leaving the network r Each node maintains routing information on the network.
4
4 Consistent hashing r Consistent hash function assigns each node and key an m-bit identifier. r A node is a member of the network: a processing and communication unit. r A key identifies a data item: Example: movie name r A node’s identifier can be defined by e.g. hashing the node’s IP address. r An identifier for node or key can be produced by hashing ID(node) = hash(IP address) ID(key) = hash(key)
5
Consistent hashing (cont.) r Properties of consistent hashing: Load balancing – given N nodes, with high probability O(1/N) of the keys are stored in each node. When the N-th node joins, with high probability O(1/N) of the keys are moved. r Practical problems with identifiers IP is not always a good identifier: NAT DHCP Key is not always well defined (e.g. file name). 5
6
6 Chord ring r Chord orders all m-bit identifiers (both nodes and keys) on ring of 2 m elements r Order is defined modulo 2 m (2 m -1<0) r Identifier space for keys – {0,…,2 m -1} Not all keys need be in use concurrently r Identifier space for nodes – {0,…,2 m -1} Not all nodes are necessarily active concurrently r Key k is assigned to node n if id(n)=min {id (n’); id(n’) id(k)} ( is interpreted modulo 2 m, i.e. first node clockwise). r n is called successor(k)
7
7 Successor Nodes 6 1 2 6 0 4 26 5 1 3 7 2 identifier circle identifier node X key successor(1) = 1 successor(2) = 3successor(6) = 0 m=3
8
More on hashing r Desirable properties for hash function: Collision resistant on nodes (mandatory) Collision resistant on keys (hopefully) Load balancing Good adversarial behavior. r Chord suggestion SHA-1 ID(node) = SHA-1(IP address) ID(key) = SHA-1(key) r Drawback Full SHA-1 implies m=160 Truncated SHA-1 may not be collision-resistant 8
9
9 Key lookup r Suppose each node stores successor(node) r We could use the following pseudo-code n.find_successor(id) if (id [node, successor]) return successor; else // forward the query around the circle return successor.find_successor(id); r Example of execution r What is the time/message complexity?
10
10 Key lookup in O(m) r Each node n holds additional routing information: finger table m entries (at most) Each entry is called “finger” r Data in each entry Finger[i].node=successor (n+2 i-1 mod 2 m ), i=1,…,m Finger[i] includes TCP/IP access information such as IP and port. r Denote the i-th finger by node.finger(i)
11
11 Example – finger tables 0 4 26 5 1 3 7 1 3 0 finger table succ. keys 1 start 235235 330330 finger table succ. keys 2 start 457457 000000 finger table succ. keys 6 124124 start [1,2) [2,4) [4,0) int. [2,3) [3,5) [5,1) Int. [4,5) [5,7) [7,3) int.
12
12 Example – fast lookup
13
13 Fast lookup - pseudo code // ask node n to find the successor of id n.find_successor(id) if (id (n, successor]) return successor; else n’ = closest_preceding_node(id); return n’.find_successor(id); // search the local table for the highest predecessor of id n.closest_preceding_node(id) for i = m downto 1 if (finger[i] (n, id)) return finger[i]; return n;
14
14 Fast lookup - analysis r Time complexity at each node O(m) r Theorem - Number of nodes in a search path is O(m) in the worst case r Number of messages is equal to number of nodes in search path r Theorem – if nodes are distributed uniformly among identifiers then number of nodes in search path is O(log n) with high probability.
15
Fast lookup – analysis (cont.) r X i – i-th node is in last interval of 2 m /N identifiers in search path. r X=Σx i r =E[X]=1 r Chernoff bound – Pr[X>(1+ ) ]<(e /(1+ ) 1+ ) 15
16
16 Node Joins and Stabilizations r Nodes join dynamically r Network maintains two invariants Node successor is up to date. For every key k, successor(k) is responsible for k. r When node n joins: n.successor and n.finger[i] are updated. Successor and finger tables of other nodes are updated. Correct keys are transferred to n. r Each node runs a “stabilization” protocol periodically in the background to update successor pointer and finger table.
17
17 Node Joins and Stabilizations r “Stabilization” protocol contains 6 functions: create() join() stabilize() notify() fix_fingers() check_predecessor() r Each node has both successor and predecessor pointers
18
18 Node Joins – join() r When node n first starts, it calls n.join(n’), where n’ is any known Chord node. r The join() function asks n’ to find the immediate successor of n. r join() does not make the rest of the network aware of n.
19
19 Node Joins – join() // create a new Chord ring. n.create() predecessor = nil; successor = n; // join a Chord ring containing node n’. n.join(n’) predecessor = nil; successor = n’.find_successor(n);
20
20 Node Joins – stabilize() r Each time node n runs stabilize(), it asks its successor for the successor’s predecessor p, and decides whether p should be n’s successor instead. r stabilize() notifies node n’s successor of n’s existence, giving the successor the chance to change its predecessor to n. r The successor does this only if it knows of no closer predecessor than n.
21
21 Node Joins – stabilize() // called periodically. verifies n’s immediate // successor, and tells the successor about n. n.stabilize() x = successor.predecessor; if (x (n, successor)) successor = x; successor.notify(n); // n’ thinks it might be our predecessor. n.notify(n’) if (predecessor is nil or n’ (predecessor, n)) predecessor = n’;
22
22 Node Joins – Join and Stabilization npnp succ(n p ) = n s nsns n pred(n s ) = n p r n joins predecessor = nil n acquires n s as successor via some n’ r n runs stabilize n notifies n s being the new predecessor n s acquires n as its predecessor r n p runs stabilize n p asks n s for its predecessor (now n) n p acquires n as its successor n p notifies n n will acquire n p as its predecessor r all predecessor and successor pointers are now correct r fingers still need to be fixed, but old fingers will still work nil pred(n s ) = n succ(n p ) = n
23
23 Node Joins – fix_fingers() r Each node periodically calls fix fingers to make sure its finger table entries are correct. r New nodes initialize their finger tables using fix_fingers(). r Existing nodes incorporate new nodes into their finger tables using fix_fingers(). r Each node maintains a pointer next into its finger table.
24
24 Node Joins – fix_fingers() // called periodically. refreshes finger table entries. n.fix_fingers() next = next + 1 ; if (next > m) next = 1 ; finger[next].node = find_successor(n + 2 next-1 ); // checks whether predecessor has failed. n.check_predecessor() if (predecessor has failed) predecessor = nil; r What is the complexity?
25
25 Node Failures r Key step in failure recovery is maintaining correct successor pointers r To help achieve this, each node maintains a successor-list of its r nearest successors on the ring r If node n notices that its successor has failed, it replaces it with the first live entry in the list r Successor lists are stabilized as follows: node n reconciles its list with its successor s by copying s’s successor list, removing its last entry, and prepending s to it. If node n notices that its successor has failed, it replaces it with the first live entry in its successor list and reconciles its successor list with its new successor.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.