1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.

1 Distributed Hash tables

2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network can find an item efficiently  No central node or infrastructure r Applications:  P2P file sharing, without central node (items are files)  Large scale, distributed or grid computation (items are computational tasks)

3 Chord r Uses consistent hashing to map keys to nodes. r Enables  Lookup  Node joining the network  Node leaving the network r Each node maintains routing information on the network.

4 Consistent hashing r Consistent hash function assigns each node and key an m-bit identifier. r A node is a member of the network: a processing and communication unit. r A key identifies a data item:  Example: movie name r A node’s identifier can be defined by e.g. hashing the node’s IP address. r An identifier for node or key can be produced by hashing  ID(node) = hash(IP address)  ID(key) = hash(key)

Consistent hashing (cont.) r Properties of consistent hashing:  Load balancing – given N nodes, with high probability O(1/N) of the keys are stored in each node.  When the N-th node joins, with high probability O(1/N) of the keys are moved. r Practical problems with identifiers  IP is not always a good identifier: NAT DHCP  Key is not always well defined (e.g. file name). 5

6 Chord ring r Chord orders all m-bit identifiers (both nodes and keys) on ring of 2 m elements r Order is defined modulo 2 m (2 m -1<0) r Identifier space for keys – {0,…,2 m -1}  Not all keys need be in use concurrently r Identifier space for nodes – {0,…,2 m -1}  Not all nodes are necessarily active concurrently r Key k is assigned to node n if  id(n)=min {id (n’); id(n’)  id(k)} (  is interpreted modulo 2 m, i.e. first node clockwise). r n is called successor(k)

7 Successor Nodes 6 1 2 6 0 4 26 5 1 3 7 2 identifier circle identifier node X key successor(1) = 1 successor(2) = 3successor(6) = 0 m=3

More on hashing r Desirable properties for hash function:  Collision resistant on nodes (mandatory)  Collision resistant on keys (hopefully)  Load balancing  Good adversarial behavior. r Chord suggestion  SHA-1  ID(node) = SHA-1(IP address)  ID(key) = SHA-1(key) r Drawback  Full SHA-1 implies m=160  Truncated SHA-1 may not be collision-resistant 8

9 Key lookup r Suppose each node stores successor(node) r We could use the following pseudo-code n.find_successor(id) if (id  [node, successor]) return successor; else // forward the query around the circle return successor.find_successor(id); r Example of execution r What is the time/message complexity?

10 Key lookup in O(m) r Each node n holds additional routing information: finger table  m entries (at most)  Each entry is called “finger” r Data in each entry  Finger[i].node=successor (n+2 i-1 mod 2 m ), i=1,…,m  Finger[i] includes TCP/IP access information such as IP and port. r Denote the i-th finger by node.finger(i)

11 Example – finger tables 0 4 26 5 1 3 7 1 3 0 finger table succ. keys 1 start 235235 330330 finger table succ. keys 2 start 457457 000000 finger table succ. keys 6 124124 start [1,2) [2,4) [4,0) int. [2,3) [3,5) [5,1) Int. [4,5) [5,7) [7,3) int.

12 Example – fast lookup

13 Fast lookup - pseudo code // ask node n to find the successor of id n.find_successor(id) if (id  (n, successor]) return successor; else n’ = closest_preceding_node(id); return n’.find_successor(id); // search the local table for the highest predecessor of id n.closest_preceding_node(id) for i = m downto 1 if (finger[i]  (n, id)) return finger[i]; return n;

14 Fast lookup - analysis r Time complexity at each node  O(m) r Theorem - Number of nodes in a search path is O(m) in the worst case r Number of messages is equal to number of nodes in search path r Theorem – if nodes are distributed uniformly among identifiers then number of nodes in search path is O(log n) with high probability.

Fast lookup – analysis (cont.) r X i – i-th node is in last interval of 2 m /N identifiers in search path. r X=Σx i r  =E[X]=1 r Chernoff bound –  Pr[X>(1+  )  ]<(e  /(1+  ) 1+  )  15

16 Node Joins and Stabilizations r Nodes join dynamically r Network maintains two invariants  Node successor is up to date.  For every key k, successor(k) is responsible for k. r When node n joins:  n.successor and n.finger[i] are updated.  Successor and finger tables of other nodes are updated.  Correct keys are transferred to n. r Each node runs a “stabilization” protocol periodically in the background to update successor pointer and finger table.

17 Node Joins and Stabilizations r “Stabilization” protocol contains 6 functions:  create()  join()  stabilize()  notify()  fix_fingers()  check_predecessor() r Each node has both successor and predecessor pointers

18 Node Joins – join() r When node n first starts, it calls n.join(n’), where n’ is any known Chord node. r The join() function asks n’ to find the immediate successor of n. r join() does not make the rest of the network aware of n.

19 Node Joins – join() // create a new Chord ring. n.create() predecessor = nil; successor = n; // join a Chord ring containing node n’. n.join(n’) predecessor = nil; successor = n’.find_successor(n);

20 Node Joins – stabilize() r Each time node n runs stabilize(), it asks its successor for the successor’s predecessor p, and decides whether p should be n’s successor instead. r stabilize() notifies node n’s successor of n’s existence, giving the successor the chance to change its predecessor to n. r The successor does this only if it knows of no closer predecessor than n.

21 Node Joins – stabilize() // called periodically. verifies n’s immediate // successor, and tells the successor about n. n.stabilize() x = successor.predecessor; if (x  (n, successor)) successor = x; successor.notify(n); // n’ thinks it might be our predecessor. n.notify(n’) if (predecessor is nil or n’  (predecessor, n)) predecessor = n’;

22 Node Joins – Join and Stabilization npnp succ(n p ) = n s nsns n pred(n s ) = n p r n joins  predecessor = nil  n acquires n s as successor via some n’ r n runs stabilize  n notifies n s being the new predecessor  n s acquires n as its predecessor r n p runs stabilize  n p asks n s for its predecessor (now n)  n p acquires n as its successor  n p notifies n  n will acquire n p as its predecessor r all predecessor and successor pointers are now correct r fingers still need to be fixed, but old fingers will still work nil pred(n s ) = n succ(n p ) = n

23 Node Joins – fix_fingers() r Each node periodically calls fix fingers to make sure its finger table entries are correct. r New nodes initialize their finger tables using fix_fingers(). r Existing nodes incorporate new nodes into their finger tables using fix_fingers(). r Each node maintains a pointer next into its finger table.

24 Node Joins – fix_fingers() // called periodically. refreshes finger table entries. n.fix_fingers() next = next + 1 ; if (next > m) next = 1 ; finger[next].node = find_successor(n + 2 next-1 ); // checks whether predecessor has failed. n.check_predecessor() if (predecessor has failed) predecessor = nil; r What is the complexity?

25 Node Failures r Key step in failure recovery is maintaining correct successor pointers r To help achieve this, each node maintains a successor-list of its r nearest successors on the ring r If node n notices that its successor has failed, it replaces it with the first live entry in the list r Successor lists are stabilized as follows:  node n reconciles its list with its successor s by copying s’s successor list, removing its last entry, and prepending s to it.  If node n notices that its successor has failed, it replaces it with the first live entry in its successor list and reconciles its successor list with its new successor.

1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.

Similar presentations

Presentation on theme: "1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.

Similar presentations

Presentation on theme: "1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network."— Presentation transcript:

Similar presentations

About project

Feedback