Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.

Similar presentations


Presentation on theme: "CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California."— Presentation transcript:

1 CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California

2 CS694 - DHT2 Outline History General definition of a DHT system Chord Kademlia Viceroy Koorde Open questions

3 CS694 - DHT3 serverindex tabledata client aclient bclient cclient d query data transferring History: client-server model

4 CS694 - DHT4 server index table dataclient a client b client c client ddata query data transferring History: peer-to-peer model (Napster)

5 CS694 - DHT5 query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (Gnutella)

6 CS694 - DHT6 query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (DHT systems)

7 CS694 - DHT7 a new class of peer-to-peer routing infrastructures support a hash table-like functionality on Internet-like scale - given a key, map it onto a node DHT systems - definition

8 CS694 - DHT8 Basic DHT components an overlay space and its partition approach a routing protocol  local routing table  next-hop decision a base hash function  variants: proximity-aware, locality-preserving, etc.

9 CS694 - DHT9 DHT performance metrics Scalability, the first design goal two performance metrics - expected hops per request - expected routing table size

10 CS694 - DHT10 Consistent hashing [Karger et al. STOC97] data server Overlay space Hashing

11 CS694 - DHT11 Overlay space - one-dimensional unidirectional key space 0 – 2 m -1.  Given two m-bit identifiers x and y, d(x, y)=(y-x+ 2 m ) % 2 m - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Chord [Stoica et al. Sigcomm2001]

12 CS694 - DHT12 Routing table (finger table) - (at most) m entries. The i th entry of node n contains the pointer to the first node that succeeds n by at least 2 (i-1) on the key space, 1  i  m. Next-hop decision: - For the given target key k, find the closest finger before (to) k and forward the request to it. - Ending condition: The request terminates when k lies between the ID range of current node and its successor node. - The routing path length is O(log n) for a n-nodes network with high probability (w.h.p.). Chord – routing protocol

13 CS694 - DHT13 A Chord network with 8 nodes and 8-bit key space 0 32 64 96 128 160 192 224 256 Network node 0 256 Data 120 Chord – an example (m=8)

14 CS694 - DHT14 Chord – routing table setup A Chord network with 8 nodes and 8-bit key space Network node 255 0 64 128 32 192 96 160224 [1,2) Range 1 [2,4) Range 2 [4,8) Range 3 [8,16) Range 4 [16,32) Range 5 [32,64) Range 6 [64,128) Range 7 [128,256) Range 8 Data Pointer

15 CS694 - DHT15 Chord – a lookup for key 120

16 CS694 - DHT16 Chord – node joining 3.Transferring keys Between the new node and its successor node. 1.Initializing fingers and predecessor2.Updating fingers of existing nodes

17 CS694 - DHT17 Chord – stabilization mechanism Refresh the pointer to the immediate successor up to date. - guarantee correctness of lookups Refresh the pointers to the rest (m-1) fingers up to date. - keep lookups fast.

18 CS694 - DHT18 Chord – successors-list mechanism

19 CS694 - DHT19 Chord – simulation result [Stoica et al. Sigcomm2001]

20 CS694 - DHT20 The fraction of lookups that fail as a function of the fraction of nodes that fail. [Stoica et al. Sigcomm2001] Chord – “failure” experiment

21 CS694 - DHT21 Overlay space - m-bit XOR-based metric space  Given two m-bit identifiers x and y, d(x, y)=x  y - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys that it is closest to. Kademlia [Maymounkov et al. IPTPS02]

22 CS694 - DHT22 Routing table (k-buckets) - (at most) m buckets. The i th bucket of node x contains the pointers to up to k nodes that are of distance between 2 i-1 and 2 i from x, 1  i  m. Next-hop decision: - For the given target key t, find the closest bucket to t and forward the request to the near nodes in the bucket (proximity routing). - Ending condition: either the initiator has got the requested information or it has queried and gotten responses from the k closest nodes it has seen. - The routing path length is O(log n) w.h.p. Kademlia – routing protocol

23 CS694 - DHT23 Either request or reply message will include the sender’s routing information. When receiving a routing message, the node will update its routing table with sender’s information.  Least Recently Seen (LRS) bucket update scheme Will append-sender-ID bring equal refreshment information to all buckets?  recursive lookup?  iterative lookup? Kademlia – routing table update scheme

24 CS694 - DHT24 Overlay space - one-dimensional unidirectional key space 0 – 1.  Given two m-bit identifiers x and y, d(x, y)=(y-x+ 1) % 1 - Each node select a random identifier from the key space, and a random level number from [1, logN], where N is the network size. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Viceroy [Malkhi et al. PODC02]

25 CS694 - DHT25 7 entries for a level-l node  2 stepping links to its predecessor and successor on the global ring.  2 stepping links to its predecessor and successor on the level-l ring.  2 down links to the level-(l+1) ring: one long contact at distance, and one local contact.  1 up link to the level-(l-1) ring: one local contact. Viceroy – routing table [Malkhi et al. PODC02] small-world link structure

26 CS694 - DHT26 3 steps 1. climb up to some level-1 node using up links. 2. jump using down links until no down link available. 3. step using stepping links. The routing path length is O(log n) w.h.p. Viceroy – routing protocol requestor target key

27 CS694 - DHT27 Localized network size estimation scheme  n estimated = 1/d(s, successor(s)) for node s Node s select a level l among [1, ] with uniform randomness. When node s’successor changes, node s will change its level if  changes, and  the current selected level no longer exists or the new selected level didn’t exist before. Viceroy – level selection algorithm [Malkhi et al. PODC02]

28 CS694 - DHT28 Overlay space - one-dimensional unidirectional key space 0 – 2 m -1.  Given two m-bit identifiers x and y, d(x, y)=(y-x+ 2 m ) % 2 m - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Koorde [Kaashoek et al. IPTPS03]

29 CS694 - DHT29 2 b nodes, each node a unique b-bit identifier. 2 out-pointers for each node x  1 link to the node 2x mod 2 b.  1 link to the node 2x+1 mod 2 b. O(b) path length between any pair of nodes. De Bruijn graph [Bruijn1946]

30 CS694 - DHT30 Routing table  2 entries for node x, - 1 link to its successor on the ring. - 1 link to its pseudo de Bruijn node, the predecessor of key 2x. Next-hop decision on node x:  Initially, the imagery de Bruijin node i is set as the ID which lies between the requester and its successor and whose low bits has the longest matching with the high bits of the target key k.  If the target k lies between the ID range of x and its successor, the request terminates successfully.  Otherwise, - if i lies between x and its successor, i is assigned new value based on de Bruijin routing algorithm, and then the request and new i is forwarded to x’s pseudo de Bruijn node. - else, the request and the same i is forwarded to x’s successor. The routing path length is O(log n) for a n-nodes network Koorde – routing protocol

31 CS694 - DHT31 Bounds and tradeoffs [Kaashoek et al. IPTPS03] Degree and hop counts Fault tolerance and maintenance

32 CS694 - DHT32 Open questions for DHTs Q.1Can one redesign DHT routing algorithms to exploit heterogeneity? [Ratnasamy et al. IPTPS2002] Q.2Can one redesign DHT routing algorithms to be both degree optimal and load balanced? [Kaashoek et al. IPTPS03]


Download ppt "CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California."

Similar presentations


Ads by Google