Download presentation
Presentation is loading. Please wait.
Published bySilvester McDaniel Modified over 8 years ago
1
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California
2
CS694 - DHT2 Outline History General definition of a DHT system Chord Kademlia Viceroy Koorde Open questions
3
CS694 - DHT3 serverindex tabledata client aclient bclient cclient d query data transferring History: client-server model
4
CS694 - DHT4 server index table dataclient a client b client c client ddata query data transferring History: peer-to-peer model (Napster)
5
CS694 - DHT5 query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (Gnutella)
6
CS694 - DHT6 query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (DHT systems)
7
CS694 - DHT7 a new class of peer-to-peer routing infrastructures support a hash table-like functionality on Internet-like scale - given a key, map it onto a node DHT systems - definition
8
CS694 - DHT8 Basic DHT components an overlay space and its partition approach a routing protocol local routing table next-hop decision a base hash function variants: proximity-aware, locality-preserving, etc.
9
CS694 - DHT9 DHT performance metrics Scalability, the first design goal two performance metrics - expected hops per request - expected routing table size
10
CS694 - DHT10 Consistent hashing [Karger et al. STOC97] data server Overlay space Hashing
11
CS694 - DHT11 Overlay space - one-dimensional unidirectional key space 0 – 2 m -1. Given two m-bit identifiers x and y, d(x, y)=(y-x+ 2 m ) % 2 m - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Chord [Stoica et al. Sigcomm2001]
12
CS694 - DHT12 Routing table (finger table) - (at most) m entries. The i th entry of node n contains the pointer to the first node that succeeds n by at least 2 (i-1) on the key space, 1 i m. Next-hop decision: - For the given target key k, find the closest finger before (to) k and forward the request to it. - Ending condition: The request terminates when k lies between the ID range of current node and its successor node. - The routing path length is O(log n) for a n-nodes network with high probability (w.h.p.). Chord – routing protocol
13
CS694 - DHT13 A Chord network with 8 nodes and 8-bit key space 0 32 64 96 128 160 192 224 256 Network node 0 256 Data 120 Chord – an example (m=8)
14
CS694 - DHT14 Chord – routing table setup A Chord network with 8 nodes and 8-bit key space Network node 255 0 64 128 32 192 96 160224 [1,2) Range 1 [2,4) Range 2 [4,8) Range 3 [8,16) Range 4 [16,32) Range 5 [32,64) Range 6 [64,128) Range 7 [128,256) Range 8 Data Pointer
15
CS694 - DHT15 Chord – a lookup for key 120
16
CS694 - DHT16 Chord – node joining 3.Transferring keys Between the new node and its successor node. 1.Initializing fingers and predecessor2.Updating fingers of existing nodes
17
CS694 - DHT17 Chord – stabilization mechanism Refresh the pointer to the immediate successor up to date. - guarantee correctness of lookups Refresh the pointers to the rest (m-1) fingers up to date. - keep lookups fast.
18
CS694 - DHT18 Chord – successors-list mechanism
19
CS694 - DHT19 Chord – simulation result [Stoica et al. Sigcomm2001]
20
CS694 - DHT20 The fraction of lookups that fail as a function of the fraction of nodes that fail. [Stoica et al. Sigcomm2001] Chord – “failure” experiment
21
CS694 - DHT21 Overlay space - m-bit XOR-based metric space Given two m-bit identifiers x and y, d(x, y)=x y - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys that it is closest to. Kademlia [Maymounkov et al. IPTPS02]
22
CS694 - DHT22 Routing table (k-buckets) - (at most) m buckets. The i th bucket of node x contains the pointers to up to k nodes that are of distance between 2 i-1 and 2 i from x, 1 i m. Next-hop decision: - For the given target key t, find the closest bucket to t and forward the request to the near nodes in the bucket (proximity routing). - Ending condition: either the initiator has got the requested information or it has queried and gotten responses from the k closest nodes it has seen. - The routing path length is O(log n) w.h.p. Kademlia – routing protocol
23
CS694 - DHT23 Either request or reply message will include the sender’s routing information. When receiving a routing message, the node will update its routing table with sender’s information. Least Recently Seen (LRS) bucket update scheme Will append-sender-ID bring equal refreshment information to all buckets? recursive lookup? iterative lookup? Kademlia – routing table update scheme
24
CS694 - DHT24 Overlay space - one-dimensional unidirectional key space 0 – 1. Given two m-bit identifiers x and y, d(x, y)=(y-x+ 1) % 1 - Each node select a random identifier from the key space, and a random level number from [1, logN], where N is the network size. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Viceroy [Malkhi et al. PODC02]
25
CS694 - DHT25 7 entries for a level-l node 2 stepping links to its predecessor and successor on the global ring. 2 stepping links to its predecessor and successor on the level-l ring. 2 down links to the level-(l+1) ring: one long contact at distance, and one local contact. 1 up link to the level-(l-1) ring: one local contact. Viceroy – routing table [Malkhi et al. PODC02] small-world link structure
26
CS694 - DHT26 3 steps 1. climb up to some level-1 node using up links. 2. jump using down links until no down link available. 3. step using stepping links. The routing path length is O(log n) w.h.p. Viceroy – routing protocol requestor target key
27
CS694 - DHT27 Localized network size estimation scheme n estimated = 1/d(s, successor(s)) for node s Node s select a level l among [1, ] with uniform randomness. When node s’successor changes, node s will change its level if changes, and the current selected level no longer exists or the new selected level didn’t exist before. Viceroy – level selection algorithm [Malkhi et al. PODC02]
28
CS694 - DHT28 Overlay space - one-dimensional unidirectional key space 0 – 2 m -1. Given two m-bit identifiers x and y, d(x, y)=(y-x+ 2 m ) % 2 m - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Koorde [Kaashoek et al. IPTPS03]
29
CS694 - DHT29 2 b nodes, each node a unique b-bit identifier. 2 out-pointers for each node x 1 link to the node 2x mod 2 b. 1 link to the node 2x+1 mod 2 b. O(b) path length between any pair of nodes. De Bruijn graph [Bruijn1946]
30
CS694 - DHT30 Routing table 2 entries for node x, - 1 link to its successor on the ring. - 1 link to its pseudo de Bruijn node, the predecessor of key 2x. Next-hop decision on node x: Initially, the imagery de Bruijin node i is set as the ID which lies between the requester and its successor and whose low bits has the longest matching with the high bits of the target key k. If the target k lies between the ID range of x and its successor, the request terminates successfully. Otherwise, - if i lies between x and its successor, i is assigned new value based on de Bruijin routing algorithm, and then the request and new i is forwarded to x’s pseudo de Bruijn node. - else, the request and the same i is forwarded to x’s successor. The routing path length is O(log n) for a n-nodes network Koorde – routing protocol
31
CS694 - DHT31 Bounds and tradeoffs [Kaashoek et al. IPTPS03] Degree and hop counts Fault tolerance and maintenance
32
CS694 - DHT32 Open questions for DHTs Q.1Can one redesign DHT routing algorithms to exploit heterogeneity? [Ratnasamy et al. IPTPS2002] Q.2Can one redesign DHT routing algorithms to be both degree optimal and load balanced? [Kaashoek et al. IPTPS03]
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.