Download presentation
Presentation is loading. Please wait.
1
Consistent Hashing and Distributed Hash Table
Chen Qian Department of Computer Science & Engineering
2
Motivating question Suppose we have a large amount of web pages to store over 100 servers. How can we know where a webpage (identified by a url) is stored, among the 100 servers? Do we have a method simpler than using a complete directory? 2
3
Simple Solution Using Hashing
Let us use a mapping from all web pages to all servers A hash functions maps elements of a universe U to buckets A good hash function: Easy to compute Uniformly random 3
4
Say there are n servers, indexed by 0, 1, 2, …, n-1
Say there are n servers, indexed by 0, 1, 2, …, n-1. Then we can just store the Web page with URL x at the cache server named h(x) mod n. Problem: Suppose we add a new server and n is now 101. For an object x, it is very unlikely that h(x) mod 100 and h(x) mod 101 are the same number, Thus, changing n forces almost all objects to relocate. 4
5
Consistent Hashing Each object is mapped to the next bucket that appears in clockwise order on the unit circle. Server Object 5
6
Consistent Hashing Server Object Failure 6
7
Consistent Hashing Server Object 7
8
Consistent Hashing Server Object
Add a server or the server is online again 8
9
The object and server names need to be hashed to the same range, such as 32-bit values.
n servers partition the circle into n segments, with each server responsible for all objects in one of these segments. in expectation, adding the nth server causes only a 1/n fraction of the objects to relocate. 9
10
In practice Using binary search tree to find the server responsible for an object If you pick n random points on the circle, you're very unlikely to get a perfect partition of the circle into equal-sized segments. Decrease this variance is to make k “virtual copies" of each server s 10
11
Pros? Cons? 11
12
P2P: searching for information
“Index” in P2P system: maps information to peer location (location = IP address & port number) “Index” may be a server, or may be a distributed system – it is an abstraction on this slide Each file that is shared using eMule is hashed using the MD4 algorithm. The top-level MD4 hash, file size, filename, and several secondary search attributes such as bit rate and codec are stored on eD2k servers and the serverless Kad network. The Kad network is a peer-to-peer (P2P) network which implements the Kademlia P2P overlay protocol. The majority of users on the Kad Network are also connected to servers on the eDonkey network, and Kad Network clients typically query known nodes on the eDonkey network in order to find an initial node on the Kad network.
13
Distributed Hash Table (DHT)
DHT = distributed P2P database Database has (key, value) pairs; key: content name; value: IP address Peers query database with key database returns values that match the key Peers can also insert (key, value) pairs into database Finding “needles” requires that the P2P system be structured
14
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications
Ion Stoica Robert Morris David Karger Frans Kaashoek Hari Balakrishnan Slides by Xiaozhou Li
15
distributed hash table
Distributed application put(key, data) get (key) data Distributed hash table node …. DHT provides the information look up service for P2P applications. Nodes uniformly distributed across key space Nodes form an overlay network Nodes maintain list of neighbors in routing table Decoupled from physical network topology
16
Routed queries (Chord)
N2 N1 N3 N4 Client Publisher Lookup(“beat it”) Key = “beat it” Value = MP3 data… N7 N6 N8 N9
17
Routing challenges Define a useful key nearness metric
Keep the hop count small Keep the tables small Stay robust despite rapid change Chord: emphasizes efficiency and simplicity
18
Chord properties Efficient: O(log(N)) messages per lookup
N is the total number of servers Scalable: O(log(N)) state per node Robust: survives massive failures Proofs are in paper / tech report Assuming no malicious participants
19
Chord overview Provides peer-to-peer hash lookup:
Lookup(key) return IP address Chord does not store the data How does Chord route lookups? How does Chord maintain routing tables?
20
Chord IDs Key identifier = SHA-1(key)
Node identifier = SHA-1(IP address) Both are uniformly distributed Both exist in the same ID space How to map key IDs to node IDs? The heart of Chord protocol is “consistent hashing”
21
Review: consistent hashing for data partitioning and replication
1 1/2 hash(key1) E replication factor N=3 A C hash(key2) F B D A key is stored at its successor: node with next higher ID 21
22
Identifier to node mapping example
Node 8 maps [5,8] Node 15 maps [9,15] Node 20 maps [16, 20] … Node 4 maps [59, 4] Each node maintains a pointer to its successor 4 58 8 15 44 20 35 32
23
Lookup Each node maintains its successor
4 Each node maintains its successor Route packet (ID, data) to the node responsible for ID using successor pointers 58 8 node=44 15 44 20 35 32
24
Join Operation Node with id=50 joins the ring via node 15
Node 50: send join(50) to node 15 Node 44: returns node 58 Node 50 updates its successor to 58 succ=4 pred=44 4 58 8 join(50) succ=nil succ=58 pred=nil 15 50 58 44 succ=58 20 pred=35 35 32 24
25
Periodic Stabilize Node 50: periodic stabilize 4
succ=4 Node 50: periodic stabilize Sends stabilize message to 58 Node 50: send notify message to 58 Update pred=44 pred=50 pred=44 4 stabilize(node=50) notify(node=50) 58 8 succ.pred=44 succ=58 pred=nil 15 50 44 succ=58 20 pred=35 35 32
26
Periodic Stabilize Node 44: periodic stabilize Asks 58 for pred (50) 4
succ=4 pred=50 Node 44: periodic stabilize Asks 58 for pred (50) Node 44 updates its successor to 50 4 58 succ.pred=50 stabilize(node=44) 8 succ=58 pred=nil 15 50 44 succ=58 succ=50 20 pred=35 35 32
27
Periodic Stabilize Node 44 has a new successor (50) 4
pred=50 Node 44 has a new successor (50) Node 44 sends a notify message to node 50 4 58 8 succ=58 pred=44 pred=nil notify(node=44) 15 50 44 succ=50 20 pred=35 35 32
28
Periodic Stabilize Converges!
This completes the joining operation! pred=50 4 58 8 succ=58 50 pred=44 15 44 20 succ=50 35 32
29
Achieving Efficiency: finger tables
Say m=7 Finger Table at 80 i ft[i] 0 96 1 96 2 96 3 96 4 96 5 112 6 20 ( ) mod 27 = 16 112 20 96 32 45 80 ith entry at peer with id n is first peer with id >= Each node only stores O(log N) entries Each look up takes at most O(log N) hops
30
Achieving Robustness What if nodes FAIL?
Ring robustness: each node maintains the k (> 1) immediate successors instead of only one successor If smallest successor does no respond, substitute the second entry in its successor list Unlikely all successors fail simultaneously Modifications to stabilize protocol (see paper!)
31
Cooperative File System (CFS)
Block storage Availability / replication Authentication Caching Consistency Server selection Keyword search Lookup DHash distributed block store Chord Powerful lookup simplifies other mechanisms
32
Cooperative File System (cont.)
Block storage Split each file into blocks and distribute those blocks over many servers Balance the load of serving popular files Data replication Replicate each block on k servers Increase availability Reduce latency (fetch from the server with least latency)
33
Cooperative File System (cont.)
Caching Caches blocks along the lookup path Avoid overloading servers that hold popular data Load balance Different servers may have different capacities A real server may act as multiple virtual servers, by being hashed to several different IDs.
34
Problem of DHT No good solution to maintain both scalable and consistent finger table under Churn. Solution of BitTorrent Maintain trackers (servers) as DHT, which are more reliable Users queries trackers to get the locations of the file File sharing are not structured.
35
Next class Please read Chapter 3-4 of the textbook BEFORE Class 35
36
Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/
Thank You Chen Qian
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.