Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao
Technische Universität Yimei Liao Chemnitz Outline – Lookup problem in Peer-to-Peer systems and Solutions – Chord Algorithm Consistent Hashing Scalable Key Location Node joins Stabilization – Summary 2
Technische Universität Yimei Liao Chemnitz Peer-to-Peer Systems Peer-to-Peer System: self-organizing system of equal, autonomous entities (peers) – Decentralized resource usage – Decentralized self-organization – Where to store? where to get? 3 Solutions Centralized servers Flooding search Distributed hash tables
Technische Universität Yimei Liao Chemnitz Solutions to lookup problem – Centralized servers Maintain the current location of data items in a central server Central server becomes crucial Best for simple and small applications – Flooding search Broadcast a request for an item among the nodes No additional routing information High bandwidth consumption 4
Technische Universität Yimei Liao Chemnitz Solutions to lookup problem – Distributed hash tables A global view of data distributed among many nodes Mapping nodes and data items into a common address space Each DHT node manages a small number of references to other nodes Queries are routed via a small number of nodes to the target node Load for retrieving items should be balanced equally among all nodes Robust against random failure and attacks Provides a definitive answer to a query 5
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Consistent Hashing Supports just one operation: given a key of an item, it maps the key onto a node where the item is located Consistent Hashing Assign each node and key an m-bit identifier using a base hash function such as SHA-1 Identifiers are ordered in an identifier circle modulo 2 m (Chord ring) Key k is assigned to the first node whose identifier is equal to or follows k: succ(k) = the node with the smallest id k 6
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Consistent Hashing identifier space : m=3 node key identifier circle
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Simple Key Lookup Simple Key Lookup Queries are passed around the circle via successor pointers Requires traversing All nodes to find the appropriate mapping successor(1) = 3 successor(3) = 6 successor(6) = 0 successor(0) = 1 Node 0 sends a query for key 6
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Scalable Key Location Finger Table Each node n maintains a routing table with up to m entries The i th entry in the table at node n contains the identifier of the first node s that succeeds n by at least 2 i-1 on the identifier circle.(s = succ(n+2 i-1 )) s is called the i th finger of node n 9 Definition of variables for node n
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Scalable Key Location For.startInt.Succ. 10 Finger table m = 3, each node n maintains at most 3 entries finger table keys [1,2)3 3 6 [2,4) [4,0) finger table keys For.startInt.Succ [4,5)6 6 0 [5,7) [7,3) 1 2 finger table keys For.startInt.Succ [7,0)0 0 3 [0,2) [2,6) 5
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Scalable Key Location 11 Query Upon receiving a query for key id, a node Check whether it stores the item locally If not, forwards the query to the largest node in its successor table that does not exceed id
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Scalable Key Location id=5 n= finger table keys startInt.Succ. 1 [1,2) 3 2 [2,4) 3 4 [4,0) 4 finger table keys startInt.Succ. 4[4,5) 5[5,7) 7[7,3) 1 2 finger table keys startInt.Succ. 0[0,1)0 1[1,3)3 3[3,7) 6 4 Successor 0 Predecessor 4 3 Successor 3 Predecessor 7 Successor Predecessor 0 finger table keys startInt.Succ. 5[5,6)7 6[6,0)7 0[0,4)0 Successor Predecessor succ(5) = 7 4 O(logN) pred(k)?
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Node joins Invariants to preserve – Each node’s successor is correctly maintained – For every key k, node succ(k) is responsible for k It is desirable for the finger tables to be correct Tasks to be performed by Chord – Initialize the predecessor and fingers of node n – Update the fingers and predecessor of existing nodes to reflect the addition of n – Notify the higher layer software so that it can transfer state associated with keys that node n is now responsible for 13
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Node joins startInt.Succ. 1 [1,2) 3 2 [2,4) 3 4 [4,0) finger table keys startInt.Succ. 4[4,5)7 5[5,7)7 7[7,3)7 1 2 finger table keys startInt.Succ. 0[0,1)0 1[1,3)0 3[3,7)3 6 4 Successor 0 Predecessor 3 5 Successor 3 Predecessor 7 Successor 7 Predecessor 0 finger table keys startInt.Succ. 6[6,7) 7[7,1) 1[1,5) Successor Predecessor 7 3 Initializing fingers and predecessor find_succ(6);
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Node joins startInt.Succ. 1 [1,2) 2 [2,4) 4 [4,0) finger table keys 5 startInt.Succ. 4[4,5)7 5[5,7)7 7[7,3) finger table keys startInt.Succ. 0[0,1)0 1[1,3)0 3[3,7) 6 4 Successor 0 Predecessor 3 5 Successor 3 Predecessor 7 Successor 7 Predecessor 0 5 finger table keys startInt.Succ. 6[6,7) 7[7,1) 1[1,5) Successor Predecessor Updating fingers of existing nodes 3 3 P = find_pred(n-2 i-1 ) i = 1, P = find_pred(4) i = 2, P = find_pred(3) i = 3, P = find_pred(1) 3 O(log 2 N)
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Node joins startInt.Succ. 1 [1,2) 3 2 [2,4) 3 4 [4,0) finger table keys 5 startInt.Succ. 4[4,5)7 5[5,7)7 7[7,3) finger table keys startInt.Succ. 0[0,1)0 1[1,3)0 3[3,7)3 6 4 Successor 0 Predecessor 3 5 Successor 3 Predecessor 7 Successor 6 Predecessor 0 5 finger table keys startInt.Succ. 6[6,7) 7[7,1) 1[1,5) Successor Predecessor Transferring Keys
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Node joins startInt.Succ. 1 [1,2) 3 2 [2,4) 3 4 [4,0) finger table keys 7 startInt.Succ. 4[4,5)7 5[5,7)7 7[7,3) finger table keys startInt.Succ. 0[0,1)0 1[1,3)0 3[3,7)3 6 4 Successor 0 Predecessor 3 5 Successor 3 Predecessor 7 Successor 6 Predecessor 0 7 finger table keys startInt.Succ. 6[6,7) 7[7,1) 1[1,5) Successor Predecessor Query without all finger tables been updated succ(4) = 7
Technische Universität Yimei Liao Chemnitz Chord Algorithm - Stabilization Stabilization – Correctness and performance – Keep node‘s successor pointers up to date – Use successor pointers to verify correct finger table entries – Periodically node n 1) asks its successor, n’, about its predecessor n’’; 2) If n’’ is between n’ and n, then let n->successor = n’’ and repeat step 1) until we reach a point where predecessor of succ(n) = n 18
Technische Universität Yimei Liao Chemnitz Chord Algorithm – Node Failure Node Failure – Successor-list – If successor fails, replace it with the first live entry in the list – Later run stabilization to correct finger table and successor-list 19
Technische Universität Yimei Liao Chemnitz Summary Characteristics of Chord – Load balance distributed hash table – Decentralization fully distributed – Scalability cost of lookup grows logarithmic – Availability automatically adjusts internal tables – Flexible naming no constraints on the structure of the keys Routing HopsO(logN) ArrivalO(log 2 N) DepartureO(log 2 N) 20
Technische Universität Yimei Liao Chemnitz References I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. Chord: A scalable Peer-To-Peer lookup service for internet applications. In Proceedings of the 2001 ACM SIGCOMM Conference, pages 149–160, R. Steinmetz, K. Wehrle (Edt.): "Peer-to-Peer Systems and Applications", LNCS 3485, Springer, Chapter 7-8,
Technische Universität Yimei Liao Chemnitz Thank You Questions? 22