CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida
Outline History General definition of a DHT system Chord Applications Implementation Demo/Open questions
serverindex tabledata client aclient bclient cclient d query data transferring History: client-server model
server index table dataclient a client b client c client ddata query data transferring History: peer-to-peer model (Napster)
query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (Gnutella)
query data transferring index table dataclient a client b client c client d index table data index table data index table data History: peer-to-peer model (DHT systems)
What is normal hashing ? Putting items into say n buckets, often solved by a static solution(e.g H(key)mod N) Problems N changes ? We must move every thing (simple solution) or could we move just (#items) / (#buckets) on an average (optimal solution) what if we do not know the number of buckets ? Distributed Hash Table
Consistent Hashing Let A & B be Hashing. They send an object k to two different buckets, say y and z, respectively. Then either A cannot see bucket z or B cannot see bucket y. Each bucket is a assigned a number in (0,1), and each item gets a value (0,1). You could then be in according to least distance or always round down(or up) and wrap at the edges.
Distributed Hash Table Consistent Hashing c’td If a new bucket is added, we therefore need to move the Contents of bucket immediately before it. If we let each bucket have m elements and (M = log n) then each bucket gets a fair share of load and on an average we move only the average contents of the bucket around. If we let each bucket have m elements and (M = log n) then each bucket gets a fair share of load and on an average we move only the average contents of the bucket around.
Distributed Hash Table a new class of peer-to-peer routing infrastructures. support a hash table-like functionality on Internet-like scale Consider the buckets to be computers on the internet and that given a key, they should find the value(IP address) The challenges with such a system are: Load balancing Scalability dynamic nature no critical point deterministic.
an overlay space and its partition approach a routing protocol local routing table next-hop decision a base hash function variants: proximity-aware, locality-preserving, etc. Basic DHT components
Consistent hashing data server Overlay space Hashing
Overlay space - one-dimensional unidirectional key space 0 – 2 m -1. Given two m-bit identifiers x and y, d(x, y)=(y-x+ 2 m ) % 2 m - Each node is assigned a random m-bit identifier. - Each node is responsible for all keys equal to or before its identifier until its predecessor node’s identifier in the key space. Chord [Stoica et al. Sigcomm2001]
Routing table (finger table) - (at most) m entries. The i th entry of node n contains the pointer to the first node that succeeds n by at least 2 (i-1) on the key space, 1 i m. Next-hop decision: - For the given target key k, find the closest finger before (to) k and forward the request to it. - Ending condition: The request terminates when k lies between the ID range of current node and its successor node. - The routing path length is O(log n) for a n-nodes network with high probability (w.h.p.). Chord – routing protocol
A Chord network with 8 nodes and 8-bit key space Network node Data 120 Chord – an example (m=8)
Chord – routing table setup A Chord network with 8 nodes and 8-bit key space Network node [1,2) Range 1 [2,4) Range 2 [4,8) Range 3 [8,16) Range 4 [16,32) Range 5 [32,64) Range 6 [64,128) Range 7 [128,256) Range 8 Data Pointer
Chord – a lookup for key 120
How to look up a key quickly ? We need finger table
How to look up a key quickly ?(cont.) u finger table for node 1
How to look up a key quickly ?(cont.) Node 3: Am I predecessor(1) ? Predecessor(1) successor(1) Node 3: Try entry 3, and find node 0 Node 3: Send lookup to node 0Node 0: Am I predecessor(1) ?Node 0: successor(1) is node 1 return to node 3 (RPC) Value of key 1 ?
Node joins u Two challenges Each node’s finger table is correctly filled Each key k is stored at node successor(k) u Three operations Initialize the predecessor and fingers of the new node n Update the fingers and predecessors of existing nodes Copy to n all keys for which node n has became their successor
Initialize the predecessor and fingers of node n Idea: Ask an existing node for information needed Join in
Update the fingers and predecessors of existing nodes Observation: when node n joins the network, n will become the i th finger of a node p when the following two conditions meet: P proceeds n by at least 2 i-1 The i th finger of node p succeeds n u Solution: Try to find predecessor(n- 2 i-1 ) for all 1<=i<=m; and check whether n is their i th finger, and whether n is their predecessor’s i th finger.
Update the fingers and predecessors of existing nodes (cont.) Predecessor( ) =3, update 66 Predecessor(3) =1, no update Predecessor( ) =3, update 66 Predecessor( ) =1, update 66 Predecessor(1) =0, update Predecessor(3) =1, no update 66 Predecessor(0) =3, no update Join in
Copy to n all keys for which node n has became their successor Idea: Node n can become the successor only for keys stored by the node immediately following n Join in
Extended Chord protocol Concurrent joins Failures and replication Beyond are project scope. [Ref. Stoica et, all sigcomm 2001]
Example applications: Co-operative Mirroring Time-shared storage Distributed indexes Large-Scale combinatorial search
Implementation Our current implementation is simulation of the Chord protocol in O(log n) steps in java. The framework can be shown as below
Methods Implemented
Conclusion Questions ? Reference : Chord: A Scalable Peer to peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science
THANK YOU