Presentation is loading. Please wait.

Presentation is loading. Please wait.

Koorde: A Simple Degree Optimal DHT

Similar presentations


Presentation on theme: "Koorde: A Simple Degree Optimal DHT"— Presentation transcript:

1 Koorde: A Simple Degree Optimal DHT
Adopted from Frans Kaashoek and David Karger MIT The blue lines are addred by Guihai Chen

2 DHT Routing Distributed hash tables
Implement hash table interface Map any ID to the machine responsible for that ID (in a consistent fashion) Standard primitive for P2P Machines not all aware of each other Each tracks small set of “neighbors” Route to responsible node via sequence of “hops” to neighbors

3 Performance Measures Degree Hop count Fault tolerance
How many neighbors nodes have Hop count How long to reach any destination node Fault tolerance How many nodes can fail Maintenance overhead E.g., making sure neighbors are up Load balance How evenly keys distribute among nodes

4 Tradeoffs With larger degree, hope to achieve
Smaller hop count Better fault tolerance But higher degree implies More routing table state per node Higher maintenance overhead to keep routing tables up to date Load balance “orthogonal issue”

5 Current Systems Chord, Kademlia, Pastry, Tapestry O(log n) degree
O(log n) hop count O(log n) ratio load balance Chord: O(1) load balance with O(log n) “virtual nodes” per real node Multiplies degree to O(log2 n)

6 Outliers CAN Viceroy Degree d O(dn1/d) hops O(log n) hop count
Constant average degree But some nodes have degree log n Outliers:局外人

7 Lower Bounds to Shoot For
Theorem: if max degree is d, then hop count is at least logd n Proof: < dh nodes at distance h Allows degree O(1) and O(log n) hops Or deg. O(log n) and O(log n / loglog n) hops Theorem: to tolerate half nodes failing, (e.g. net partition) need degree W(log n) Pf: if less, some node loses all neighbors Might as well take O(log n / loglog n) hops! Something wrong:if max degree is d, then hop count is at least logd nif max degree is d, then hop count is at least FLOOR(logd n) otherwise, there are 2 counter examples: 3-node ring and 5-node ring.

8 Koorde New routing protocol Shares almost all aspects with Chord
But, meets (to within constant factor) all lower bounds just mentioned: Degree 2 and O(log n) hops Or degree log n and O(log n / loglog n) hops and fault tolerant Like Chord, O(log n) load balance or constant with O(log n) times degree

9 Chord Review Chord consists of
Consistent hashing to assign IDs to nodes Good load balance Efficient routing protocol to find right node Fast join/leave protocol Few data items shifted Fault tolerance to half of nodes failing Efficient maintenance over time ■ Koorde routing protocol to find right node

10 Consistent Hashing Assign ID to “successor” node on ring 6 60 51 13
6 60 51 13 Assign doc with hash 49 to node 51 Assign ID to “successor” node on ring 49 18 47 22 42 36 31

11 Chord Routing Each node keeps successor pointer
Also keeps power-of-two “fingers” neighbors providing shortcuts So log n fingers 60 6 51 13 18 47 22 42 36 31

12 Chord Lookups 60 6 51 13 18 47 22 42 36 31

13 Koorde Idea Chord acts like a hypercube Koorde uses a deBruijn network
Fingers flip one bit Degree log n (log n different flips) Diameter log n Koorde uses a deBruijn network Fingers shift in one bit Degree 2 (2 possible bits to shift in)

14 De Bruijn Graph Nodes are b-bit integers (b = log n)
Node u has 2 neighbors (bit shifts): 2u mod 2b and 2u+1 mod 2b Or Node u=ubub-1…u1 has 2 left shift neighbors u°0 = ub-1ub-2…u and u°1 =ub-1ub-2…u11 100 110 1 1 000 010 101 111 1 1 1 1 1 001 011 1

15 De Bruijn Routing Shift in destination bits one by one
b hops complete route Route from 000 to 110: 100 110 1 1 010 101 111 000 1 1 1 1 1 001 011 1

16 Routing Code Procedure u.LOOKUP(k, toShift)
/* u is machine, k is target key toShift is target bits not yet shifted in */ if k = u then Return u /* as owner for k */ else /* do de Bruijn hop */ t = u °topBit(toShift) Return t.lookup(k, toshift áá 1) Initially call self.LOOKUP(k,k) toshift <<1 : left shift by 1 bit, topBit(toShift):the current bit of destination which should be shifted in.

17 Summary Each node has 2 outgoing neighbors
Also two incoming Can show good routing load balance Need b = log n bits for n distinct nodes So log n hops to route

18 Problems to Solve Want b-bit ring, b >> log n, to avoid colliding identifiers as nodes join Implies use b >> log n hops Worse, most nodes not present to route! Solutions Imaginary routing: present nodes simulate routing actions of absent nodes Short cuts: use gaps to start route with most of destination bits already shifted in Read the paper to understand b >> log n

19 Imaginary routing Node u holds two pointers
Successor on ring One finger: predecessor of 2u (mod 2b) On sparse ring, is also predecessor of 2u+1 So handles both de Bruijn edges Node u “owns” all imaginary nodes between self and (real) successor Simulates de Bruijn routing from those imaginary nodes to others by forwarding to the others’ real owners

20 Code Procedure u.LOOKUP(k, toShift, i) if k Î (u,u.successor] then
return u.successor /* as bucket for k */ else if i Î (u,u.successor] then /* i belongs to u; do de Bruijn hop */ return u.finger.LOOKUP(k, toshift áá 1, i °topBit(toShift)) else /* i doesn’t belong to u; forward it */ return u.successor.LOOKUP(k, toShift, i) Initially call self.LOOKUP(k,k,self) i represents the next routing node of full de Bruijn graph. (u,u.successor] :including u, but not u.successor. Argument i always represents the next station of full De Bruijn graph.

21 True route tracks imaginary
start finger (< double) imaginary(double) target successor

22 Correctness Once b de Bruijn steps happen, done
At this point, i = k Will follow successors to bucket for k Successor steps delay de Bruijn steps, but not forever After finite number of successor steps, reach predecessor of i Conclude: all necessary de Bruijn steps happen in finite time. So correct.

23 How long? Only b de Bruijn steps
Just bound (expected) number of successor steps per de Bruijn step Nodes randomly distributed on ring So node expects to own size 1/n interval So distance to imaginary node on de Bruijn step is 1/n De Bruijn step doubles everything, makes distance 2/n Expect 2 nodes in interval of that size

24 Few Successor Steps start 1/n target < 2/n

25 Summary Each de Bruijn hop followed by 2 successor hops (in expectation) b de Bruijn hops Conclude 2b successor hops so 3b hops in total Expectation argument extends to “with high probability” argument (same bounds) Remaining problem: b>>log n, too big

26 Exploit Address Blocks
Only n real nodes Each owns ~1/n “block” of keyspace Within that block, only top log n bits “significant”; low bits arbitrary So set low bits to high bits of target Then just have to shift out log n most significant bits So log n de Bruijn hops, So O(log n) hops in total

27 Example Start at u = 001011011… Successor 001110101….
u “owns” imaginary 00101****** Target …. Set imaginary start … Only need to shift out 00101 5 hops, independent of b

28 Summary Koorde uses 2 neighbors per node (one successor, one finger) And requires O(log n) routing hops with high probability

29 Variant: Koorde-K We used a binary de Bruijn Network
Generalizes to other base K: 021 022 020 110 100 102 002 111 101 2 010 012 011 000 112 1 001 120 121 122

30 Analysis To represent n distinct node ids need logK n base-K digits
Suggests logK n hops to route Same problem as Koorde: b >> logK n Same solution: imaginary routing Node u points at predecessor(Ku) Same analysis: K de Bruijn hops interspersed with successor hops

31 Successor Hops Now de Bruijn hop multiplies ids by K
So expect K nodes between finger and next imaginary node Implies K successor hops per de Bruijn hop Gives K logK n hops---no good To avoid successor hops, u fingers predecessor(Ku) and following K nodes Allows K successor hops by one finger Gives O(logK n) hops as desired

32 Summary Using K fingers per node, can achieve O(logK n) = O(log n / log K) routing hops As discussed earlier, degree log n is necessary (and sufficient) for fault tolerance (and is degree of most previous systems) So, O(log n / log log n ) hops

33 Summary: What do we Gain?
Lower degree for same number of hops Storage isn’t really an issue But lower degree should translate into lower maintenance traffic Lower hop count for same degree And tunable Other systems also have tunable hop count But at low hop counts (high degree) their extra log factor in degree does matter

34 What do we lose? Chord is “self stabilizing” Koorde is not
From successors, can build entire routing system quickly by “pointer jumping” to find fingers Koorde is not Given only successor pointers, no clear fast way to find fingers Not a problem for joins, because joiner can use lookup to find its finger But could be a problem if massive changes

35 More Info


Download ppt "Koorde: A Simple Degree Optimal DHT"

Similar presentations


Ads by Google