Download presentation
Presentation is loading. Please wait.
Published byJune Lee Modified over 9 years ago
1
Shades: Expediting Kademlia’s Lookup Process Gil Einziger, Roy Friedman, Yoav Kantor Computer Science, Technion 1
2
Kademlia Overview Kademlia is nowadays implemented in many popular file sharing applications like Bit-torrent, Gnutella, and eMule. Applications over Kademlia have 100’s of millions users worldwide. Invented in 2002 by Petar Maymounkov and David Mazieres. 2
3
Kademlia is good Kademlia has a number of desirable features not simultaneously offered by any previous DHT. –It minimizes the number of configuration messages nodes must send to learn about each other. –Configuration information spreads automatically as a side-effect of key lookup. –Nodes have enough knowledge and flexibility to route queries through low-latency paths. –Kademlia uses parallel, asynchronous queries to avoid timeout delays from failed nodes. Easy to maintain Easy to maintain Fast Log(N) lookups Fault tolerant Fault tolerant 3
4
Many ways to reach the same value… K possible peers to make the first step. The first peer returns k other peers that are closer to the value. Each one of these peers returns other closer peers And so on… 4 Until finally we reach the k-closest nodes. These nodes store the actual value!
5
Many possible routing paths… All roads lead to Rome… But all of them lead to the same k closest peers. Popular content Many users that love Fry… Please wait … we’re all laptops here. 5
6
6 OneHop Kelips Chord Latency Kademlia Overheads Shades Low latency DHTS typically require gossip to maintain a large state. Other DHTS are easier to maintain, but encounter longer routing. The big picture Our Goal: Reduce the latency and remain very easy to maintain.
7
Caching to the rescue! Local Cache – After searching an item, cache it locally. (Guangmin, 2009). KadCache – After searching an item, send it to the last peer along the path. Kaleidoscope – Break symmetry using colors. Designed to reduce message cost, and not latency. KC LC Motivation: Motivation: If a value is popular, we should be able to hit a cached copy before reaching the k-closest nodes. 7
8
Frequency Rank Caching Internet Content The access distribution of most content is skewed ▫ Often modeled using Zipf-like functions, power-law, etc. Long Heavy Tail For example~(50% of the weight) A small number of very popular items For example~(50% of the weight)
9
Frequency Rank Caching Internet Content Unpopular items can suddenly become popular and vice versa. Blackmail is such an ugly word. I prefer "extortion". The "X" makes it sound cool.
10
Shades overview Form a large distributed cache from many nodes. –Make sure these caches are accessible early during the lookup. Single cache behavior – –Admission policy –Eviction policy. 10
11
Palette 11 The Palette provides a mapping from colors to nodes of that color. We want to have at least a single node, from every color. K- buckets Palette
12
Shades in Brief Do the original Kademlia lookup and in the same time, contact correctly colored nodes from the palette. 12 Original routing advance us towards the value. Correctly colored nodes– are likely to contain a cached copy of the value.
13
Multiple cache lookups 13 Problem: Problem: If the first routing step is not successful, how can we get additional correctly colored nodes ? Solution: Use the palette of contacted nodes! Looking for “bender” a key. Response + node.
14
Cache Victim Winner Eviction and Admission Policies Eviction Policy (Lazy LFU) Admission Policy TinyLFU New Item One of you guys should leave… is the new item any better than the victim? What is the common Answer?
15
1311 TinyLFU: LFU Admission Policy 22 Keep inserting new items to the histogram until #items = W 4 7 #items 8910 Once #items reaches W - divide all counters by 2. 121 5
16
TinyLFU Example 13122411 New Item Cache Victim Winner Admission Policy TinyLFU Victim Score: 3 New Item Score: 2 Victim Wins! Eviction Policy (Lazy LFU) 23
17
What are we doing? Past Approximate Future It is much cheaper to maintain an approximate view of the past.
18
Estimate(item): Estimate(item): ▫ Return BF.contains(item) +MI-CBF.estimate(item) Add(item): Add(item): ▫ W++ ▫ If(W == WindowSize) Reset() ▫ If(BF.contains(item)) Return MI-CBF.add(item) BF.add(item) TinyLFU operation Bloom Filter MI-CBF Reset Divide W by 2,Divide W by 2, erase Bloom filter,erase Bloom filter, divide all counters by 2.divide all counters by 2. (in MI-CBF). (in MI-CBF).
19
Eviction Policy: Lazy LFU 19 Motivation: Efficient approximation of the LFU eviction policy, in case that admission is rare. “Search for the least frequently used item… in a lazy manner” A7 B6 C8 D5 E2 F17 G31 Victim Search item Get Victim 1 Victim Search item Get Victim 2 Victim Search item Get Victim 3 Victim Search item
20
Bloom Filters An array BF of m bits and k hash functions {h 1,…,h k } over the domain [0,…,m-1] Adding an object obj to the Bloom filter is done by computing h 1 (obj),…, h k (obj) and setting the corresponding bits in BF Checking for set membership for an object cand is done by computing h 1 (cand),…, h k (cand) and verifying that all corresponding bits are set m=11, k=3, 111 h 1 (o1)=0, h 2 (o1)=7, h 3 (o1)=5 BF= h 1 (o2)=0, h 2 (o2)=7, h 3 (o2)=4 √ × Not all 1.
21
Bloom Filters with Minimal Increment Scarifies the ability to Decrement in favor of accuracy/space efficiency –During an Increment operation, only update the lowest counters m=11 368 k=3, h 1 (o1)=0, h 2 (o1)=7, h 3 (o1)=5 MI-CBF= Increment(o1) only adds to the first entry (3->4) 4
22
Shades Tradeoff What happens as the number of colors increases?What happens as the number of colors increases? 22 We form larger distributed caches. But it is more difficult to fill the palette.
23
Comparative results Emulation Emulation – We run the actual implementation, sending and receiving actual UDP packets. (Only the user is simulated) Scale Scale - Different network sizes up to 5,000 Kademlia peers. Experimental settings: Experimental settings: Each peer does: 500 requests warm-up. 500 requests measurement interval. (Up to 2.5 Million find value requests in warm-up and 2.5 Million requests in measurement) Experiment generation: Experiment generation: Each peer receives a file with 1000 requests from the appropriate workload. All users continuously play the requests. 23
24
Wikipedia trace Wikipedia trace (Baaren & Pierre 2009) “10% of all user requests issued to Wikipedia during the period from September 19th 2007 to October 31th. “ 24 YouTube trace YouTube trace (Cheng et al, QOS 2008) Weekly measurement of ~160k newly created videos during a period of 21 weeks. We directly created a synthetic distribution for each week.
25
Comparative results 25 YouTube workload 100 items cache. More queries are finished sooner. Other caching strategies offer only a marginal reduction of the number of contacted nodes. This is the ideal corrner we want to complete as many of the lookups as soon as possible!
26
Comparative results 26 YouTube workload unbounded cache. Shades is also better for unbounded cache! Notice that Shades_100 is still better than other caches with unbounded cache.
27
Comparative results 27 Wikipedia workload - 100 items cache.
28
Comparative results 28 Load is more balanced because frequent items are found in a single step. Similar message overheads to other suggestions.
29
Conclusions Latency improvement– up to 22-34% reduction of median latency and 18-23% reduction of average latency. Better load distribution – Busiest nodes are 22-43% less congested – cached values are not close to the stored values. Reproducibility– Shades is an open source project : https:// code.google.com/p/shades/ https:// code.google.com/p/shades/ Kaleidoscope, KadCache and Local are released as part of the open source project OpenKad: https://code.google.com/p/openkad/.https://code.google.com/p/openkad/ Feel free to use them! Feel free to use them! 29
30
The end: Any questions ? Thanks for listening! 30
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.