Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of.

Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project http://mirage.cs.uoregon.edu/P2P University of Oregon INFOCOM Barcelona, Spain April 27 th, 2006

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 2/14 Introduced in 2001 Goal: Allow fast, scalable lookups Hyped as “second-generation” P2P Focus of many research papers For a long time, no significant deployment Deployment Now: Overnet: 500,000+ Kad: 1,000,000+ Performance of a widely-deployed DHT with real churn How efficient are lookups in practice? How do parallel lookups improve performance? How much replication is needed to ensure consistency? But first some background… Distributed Hash Tables (DHTs) Azureus: 800,000+ All Kademlia based

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 3/14 Background: Kademlia Several features to address churn: Routing tables contain redundant routes. Called k-buckets Parallel routing quickly bypasses failed peers. Relies on iterative routing Lookups use prefix-matching Similar to Pastry

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 4/14 Background: Routing in prefix- matching DHTs Target: 01101 11000 01100 Source’s ID: 1 st Hop’s ID: 00110 2 nd Hop’s ID: If the first x bits match: Point to a peer with x+b matching bits, or Within 1 hop of the closest peer. With high probability, need to match around log 2 n bits. Improve by b bits per step. steps per lookup b = 2 High b yields quicker lookups, but larger routing tables

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 5/14 Outline Performance: Theory versus Practice Theory predicts (log 2 n) / b steps per lookup. Measure n to be approximately 1 million. Theory from Kademlia paper predicts 6.3 hops. Emulate lookups from nodes to addresses. In practice, average lookups take 3.2 hops. Revising Theory: Analyzing the average-case k-buckets improve performance. Enrichment via k-buckets versus increasing b Parallel Lookup Ensuring consistency through replication

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 6/14 Theoretical Lookup Performance Measured n ≈ 1 million peers in Kad. Developed a fast P2P crawler, called Cruiser. Global Internet 2005, IMC 2005 Adapted Cruiser to crawl Kad zones. Measured thousands of Kad zones. Kad improves 4 bits on the first step, at least 3 bits on each additional step.

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 7/14 Empirical Lookup Performance Goal: Measure lookup cost between (node, address) pairs Emulate DHT lookups with kLookup Leverage iterative routing Probe node A to extract its routing table Perform the lookup as node A We can use a variety of different lookup strategies We found an average lookup takes 3.2 hops, much better than the predicted 6.3!

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 8/14 Investigating the performance gap 6.3 is a worst-case analysis (sort of). Based on improving 3 bits per step. Through chance, the next hop peer may have additional matching bits. We derive a formula for average performance. There are k chances to find a peer with additional matching bits. k-buckets dramatically improve average performance. For k = 20, suggested in the Kademlia paper: Worst-case is 1 bit per step. Average-case is 5.7 bits per step! Is it better to enrich a routing table with k-buckets or with larger symbols (b)?

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 9/14 Analysis of Enriching Routing Tables Two ways to increase lookup efficiency: Larger symbols (b) Larger buckets (k) Both proposed in the Kademlia paper, but the benefits of buckets were not fully considered. Which yields the most improvement? See paper for detailed analysis Asymptotically similar Larger symbols are better by a constant factor (around 23% more bits per step)

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 10/14 Empirical Lookup Performance, Revisited To compute the average case, we need to know the size of the k-buckets. Kad uses buckets with k = 10. However, due to churn: The buckets are not always full. Some entries may point to departed peers. We need to examine buckets in the wild to determine how full they are in practice.

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 11/14 Extracting Kad Routing Tables kFetch: a tool for extracting routing tables Systematically generates queries for each k-bucket TCP-like congestion controlled query rate Probes each neighbor to determine if it departed Findings On average, k-buckets have 1 free slot On average, k-buckets point to 1 or 2 departed peers Overall, the mean k-bucket has 7.5 useful peers. We now predict 2.9 steps per lookup. Reasonably close to the measured 3.2 steps.

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 12/14 Improving Lookup Performance with Parallel Lookup Types of Parallel Lookup Strict Have exactly α outstanding lookups If we find a better next-hop, wait until one lookup completes. Pro: Limited overhead Loose Always have outstanding lookups to the α best-known next-hops If we find a better next-hop, send a lookup immediately. Pro: May be faster Key Questions: Which one is better? How much parallelism (α) is best?

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 13/14 Parallel Lookup Using parallelism reduces latency from 10 s to 2—3 s. Diminishing returns after α = 3. Loose parallelism is slightly faster. Strict parallelism is much more efficient.

Daniel Stutzbach The ION P2P Project http://mirage.cs.uoregon.edu/P2PSlide 14/14 Summary of Contributions Analysis of average-case performance for prefix-matching DHTs. Average performance can be dramatically different from the worst case. k-buckets improve lookup efficiency. Empirical study of improving performance with parallel lookup Strict parallel routing performs better. Sweet spot at α = 3 outstanding lookups. Empirical study of using replication to ensure consistency 3 copies on nearby peers overcomes lookup inconsistencies See paper for details Tools and techniques: Kad Cruiser: Capture the peers in a Kad zone, measure size kLookup: Emulate a lookup from any peer to any address kFetch: Extract a peer’s routing table

Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of.

Similar presentations

Presentation on theme: "Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of.

Similar presentations

Presentation on theme: "Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of."— Presentation transcript:

Similar presentations

About project

Feedback