Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Case Study in Building Layered DHT Applications

Similar presentations


Presentation on theme: "A Case Study in Building Layered DHT Applications"— Presentation transcript:

1 A Case Study in Building Layered DHT Applications
Yatin Chawathe Sriram Ramabhadran, Sylvia Ratnasamy, Anthony LaMarca, Scott Shenker, Joseph Hellerstein

2 Building distributed applications
Distributed systems are designed to be scalable, available and robust What about simplicity of implementation and deployment? DHTs proposed as simplifying building block Simple hash-table API: put, get, remove Scalable content-based routing, fault tolerance and replication

3 Can DHTs help Can we layer complex functionality on top of unmodified DHTs? Can we outsource the entire DHT operation to a third-party DHT service, e.g., OpenDHT? Existing DHT applications fall into two classes Simple unmodified DHT for rendezvous or storage, e.g., i3, CFS, FOOD Complex apps that modify the DHT for enhanced functionality, e.g, Mercury, CoralCDN

4 Outline Motivation A case study: Place Lab
Range queries with Prefix Hash Trees Evaluation Conclusion

5 A Case Study: Place Lab Positioning service for location-enhanced apps
Clients locate themselves by listening for known radio beacons (e.g. WiFi APs) Database of APs and their known locations Clients download local WiFi maps Place Lab service Computes maps of AP MAC address ↔ lat,lon “War-drivers” submit neighborhood logs { lat, lon → list of APs } . { AP → lat, lon } .

6 Why Place Lab Developed by group of ubicomp researchers
Not experts in system design and management Centralized deployment since March 2004 Software downloaded by over 6000 sites Concerns over organizational control  decentralize the service But, want to avoid implementation and deployment overhead of distributed service

7 How DHTs can help Place Lab
storage and routing Clients download local WiFi maps “War-drivers” submit neighborhood logs Place Lab servers compute AP location Automatic content-based routing Route logs by AP MAC address to appropriate Place Lab server Robustness and availability DHT managed entirely by third party Provides automatic replication and failure recovery of database content

8 “War-drivers” submit neighborhood logs
Downloading WiFi Maps ? DHT storage and routing Clients download local WiFi maps “War-drivers” submit neighborhood logs Place Lab servers compute AP location Clients perform geographic range queries Download segments of the database e.g., all access points in Philadelphia Can we perform this entirely on top of unmodified third-party DHT DHTs provide exact-match queries, not range queries

9 Supporting range queries
Prefix Hash Trees Index built entirely with put, get, remove primitives No changes to DHT topology or routing Binary tree structure Node label is a binary prefix of values stored under it Nodes split when they get too big Stored in DHT with node label as key Allows for direct access to interior and leaf nodes R R0 R1 R01 R10 R11 R00 0000 3 0011 8 1000 Prefix hash trees cannot by themselves protect against data loss, but failure of a tree node does not affect availability of data stored in other nodes, even descendants of the failed node. Moreover, PHTs can benefit from the DHT’s replication strategy. R111 R011 R110 R010 6 0110 12 1100 14 1110 4 0100 5 0101 13 1101 15 1111

10 PHT operations Lookup(K) Insert(K, V) Query(K1, K2)
Find leaf node whose label is prefix of K Binary search across K’s bits O(log log D) where D = size of key space Insert(K, V) Lookup leaf node for K If full, split node into two Put value V into leaf node Query(K1, K2) Lookup node for P, where P=longest common prefix of K1,K2 Traverse subtree rooted at node for P R R1 R11 R110 R1101 13 1101 R R0 R1 R01 R010 R011 4 0100 5 0101 6 0110 R01 R10 R11 R00 0000 3 0011 8 1000 R111 R011 R110 R010 6 0110 12 1100 14 1110 4 0100 5 0101 13 1101 15 1111

11 2-D geographic queries Convert lat/lon into 1-D key
Convert lat/lon into 1-D key Use z-curve linearization Interleave lat/lon bits to create z-curve key Linearized query results may not be contiguous Start at longest prefix subtree Visit child nodes only if they can contribute to query result 7 6 ( , ) (0101,0110) (54) 5 4 latitude 3 2 1 1 2 3 4 5 6 7 longitude P(=R000…00) P1 P0 P11 P10 P00 P01 P010 P011 P100 P101 P111 P110 P0100 P0101 P0110 P0111 P1100 P1101 (2,4) (2,5) (3,5) (3,6) (3,7) (0,4) (1,5) (1,0) (0,7) (1,6) (1,7) P10

12 PHT Visualization

13 Ease of implementation and deployment
2,100 lines of code to hook Place Lab into underlying DHT service Compare with 14,000 lines for the DHT Runs entirely on top of deployed OpenDHT service DHT handles fault tolerance and robustness, and masks failures of Place Lab servers

14 Flexibility of DHT APIs
Range queries use only the get operation Updates use combination of put, get, remove But… Concurrent updates can cause inefficiencies No support for concurrency in existing DHT APIs A test-and-set extension can be beneficial to PHTs and a range of other applications put_conditional: perform the put only if value has not changed since previous get

15 PHT insert performance
Median insert latency is 1.45 sec w/o caching = 3.25 sec; with caching = 0.76 sec

16 PHT query performance Queries on average take 2–4 seconds
Data size Latency (sec) 5k 2.13 10k 2.76 50k 3.18 100k 3.75 Queries on average take 2–4 seconds Varies with block size Smaller (or very large) block size implies longer query time

17 Conclusion Concrete example of building complex applications on top of vanilla DHT service DHT provides ease of implementation and deployment Layering allows inheriting of robustness, availability and scalable routing from DHT Sacrifices performance in return


Download ppt "A Case Study in Building Layered DHT Applications"

Similar presentations


Ads by Google