A Case Study in Building Layered DHT Applications

Slides:



Advertisements
Similar presentations
Boxwood: Distributed Data Structures as Storage Infrastructure Lidong Zhou Microsoft Research Silicon Valley Team Members: Chandu Thekkath, Marc Najork,
Advertisements

CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
IP Address Lookup for Internet Routers Using Balanced Binary Search with Prefix Vector Author: Hyesook Lim, Hyeong-gee Kim, Changhoon Publisher: IEEE TRANSACTIONS.
B+-tree and Hashing.
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Distributed Lookup Systems
1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
1 Trees Tree nomenclature Implementation strategies Traversals –Depth-first –Breadth-first Implementing binary search trees.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Peer to Peer Network Design Discovery and Routing algorithms
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
Algorithms Design Fall 2016 Week 6 Hash Collusion Algorithms and Binary Search Trees.
School of Computing Clemson University Fall, 2012
Magdalena Balazinska, Hari Balakrishnan, and David Karger
Internet Indirection Infrastructure (i3)
CPS216: Data-intensive Computing Systems
Multiway Search Trees Data may not fit into main memory
Azita Keshmiri CS 157B Ch 12 indexing and hashing
IP Routers – internal view
Multidimensional Access Structures
Open Source distributed document DB for an enterprise
Distributed Hash Tables
Peer-to-Peer Data Management
File System Implementation
Multiway range trees: scalable IP lookup with fast updates
Lecture 21: Hash Tables Monday, February 28, 2005.
CHAPTER 3 Architectures for Distributed Systems
Database Management Systems (CS 564)
Binary Search Tree Chapter 10.
Chapter Trees and B-Trees
Chapter Trees and B-Trees
PROSE CS 218 Fall 2017.
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
DHT Routing Geometries and Chord
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
Indexing and Hashing Basic Concepts Ordered Indices
A Robust Data Structure
Distributed Hash Tables
2018, Spring Pusan National University Ki-Joune Li
A Small and Fast IP Forwarding Table Using Hashing
CPS216: Advanced Database Systems
CMSC 341 Extensible Hashing.
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Consistent Hashing and Distributed Hash Table
Path Oram An Extremely Simple Oblivious RAM Protocol
Presentation transcript:

A Case Study in Building Layered DHT Applications Yatin Chawathe Sriram Ramabhadran, Sylvia Ratnasamy, Anthony LaMarca, Scott Shenker, Joseph Hellerstein

Building distributed applications Distributed systems are designed to be scalable, available and robust What about simplicity of implementation and deployment? DHTs proposed as simplifying building block Simple hash-table API: put, get, remove Scalable content-based routing, fault tolerance and replication

Can DHTs help Can we layer complex functionality on top of unmodified DHTs? Can we outsource the entire DHT operation to a third-party DHT service, e.g., OpenDHT? Existing DHT applications fall into two classes Simple unmodified DHT for rendezvous or storage, e.g., i3, CFS, FOOD Complex apps that modify the DHT for enhanced functionality, e.g, Mercury, CoralCDN

Outline Motivation A case study: Place Lab Range queries with Prefix Hash Trees Evaluation Conclusion

A Case Study: Place Lab Positioning service for location-enhanced apps Clients locate themselves by listening for known radio beacons (e.g. WiFi APs) Database of APs and their known locations Clients download local WiFi maps Place Lab service Computes maps of AP MAC address ↔ lat,lon “War-drivers” submit neighborhood logs { lat, lon → list of APs } . { AP → lat, lon } .

Why Place Lab Developed by group of ubicomp researchers Not experts in system design and management Centralized deployment since March 2004 Software downloaded by over 6000 sites Concerns over organizational control  decentralize the service But, want to avoid implementation and deployment overhead of distributed service

How DHTs can help Place Lab storage and routing Clients download local WiFi maps … “War-drivers” submit neighborhood logs Place Lab servers compute AP location Automatic content-based routing Route logs by AP MAC address to appropriate Place Lab server Robustness and availability DHT managed entirely by third party Provides automatic replication and failure recovery of database content

“War-drivers” submit neighborhood logs Downloading WiFi Maps ? DHT storage and routing Clients download local WiFi maps … “War-drivers” submit neighborhood logs Place Lab servers compute AP location Clients perform geographic range queries Download segments of the database e.g., all access points in Philadelphia Can we perform this entirely on top of unmodified third-party DHT DHTs provide exact-match queries, not range queries

Supporting range queries Prefix Hash Trees Index built entirely with put, get, remove primitives No changes to DHT topology or routing Binary tree structure Node label is a binary prefix of values stored under it Nodes split when they get too big Stored in DHT with node label as key Allows for direct access to interior and leaf nodes R R0 R1 R01 R10 R11 R00 0000 3 0011 8 1000 Prefix hash trees cannot by themselves protect against data loss, but failure of a tree node does not affect availability of data stored in other nodes, even descendants of the failed node. Moreover, PHTs can benefit from the DHT’s replication strategy. R111 R011 R110 R010 6 0110 12 1100 14 1110 4 0100 5 0101 13 1101 15 1111

PHT operations Lookup(K) Insert(K, V) Query(K1, K2) Find leaf node whose label is prefix of K Binary search across K’s bits O(log log D) where D = size of key space Insert(K, V) Lookup leaf node for K If full, split node into two Put value V into leaf node Query(K1, K2) Lookup node for P, where P=longest common prefix of K1,K2 Traverse subtree rooted at node for P R R1 R11 R110 R1101 13 1101 R R0 R1 R01 R010 R011 4 0100 5 0101 6 0110 R01 R10 R11 R00 0000 3 0011 8 1000 R111 R011 R110 R010 6 0110 12 1100 14 1110 4 0100 5 0101 13 1101 15 1111

2-D geographic queries Convert lat/lon into 1-D key … Convert lat/lon into 1-D key Use z-curve linearization Interleave lat/lon bits to create z-curve key Linearized query results may not be contiguous Start at longest prefix subtree Visit child nodes only if they can contribute to query result 7 6 ( 5 , 6 ) (0101,0110)  00110110 (54) 5 4 latitude 3 2 1 … 1 2 3 4 5 6 7 longitude P(=R000…00) P1 P0 P11 P10 P00 P01 P010 P011 P100 P101 P111 P110 P0100 P0101 P0110 P0111 P1100 P1101 (2,4) (2,5) (3,5) (3,6) (3,7) (0,4) (1,5) (1,0) (0,7) (1,6) (1,7) P10

PHT Visualization

Ease of implementation and deployment 2,100 lines of code to hook Place Lab into underlying DHT service Compare with 14,000 lines for the DHT Runs entirely on top of deployed OpenDHT service DHT handles fault tolerance and robustness, and masks failures of Place Lab servers

Flexibility of DHT APIs Range queries use only the get operation Updates use combination of put, get, remove But… Concurrent updates can cause inefficiencies No support for concurrency in existing DHT APIs A test-and-set extension can be beneficial to PHTs and a range of other applications put_conditional: perform the put only if value has not changed since previous get

PHT insert performance Median insert latency is 1.45 sec w/o caching = 3.25 sec; with caching = 0.76 sec

PHT query performance Queries on average take 2–4 seconds Data size Latency (sec) 5k 2.13 10k 2.76 50k 3.18 100k 3.75 Queries on average take 2–4 seconds Varies with block size Smaller (or very large) block size implies longer query time

Conclusion Concrete example of building complex applications on top of vanilla DHT service DHT provides ease of implementation and deployment Layering allows inheriting of robustness, availability and scalable routing from DHT Sacrifices performance in return