Towards a Scalable and Robust DHT Baruch Awerbuch Johns Hopkins University Christian Scheideler Technical University of Munich
2Towards a Scalable and Robust DHT Holy Grail of Distributed Systems Scalability and Robustness Adversarial behavior increasingly pressing issue!
3Towards a Scalable and Robust DHT Why is this difficult??? Scalability: minimize resources needed for operations Robustness: maximize resources needed for attack Scalable solutions seem to be easy to attack!
4Towards a Scalable and Robust DHT Distributed Hash Table Problem: maintain a hash table of data items across multiple sites Basic operations: Join(s): site s joins the system Join(s): site s joins the system Leave(s): site s leaves the system Leave(s): site s leaves the system Insert(d): insert data item d into hash table Insert(d): insert data item d into hash table Lookup(name): lookup data item with given name Lookup(name): lookup data item with given name data sites
5Towards a Scalable and Robust DHT Distributed Hash Table Scalable DHT: Bounded-degree overlay network Few copies per data item data sites
6Towards a Scalable and Robust DHT Distributed Hash Table Robust DHT: Network can sustain constant fraction of adversarial sites Data layer can handle arbitrary collection of insert and lookup requests data sites
7Towards a Scalable and Robust DHT DHT by Karger et al. 01 data sites
8Towards a Scalable and Robust DHT Robust Overlay Network Two central conditions (given n sites): Balancing condition: only O(log n) sites in intervals of size (c log n)/n Majority condition: adv. sites in minority in all intervals of size (c log n)/n Balancing condition: scalability Majority condition: robustness via majority decision 01
9Towards a Scalable and Robust DHT How to satisfy conditions? Chord: uses cryptographic hash function to map sites to points in [0,1) randomly distributes honest sites does not randomly distribute adversarial sites
10Towards a Scalable and Robust DHT How to mix adversarial sites? CAN: map sites to random points in [0,1)
11Towards a Scalable and Robust DHT How to mix adversarial sites? Group spreading [AS04]: Map sites to random points in [0,1) Limit lifetime of points Too expensive!
12Towards a Scalable and Robust DHT How to mix adversarial sites? Our approach: n honest nodes (not under control of adversary) and n adversarial nodes Adversary can adaptively join and leave with its nodes Oblivious join operation that can maintain conditions under any adversary Naive solution: perturb everything after each join operation
13Towards a Scalable and Robust DHT How to mix adversarial sites? Card shuffling [Diaconis & Shahshahani 81]: random transposition (n log n) transpositions: random permutation (log n) transpositions per join operation??
14Towards a Scalable and Robust DHT Random transpositions Cannot preserve balancing condition!!
15Towards a Scalable and Robust DHT How to mix adversarial sites? Rule that works: k-cuckoo rule evict k/n-region n honest n adversarial < 1-1/k
16Towards a Scalable and Robust DHT Are we done? Dilemma: we cannot randomly distribute the data Reason: data unsearchable! So we need hash function, but then open to adversarial attacks on insert, lookup operations
17Towards a Scalable and Robust DHT What is the problem? Adversary can select requests to different data items at same region Any medicine against that??? 01
18Towards a Scalable and Robust DHT Yes!!
19Towards a Scalable and Robust DHT Robust Data Management Long, long ago… Deterministic PRAM simulation: Pioneered by Mehlhorn and Vishkin 84 Problem: simulate PRAM of n processors and m memory cells on complete network of n processors with memory Central ideas: Choose 2c-1 fixed hash functions with expansion properties, c= (log m) Majority trick: update and lookup any c copies
20Towards a Scalable and Robust DHT Robust Data Management Why deterministic strategies? Randomness expensive in dynamic systems with adversarial presence! More complex for dynamic networks: Congestion during routing Contention at destination regions
21Towards a Scalable and Robust DHT Robust Data Management Expansion properties (probabilistic proof): simple threshold rule sufficient sources
22Towards a Scalable and Robust DHT Robust Data Management Strategy: Run several attempts In each attempt: For each remaining request, route 2c-1 packets, one for each copy For each remaining request, route 2c-1 packets, one for each copy Discard packets at congested nodes Discard packets at congested nodes If c packets of a request successful, then request successful If c packets of a request successful, then request successful How many attempts???
23Towards a Scalable and Robust DHT Robust Data Management Answer: O(log n) many attempts with congestion threshold O(polylog). Theorem 1: For any set of n lookup requests, one request per node, the lookup protocol can serve all requests in O(polylog) communication rounds. Theorem 2: Same for insert requests
24Towards a Scalable and Robust DHT Conclusion We presented high-level solution for a scalable and robust DHT. Open problems: Low-level attacks (DoS!) Adversary controls join-leave behavior of adversarial and honest nodes Elementary random number generator
25Towards a Scalable and Robust DHT Questions?
26Towards a Scalable and Robust DHT Hash table data sites 01