OpenDHT: A Public DHT Service Sean C. Rhea UC Berkeley June 2, 2005 Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Peer-to-Peer File Sharing Very simple insight –Most computers unused most of the time Idea: harness this spare capacity to –Quickly download music files [Napster, Gnutella] –Search for aliens –Make free long-distance phone calls [Skype] Question: how to find desired resource(s)? –Early approaches: scoped flooding –Downsides: scalability, accuracy
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service A Better Search Facility: The Distributed Hash Table (DHT) Same interface as a programmatic hash table, –put(key, value) stores value under key –get(key) returns the value(s) stored under key But shared across many machines Implemented via an overlay network
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service A Better Search Facility: The Distributed Hash Table (DHT) K V put(k 1,v 1 ) stores k 1,v 1 get(k 1 ) k1k1 k1k1 k1k1 v1v1 v1v1 v1v1 k1,v1k1,v1 k1,v1k1,v1 k1,v1k1,v1
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and File Sharing: DHT Stores Pointers to Files K V pointer to file put(file, IP)
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and File Sharing: DHT Stores Pointers to Files K V pointer to file get(file) IP xfer over HTTP
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V put(hash(msg), IP) “I love you!”
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V put(hash(msg), IP) “I love you!” “I love you!”
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V “I love you!” “I love you!” Something’s fishy!
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V put(hash(msg), IP) “I love you!” “I love you!” “I love you!”
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V “I love you!” “I love you!” “I love you!” Something’s fishy!
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service More DHT Applications Distributed Storage Systems –CFS, HiveCache, PAST, Pastiche –OceanStore / Pond Content Distribution Networks / Web Caches –Bslash, Coral, Squirrel Indexing / Naming Systems –Chord-DNS, CoDoNS, DOA, SFR Internet Query Processors –Catalogs, PIER Communication Systems –Bayeux, i3, MCAN, SplitStream
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Some Areas of DHT Research Better routing protocols –One-hop, degree-optimal Load balancing –Non-uniform key distributions Security –Byzantine fault-tolerant routing Data redundancy and fault tolerance –Replication, erasure-coding Stronger semantics –Supporting read-modify-write
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service How Many DHTs Will There Be? K V File Sharing K V Spam Detection Company Machine: Can’t Share Files Owns Stock in Spam Company K V
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service How Many DHTs Will There Be? K V File Sharing K V Spam Detection Redundant Link K V
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service How Many DHTs Will There Be? K V File Sharing K V Spam Detection Unshared Links K V
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Benefits of Sharing a DHT Amortizes costs across applications –Maintenance bandwidth, connection state, etc. Facilitates “bootstrapping” of new applications –Working infrastructure already in place Allows for statistical multiplexing of resources –Takes advantage of spare storage and bandwidth Facilitates upgrading existing applications –“Share” DHT between application versions
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Challenges in Sharing a DHT Robustness –Must be available 24/7 Shared Interface Design –Should be general, yet easy to use Resource Allocation –Must protect against malicious/over-eager users Economics –What incentives are there to provide resources?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Challenges in Sharing a DHT Robustness –Must be available 24/7 Shared Interface Design –Should be general, yet easy to use Resource Allocation –Must protect against malicious/over-eager users Economics –What incentives are there to provide resources?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The DHT as a Service K V
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The DHT as a Service K V OpenDHT
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The DHT as a Service OpenDHT Clients
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The DHT as a Service OpenDHT
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The DHT as a Service OpenDHT What is this interface?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The Traditional Interface: lookup
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service The Traditional Interface: lookup lookup(k) k On reaching the successor of k, message passed to an “upcall”
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V put(hash(msg), IP) “I love you!” “I love you!” Upcall: I’ve seen this message before!
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and Spam Detection: Detecting Similar Messages K V “I love you!” “I love you!” Something’s fishy!
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Challenges Distribution –How do we get new upcall code to all nodes?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Challenges lookup(k) k How did the upcall code get here?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Challenges Distribution –How do we get new upcall code to all nodes? –Active networking experience is a warning…
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Challenges Distribution –How do we get new upcall code to all nodes? –Active networking experience is a warning… Security –How do we safely run untrusted clients’ upcalls?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service What about Put/Get? Works great for some applications –File sharing, for example
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service DHTs and File Sharing: DHT Stores Pointers to Files K V pointer to file get(file) IP xfer over HTTP put(file, IP)
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service What about Put/Get? Works great for some applications –File sharing, for example What about applications with upcalls? –Our spam detection application, for example
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service What about Put/Get? Works great for some applications –File sharing, for example What about applications with upcalls? –Our spam detection application, for example Idea: let application nodes run the upcalls –Each node only runs upcalls for the applications that it’s participating in
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Example OpenDHT put/get File Sharing I can handle spam detection messages Spam Detection I can handle spam detection messages I can handle spam detection messages
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Example OpenDHT put/get File Sharing Spam Detection “I love you!” Who’s handling hash(message)?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Example OpenDHT put/get File Sharing Spam Detection “I love you!” Who’s handling hash(message)? “I love you!”
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Upcall Example OpenDHT put/get File Sharing Spam Detection “I love you!” “I love you!” Something’s fishy! DHT keeps track of which nodes support which upcalls via Recursive Distributed Rendezvous (ReDiR)
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) H(namespace) L0 L1 L2 H(A) A A A H(B)
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A A H(B)H(C) C
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A H(B)H(C) C H(D) D D
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D H(E) E
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Join cost: –Worst case: O(log n) puts and gets –Average case: O(1) puts and gets L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E H(k1) successor
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E H(k2) no successor successor
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Goal: Implement two functions using put/get: –join(namespace, node) –node = lookup(namespace, identifier) L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E H(k3) no successor successor no successor
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Lookup cost: –Worst case: O(log n) gets –Average case: O(1) gets L0 L1 L2 H(A) A, B A, C A, D H(B)H(C) C H(D) D D, E H(E) E
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service ReDiR Performance (On PlanetLab)
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service OpenDHT Design Summary OpenDHT is a common infrastructure for –Storage of values, pointers, etc. –Organizing clients that handle application upcalls Benefits: –Amortizes maintenance costs across applications –Facilitates “bootstrapping” of new applications –Allows for statistical multiplexing of resources
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Impact ApplicationUses OpenDHT forInterface Croquet Media Managerreplica locationput/get DOAindexingput/get HIPname resolutionput/get Tetherless Computinghost mobilityput/get Place Labrange queriesReDiR QStreammcast tree constr.ReDiR VPN Indexindexingput/get FreeDBstorageput/get Instant Messagingrendezvousput/get CFSstorageput/get i3i3redirectionReDiR
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Future Work OpenDHT makes a great common substrate for: –Soft-state storage –Naming and rendezvous Many P2P applications also need to: –Traverse NATs –Redirect packets within the infrastructure (as in i3) –Refresh puts while intermittently connected All of these can be implemented with upcalls –Who provides the machines that run the upcalls?
June 2, 2005Sean C. RheaOpenDHT: A Public DHT Service Future Work We don’t want to add upcalls to the core DHT –Keep the main service simple, fast, and robust Can we build a separate upcall service? –Some other set of machines organized with ReDiR –Security: can only accept incoming connections, can’t write to local storage, etc. This should be enough to implement –NAT traversal, reput service –Some (most?) packet redirection What about more expressive security policies?
For more information, see