A Public DHT Service Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu UC Berkeley and.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
CS 603 Process Synchronization: The Colored Ticket Algorithm February 13, 2002.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Project Byzantium Networking for the Zombie Apocalypse.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
1 CS 268: Lecture 22 DHT Applications Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California,
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
An Engineering Approach to Computer Networking
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
DTNs Delay Tolerant Networks. Fall, Kevin. Intel Research, Berkeley. SIGCOMM 2003 Aug25, A Delay- Tolerant Network Architecture for Challenged Internets.
CS 268: Active Networks Ion Stoica May 6, 2002 (* Based on David Wheterall presentation from SOSP ’99)
Vault: A Secure Binding Service Guor-Huar Lu, Changho Choi, Zhi-Li Zhang University of Minnesota.
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
X Non-Transitive Connectivity and DHTs Mike Freedman Karthik Lakshminarayanan Sean Rhea Ion Stoica WORLDS 2005.
Resource Allocation in OpenHash: a Public DHT Service Sean Rhea with Brad Karp, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu.
OpenDHT: A Public DHT Service Sean C. Rhea UC Berkeley June 2, 2005 Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott.
CSE331: Introduction to Networks and Security Lecture 14 Fall 2002.
CSE 326: Data Structures Lecture #11 B-Trees Alon Halevy Spring Quarter 2001.
Lecture 9 Naming services. EECE 411: Design of Distributed Software Applications Logistics / reminders Subscribe to the mailing list! Assignment1 marks.
Fall 2006CS 5611 Peer-to-Peer Networks Outline Overview Pastry OpenDHT Contributions from Peter Druschel & Sean Rhea.
A Note on Finding the Nearest Neighbor in Growth-Restricted Metrics Kirsten Hildrum John Kubiatowicz Sean Ma Satish Rao.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Sean Rhea, Byung-Gon Chun, John Kubiatowicz, and Scott Shenker UC Berkeley (and now MIT) December.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
Peer-to-peer approaches for SIP Henning Schulzrinne Dept. of Computer Science Columbia University.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Beyond Theory: DHTs in Practice CS Networks Sean C. Rhea April 18, 2005 In collaboration with: Dennis Geels, Brighten Godfrey, Brad Karp, John Kubiatowicz,
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Day15 IP Space/Setup. IP Suite of protocols –TCP –UDP –ICMP –GRE… Gives us many benefits –Routing of packets over internet –Fragmentation/Reassembly of.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
Networked File System CS Introduction to Operating Systems.
Defense by Amit Saha March 25 th, 2004, Rice University ANTS : A Toolkit for Building and Dynamically Deploying Network Protocols David Wetherall, John.
Information-Centric Networks07a-1 Week 7 / Paper 1 Internet Indirection Infrastructure –Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Information-Centric Networks06a-1 Week 6 / Paper 1 Untangling the Web from DNS –Michael Walfish, Hari Balakrishnan and Scott Shenker –Networked Systems.
Distributed Session Announcement Agents for Real-time Streaming Applications Keio University, Graduate School of Media and Governance Kazuhiro Mishima.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
ECO-DNS: Expected Consistency Optimization for DNS Chen Stephanos Matsumoto Adrian Perrig © 2013 Stephanos Matsumoto1.
CSCI-375 Operating Systems Lecture Note: Many slides and/or pictures in the following are adapted from: slides ©2005 Silberschatz, Galvin, and Gagne Some.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Effective Replica Maintenance for Distributed Storage Systems USENIX NSDI’ 06 Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon,
Protocol Requirements draft-bryan-p2psip-requirements-00.txt D. Bryan/SIPeerior-editor S. Baset/Columbia University M. Matuszewski/Nokia H. Sinnreich/Adobe.
Minimizing Churn in Distributed Systems P. Brighten Godfrey, Scott Shenker, and Ion Stoica UC Berkeley SIGCOMM’06.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
1 Open DHT: A Public DHT Service for Developers of Distributed Applications University of California, Irvine Presented By : Ala Khalifeh
Beyond Theory: DHTs in Practice CS Distributed Systems Sean C. Rhea April 18, 2005 In collaboration with: Dennis Geels, Brighten Godfrey, Brad Karp,
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
A Case Study in Building Layered DHT Applications
Internet Indirection Infrastructure (i3)
P2P-SIP Using an External P2P network (DHT)
Building a Database on S3
AWS Cloud Computing Masaki.
Indexing, Access and Database System Architecture
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Peer-to-Peer Networks
Presentation transcript:

A Public DHT Service Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu UC Berkeley and Intel Research August 23, 2005

Sean C. RheaOpenDHT: A Public DHT Service Two Assumptions 1.Most of you have a pretty good idea how to build a DHT 2.Many of you would like to forget

My talk today: How to avoid building one

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service IP Chord DHT CFS (MIT) Pastry DHT PAST (MSR/ Rice) Tapestry DHT OStore (UCB) Bamboo DHT PIER (UCB) CAN DHT pSearch (HP) Kademlia DHT Coral (NYU) Chord DHT i 3 (UCB) DHT Deployment Today Kademlia DHT Overnet (open) connectivity Every application deploys its own DHT (DHT as a library)

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service IP Chord DHT CFS (MIT) Pastry DHT PAST (MSR/ Rice) Tapestry DHT OStore (UCB) Bamboo DHT PIER (UCB) CAN DHT pSearch (HP) Kademlia DHT Coral (NYU) Chord DHT i 3 (UCB) Kademlia DHT Overnet (open) DHT connectivity indirection OpenDHT: one DHT, shared across applications (DHT as a service) DHT Deployment Tomorrow?

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Two Ways To Use a DHT 1.The Library Model –DHT code is linked into application binary –Pros: flexibility, high performance 2.The Service Model –DHT accessed as a service over RPC –Pros: easier deployment, less maintenance

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service The OpenDHT Service Bamboo [USENIX’04] nodes on PlanetLab –All in one slice, all managed by us Clients can be arbitrary Internet hosts –Access DHT using RPC over TCP Interface is simple put/get: –put(key, value) — stores value under key –get(key) — returns all the values stored under key Running on PlanetLab since April 2004 –Building a community of users

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service OpenDHT Applications ApplicationUses OpenDHT for Croquet Media Managerreplica location DOAindexing HIPname resolution DTN Tetherless Computing Architecturehost mobility Place Labrange queries QStreammulticast tree construction VPN Indexindexing DHT-Augmented Gnutella Clientrare object search FreeDBstorage Instant Messagingrendezvous CFSstorage i3i3redirection

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service OpenDHT Benefits OpenDHT makes applications –Easy to build Quickly bootstrap onto existing system –Easy to maintain Don’t have to fix broken nodes, deploy patches, etc. Best illustrated through example

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service An Example Application: The CD Database Compute Disc Fingerprint Recognize Fingerprint? Album & Track Titles

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service An Example Application: The CD Database Type In Album and Track Titles Album & Track Titles No Such Fingerprint

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Picture of FreeDB

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service A DHT-Based FreeDB Cache FreeDB is a volunteer service –Has suffered outages as long as 48 hours –Service costs born largely by volunteer mirrors Idea: Build a cache of FreeDB with a DHT –Add to availability of main service –Goal: explore how easy this is to do

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Cache Illustration DHT New Albums Disc Fingerprint Disc Info Disc Fingerprint

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Building a FreeDB Cache Using the Library Approach 1.Download Bamboo/Chord/FreePastry 2.Configure it 3.Register a PlanetLab slice 4.Deploy code using Stork 5.Configure AppManager to keep it running 6.Register some gateway nodes under DNS 7.Dump database into DHT 8.Write a proxy for legacy FreeDB clients

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Building a FreeDB Cache Using the Service Approach 1.Dump database into DHT 2.Write a proxy for legacy FreeDB clients We built it – Called FreeDB on OpenDHT (FOOD)

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service food.pl

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Building a FreeDB Cache Using the Service Approach 1.Dump database into DHT 2.Write a proxy for legacy FreeDB clients We built it – Called FreeDB on OpenDHT (FOOD) – Cache has  latency,  availability than FreeDB

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Talk Outline Introduction and Motivation Challenges in building a shared DHT –Sharing between applications –Sharing between clients Current Work Conclusion

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Is Providing DHT Service Hard? Is it any different than just running Bamboo? –Yes, sharing makes the problem harder OpenDHT is shared in two senses –Across applications  need a flexible interface –Across clients  need resource allocation

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Sharing Between Applications Must balance generality and ease-of-use –Many apps (FOOD) want only simple put/get –Others want lookup, anycast, multicast, etc. OpenDHT allows only put/get –But use client-side library, ReDiR, to build others –Supports lookup, anycast, multicast, range search –Only constant latency increase on average –(Different approach used by DimChord [KR04])

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Sharing Between Clients Must authenticate puts/gets/removes –If two clients put with same key, who wins? –Who can remove an existing put? Must protect system’s resources –Or malicious clients can deny service to others –The remainder of this talk

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Protecting Storage Resources Resources include network, CPU, and disk –Existing work on network and CPU –Disk less well addressed As with network and CPU: –Hard to distinguish malice from eager usage –Don’t want to hurt eager users if utilization low Unlike network and CPU: –Disk usage persists long after requests are complete Standard solution: quotas –But our set of active users changes over time

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Fair Storage Allocation Our solution: give each client a fair share –Will define “fairness” in a few slides Limits strength of malicious clients –Only as powerful as they are numerous Protect storage on each DHT node separately –Global fairness is hard –Key choice imbalance is a burden on DHT –Reward clients that balance their key choices

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Two Main Challenges 1.Making sure disk is available for new puts –As load changes over time, need to adapt –Without some free disk, our hands are tied 2.Allocating free disk fairly across clients –Adapt techniques from fair queuing

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Making Sure Disk is Available Can’t store values indefinitely –Otherwise all storage will eventually fill Add time-to-live (TTL) to puts –put (key, value)  put (key, value, ttl) –(Different approach used by Palimpsest [RH03])

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Making Sure Disk is Available TTLs prevent long-term starvation –Eventually all puts will expire Can still get short term starvation: time Client A arrives fills entire of disk Client B arrives asks for space Client A’s values start expiring B Starves

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Making Sure Disk is Available Stronger condition: Be able to accept r min bytes/sec new data at all times Reserved for future puts. Slope = r min Candidate put TTL size Sum must be < max capacity time space max 0 now

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Making Sure Disk is Available Stronger condition: Be able to accept r min bytes/sec new data at all times TTL size time space max 0 now TTL size time space max 0 now

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Formalize graphical intuition: f(  ) = B(t now ) - D(t now, t now +  ) + r min   To accept put of size x and TTL l: f(  ) + x < C for all 0 ≤  < l This is non-trivial to arrange –Have to track f(  ) at all times between now and max TTL? Can track the value of f efficiently with a tree –Leaves represent inflection points of f –Add put, shift time are O(log n), n = # of puts Making Sure Disk is Available

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Fair Storage Allocation Per-client put queues Queue full: reject put Not full: enqueue put Select most under- represented Wait until can accept without violating r min Store and send accept message to client The Big Decision: Definition of “most under-represented”

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Defining “Most Under-Represented” Not just sharing disk, but disk over time –1-byte put for 100s same as 100-byte put for 1s –So units are bytes  seconds, call them commitments Equalize total commitments granted? –No: leads to starvation –A fills disk, B starts putting, A starves up to max TTL time Client A arrives fills entire of disk Client B arrives asks for space B catches up with A Now A Starves!

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Defining “Most Under-Represented” Instead, equalize rate of commitments granted –Service granted to one client depends only on others putting “at same time” time Client A arrives fills entire of disk Client B arrives asks for space B catches up with A A & B share available rate

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Defining “Most Under-Represented” Instead, equalize rate of commitments granted –Service granted to one client depends only on others putting “at same time” Mechanism inspired by Start-time Fair Queuing –Have virtual time, v(t) –Each put gets a start time S(p c i ) and finish time F(p c i ) F(p c i ) = S(p c i ) + size(p c i )  ttl(p c i ) S(p c i ) = max(v(A(p c i )) - , F(p c i-1 )) v(t) = maximum start time of all accepted puts

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Fairness with Different Arrival Times

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Fairness With Different Sizes and TTLs

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Talk Outline Introduction and Motivation Challenges in building a shared DHT –Sharing between applications –Sharing between clients Current Work Conclusion

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Current Work: Performance Only 28 of 7 million values lost in 3 months –Where “lost” means unavailable for a full hour On Feb. 7, 2005, lost 60/190 nodes in 15 minutes to PL kernel bug, only lost one value

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Current Work: Performance Median get latency ~250 ms –Median RTT between hosts ~ 140 ms But 95th percentile get latency is atrocious –And even median spikes up from time to time

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service The Problem: Slow Nodes Some PlanetLab nodes are just really slow –But set of slow nodes changes over time –Can’t “cherry pick” a set of fast nodes –Seems to be the case on RON as well –May even be true for managed clusters (MapReduce) Modified OpenDHT to be robust to such slowness –Combination of delay-aware routing and redundancy –Median now 66 ms, 99th percentile is 320 ms (using 2X redundancy)

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Conclusion Focusing on how to use a DHT –Library model: flexible, powerful, often overkill –Service model: easy to use, shares costs –Both have their place, we’re focusing on the latter Challenge: Providing for sharing –Across applications  flexible interface –Across clients  fair resource sharing Up and running today

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service To try it out: (code at $./find-gateway.py | head -1 planetlab5.csail.mit.edu $./put.py Hello World 3600 Success $./get.py Hello World

August 23, 2005Sean C. RheaOpenDHT: A Public DHT Service Identifying Clients For fair sharing purposes, a client is its IP addr –Spoofing prevented by TCP’s 3-way handshake Pros: –Works today, no registration necessary Cons: –All clients behind NAT get only one share –DHCP clients get more than one share Future work: authentication at gateways