Presentation is loading. Please wait.

Presentation is loading. Please wait.

Resource Allocation in OpenHash: a Public DHT Service Sean Rhea with Brad Karp, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu.

Similar presentations


Presentation on theme: "Resource Allocation in OpenHash: a Public DHT Service Sean Rhea with Brad Karp, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu."— Presentation transcript:

1 Resource Allocation in OpenHash: a Public DHT Service Sean Rhea with Brad Karp, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu

2 Introduction In building OceanStore, worked on Tapestry –Found Tapestry problem harder than expected –Main problem: handling churn Built Bamboo to be churn-resilient from start –Working by 6/2003 –Rejected from NSDI in 9/2003 –Released code in 12/2003, about 10 groups using –Accepted to USENIX, 6/2004

3 Introduction (con’t.) Intended to Bamboo to be general, reusable –Supports Common API for DHTs –Tens of DHT applications proposed in literature –Still very few in common use--why? One possible barrier: deployment –Need access to machines, not everyone on PL –Must monitor, restart individual processes –Takes about an hour/day minimum right now

4 Simple DHT Applications Many uses of DHTs very simple: just put/get –Don’t use Common API [Dabek et al.] –No routing, no upcalls, etc. Examples: –Dynamic DNS –FreeDB In general: use DHT as highly available cache or rendezvous service Should be able to share a single DHT deployment

5 Sophisticated DHT Applications Other functionality of DHTs is lookup –Map identifiers to application nodes efficiently –Used by most sophisticated applications i3, OceanStore, SplitStream Can implement lookup on put/get –Algorithm called ReDiR, IPTPS paper this year Sophisticated applications could also share a single DHT deployment

6 OpenHash: a Public DHT Service Idea: Public DHT to amortize deployment effort –Very low barrier to entry for simple applications –Amortize bandwidth cost for sophisticated apps Challenges –Economics –Security –Resource Allocation

7 Overview Introduction OpenHash interface, assumptions Resource allocation –Goals/Problem formalization –Rate-limiting puts –Fair sharing Discussion

8 OpenHash Interface/Assumptions Want to keep things simple for clients –Remember goal: low barrier to entry Simple put/get –put (key, value, time-to-live) –get (key) Service contract: –Puts accepted/rejected immediately (not queued) –Once accepted, put values available for whole TTL Predictable, zero-effort availability for clients –After that, will be thrown out by DHT Easy garbage collection, also valuable for some apps

9 Resource Allocation Introduction Problem: disk space is limited –If service popular, may exhaust –Malicious clients might exhaust on purpose Rough goal: every client gets fair share of store –Ideally, algorithm should be work-conserving Example: –Three clients: A, B, and C; 10 GB of total space –A and B want 1 GB each, C wants 20 GB –A and B should get 1 GB each; C should get 8 GB

10 Problem Simplification For now, shares calculated per-DHT node –Global fair sharing saved for future work Clients that balance puts won’t notice a problem –Most DHT applications already balance puts –Apps that can choose their keys can do even better Side benefit: encourages balancing puts –Mitigates need for load balancing in DHT Let the users handle load balancing –Easier for us to implement!

11 Problem Formalization (First Try) C - total available storage s i - storage desired by client i, S =  s i s fair - fair share such that C =  min(s i, s fair ) g i - storage granted to client i, G -  g i Goals –Fairness:  i g i = min(s i, s fair ) –Utilization: G = min(C, S)

12 Problem Formulation (Second Try) Previous version didn’t account for time –Can only remove stored values as TTLs expire –As such, can only adapt so quickly –Before accepting one put, another must expire Add goal: always accept puts at rate  R –Prefer puts from underrepresented clients –Intuition: R bounds time it takes to correct unfairness New questions: –How to guarantee space frees up at rate  R? –How to divide R among clients?

13 Overview Introduction OpenHash interface, assumptions Resource allocation –Goals/Problem formalization –Rate-limiting puts –Fair sharing Discussion

14 Accepting At Rate  R S(t) - total data stored at time t A(t 1, t 2 ) - data added to system in [t 1, t 2 ) D(t 1, t 2 ) - data freed in [t 1, t 2 ) For adaptivity, need:A(t, t+∆t)  R  ∆t Capacity limit: S(t) + A(t, t+∆t) - D(t, t+∆t)  C –Rearrange: C + D(t, t+∆t) - S(t)  A(t, t+∆t) Combined with top eqn: C + D(t, t+∆t) - S(t)  R  ∆t –Rearrange: D(t, t+∆t)  R  ∆t - C + S(t) Result: can accept any put that won’t make us violate this equation at any point in the future

15 Implementing Rate Limiting Before accepting put, must check D(t, t+∆t) –Can we check this efficiently? Easy, assuming all puts have same TTL –Can implement using a virtual “pipe” –Pipe is TTL long, total capacity C –New puts go into pipe, expire on exit –Can easily show pipe is optimal for this case With varying TTLs, problem harder –Puts with short TTLs expire in middle of pipe –Bin-packing problem on new puts: find latest spot in pipe that satisfies desired size and TTL

16 Overview Introduction OpenHash interface, assumptions Resource allocation –Goals/Problem formalization –Rate-limiting puts –Fair sharing Discussion

17 Choosing Puts for Fair Sharing Assume can accept new puts at rate  R –How do we divide it up between clients? Unlike fair queuing, two competing goals: 1.Want to make decisions (put/reject) quickly –In FQ, may queue for a long time before fowarding 2.Suffer consequences of decisions for full TTL –In FQ, only interested in fairness over short window But one big advantage: long history –Remember all puts whose TTLs haven’t expired

18 The Rate-Based Approach Accept based on recent put rates –Already storing all puts, so also store rates –(Could estimate these as in Approx. Fair Drop.) –Basically, fair share the input rate R Pros: –Easy to implement –If all clients put at uniform rates, gives fair stores Cons: –To get fair share, must put at uniform rate –What about bursty clients (avg. rate << max. rate)?

19 The Storage-Based Approach Accept puts based on amount of storage used –Keep counters of storage used by each client –Prefer new puts from clients with less data on disk Pros: –Also easy to implement –Gives fair stores regardless of uniformity of client put rates Cons: –Over-represented clients block on under-represented ones –Could be very disruptive as new clients enter system

20 The Commitment-Based Approach Base fairness around “commitments” –How many bytes stored for how much more time –New bytes entail more future commitment than old Pros: –Better at bursts than rate-based approach –Better at not blocking over-represented clients than storage-based approach Cons: –Hard to think about in detail, hard to implement?

21 Related Work Various fair queuing techniques –Standard FQ –Approximate Fair Dropping –CSFQ Other DHT work –Palimpsest Other networking work –Internet backplane

22 Discussion What is the optimal rate limiting algorithm? –How close to our various schemes come to it? What’s the right model for sharing? –Rate-based approach? –Storage-based approach? –Commitment-based approach? –Some hybrid? –Lottery Scheduling? What other models make sense? –Palimpsest?


Download ppt "Resource Allocation in OpenHash: a Public DHT Service Sean Rhea with Brad Karp, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu."

Similar presentations


Ads by Google