Presentation is loading. Please wait.

Presentation is loading. Please wait.

PrincetonUniversity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM.

Similar presentations


Presentation on theme: "PrincetonUniversity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM."— Presentation transcript:

1 PrincetonUniversity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

2 2 Setting: Shared Storage in the Cloud Z Y T F Z Y F T S3EBS SQ S Shared Key-Value Storage

3 3 DD Shared Key-Value Storage ZYTFZYFTYYZFFF Multiple co-located tenants ⇒ resource contention Predictable Performance is Hard

4 4 DD Shared Key-Value Storage ZYTFZYFTYYZFFF Fair queuing @ big iron Multiple co-located tenants ⇒ resource contention Predictable Performance is Hard

5 5 Distributed system ⇒ distributed resource allocation Multiple co-located tenants ⇒ resource contention ZYTFZYFTYYZFFF SS Predictable Performance is Hard

6 6 ZYTFZYFTYYZFFF SS popularity data partition Multiple co-located tenants ⇒ resource contention Distributed system ⇒ distributed resource allocation Z keyspace T F Y Predictable Performance is Hard

7 7 ZYTFZYFTYYZFFFZYTFZYFTYYZFFF SS Skewed object popularity ⇒ variable per-node demand Multiple co-located tenants ⇒ resource contention Distributed system ⇒ distributed resource allocation 1kB GET 10B GET 1kB SET 10B SET (small reads)(large reads)(large writes)(small writes) Disparate workloads ⇒ different bottleneck resources Predictable Performance is Hard

8 8 ZyngaYelpFoursquareTP Shared Key-Value Storage Tenants Want System-wide Resource Guarantees ZYTFZYFTYYZFFF SS 80 kreq/s120 kreq/s160 kreq/s 40 kreq/s Hard limits ⇒ lower utilization demand z = 120 kreq/s rate z = 80 kreq/s demand f = 120 kreq/s rate f = 120 kreq/s Skewed object popularity ⇒ variable per-node demand Multiple co-located tenants ⇒ resource contention Distributed system ⇒ distributed resource allocation Disparate workloads ⇒ different bottleneck resources

9 9 ZyngaYelpFoursquareTP Shared Key-Value Storage Pisces Provides Weighted Fair-shares w z = 20%w y = 30%w f = 40%w t = 10% demand z = 30% rate z = 30% demand f = 30% ZYTFZYFTYYZFFF Max-min fair share ⇒ lower bound on system performance rate f = 30% SS Skewed object popularity ⇒ variable per-node demand Multiple co-located tenants ⇒ resource contention Distributed system ⇒ distributed resource allocation Disparate workloads ⇒ different bottleneck resources

10 10 Pisces: Predictable Shared Cloud Storage Pisces - Per-tenant max-min fair shares of system-wide resources ~ min guarantees, high utilization - Arbitrary object popularity - Different resource bottlenecks Amazon DynamoDB - Per-tenant provisioned rates ~ rate limited, non-work conserving - Uniform object popularity - Single resource (1kB requests)

11 11 Tenant A Predictable Multi-Tenant Key-Value Storage Tenant B VM RS FQ GET 1101100 RR Controller PP

12 12 Tenant A Predictable Multi-Tenant Key-Value Storage Tenant B VM Weight A Weight B RS FQ PP WA W A2 W B2 GET 1101100 RR Controller W A2 W B2 W A1 W B1

13 13 Strawman: Place Partitions Randomly Tenant A Tenant B VM Weight A Weight B PP RS FQ WA W A2 W B2 Controller RR

14 14 Strawman: Place Partitions Randomly Tenant A Tenant B VM Weight A Weight B PP RS FQ WA W A2 W B2 RR Controller Overloaded

15 15 Pisces: Place Partitions By Fairness Constraints Bin-pack partitions Tenant A Tenant B VM Weight A Weight B PP RS FQ WA W A2 W B2 RR Collect per-partition tenant demand Controller

16 16 Pisces: Place Partitions By Fairness Constraints Tenant A Tenant B VM Weight A Weight B PP Results in feasible partition placement RS FQ WA W A2 W B2 RR Controller

17 17 Controller Strawman: Allocate Local Weights Evenly W A1 = W B1 W A2 = W B2 excessive demand Tenant A Tenant B VM Weight A Weight B PP WA W A2 W B2 RR RS FQ Overloaded

18 18 Pisces: Allocate Local Weights By Tenant Demand A←BA←BA→BA→B W A1 > W B1 W A2 < W B2 max mismatch M/M/1 queuing delay Tenant A Tenant B VM Weight A Weight B Compute per-tenant +/- mismatch PP WA W A2 W B2 Controller W A1 = W B1 W A2 = W B2 Reciprocal weight swap RR RS FQ

19 19 Strawman: Select Replicas Evenly 50% RS PP WA W A2 W B2 Controller Tenant A Tenant B VM Weight A Weight B GET 1101100 RR W A1 > W B1 W A2 < W B2 FQ

20 20 Tenant A Pisces: Select Replicas By Local Weight 60% 40% detect weight mismatch by request latency Tenant B VM Weight B Controller 50% RS PP WA W A2 W B2 VM Weight A GET 1101100 W A1 > W B1 W A2 < W B2 FQ RR

21 21 Strawman: Queue Tenants By Single Resource Bandwidth limitedRequest Limited bottleneck resource (out bytes) fair share outreq outreq Tenant A Tenant B VM Controller RS PP WA W A2 W B2 FQ W A2 < W B2 W A1 > W B1 RR GET 1101100 GET 0100111

22 22 Pisces: Queue Tenants By Dominant Resource Bandwidth limitedRequest Limited outreq outreq Tenant A Tenant B VM bottlenecked by out bytes Track per-tenant resource vector dominant resource fair share Controller RS PP WA W A2 W B2 FQ W A2 < W B2 RR

23 23 Pisces Mechanisms Solve For Global Fairness minutessecondsmicroseconds Timescale System Visibility local global RSR R... SS... Controller feasible partition placement demand-driven weight allocation weight-sensitive selection policy dominant resource fair shares Maximum bottleneck flow weight exchange FAST-TCP basedreplica selection DRR token- basedDRFQ scheduler Replica Selection Policies Weight Allocations fairness and capacity constraints PP WA W A2 W B2 FQ

24 24 Evaluation Does Pisces achieve (even) system-wide fairness?  Is each Pisces mechanism necessary for fairness?  What is the overhead of using Pisces? Does Pisces handle mixed workloads? Does Pisces provide weighted system-wide fairness? Does Pisces provide local dominant resource fairness? Does Pisces handle dynamic demand? Does Pisces adapt to changes in object popularity?

25 25 Evaluation Does Pisces achieve (even) system-wide fairness?  Is each Pisces mechanism necessary for fairness?  What is the overhead of using Pisces? Does Pisces handle mixed workloads? Does Pisces provide weighted system-wide fairness? Does Pisces provide local dominant resource fairness? Does Pisces handle dynamic demand? Does Pisces adapt to changes in object popularity?

26 Pisces Achieves System-wide Per-tenant Fairness Unmodified Membase Ideal fair share: 110 kreq/s (1kB requests) Pisces 0.57 MMR 0.98 MMR Min-Max Ratio: min rate/max rate (0,1] 8 Tenants - 8 Client - 8 Storage Nodes Zipfian object popularity distribution

27 27 Each Pisces Mechanism Contributes to System-wide Fairness and Isolation Unmodified Membase 0.59 MMR0.93 MMR0.98 MMR 0.36 MMR 0.58 MMR0.74 MMR RS WA PPFQWAPPFQ 0.90 MMR 0.96 MMR0.97 MMR0.89 MMR 0.57 MMR FQ 2x vs 1x demand usuallysometimesalways RS PPWA W A2 W B2 FQ 0.64 MMR

28 28 Pisces Imposes Low-overhead < 5% > 19%

29 29 Pisces Achieves System-wide Weighted Fairness 0.98 MMR 4 heavy hitters20 moderate demand40 low demand 0.56 MMR 0.89 MMR0.91 MMR

30 30 Pisces Achieves Dominant Resource Fairness Time (s) Bandwidth (Mb/s) GET Requests (kreq/s) 76% of bandwidth 76% of request rate Time (s) 1kB workload bandwidth limited 10B workload request limited 24% of request rate

31 31 Pisces Adapts to Dynamic Demand Constant BurstyDiurnal (2x wt) ~2x even Tenant Demand

32 32 Conclusion Pisces Contributions - Per-tenant weighted max-min fair shares of system- wide resources w/ high utilization - Arbitrary object distributions - Different resource bottlenecks - Novel decomposition into 4 complementary mechanisms PP Partition Placement WARSFQ Weight Allocation Replica Selection Fair Queuing


Download ppt "PrincetonUniversity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM."

Similar presentations


Ads by Google