Download presentation
Presentation is loading. Please wait.
Published byCoral McCoy Modified over 9 years ago
1
Power-aware Resource Allocation for Cpu- and Memory Intense Internet Services Vlasia Anagnostopoulou (vlasia@cs.ucsb.edu), Susmit Biswas, Heba Saadeldeen, Ricardo Bianchini, Tao Yang, Diana Franklin, Frederic T. Chong University of California, Santa Barbara First E 2 DC Workshop 08/05/2012
2
Cpu- and Memory Intense Internet-Services MapReduce, Hadoop,… Latency-bound Intense computation (=>high cpu utilization) Petascale data
3
Datacenter clusters
4
Datacenter cluster operation
5
Challenges Standard middleware algorithms are inefficient for cpu- and memory-intense internet services Resource allocation operates at a fine- granularity – But is oblivious of the SLA Power management is SLA-aware – But is only driven by the CPU – Coarse-grained Request distribution does not operate at a resource granularity
6
Overview of solution SLA-aware and fine-grained Two steps: – Configure states of servers (basic power-aware resource allocation) – Allocate resources to servers (cpu and memory) Resource Allocation Power Management Request distribution Power-aware Resource Allocation for cpu and memory Adjusted Request distribution Standard Middleware Optimized Middleware
7
Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology Experiments Conclusion
8
Basic Power-aware Resource Allocation Configure server states: – Active, off, low-power state Problem of memory being inaccessible – Internet-services have high memory demand (for caching) Solution: use a memory-active, low-power state (barely-alive) – Memory is on – Server is not operational, but memory can be remotely accessed – Memory contributes to global cache
9
Details of Barely-alive state
10
Basic Power-aware Resource Allocation Calculations: Active servers to service load – N_cpu_act = Load_demand / Cpu_capacity Memory-active servers to satisfy memory demand – Active or barely-alive – N_mem_act = Memory_demand/ Mem_capacity Configure to maximize energy savings, or to maximize memory allocation
11
Example N=5 servers Cpu-capacity = 1,000 conn. Mem-capacity = 1GB Load = 3,000 conn. Target mem-alloc = 4GB Maximize energy-savings: Maximize memory alloc.: Mem. usage: 0.8GB/server How to control the memory allocation?
12
Memory Allocation for SLA Two objectives: 1) Allocate memory for SLA 2) Share memory among services with SLA guarantees – Must be fair; accept priority – Guarantee minimum performance Characteristics: Uniform allocation per server (to avoid imbalance) Memory performance monitoring capability which is SLA-aware
13
Memory allocation for SLA Utilize stack algorithm [Mattson] – Measures contribution of memory size to the hit-rate – Hit-rate is used as proxy of performance Server-level: Calculate alloc for target-hit-rate – Attach SLA mapping Cluster-level: calculate avg size for target hit-rate How to allocate memory when constrained? SizeHitsHit- ratio 16/966.7% 21/977.8% 30/977.8% ServerSize 12 22 …… 52 Avg:2 SLA #3 #2
14
SLA/Memory Sharing Aggregate metric of performance – sum of allocations which yield performance closest to SLA Linear optimization problem to maximize aggregate performance: at each step, allocate memory s.t. to minimize aggregate performance subject to memory capacity constraint guarantee min SLA for each app SizeHit- ratio SLA 166.7%#3 277.8%#2 377.8%#2 {app1, app2} => Target SLA {#2, #2} dist_to_SLA_alloc = ∞ dist_to_SLA_alloc = 1 dist_to_SLA_alloc = 0
15
Request Distribution Processing…
16
Adjusted Request Distribution Processing…
17
Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments Conclusion
18
Methodology Datacenter-cluster simulator: – 1 rack – trace-based functional simulator Simulate all standard and proposed middleware algorithms Traces: – Internet-search “snippet” generator
19
Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments – Basic Algorithm – Shared Cluster Conclusion
20
Experiments – Basic Algorithm Evaluate various configuration objectives: Barely-alive: maximize memory allocation; Mixed: maximize energy savings Fix SLA, evaluate energy savings only. Also, evaluate residual memory. SLA #1, #2, #3: Response time degradation 1-2%, 2- 3%, 3-4% Aggressiveness of consolidation: 50, 70, 85% SystemActiveOffBarely-alive BaselineYNN On/OffYYN BAYNY MixedYYY
21
Results – basic algorithm Mixed system has highest energy savings; up to 42% (24% over On/Off) BA: up to 34% (20% over On/Off)
22
Results – basic algorithm Mixed system is most stable In barely-alive system savings depend on the SLA level; can push the parameter for savings aggressiveness On/off system savings are influenced by both parameters. Degrade significantly at high SLA levels
23
Results - Basic algorithm BA: up to extra 7.5GB memory: allocate to another application, transition to low- power etc
24
Results – Cluster Sharing
25
Results – Cluster sharing
26
Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments – Basic Algorithm – Shared Cluster Conclusion
27
Combine power management and resource allocation => power-aware resource allocation SLA-driven, fine grained management of datacenter clusters – Performance guarantees + energy savings Flexibility to different optimizations for datacenter scenarios Achieve deep energy savings or potential for more memory utility out of cluster Holistic design of middleware software
28
Thank you for your attention!!! Questions? Contact: vlasia@cs.ucsb.edu URL: www.cs.ucsb.edu/~vlasia
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.