Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008.

Similar presentations


Presentation on theme: "Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008."— Presentation transcript:

1 Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008

2 2 Introduction ● Well-known problem: data center energy use ● Energy is one of the largest expenses for site operators ● 1.2% of U.S. energy use attributed to servers (estimate for 2005)

3 3 Opportunities ● Power usage effectiveness (PUE) ● Overhead power for non-IT equipment ● EPA est. average 1.96; Google average 1.21 ● Energy proportionality of hardware ● Conversion losses, idle power, leakage ● Ideally, power should be proportional to load ● Compute capacity overprovisioning ● Static provisioning for peak expected load ● Ideally, capacity should track actual load

4 4 Opportunities ● Power usage effectiveness (PUE) ● Overhead power for non-IT equipment ● EPA est. average 1.96; Google average 1.21 ● Energy proportionality of hardware ● Conversion losses, idle power, leakage ● Ideally, power should be proportional to load ● Compute capacity overprovisioning ←Our focus ● Static provisioning for peak expected load ● Ideally, capacity should track actual load

5 5 Cluster Energy Management ● A run-time solution that dynamically: 1.estimates demand 2.reconfigures cluster to adjust capacity – activates sufficient number of nodes – may use dynamic voltage scaling (DVS) on activated nodes 3.shuts down inactive (idle) nodes – may leave some spare nodes turned on to absorb load bursts ● Reduces waste from overprovisioning

6 6 Multi-mode Energy Management ● Manages energy use of both the active and the inactive portion of the cluster ● Exploits multiple dynamic power management (DPM) techniques: ● DVS for the active nodes ● System sleep states for the inactive nodes – Various power levels and wakeup latencies ● Significant additional energy reduction possible ● Need to optimize capacity in each sleep state to maintain service responsiveness

7 7 System Model ● Cluster running a multi-tier service ● Each tier a different application ● Service level agreement: ● Average end-to-end latency ● Assumptions: ● Homogeneous tiers – Uniform CPU frequencies ● Perfect load balancing within tiers – Uniform utilization Tier 1 Tier 2 Tier s request response latency

8 8 Motivating Example latency target exceeded Spares ● An SLA violation is detected

9 9 Motivating Example ● An SLA violation is detected ● Optimal reaction? ● Activate spares? ● Reallocate among tiers? ● Increase CPU frequencies? latency target exceeded Spares ? ? ? ?

10 10 Motivating Example ● An SLA violation is detected ● Optimal reaction? ● Activate spares? ● Reallocate among tiers? ● Increase CPU frequencies? ● Predict system behavior for different feasible cluster configurations ● Machine power model ● Tier latency model latency target exceeded Spares ? ? ? ?

11 11 Power Model ● Mainly determined by CPU frequency ( f ) and utilization ( U ) ● Equation obtained from linear approximation: P(f,U) = a 3 f U + a 2 f + a 1 U + a 0 ● Parameters a i calculated using curve fitting ● Average estimation error: 1% (R 2 =0.99)

12 12 Multi-mode Energy Components ● Total cluster energy use: E total = E active + E sleep + E transition ● Active portion is dictated by load ● Sleeping portion can be managed separately ● Transition energy is insignificant for typical Web workloads ● Major fluctuations are long-term (daily, weekly, etc.) ● Short bursts can be filtered out

13 13 Active Energy Optimization ● Constrained optimization problem ● Minimize total machine power by setting – Tier sizes ( m i ) – Tier CPU frequencies ( f i ) ● Subject to – Sum of tier latencies ≤ Target latency from SLA – Sum of tier sizes ≤ Cluster size ● Solution: necessary condition For all i,j : G(m i, f i ) = G(m j, f j ) ● Can be used to limit search space

14 14 Sleep Energy Optimization ● Sleep states range from On to Soft-Off (0..n) ● Power draw: p i ≤ p i-1 ● Wakeup latency: ω i ≥ ω i-1 ● Control tradeoff between responsiveness and energy ● Maximum accommodated load increase rate (MALIR, σ ) ● Another constrained optimization problem ● Minimize spare machine power by selecting sleep states ● Subject to MALIR constraint – A feasible wakeup schedule must always exist to meet load increase

15 15 Sleep Energy Optimization ● Minimum feasible power occurs when feasible wakeup schedule is tight ● All sleeping nodes have to be woken up at t 0 ● Optimal number of spare servers for sleep state i : S i = σ(ω i+1 – ω i ) ∕ f max d(t) = d(t 0 ) + σ(t−t 0 ) demand t0t0 t0+ω1t0+ω1 t0+ω2t0+ω2 t0+ω3t0+ω3 t c(t 0 +ω i ) = c(t 0 +ω i-1 ) + S i f max capacity S 0 f max S 1 f max S 2 f max

16 16 Policy Design ● Active capacity policy ● Assumptions – Static machine power components dominate ● Heuristic search (each period): – Start with minimal tier sizes – Round 1: While predicted latency is too high, add one machine (at full frequency) to a selected tier – Round 2: While predicted latency is too low, lower the frequency of a selected tier – In both rounds, tier selection is guided by greedy minimization of StdDev G i

17 17 Policy Design ? ? G 1 =0.4 7 G 2 =0.2 3 G 3 =0.8 2

18 18 Policy Design G 1 =0.47 → 0.53 (StdDev=0.2409) G 2 =0.2 3 G 3 =0.8 2

19 19 Policy Design G 1 =0.4 7 G 2 =0.23 → 0.61 (StdDev=0.1438) G 3 =0.8 2

20 20 Policy Design G 1 =0.4 7 G 2 =0.2 3 G 3 =0.82 → 0.83 (StdDev=0.2466)

21 21 Policy Design G 1 =0.47 → 0.53 (StdDev=0.2409) G 2 =0.23 → 0.61 (StdDev=0.1438) G 3 =0.82 → 0.83 (StdDev=0.2466)

22 22 Policy Design ● Spare server policies ● “Optimal” Policy – Pre-computes S i and assigns spare machines to each state up to this limit, working from shallow to deep states. – This way, it attempts to ensure machines can be woken up by the time they are needed. ● Demotion Policy – Machines are put into progressively deeper states, as guided by carefully calculated fixed timeouts. – No attempt to ensure it is possible to wake up in time to meet demand. – The most widely used predictive policy

23 23 Feedback Control Design ● Load estimation is critical ● Utilization is not sufficient ● Adjust estimates based on performance monitoring

24 24 Experimental Setup ● Testbed HW: 12-node cluster ● 3 sleep states (ACPI S3, S4, S5) ● 4 CPU frequencies (1.0, 1.8, 2.0, 2.2 GHz) ● Total cluster AC power measurements ● SW: standard Web server infrastructure ● LVS (load balancer) + Apache + JBoss + MySQL ● Our system implemented in middleware ● Workload: trace-driven TPC-W ● Highly dynamic workload that emulates client behavior ● Load fluctuations simulated from real-life Web traces

25 25 Policies Evaluated ● Baseline: statically provisioned for peak load (not overprovisioned) ● DVS: uses our active capacity algorithm but without reconfiguration, allowing only DVS ● Optimal( states ): uses the “Optimal” spare server policy for reconfiguration ● Demotion( states ): uses the Demotion spare server policy for reconfiguration ● where states is: – Off: allow the Soft-Off (S5) state only (and DVS) – Multiple: allow all 3 sleep states (and DVS)

26 26 Performance and Energy Efficiency ● Average latency ● Baseline: 139 ms ● Target: 250 ms ● Active policy: 227 ms ● Overall energy (cost) savings ● DVS: 11% ● Optimal(Off): 34% ● Energy efficiency (EDP) ● Baseline: 1.475 Js ● DVS: 1.342 Js ● Optimal(Off): 0.996 Js

27 27 Shape of Load Fluctuations ● Trace acceleration factor: 20x ● Exploiting multiple states yields 6–14% extra energy savings ● Optimal is less sensitive to trace differences ● Optimal outperforms Demotion for the majority of traces by up to 7% in energy savings

28 28 Peak Load Intensity ● Acceleration: 60x ● Multiple states yield additional energy savings of 7–23% ● Demotion(Off) is highly inefficient ● Worse energy than Optimal(Off) by up to 9% ● Fails to meet the target latency – Cause: does not leave spare capacity to absorb load while machines are being waken up

29 29 Time Scale of Fluctuations ● Acceleration factors: 5–60x ● System invariants cannot be accelerated – wakeup latencies, service startup times ● Major influence on energy savings and efficiency ● More dynamic fluctuations → more spare capacity is necessary → more important the policy ● High fluctuation rates + Demotion-like Off-only policy → performance degradation

30 30 Conclusions ● Separate concerns: active and spare capacity ● Minimal sleep energy subject to responsiveness constraints – A combination of spare capacities in multiple sleep states ● Exploiting multiple sleep states ● Extra energy savings up to 23% over Off-only version ● Negligible performance impact ● Benefits of the optimization-based spare policy ● Outperforms Demotion for the majority of traces ● Guarantees specified level of responsiveness ● No reliance on workload prediction


Download ppt "Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008."

Similar presentations


Ads by Google