Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers Sandeep K. S. Gupta (co-authors: Ayan Banerjee, Tridib Mukherjee, George Varsamopoulos) School of Computing, Informatics and Decision Systems Engg. Arizona State University
Sandeep Gupta, IEEE Senior Member Heads @ School of Computing & Informatics Use-inspired, Human-centric research in distributed cyber-physical systems Pervasive Health Monitoring Criticality Aware-Systems Thermal Management for Data Centers Intelligent Container ID Assurance Mobile Ad-hoc Networks BEST PAPER AWARD: Security Solutions for Pervasive HealthCare – ICISIP 2006. BOOK: Fundamentals of Mobile and Pervasive Computing, Publisher: McGraw-Hill Dec. 2004 Area Editor TCP Chair TCP Co-Chair: GreenCom’07 Also for IEEE TPDS WINET Email:; IMPACT Lab URL:;
IMPACT: Current Research Thrusts Challenges – Traffic congestion, Energy Scarcity, Climate Change, Medical Cost … Smart Infrastructure – distributed CPS (Cyber-Physical Embedded System (of systems)) Criticality (Context)-awareness to enhance dependability (security, safety, reliability) of CPS systems Unifying Framework to enhance our understanding in developing (energy) efficient, sustainable, assured CPS Model-based Design and Development to harness complexity (simultaneously ensure safety, security, efficiency etc.) as well cost. Enhanced Usability and Interoperability to reduce manageability overhead and enhance
IMPACT Lab Members and Collaborators Faculty: Sandeep K. S. Gupta (Professor) Postdoc: Georgios Varsamopoulos Tridib Mukherjee Students: Zahra Abbasi (CSE Phd) Ayan Banerjee (CSE Phd) Michael Jonas (CSE Phd) Sailesh Kandula (CSE MS) Su Kim (CSE Phd) Collaborator From: Microsoft Embedded Innovation Center, Aachen FDA University of Florence Intel Corp. Texas Instruments U. Penn
Introduction-Motivation Projected Electricity Use of data centers\, 2007 to 2011 The magnitude of data center energy consumption Internet users’ growth in the world from 2000-2009: 400% [] Data center energy consumption grew 20-30% annually in 2006 and 2007 [ Uptime Institute research] Addressing energy saving for internet/HPC Data Centers Thermal and Cooling awareness to improve energy consumption Future energy use projection - current efficiency trend Historical energy use [Source: EPA] Typical data center energy end use [Source: Department of energy]
The BlueTool project
Overview of problem and results Can we save energy by coordinating job scheduling and cooling? How much? Results and Contributions SP-EIR: an energy inefficiency metric of spatial scheduling higher SP-EIR → worse energy performance of a schedule lower heat recirculation → lower SP-EIR Higher thermostat setting → lower SP-EIR (because of CoP) HTS a spatial scheduling algorithm that heuristically maximizes the thermostat setting Evaluation of the HTS combined with FCFS or EDF: EDF-HTS saves 15% over EDF-LRH
Outline of talk Background System model Problem definition thermal awarenes and cooling awareness System model Physical assumptions, job model, cooling model Heat recirculation and thermostat Dependency between job and cooling Problem definition How to schedule jobs so as to minimize the need for low cooling temperature SP-IER and HTS Simulation-based comparison of various combinations between FCFS and EDF with LRH and HTS. On-going work Energy-proportional computing and its savings
Job scheduling and energy awareness Most energy-aware approaches are power-aware (e.g. DVFS schemes) Thermal awareness: to know heat recirculation Cooling awareness to know cooling performance Why cooling awareness? Cooling, along with PDUs, responsible for PUE>1 Optimizing for cooling can save additional energy About 15% for the simulated data center job scheduling schemes energy-oblivious energy-aware or power-aware performance-oriented power-aware (thermally oblivious) thermal-aware (aware of heat effects) cooling-oblivious cooling-aware (aware of the cooling model)
System model (1) Cold-aisle, hot-aisle configuration Tred: red-line Each job comes with a deadline performance heterogeneity fast and slow machines CRAC (cooling equipment) Two cooling power modes low (preferred for energy eff.) High Two (programmable) set points Low->high High->low Mode-switching delay tsw Coefficient of performance depends on the current mode Tsup = Tsen-Tdiffmode Epoch: the interval between two consecutive triggers is called an Computing equipment: linear power consumption model P = a U + b Tthresholds (set points) supply air temperature (Tsup) Input air temperature (Tsen) Ppeak system power (P) CPU utilization (U) Pidle=b low high Challenge is to set low->high set point as high as possible.
System model (2) Models assumed Tin(t) = FTsup(t)+DP(t)≤Tred Equip. 1 Equip. 2 Equip. 3 CRAC f1 f2 f3 d13 d31 d12 Tin≤Tred d21 d23 d32 Models assumed Cooling distribution matrix F Diagonal matrix: fii: portion of cool air going to equipment i Heat recirculation matrix D dij: portion of heat going from equipment i to equipment j Tin(t) = FTsup(t)+DP(t)≤Tred Tsup(t) ≤ F-1 [ Tred -D(aU(t)+ω) ] Tsup(t) has to be dynamically adjusted in accordance to U(t) to match the Tred constraint Highest thermostat setting, maxTsen, can be derived as: maxTsen = Tsup + Tdiffm - [temperature increase due to mode-switching delay] Selecting/scheduling a different set of servers (i.e. changing a and ω) can change the requirements on Tsup.
Problem definition and HTS Given a data center and its running jobs, for a given new job, find: a spatial schedule (i.e. server assignment) for that job, and thermostat settings for the CRAC, that minimize the energy consumption while meeting the deadlines. Algorithm HTS (Highest Thermostat Setting) Spatial scheduling only algorithm (i.e. server assignment) Find a spatial schedule that maximizes (lw->hi) thermostat setting Assign ranking grade to each server Rank(server j) = Tred – [temperature rise to j caused by all servers at full blast] Assign the job to the available servers with the highest rank values
SP-EIR: an energy inefficiency metric SP-EIR(alg, job set J) = Challenging to compute max SP-EIR over all possible jobs sets Akin to competitive ratio in performance domain One (naïve) upper bound to SP-EIR: Ealg(100% utilization)/Eopt(idle) Note the naïve upper bound is independent of the algorithm (for 100% utilization, there is only one possible schedule) It is solely dependent on datacenter thermal and cooling behavior For simulated data center, upper bound is 1.69 Here we “measure” SP-EIR using simulation – Leave theoretical analysis as a challenge for theoreticians
Simulation setup F and D derived from a CFD model of the ASU HPCI data center 9.6 m 8.4 m 3.6 m 30 Dell 1955 chassis, 20 Dell 1855 chassis P derived from power measurements of the computing equipment Some variations of the spatial scheduling algorithms: cooling oblivious (e.g. LRH): statically use the thermostat setting for 100% of data center utilization m (e.g. LRHm): statically use the maximum thermostat setting for the given job trace d (e.g. LRHd): dynamically adjust the thermostat setting to match Tred. 5% overall data center utilization 40% overall data center utilization 80% overall data center utilization
Measuring thermal efficiency: LRH Thermal efficiency: least contribution on the heat recirculation LRH: A metric of thermal efficiency of a server [Tang et al. T-PDS ’08] Based on a two-layer rank calculation Rank the servers as recipients of heat recirculation Rank the servers as contributors of heat recirculation LRH weight of S = Σrecipients recipient value amount of heat from S to recipient The direction and amount of heat recirculation A Example: LRH ranking of servers A and B B LRH rank of Server B is worse than A
SP-EIR as measured Reference algorithm used as optimal: Observations minimize the product DP Observations HTS alwas has the lowest SP-EIR in the simulations Enhancing any algorithm with cooling-awareness reduces the SP-EIR. MTDP has lower SP-EIR than LRH although it is thermally oblivious Power-aware workload consolidation (MTDP) has higher saving effect than thermal aware job scheduling (LRH)\ Enhancing LRH with cooling awareness can bring the SP-EIR lower than MTDP
EDF-HTS: results on energy savings with respect to other schemes Idle on LRH HTS Data center utilization cooling oblivious m d (inherently dynamic) FCFS-backfill 5% 12.41 10.65 40% 5.70 3.27 80% 3.30 0.85 EDF 3.70 3.32 0.87 0.00 (ref point) 1.85 1.49 0.83 1.40 1.31 0.73 Idle off LRH MTDP HTS Data center utilization cooling oblivious m d (inherently dynamic) FCFS-backfill 5% 23.78 21.30 40% 21.50 17.22 80% 15.80 10.81 EDF 12.53 12.30 5.17 8.17 0.00 (ref point) 16.00 15.56 10.84 8.73 5.73 9.03 8.86 0.66 3.47 0.47
Conclusions Cooling awareness SP-EIR Advantages Disadvantages Additional energy savings with other thermal-aware (but cooling-oblivious) schemes Savings up to 23% in the simulations Disadvantages Requires good knoweldge of the heat recirculation pattern and the performance of the cooling units Holistic management approaches that can configure the cooling unit by network can be cooling-aware SP-EIR SP-EIR depends on the given algorithm, job and data center. Upper bound for any algorithm depends on the thermal and power characteristics of the data center.
Implications of thermal awareness First direction Introduce thermal awareness beyond just scheduling, in data center management: Thermal-aware power management Thermal-aware cooling management Cooling-awareness enables the above “Model-driven Co-ordinated Management of Data Centers,” ComNet, Special issue on Managing Emerging Computing Environments, under review Second direction Investigate technological trends on the savings of management E.g. “Trends and Effects of energy proportionality on server provisioning in internet data centers,” HiPC 2010
Energy proportionality metrics Energy-proportional computing: Consume power in proportion to utilization (purple line) Metrics IPR: idle-to-peak power ratio Pidle / Ppeak LDR: linear deviation ratio maxu (P(u)-Linear(u))/Linear(u) (the ratio of the maximum offset from the straight green line over the the value of the straight line at the maximum point)
Historical trends of energy proportionality Source data from SPECpower_ssj2008 published results
Discussion on diverging LDR optimal performance-to-power ratio Negative LDR Ideal for stand-alone systems that are under-utilized Positive LDR Ideal for use in consolidation Near-zero LDR energy efficiency is independent to the utilization level P U minimal energy increase for considerable performance increase P U performance-to-power ratio almost independent of workload P U
Conclusions Energy proportionality will have different effects on the energy savings, depending upon the shape of the power curve IPR → 0 energy savings of power management (server provisioning) are expected to be minimal LDR >> 0 maximum energy efficiency may not be at the 100% utilization Systems can be optimally efficient at lower utilizations