Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.

Slides:



Advertisements
Similar presentations
Data Center Design Issues Bill Tschudi, LBNL
Advertisements

Matt Warner Future Facilities Proactive Airflow Management in Data Centre Operation - using CFD simulation to improve resilience, energy efficiency and.
International Symposium on Low Power Electronics and Design Qing Xie, Mohammad Javad Dousti, and Massoud Pedram University of Southern California ISLPED.
Supply and Demand Coordination in Energy Adaptive Computing (invited talk) Dr. Krishna Kant Intel/GMU M. Murugan, U/Minn 1.
1 * Other names and brands may be claimed as the property of others. Copyright © 2010, Intel Corporation. Data Center Efficiency with Optimized Cooling.
Using Computational Fluid Dynamics (CFD) for improving cooling system efficiency for Data centers Data Centre Best Practises Workshop 17 th March 2009.
Model predictive control for energy efficient cooling and dehumidification Tea Zakula Leslie Norford Peter Armstrong.
A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
Chandrakant Patel, Ratnesh Sharma, Cullen Bash, Sven Graupner HP Laboratories Palo Alto Energy Aware Grid: Global Workload Placement based on Energy Efficiency.
Project Motivation: Opportunity to explore building efficiency technology and the engineering design process Improving the thermal efficiency will save.
Dynamic Spectrum Management: Optimization, game and equilibrium Tom Luo (Yinyu Ye) December 18, WINE 2008.
Kick-off meeting 3 October 2012 Patras. Research Team B Communication Networks Laboratory (CNL), Computer Engineering & Informatics Department (CEID),
Effect of Rack Server Population on Temperatures in Data Centers CEETHERM Data Center Laboratory G.W. Woodruff School of Mechanical Engineering Georgia.
Thermal Management Solutions from APW President Systems
Dynamic Reduced-order Model for the Air Temperature Field Inside a Data Center G.W. Woodruff School of Mechanical Engineering Georgia Institute of Technology.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
All content in this presentation is protected – © 2008 American Power Conversion Corporation Rael Haiboullin System Engineer Capacity Manager.
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
Thermal Analysis and Design of Cooling Towers
Optimal Fan Speed Control for Thermal Management of Servers UMass-Amherst Green Computing Seminar September 21 st, 2009.
University of Karlsruhe, System Architecture Group Balancing Power Consumption in Multiprocessor Systems Andreas Merkel Frank Bellosa System Architecture.
Data Centre Power Trends UKNOF 4 – 19 th May 2006 Marcus Hopwood Internet Facilitators Ltd.
Identifying and Using Energy Critical Paths Nedeljko Vasić with Dejan Novaković, Satyam Shekhar, Prateek Bhurat, Marco Canini, and Dejan Kostić EPFL, Switzerland.
Thermal Aware Resource Management Framework Xi He, Gregor von Laszewski, Lizhe Wang Golisano College of Computing and Information Sciences Rochester Institute.
Green IT and Data Centers Darshan R. Kapadia Gregor von Laszewski 1.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Thermodynamic Feasibility 1 Anna Haywood, Jon Sherbeck, Patrick Phelan, Georgios Varsamopoulos, Sandeep K. S. Gupta.
07/21/2005 Senmetrics1 Xin Liu Computer Science Department University of California, Davis Joint work with P. Mohapatra On the Deployment of Wireless Sensor.
Low-Power Wireless Sensor Networks
SoftCOM 2005: 13 th International Conference on Software, Telecommunications and Computer Networks September 15-17, 2005, Marina Frapa - Split, Croatia.
Challenges towards Elastic Power Management in Internet Data Center.
Computer Architecture for Embedded Systems (CAES) group Faculty of Electrical Engineering, Mathematics and Computer Science.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Joint Power Optimization Through VM Placement and Flow Scheduling in Data Centers DAWEI LI, JIE WU (TEMPLE UNIVERISTY) ZHIYONG LIU, AND FA ZHANG (CHINESE.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
TSV-Constrained Micro- Channel Infrastructure Design for Cooling Stacked 3D-ICs Bing Shi and Ankur Srivastava, University of Maryland, College Park, MD,
Thermal Aware Data Management in Cloud based Data Centers Ling Liu College of Computing Georgia Institute of Technology NSF SEEDM workshop, May 2-3, 2011.
ATAC: Ambient Temperature- Aware Capping for Power Efficient Datacenters Sungkap Yeo Mohammad M. Hossain Jen-cheng Huang Hsien-Hsin S. Lee.
We can…. 2 GLOBAL REFERENCES Rev: 00 References :
1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Thermal-aware Task Placement in Data Centers Qinghui Tang Sandeep K S Gupta Georgios Varsamopoulos IMPACT Lab Arizona State University.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
ATCA cooling tests Claudio Bortolin (PH-ADO) Damian Dyngosz (PH-DT) George Glonti (PH-UAT) Julian Maxim Mendez (PH-ESE) Lukasz Zwalinski (PH-DT) xTCA IG.
Thermal Management in Datacenters Ayan Banerjee. Thermal Management using task placement Tasks: Requires a certain number of servers (cores) for a specified.
1 1 Thermal-Aware Scheduling in Environmentally Coupled Cyber-Physical Distributed Systems Qinghui Tang Committee Dr. Sandeep Gupta Dr. Martin Reisslein.
ATCA COOLING PROJECT INTERSHIP FINAL PRESENTATION Piotr Koziol.
Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University.
All content in this presentation is protected – © 2008 American Power Conversion Corporation Row Cooling.
1 PCE 2.1: The Co-Relationship of Containment and CFDs Gordon Johnson Senior CFD Manager at Subzero Engineering CDCDP (Certified Data Center Design Professional)
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Data Mining Techniques Applied in Advanced Manufacturing PRESENT BY WEI SUN.
1 Copyright © 2016, The Green Grid The webcast will begin shortly Today’s live session will be recorded.
Ruihong Lin 1, Yuhui Deng 1,2, Liyao Yang 1 1 Department of Computer Science, Jinan University, Guangzhou, , China 2 State Key Laboratory of Computer.
) Recherche & Innovation Multi-Objectives, Multi-Period Optimization of district networks Using Evolutionary Algorithms and MILP: Daily thermal storage.
EE5900 Cyber-Physical Systems Smart Home CPS
Unit 2: Chapter 2 Cooling.
Thermal-aware Task Placement in Data Centers (part 4)
Georgios Varsamopoulos, Zahra Abbasi, and Sandeep Gupta
Thermal-aware Task Placement in Data Centers
Work-in-Progress: Wireless Network Reconfiguration for Control Systems
System Control based Renewable Energy Resources in Smart Grid Consumer
ElasticTree Michael Fruchtman.
Thermal analysis Friction brakes are required to transform large amounts of kinetic energy into heat over very short time periods and in the process they.
Nlyte for Colocation Providers
The Greening of IT November 1, 2007.
Presentation transcript:

Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer Sc. & Engg. Arizona State University & Phil Cayton, Intel Corp.

Heating problem in Data Center Power densities are increasing exponentially along with Moore Law Current cooling solutions at various levels Chip / component level Server/board level Rack level Data center level

Two steps of reducing heating effects Design and deployment stage (Civil & Mechanical Engineering Approach ) Increasing air conditioner capacity Designing optimized layout to facilitate air circulation Operation stage (Computer Science Approach) Example: dynamically assigning tasks to avoid overheated servers and to achieve thermal balancing Assigning task to servers who consume less energy

Thermal Management of Datacenter Motivation and significance Compute Intensive Applications (Online Gaming, Computer Movie Animation, Data Mining) requiring increased utilization of Data Center Maximizing computing capacity is a demanding requirement New blade servers can be packed more densely Energy cost is rising dramatically Goal Improving thermal performance Lowering hardware failure rate Reducing energy cost

Typical layout of a datacenter Rack outlet temperature T out Rack inlet temperature T in Air conditioner supply temperature T s

Schematic View of Thermal Management

Thermal-Aware Scheduling versus Datacenter Energy Cost

Thermal Scheduling: Problem Statement We present results of thermal-aware scheduling to improve the (blade server based) energy efficient of datacenter Given a total task C, how to divide it among N server node to finish computing task with minimal total energy cost ?

Energy Conservation Inlet Airflow, a mixture of Supplied cold air and Recirculated hot air Outlet Airflow Server Power Consumption P i Depending on amount of computing task

Thermal Management Different task assignment result in different power consumption distribution Different power consumption distribution results in different temperature distribution Different temperature distribution results in different total energy cost

Example Inlet temperature distribution without Cooling 25  C Cooling lowered Inlet temperature lowered blow redline threshold Different scheduling Results different inlet Temperature distribution Scheduling 1 Scheduling 2 Demand for cooling load /energy

Total Energy Cost of Datacenter Computing energy cost Cooling energy cost keep the maximal inlet temperature below the redline temperature of devices 25  C COP: Coefficient Of Performance (COP) Total Energy Cost the amount of heat removed the energy consumed by the cooling device. COP =

Observation Even with the same computing power dissipation, different temperature distribution may demand different cooling load, results in different total energy cost We can manipulating task scheduling to achieve best temperature distribution, consequently minimize total energy cost

Naive Scheduling Algorithm

Uniform Outlet Profile Why Naive Based on observation and intuition No mathematical formalization Uniform Outlet Profile (UOP) Assigning tasks in a way trying to achieve unifrom outlet temperature distribution Tc Assigning more task to nodes with low inlet temperature (water filling process) Tc Temperature rise due to power consumption Inlet Temperature

Uniform Task Uniform Task (UT) Assigning all chassis the same amount of tasks (power consumptions) All nodes experience the same power consumption and temperature rise

Minimum Computing Energy Minimum computing energy (cooling inlet) Assigning tasks in a way to keep the number of active (power on) chassis as small as possible

Abstract Heat Flow Mode & Cross Interference Coefficients

Abstract Heat Flow Model Observation Airflow pattern are stable (confirmed through CFD simulation) Hypothesis The amount of recirculated heat is stable, can be characterized Define a ij the percentage of recirculated heat from node i to node j

Cross Interference among Server Nodes Cross Interference Coefficients (CIC) Define a ij the percentage of recirculated heat from node i to node j Cross interference coefficients Cross Interference Matrix Correlations among power consumption (utilization rate), temperature, and cross interference

Fast Thermal Evaluation Use profiling process to calculate cross interference coefficients Temperature Prediction A Configuration of Distributed System Numerical Simulation (hours) Fast Thermal Evaluation (real time) Thermal Performance Evaluation

Recirculation Minimized Scheduling: XInt

Formalizing optimization problem To minimize cooling energy cost, we only need to minimize maximal inlet temperature Formalized optimization problem based on abstract heat flow model, can be converged into LP, ILP, linear, nonlinear problems according to different models and policies

Simulation Results

Simulation Environment 2 Row Datacenter Ten standard 42U racks Each rack has five Dell 1855 Blade server CFD simulation is used for evaluate temperature distribution (Flovent from Flomerics)

DataCenter model Node 1 Node 2 Node 5 Node 50 Node 25 Node 30

Cross Interference Coefficients Confirmed with datacenter reality Strong interference to neighboring nodes

Fast Thermal Evaluation Results Provides fast and accurate temperature prediction Practical for online real-time thermal management

Simulation Results: Cooling Cost

Simulation Results: Analysis & Summary XInt consistently outperforms all other scheduling algorithms Compared with MinHR, XInt is more practicabel Task oriented scheduling vs. Power oriented scheduling Online, real-time XInt is mathematically formalized

Future Works Integrating with cluster management software platforms Moab, Torque, etc Considering task priorities and time constraints

Questions ?

Related Works Consil vs Fast Thermal Evaluation Deduction vs. Prediction Current vs. future, which is more important for proactive and preventive thermal management MinHR vs. XInt Both characterize recirculation in similar granulites Aggregated effects vs. point to point Offline vs. online Power oriented vs. Task oriented

Supply Heat Index (SHI) Roughly characterize recirculation Cannot differentiate the same SHI but different temperature distribution