Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab (www.impact.asu.edu) Department.

Slides:



Advertisements
Similar presentations
University of Minnesota Optimizing MapReduce Provisioning in the Cloud Michael Cardosa, Aameek Singh†, Himabindu Pucha†, Abhishek Chandra
Advertisements

1 * Other names and brands may be claimed as the property of others. Copyright © 2010, Intel Corporation. Data Center Efficiency with Optimized Cooling.
S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.
Power Management in Cloud Computing using Green Algorithm -Kushal Mehta COP 6087 University of Central Florida.
Efficient Resource Management for Cloud Computing Environments Andrew J. Younge, Gregor von Laszewski, Lizhe Wang, Sonia Lopez-Alarcon, Warren Carithers.
Improving Energy Efficiency in Data Centers and federated Cloud Environments A Comparison of CoolEmAll and Eco2Clouds approaches and metrics Eugen Volk,
A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Utility-Function-Driven Energy- Efficient Cooling in Data Centers Authors: Rajarshi Das, Jeffrey Kephart, Jonathan Lenchner, Hendrik Hamamn IBM Thomas.
Keeping Hot Chips Cool Thermal Management for Green Computing Yang Ge Professor Qinru Qiu.
1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
All content in this presentation is protected – © 2008 American Power Conversion Corporation Rael Haiboullin System Engineer Capacity Manager.
Cross Strait Quad-Regional Radio Science and Wireless Technology Conference, Vol. 2, p.p. 980 – 984, July 2011 Cross Strait Quad-Regional Radio Science.
Efficient Resource Management for Cloud Computing Environments
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
University of Karlsruhe, System Architecture Group Balancing Power Consumption in Multiprocessor Systems Andreas Merkel Frank Bellosa System Architecture.
Data Centre Power Trends UKNOF 4 – 19 th May 2006 Marcus Hopwood Internet Facilitators Ltd.
Thermal Aware Resource Management Framework Xi He, Gregor von Laszewski, Lizhe Wang Golisano College of Computing and Information Sciences Rochester Institute.
Green IT and Data Centers Darshan R. Kapadia Gregor von Laszewski 1.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
Low-Power Wireless Sensor Networks
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
Cluster Reliability Project ISIS Vanderbilt University.
EmNet: Satisfying The Individual User Through Empathic Home Networks J. Scott Miller, John R. Lange & Peter A. Dinda Department of Electrical Engineering.
Challenges towards Elastic Power Management in Internet Data Center.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Most organization’s data centers that were designed before 2000 were we built based on technologies did not exist or were not commonplace such as: >Blade.
GreenSched: An Energy-Aware Hadoop Workflow Scheduler
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Thermal Aware Data Management in Cloud based Data Centers Ling Liu College of Computing Georgia Institute of Technology NSF SEEDM workshop, May 2-3, 2011.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Thermal-aware Task Placement in Data Centers Qinghui Tang Sandeep K S Gupta Georgios Varsamopoulos IMPACT Lab Arizona State University.
A Dynamic Query-tree Energy Balancing Protocol for Sensor Networks H. Yang, F. Ye, and B. Sikdar Department of Electrical, Computer and systems Engineering.
Accounting for Load Variation in Energy-Efficient Data Centers
Key Customer ChallengesCustomer Pain Points How You can Help the CustomerProductsSolutionsServices Increasing Density Difficult to maintain 300 cfm per.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
ATCA cooling tests Claudio Bortolin (PH-ADO) Damian Dyngosz (PH-DT) George Glonti (PH-UAT) Julian Maxim Mendez (PH-ESE) Lukasz Zwalinski (PH-DT) xTCA IG.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Thermal Management in Datacenters Ayan Banerjee. Thermal Management using task placement Tasks: Requires a certain number of servers (cores) for a specified.
1 1 Thermal-Aware Scheduling in Environmentally Coupled Cyber-Physical Distributed Systems Qinghui Tang Committee Dr. Sandeep Gupta Dr. Martin Reisslein.
Load Rebalancing for Distributed File Systems in Clouds.
Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University.
Lead from the front Texas Nodal 1 TDWG Nodal Update – June 6, Texas Nodal Market Implementation Server.
Managing multiple projects or services? Have a mix of Microsoft Project and more simple tasks? Need better visibility and control?
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
1 Copyright © 2016, The Green Grid The webcast will begin shortly Today’s live session will be recorded.
BLADE HEMAL RANA BLADE TECHNOLOGIES PRESENTED BY HEMAL RANA COMPUTER ENGINEER GOVERNMENT ENGINEERING COLLEGE,MODASA.
Extreme Scale Infrastructure
Thermal-aware Task Placement in Data Centers (part 4)
Intel Data Center Manager
Thermal-aware Task Placement in Data Centers
Computing Resource Allocation and Scheduling in A Data Center
Towards Green Aware Computing at Indiana University
Dynamic Code Mapping Techniques for Limited Local Memory Systems
The Greening of IT November 1, 2007.
Towards Predictable Datacenter Networks
Presentation transcript:

Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department of Comp. Sc. & Engg. Arizona State University

2 Outline  Motivation  Dynamic Thermal Management in Datacenters  Thermal-aware task scheduling  Software Architecture  Conclusions and Future work

3 Motivation Computing clusters are increasingly deployed in current datacenters limited by power and thermal capacity Computing clusters are increasingly deployed in current datacenters limited by power and thermal capacity High server density to achieve higher computation capability - Leads to high heat densityHigh server density to achieve higher computation capability - Leads to high heat density Reliability and longevity of the overheated servers is affected - System downtime may increaseReliability and longevity of the overheated servers is affected - System downtime may increase Rising cost for datacenters Rising cost for datacenters Large scale datacenters can run into millions of dollars - Cooling cost comprises almost half of thisLarge scale datacenters can run into millions of dollars - Cooling cost comprises almost half of this Current trend of overcooling based on worst case thermal characteristics lead to high utilities costCurrent trend of overcooling based on worst case thermal characteristics lead to high utilities cost A dynamic thermal-aware control platform is necessary for online thermal evaluation that can achieve a tradeoff between these extremes. A dynamic thermal-aware control platform is necessary for online thermal evaluation that can achieve a tradeoff between these extremes.

4 Thermal Management of Datacenter  Motivation and significance Compute Intensive Applications (Online Gaming, Computer Movie Animation, Data Mining) requiring increased utilization of Data Center Compute Intensive Applications (Online Gaming, Computer Movie Animation, Data Mining) requiring increased utilization of Data Center Maximizing computing capacity is a demanding requirementMaximizing computing capacity is a demanding requirement New blade servers can be packed more densely New blade servers can be packed more densely Energy cost is rising dramatically Energy cost is rising dramatically  Goal Improving thermal performanceImproving thermal performance Lowering hardware failure rateLowering hardware failure rate Reducing energy costReducing energy cost

5 Typical layout of a datacenter  Rack outlet temperature T out  Rack inlet temperature T in  Air conditioner supply temperature T s

6 Schematic View of Thermal Management

7 Research Issues of Thermal Management in Datacenter Abstract Heat Flow Model Power & Load Characterization Modeling Thermal Performance Multiscale & Multimodal Info Analysis Thermal Performance Evaluation Cost Optimization Scheduler Other Impact Factors Understanding Control

8 Task scheduling and Thermal Distribution Co- relation Reaction Chain Scheduling Requirements Real-time measurement Online lightweight temperature prediction Thermal-awareness in the scheduling decisions Task Assignment Power Consumption Distribution Temperature Distribution Energy Cost Task Assignment Power Consumption Distribution Inlet temperature distribution without Cooling 25  C Cooling lowered Inlet temperature lowered Blow redline threshold Demand for cooling load /energy Demand for cooling load/energy

9 Thermal-aware scheduling Techniques  Uniform Task distribution (UT) Assigning all chassis the same amount of tasks (power consumptions) Assigning all chassis the same amount of tasks (power consumptions)  Uniform Outlet Profile (UOP) Assigning tasks in a way trying to achieve outlet temperature balance (uniform distribution) Assigning tasks in a way trying to achieve outlet temperature balance (uniform distribution)  Minimum Computing Energy (coolest inlet) (MCE) Assigning tasks in a way to keep the number of active (power on) chassis as small as possible Assigning tasks in a way to keep the number of active (power on) chassis as small as possible  Recirculation Minimized Scheduling (XInt) Use profiling process to calculate cross interference coefficients Use profiling process to calculate cross interference coefficients

10 Total Energy Cost Comparisons

11 System Model & Cluster Set-up Saguaro Cluster is the main cluster maintained by the High Performance Computing Initiative at ASU. 4 racks, 5 chassis per rack, 10 dual- processors per chassis4 racks, 5 chassis per rack, 10 dual- processors per chassis

12 Cluster Management S/W Infrastructure We used Moab scheduler for job allocation in this cluster. We used Moab scheduler for job allocation in this cluster. Easy to useEasy to use Provides good graphical interface in the form of Moab Cluster Manager (MCM).Provides good graphical interface in the form of Moab Cluster Manager (MCM). Job re-allocation is allowed based on priorityJob re-allocation is allowed based on priority uses of the underlying resource management software (such as torque) and enforces the scheduling policies (such as fair-share) selected from the GUIuses of the underlying resource management software (such as torque) and enforces the scheduling policies (such as fair-share) selected from the GUI Thermal awareness is integrated into the Moab Scheduler. Thermal awareness is integrated into the Moab Scheduler. Priority is set as a function of temperature, utilization, etc.Priority is set as a function of temperature, utilization, etc. PHP based datacenter visualization. PHP based datacenter visualization. Moab Cluster Management GUI Moab Server Resource Management (Torque) Data Center

13 Chassis Level Sensor Data Collection SNMP based script periodically queries sensors and updates server database SNMP based script periodically queries sensors and updates server database PHP script periodically accesses the database for presenting the thermal history in the webpage PHP script periodically accesses the database for presenting the thermal history in the webpage 11 outlet Temperature sensors at back of the chassis 3 housing Temperature sensors at middle of the chassis Sensor Placement at each chassis* * There is only one inlet sensor at the front of the chassis

14 Visualization and Scheduler Integration  Temperature data is included as Generic Metric (GMETRIC) in Moab.  Node priority is set based on moab GMETRIC data.

15 Putting it all together: Software Architecture Presentation Scheduling Control Datacenter Servers Access data from the chassis level sensors

16 Modularized Implementation of Thermal Awareness in Task Scheduling

17 Conclusions  Proposed Architecture enables dynamic on-line thermal management during datacenter operation. enables dynamic on-line thermal management during datacenter operation. provides visualization of thermal distribution provides visualization of thermal distribution  Implemented in fully operational ASU datacenter.  Prototype development and demonstration at the Intel day.

Questions ??