Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University.

Slides:



Advertisements
Similar presentations
FDA/Industry Workshop September, 19, 2003 Johnson & Johnson Pharmaceutical Research and Development L.L.C. 1 Uses and Abuses of (Adaptive) Randomization:
Advertisements

Effort Estimation and Scheduling
AN EMPIRICAL STUDY OF ENERGY EFFICIENCY OF CLOTHES DRYERS.
MENG 547 LECTURE 3 By Dr. O Phillips Agboola. C OMMERCIAL & INDUSTRIAL BUILDING ENERGY AUDIT Why do we audit Commercial/Industrial buildings Important.
Resource Prediction Based on Double Exponential Smoothing in Cloud Computing Authors: Jinhui Huang, Chunlin Li, Jie Yu The International Conference on.
PORT: A Price-Oriented Reliable Transport Protocol for Wireless Sensor Networks Yangfan Zhou, Michael. R. Lyu, Jiangchuan Liu † and Hui Wang The Chinese.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
1 PERFORMANCE EVALUATION H Often one needs to design and conduct an experiment in order to: – demonstrate that a new technique or concept is feasible –demonstrate.
Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms Lin Huang, Feng Yuan and Qiang Xu Reliable Computing Laboratory Department.
Randomized Planning for Short Inspection Paths Tim Danner Lydia E. Kavraki Department of Computer Science Rice University.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Adaptive Self-Configuring Sensor Network Topologies ns-2 simulation & performance analysis Zhenghua Fu Ben Greenstein Petros Zerfos.
Ns Simulation Final presentation Stella Pantofel Igor Berman Michael Halperin
DESIGN GUIDE FOR GAS CENTRALISED HOT WATER SYSTEMS
Cost Estimation Van Vliet, chapter 7 Glenn D. Blank.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Reaching Goals: Plans and Controls
Thermal Aware Resource Management Framework Xi He, Gregor von Laszewski, Lizhe Wang Golisano College of Computing and Information Sciences Rochester Institute.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
S/W Project Management
General Troubleshooting Tips.
Chapter 5.4 Artificial Intelligence: Pathfinding.
Melcotel TM Features and Benefits Melcotel - Introduction New features –Links with up to 250 room units –Settings reset –Automatic switch OFF –Setback.
Colin Lynch.  Busy college students can overlook aspects of their day while they are going from class to class or completing homework.  Nutrition is.
Sampling: Theory and Methods
Low Contention Mapping of RT Tasks onto a TilePro 64 Core Processor 1 Background Introduction = why 2 Goal 3 What 4 How 5 Experimental Result 6 Advantage.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 14 Sampling Variation and Quality.
McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. Reaching Goals: Plans and Controls Today’s smart supervisor.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
1 Denis Grondin / Julien Giraud – 01 December 2008 SLAB COOLING Denis Grondin Julien Giraud
David Bendit System Administrator Mars Space Flight Facility Arizona State University.
Low-Power Wireless Sensor Networks
Software Estimation and Function Point Analysis Presented by Craig Myers MBA 731 November 12, 2007.
Reviewing the Audit Results. Defining a Quality Base Year is Key to Maximizing Project Value n Base year is the mutually agreed upon pre-retrofit annual.
So… How does Smartcool achieve savings??.  Compressors use 70% of the energy in refrigeration and air conditioning systems.  The cooling cycle is dynamic.
CONTENTS:  Introduction  What is neural network?  Models of neural networks  Applications  Phases in the neural network  Perceptron  Model of fire.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
Evaluation of the XL Routing Algorithm in Multiple Failure Conditions Nguyen Cao Julie Morris Khang Pham.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Presentation of Wireless sensor network A New Energy Aware Routing Protocol for Wireless Multimedia Sensor Networks Supporting QoS 王 文 毅
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Chapter 1: Fundamental of Testing Systems Testing & Evaluation (MNN1063)
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Thermal Management in Datacenters Ayan Banerjee. Thermal Management using task placement Tasks: Requires a certain number of servers (cores) for a specified.
4 th International Conference on Service Oriented Computing Adaptive Web Processes Using Value of Changed Information John Harney, Prashant Doshi LSDIS.
(6) Estimating Computer’s efficiency Software Estimation The objective of Software Estimation is to provide the skills needed to accurately predict the.
Risk-Aware Mitigation for MANET Routing Attacks Submitted by Sk. Khajavali.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Peter P. Groumpos Professor, Electrical and Computer Engineering Department University of Patras Eleni S. Vergini PhD Canditate, Electrical and Computer.
System Control based Renewable Energy Resources in Smart Grid Consumer
Application Level Fault Tolerance and Detection
Luís Filipe Martinsª, Fernando Netoª,b. 
Cost Estimation Van Vliet, chapter 7 Glenn D. Blank.
Meeting of the Steering Group for Simulation Issues from the last SGS meeting in LAT model development Munich, 25 July 2014 Biagio Ciuffo Georgios.
Thermal Management of Heterogeneous Data Centers
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Presentation transcript:

Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University

Outline Introduction to the thermal model Overview of architecture and neural net Determining metrics Learning phase effects Software architecture Summary Future Work

Introduction

Thermal model part 1: : Air intake heat for this node i : Supplied cool air : Air heat from other nodes’ exhaust that reaches node i (cross interference) In the formal abstract heat model, there are coefficients to model the cross interference levels, but they are hard to determine without disrupting data center operation, and will vary from data center to data center. It is necessary to come up with a practical way to estimate these values in order to accurately determine the thermal characteristics of the data center and the effects of smart thermal-aware scheduling decisions.

Introduction Thermal model part 2: : Air exhaust heat from this node i : Air intake heat ( ) : Heat from consumed electrical power We can try to use a learning algorithm to estimate the cross interference from other nodes, and we can use sensors that monitor measurable values such as air exhaust temperatures and intake temperatures (and possibly power consumption) to fine tune our estimation. In simulation, where we can use the previous model and known cross-interference coefficients, we can compare how close this approach is to the actual values (future work).

Overview Architecture Neural Net Process Cycle

Outline Introduction to the thermal model Overview of architecture and neural net Determining metrics Learning phase effects Software architecture Summary Future Work

Finding a Proper Metric In order to check on the effects and re- adjust the weights of the neural nets, proper criteria must be determined. Metrics: –Number and severity of any “hot spots” in the data center environment –Average data center ambient temperature

Finding a Proper Metric Hot spot? –How do we define what constitutes a hot spot, and how large of an area is a spot? 5 degrees above surrounding areas for a 10 foot radius? 10 degrees above surrounding areas for a 5 foot radius? –These values will need to be determined through experimentation (future work)

Outline Introduction to the thermal model Overview of architecture and neural net Determining metrics Learning phase effects Software architecture Summary Future Work

Can Only Improve? It would be a great advantage if during the learning phase of the neural net, the performance is no worse than being ignorant of the thermal effects. (i.e. it will not harm performance overall during the learning phase, it can only get better).

Can Only Improve? Case 1: Neural net weights result in an action that worsens the thermal environment –This effect is detected, weights are adjusted, and the neural net will not reproduce this decision next time –Not taking into account thermal effects in the first place (a thermal- agnostic approach) would also produce negative effects, but would not avoid making the same mistake in the future Case 2: Neural net weights result in an action that improves the thermal environment –This effect is also detected, and the weights are adjusted so that this action is even more likely to be reproduced in the future Overall, the learning phase should have a negligible negative impact that is comparable to, or better than, the random negative outcomes that result from a thermal-agnostic approach.

Outline Introduction to the thermal model Overview of architecture and neural net Determining metrics Learning phase effects Software architecture Summary Future Work

Software Architecture To achieve high performance, create a Node data structure for each node in the data center. Each Node will contain a function update() that runs continuously, monitoring the node’s eligibility to receive new jobs to be scheduled. The update() function will also rely on other functions such as a check() function that checks for thermal effects after some length of time when this node has had a job scheduled. Each update() function will run in a separate thread, so that every node’s information is autonomously and continuously updated and corrected so that it can be called upon as needed by the scheduler without a high processing time overhead. –Easier to continually perform little calculations rather than do them all at once: reduces latency –Every Node is reasonably up-to-date regarding whether or not it is in a good situation to take on a new job.

Summary This approach aims to approximate the formal heat model developed by Qinghui Tang. By using neural nets and multithreaded programming, we hope to create an adaptable, self-configuring, and time-efficient system to avoid excess heat, reducing cooling costs. In theory, this approach can only be beneficial (if the evaluation criteria is accurate).

Future Work Write new code and compare results in simulation with the formal model. Evaluate performance (response time) and scalability. Determine the proper metrics that will accurately detect and avoid hot spots. Test on real hardware / in real environment.

Questions