Faraz Ahmad and T. N. Vijaykumar Purdue University

Slides:



Advertisements
Similar presentations
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Advertisements

A Graduate Course on Multimedia Technology 3. Multimedia Communication © Wolfgang Effelsberg Media Scaling and Media Filtering Definition of.
Design of the fast-pick area Based on Bartholdi & Hackman, Chpt. 7.
Design of the fast-pick area
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
CSE 522 Real-Time Scheduling (4)
LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.
Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
Chapter 6 Dynamic Priority Servers
Assets and Dynamics Computation for Virtual Worlds.
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Cutting the Electric Bill for Internet-Scale Systems Andreas Andreou Cambridge University, R02
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use parallel implementations. The reasons for parallelization.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
DEXA 2005 Quality-Aware Replication of Multimedia Data Yicheng Tu, Jingfeng Yan and Sunil Prabhakar Department of Computer Sciences, Purdue University.
COCONET: Co-Operative Cache driven Overlay NETwork for p2p VoD streaming Abhishek Bhattacharya, Zhenyu Yang & Deng Pan.
Network Aware Resource Allocation in Distributed Clouds.
Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Scheduling policies for real- time embedded systems.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Opportunistic Traffic Scheduling Over Multiple Network Path Coskun Cetinkaya and Edward Knightly.
ATAC: Ambient Temperature- Aware Capping for Power Efficient Datacenters Sungkap Yeo Mohammad M. Hossain Jen-cheng Huang Hsien-Hsin S. Lee.
Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.
DTM and Reliability High temperature greatly degrades reliability
Parallel and Distributed Simulation Time Parallel Simulation.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Sandtids systemer 2.modul el. Henriks 1. forsøg m. Power Point.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Load Balancing : The Goal Given a collection of tasks comprising a computation and a set of computers on which these tasks may be executed, find the mapping.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Decomposition and Parallel Tasks (cont.) Dr. Xiao Qin Auburn University
Warped-Slicer: Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU Multiprogramming Qiumin Xu*, Hyeran Jeon ✝, Keunsoo Kim ❖, Won.
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
Data Driven Resource Allocation for Distributed Learning
Heuristic Optimization Methods
Xiaodong Wang, Shuang Chen, Jeff Setter,
Cloud-Assisted VR.
Wayne Wolf Dept. of EE Princeton University
Challenges in Creating an Automated Protein Structure Metaserver
ElasticTree Michael Fruchtman.
Cloud-Assisted VR.
Lecture 24: Process Scheduling Examples and for Real-time Systems
Stratified Round Robin: A Low Complexity Packet Scheduler with Bandwidth Fairness and Bounded Delay Sriram Ramabhadran Joseph Pasquale Presented by Sailesh.
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Houssam-Eddine Zahaf, Giuseppe Lipari, Luca Abeni RTNS’17
Chapter 6: CPU Scheduling
CSE 451: Operating Systems Spring 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) John Zahorjan Allen Center.
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Chapter 6 Dynamic Priority Servers
CSE 589 Applied Algorithms Spring 1999
Is Dynamic Multi-Rate Worth the Effort?
Smita Vijayakumar Qian Zhu Gagan Agrawal
CARP: Compression-Aware Replacement Policies
CSE 451: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
IT 344: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Chia-Chi Teng CTB
Fast Min-Register Retiming Through Binary Max-Flow
easYgen-3000XT Series Training
IIS Progress Report 2016/01/18.
GhostLink: Latent Network Inference for Influence-aware Recommendation
Presentation transcript:

Joint Optimization of Idle and Cooling Power in Data Centers While Maintaining Response Time Faraz Ahmad and T. N. Vijaykumar Purdue University Presented by: Michael Moeng

Key Idea Techniques to reduce idle power rely on shutting down servers Concentrate load in a few servers Causes hot spots, which increases cooling costs Techniques to reduce cooling power rely on spreading out load Spread load on many servers Keeps all servers on at low utilization, which increases idle power Joint optimization is crucial for reducing overall power

Outline Existing techniques PowerTrade SurgeGuard Experimental Results Static schemes Dynamic scheme SurgeGuard Experimental Results

Idle Power: Spatial Subsetting Concentrate load in as few servers as possible Active servers get hotter Cooling cost largely dominated by maximum temperature Response time degrades

Temperature: Inverse-temperature assignment Load servers inversely to the temperature of the room Example: middle aisle is cooler, so give those servers a higher load Keeps all servers on, increases idle power

Data Center Model

DVFS: Temporal Subsetting Degrades response time, especially if transition times are slow Only reduces CPU component of power (about 1/3 of system power) Future technologies may not even use DVFS to reduce power

PowerTrade Define NetPower as: powercompute+ poweridle + powercooling Divide room into zones with similar temperatures properties Distribute load across zones Distribute load within single zone Assume hot, warm, cool zones Propose 3 policies to distribute load

PowerTrade-s: load across zones Static Scheme, assumes knowledge of: Zone partitions Expected load Use calibration run

PowerTrade-s: load across zones Calibration results in iso-temp ratio Hot zone kept on standby if possible

PowerTrade-s: load within a zone Cool zone: Less prone to hot spots Use spatial subsetting to reduce idle power Warm zone: Equal load distribution Hot zone: Kept in standby if possible Equal load distribution if necessary

PowerTrade-s: Limitations Assumes linear relationship between load and temperature Intra-zone policies do not consider both cooling and idle power Ignores heat exchange between zones Calibration runs each zone in isolation

PowerTrade-s++ PowerTrade-s assumed linear relationship between load and temperature Take multiple samples

PowerTrade-d Use server groups: ~10 servers Reduce power by dealing with one group at a time until we reach net power minimum Compare potential savings from reducing cooling power and reducing idle power Largest is selected to reduce net power

PowerTrade-d: analysis Do we reach a global minimum? Cooling and idle power are mostly monotonic Single global minimum

PowerTrade-d: analysis How fast is convergence? About 3 steps per hour Slow loading changes Relatively large groups How do we deal with groups in data centers that handle multiple services? Servers in a group must all handle a single service Group size << service server requirement

PowerTrade-d: loading increase Try to handle new load with existing servers Activate more if necessary Else: possible temperature savings increases while possible idle savings decreases Initially start new group, redistribute load according to inverse-temperature assignment Measure cooling improvement Measure idle power degradation If net benefit, try activating another group and repeat Else keep current group count

PowerTrade-d: loading decrease Possible temperature savings decrease, but potential idle savings increases Try deactivating a group, redistribute load Measure cooling degradation Measure idle improvement If net benefit, try deactivating another group and repeat

PowerTrade-d: group selection Which server group do we choose to (de)activate? When deactivating servers, select the hottest server group When activating servers, Select the group with the lowest idle power rating (if they differ) Select the from cool, warm, then hot zones

PowerTrade-d: issues Server wear from repeated on/off cycles Try to select from eligible groups in round-robin fashion Power overhead from activating servers Loading changes are infrequent, so this contributes little to total power Some data centers have frontend/backend servers Add state-aware groups?

SurgeGuard: Maintaining Response Time Increases in load require servers to transition states, which takes time Two-tier Scheme: Overprovision to handle maximum increase in load during next coarse-grained time interval Replenish reserves at finer granularities to handle spikes

Spatial Subsetting: active servers Model data center as GI/G/m queue Allen-Cunneen approximation:

SurgeGuard: coarse-grained overprovisioning Given desired response time and load, the number of active servers can be derived Overprovision by reserving extra servers Worst-case increase in load observed so far Load change tolerable for power-performance goal Authors use only? 10% of data center servers

SurgeGuard: fine-grained overprovisioning What if load decreases one interval, but increases in the next? Run algorithm after each mini-interval (5 min) Reserve exceeded in about 0.15% of all 5 minute intervals Only activate servers for fine-grained overprovisioning Prevent server wear

Experimental Results: Net Power Simulate a data center with 1120 server blades

Experimental Results: Net Power Shows improvement over better of either spatial subsetting or inverse-temperature

Experimental Results: Net Power Breakdown

Experimental Results: Temperature Variation

Experimental Results: Response Time

Experimental Results: Response Time

Questions/Comments?