Ramya (UCSB), Parthasarathy et al (HP Labs). Overview Power delivery, consumption and cooling problems in a data center are being tackled currently by.

Slides:

Advertisements

Similar presentations

L3S Research Center University of Hanover Germany

Advertisements

QoS-based Management of Multiple Shared Resources in Dynamic Real-Time Systems Klaus Ecker, Frank Drews School of EECS, Ohio University, Athens, OH {ecker,

MINERVA: an automated resource provisioning tool for large-scale storage systems G. Alvarez, E. Borowsky, S. Go, T. Romer, R. Becker-Szendy, R. Golding,

Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013.

Hadi Goudarzi and Massoud Pedram

Logistics Network Configuration

Energy Optimization and Stability in Green Data Centers Tarek Abdelzaher Dept. of Computer Science University of Illinois at Urbana Champaign, USA On Sabbatical.

Power Aware Virtual Machine Placement Yefu Wang. 2 ECE Introduction Data centers are underutilized – Prepared for extreme workloads – Commonly.

Power Management in Cloud Computing using Green Algorithm -Kushal Mehta COP 6087 University of Central Florida.

ElasticTree: Saving Energy in Data Center Networks Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneed Sharma, Sujata Banerjee,

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.

A Cyber-Physical Systems Approach to Energy Management in Data Centers Presented by Chen He Adopted form the paper authors.

Power-aware Resource Allocation for Cpu- and Memory Intense Internet Services Vlasia Anagnostopoulou Susmit Biswas, Heba Saadeldeen,

Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

SLA-aware Virtual Resource Management for Cloud Infrastructures

Energy Management and Adaptive Behavior Tarek Abdelzaher.

New Challenges in Cloud Datacenter Monitoring and Management

Adaptive Server Farms for the Data Center Contact: Ron Sheen Fujitsu Siemens Computers, Inc Sever Blade Summit, Getting the.

Jiazhang Liu;Yiren Ding Team 8 [10/22/13]. Traditional Database Servers Database Admin DBMS 1.

ElasticTree: Saving Energy in Data Center Networks 許倫愷 2013/5/28.

Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.

Energy Efficiency in Cloud Data Centers: Energy Efficient VM Placement for Cloud Data Centers Doctoral Student : Chaima Ghribi Advisor : Djamal Zeghlache.

Green IT and Data Centers Darshan R. Kapadia Gregor von Laszewski 1.

Department of Computer Science Engineering SRM University

Virtual Machine Hosting for Networked Clusters: Building the Foundations for “Autonomic” Orchestration Based on paper by Laura Grit, David Irwin, Aydan.

Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.

Network Aware Resource Allocation in Distributed Clouds.

Low-Power Wireless Sensor Networks

Service Transition & Planning Service Validation & Testing

EmNet: Satisfying The Individual User Through Empathic Home Networks J. Scott Miller, John R. Lange & Peter A. Dinda Department of Electrical Engineering.

© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.

RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Energy Aware Consolidation for Cloud Computing Srikanaiah, Kansal, Zhao Usenix HotPower 2008.

Automated Control in Cloud Computing: Challenges and Opportunities Harold C. Lim, Shivnath Babu, Jeffrey S. Chase, and Sujay S. Parekh ACM’s First Workshop.

Covilhã, 30 June Atílio Gameiro Page 1 The information in this document is provided as is and no guarantee or warranty is given that the information is.

Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.

A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,

The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella 1.

Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.

VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.

Database replication policies for dynamic content applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto Presented by Ahmed.

Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.

11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.

Present by Sheng Cai Coordinating Power Control and Performance Management for Virtualized Server Clusters.

Static Process Scheduling

Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.

Accounting for Load Variation in Energy-Efficient Data Centers

Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

UCI Large-Scale Collection of Application Usage Data to Inform Software Development David M. Hilbert David F. Redmiles Information and Computer Science.

Adaptable Approach to Estimating Thermal Effects in a Data Center Environment Corby Ziesman IMPACT Lab Arizona State University.

ANASOFT VIATUS. Challenges Supply chain optimization is necessary for achieving competitive price of final products Synchronization and utilization of.

Coordinated Performance and Power Management Yefu Wang.

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008.

Energy Aware Network Operations

Data Driven Resource Allocation for Distributed Learning

Reinforcement Learning Based Virtual Cluster Management

Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1

System Control based Renewable Energy Resources in Smart Grid Consumer

PA an Coordinated Memory Caching for Parallel Jobs

Comparison of the Three CPU Schedulers in Xen

DDoS Attack Detection under SDN Context

ElasticTree: Saving Energy in Data Center Networks

Self-Managed Systems: an Architectural Challenge

Presentation transcript:

Ramya (UCSB), Parthasarathy et al (HP Labs)

Overview Power delivery, consumption and cooling problems in a data center are being tackled currently by several systems that address separate aspects of these problems either locally/globally, in hardware/software. When these systems are deployed simultaneously, the policies of one tends to interfere with the others

Overview… The lack of coordination amongst such systems leads to undesirable consequences. This paper proposes a Global Power Management Solution that coordinates these individual solutions.

Classifying the existing power management solutions.. Approach used: localized/distributed resource management, VMs Power control : voltage scaling, power states, turning off machines Implementation scope: server/cluster/data center level Optimization requirements and constraints: accept performance loss?, allow power budget violation ?

In a nutshell.. Tracking problem – optimize power consumption while delivering performance. Capping problem – Optimize power provisioning and cooling so as not to violate the power budget. Optimization problem – maximize power saving while minimizing performance loss. (ACPIs, VMs, etc)

Representative Power Management Solutions Efficiency Controller (EC -tracking) – optimize per server avg. power consumption. Adjusts ACPI P- states based on past resource usage to manage estimated future demand. Server Manager (SM – capping) – Reduce P-state of a server on violation of Power budget.

Representative solutions.. Enclosure Manager (EM ) – thermal power capping at blade level Group Manager (GM ) – at rack or data center level These two monitor power usage on sets of machines and re-provision power to maintain group power budget (determined manually or mandated by higher level power managers)

Representative solutions.. Virtual Machine Controller (VMC) – reduce average power usage across a set of machines by workload consolidation, turning of idling machines, etc.

Power Struggles.. What happens if these solutions are deployed simultaneously ?

Power Struggles - examples EC and the SM both operate on the same knob/actuator (P-state) but for different metrics. If uncoordinated, the EC can potentially overwrite the SM leading to power budget violations and eventual thermal failover! – A correctness issue.

Examples.. If the VMC and group cappers are uncoordinated, the VMC can consolidate more capacity onto a collection of servers than allowed by the group power budget. In addition to excessive performance violations (inefficiency), the VMC can potentially react to the lower utilization (because of power capping) and pack even more workloads onto the server, leading to a vicious cycle and system instability

Design Challenges of a Coordination System Interaction between different controllers (EC, SM, EM, etc) must maintain correctness, stability and efficiency. Global Awareness of the presence of other controllers while having minimal/zero knowledge of their properties. Adaptability and Scalability – new controllers with same/different properties, new applications, etc.

Design Challenges - Sensitivity Issues. Overlapping functionalities and policies of controllers – can they be mitigated ? Is the Coordinated Management System agnostic to the deployed systems and applications (workloads) ?

The Design

The Design.. Use of feedback control loops. Measure the required metric, compare with the reference value and manipulate the actuator based on the error so that the output follows the reference.

Details.. Diagram Efficiency Controller EC: Reference utilization r ref Actual utilization r i If r i < r ref adjust Actuator A (P-State) ie reduce from say P0 to P4, resulting in higher utilization and lower power usage.

Details.. Diagram Server Manager SM: Power Capping by measuring per server power consumption If current consumption exceeds power budget, SM INCREASES r ref thereby allowing the EC to reduce the P-State of the machine In effect, EC and SM use r ref as communication channel.

Design.. EM & GM: Same principle as SM. Compare current power usage against ref. power budget and assign new values to lower level servers ( EM ->SM, GM->EM) based on some policy (FIFO, random, etc). The lower level servers pick the minimum of upper level recommendation and their own local power budget.

Design.. VMCs: Use Actual utilization instead of apparent utilization (100% at P0 is not same as 100% at P3). Supplied with data about approx power budget at various levels. Also supplied with data about current power budget violations at various levels (through CIM) The above three enable the VMCs to consolidate right workloads and making sure that the consolidated servers dont violate the power budgets nor fall into the vicious cycle mentioned earlier.

Summary of changes to the controllers

Modeling the Controllers Power – Performance Model – run actual workloads on hardware at different utilization levels and measure the power and performance. Through curve-fitting of the simulation data, obtain linear models that represent the controller behavior.

Modeling.. EC - scaled up or down by λ (changes proportional to error in utilization). r_ref is increased by SM in case of power budget violation cap_loc, resulting in EC lowering the power states of the machines.

Modeling.. SM: manipulates r_ref of EC if its power budget violates cap_loc, subject to a cap determined by β loc factor. EM & GM – operate on a fair share policy, power allocated to a component is proportional to power consumed in last interval

Modeling.. VMCs – Constrained Optimization Problem to map n VMs to m servers (decision variable matrix X). Include total power consumption and migration overhead (α M ) in the calculation Consider Server capacity constraints

Modeling VMCs.. Consider local, enclosure and group level power budget constraints The level of consolidation is tuned by tuning the power budget buffers based on the violations at different levels.

Modeling VMCs.. Equations 1 to 6 depict a 0-1 integer optimization problem. The authors use a greedy bin packing algorithm that yields an approximate optimal solution for the placement of VMs

Evaluation How? Real time deployment in Data Center or a full-system simulation ? Impractical, limits the set of use case scenarios that can be studied due to the actual system being tested Use of trace-driven simulation – use real world traces of enterprise deployments that would enable detailed workload modeling and evaluation of tradeoffs at policy and system levels. -?

Metrics used Aggregate Power Saving, performance loss and power budget violation at SM, EM and GM levels. No peak power saving is measured. No workload queuing i.e. if workload exceeds capacity, there is performance loss due to power capping. No demand carry over.

Experimentation 180 workload traces (databases, web servers, remote desktops, e-commerce, etc). Create different types of mixes (real & synthetic) from this set to exercise different utilization scenarios. SUT – A low power Blade server A and an entry level 2U server B. Experiment with different power budgets and also study the sensitivity of this architecture by varying the time constants.

Power – Performance models for Blade A and Server B

Results Baseline: No power management

Results.. Base Results: Coordinated – 64% reduction in power consumption, 3% performance degradation and 5% power budget violation Uncoordinated – 12 % performance loss and 7% budget violation. Sensitivity towards different Systems: Blade A - 5 p-states over higher power range Server B - 6 p-states over low power range. Blade As absolute power saving > Server B. Implies, Range of Power control is more important than its granularity

Results.. Variation for different workloads At low utilization, VMC is major contributor to savings (assuming idle machines are turned off). As utilization increases, benefits of VMC decrease while the combination of EC & VMC is better (i.e. a Coordinated Solution is better than a single one). If idle m/c are not switched off, savings drop significantly!