C LOUD C OM 2012 Self-Adaptive Management of The Sleep Depths of Idle Nodes in Large Scale Systems to Balance Between Energy Consumption and Response Times.

Slides:



Advertisements
Similar presentations
1 Virtual Resource Management (VRM) in Cloud Environment draft-Junsheng-Cloud-VRM-00 Friday 21 Jan 2011 B. Khasnabish, Chu JunSheng, Meng Yu.
Advertisements

Capacity Planning in a Virtual Environment
S YSTEM -W IDE E NERGY M ANAGEMENT FOR R EAL -T IME T ASKS : L OWER B OUND AND A PPROXIMATION Xiliang Zhong and Cheng-Zhong Xu ICCAD 2006, ACM Trans. on.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
Energy-efficient Virtual Machine Provision Algorithms for Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer.
Anshul Gandhi (Carnegie Mellon University) Varun Gupta (CMU), Mor Harchol-Balter (CMU) Michael Kozuch (Intel, Pittsburgh)
CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
st International Conference on Parallel Processing (ICPP)
Xavier León PhD defense
The major IT companies, such as Microsoft, Google, Amazon, and IBM, pioneered the field of cloud computing and keep increasing their offerings in data.
A SYSTEM PERFORMANCE MODEL CSC 8320 Advanced Operating Systems Georgia State University Yuan Long.
Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Energy Efficient Web Server Cluster Andrew Krioukov, Sara Alspaugh, Laura Keys, David Culler, Randy Katz.
Akhil Langer, Harshit Dokania, Laxmikant Kale, Udatta Palekar* Parallel Programming Laboratory Department of Computer Science University of Illinois at.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
Energy Aware Network Operations Authors: Priya Mahadevan, Puneet Sharma, Sujata Banerjee, Parthasarathy Ranganathan HP Labs IEEE Global Internet Symposium.
Power Containers: An OS Facility for Fine-Grained Power and Energy Management on Multicore Servers Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas, Xiao.
Green IT and Data Centers Darshan R. Kapadia Gregor von Laszewski 1.
Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.
Department of Computer Science Engineering SRM University
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
E-STAB: Energy-Efficient Scheduling for Cloud Computing Applications with Traffic Load Balancing Dzmitry KliazovichUniversity of Luxembourg, Luxembourg.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Low-Power Wireless Sensor Networks
Cloud Computing Energy efficient cloud computing Keke Chen.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Video Streaming over Cooperative Wireless Networks Mohamed Hefeeda (Joint.
Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
Grid Computing at The Hartford Condor Week 2008 Robert Nordlund
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Joint Power Optimization Through VM Placement and Flow Scheduling in Data Centers DAWEI LI, JIE WU (TEMPLE UNIVERISTY) ZHIYONG LIU, AND FA ZHANG (CHINESE.
A Node and Load Allocation Algorithm for Resilient CPSs under Energy-Exhaustion Attack Tam Chantem and Ryan M. Gerdes Electrical and Computer Engineering.
Brussels Workshop Use case 3 11/09/2015 Mario Sisinni.
Printed by Definition of Grid Resource Scheduling Scheduling diverse applications on heterogeneous, distributed, dynamic grid computing.
Power Containers: An OS Facility for Fine-Grained Power and Energy Management on Multicore Servers Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas, Xiao.
Dana Butnariu Princeton University EDGE Lab June – September 2011 OPTIMAL SLEEPING IN DATACENTERS Joint work with Professor Mung Chiang, Ioannis Kamitsos,
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu Yu Kang, Yangfan Zhou, Zibin Zheng, and Michael R. Lyu {ykang,yfzhou,
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
GreenCloud: A Packet-level Simulator of Energy-aware Cloud Computing Data Centers Dzmitry Kliazovich ERCIM Fellow University of Luxembourg Apr 16, 2010.
Accounting for Load Variation in Energy-Efficient Data Centers
Localized Low-Power Topology Control Algorithms in IEEE based Sensor Networks Jian Ma *, Min Gao *, Qian Zhang +, L. M. Ni *, and Wenwu Zhu +
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
ChinaGrid: National Education and Research Infrastructure Hai Jin Huazhong University of Science and Technology
IIS Progress Report 2016/01/11. Goal Propose an energy-efficient scheduler that minimize the power consumption while providing sufficient computing resources.
Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008.
Progress Report 07/06 Simon.
DENS: Data Center Energy-Efficient Network-Aware Scheduling
GreenCloud: A Packet-level Simulator of Energy-aware Cloud Computing Data Centers Dzmitry Kliazovich, Pascal Bouvry, Yury Audzevich, and Samee Ullah Khan.
Energy Aware Network Operations
Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1
From Algorithm to System to Cloud Computing
Cloud Computing Architecture
Creating a Dynamic HPC Infrastructure with Platform Computing
Dynamically Negotiating Capacity Between On-demand and Batch Clusters
IIS Progress Report 2016/01/18.
Presentation transcript:

C LOUD C OM 2012 Self-Adaptive Management of The Sleep Depths of Idle Nodes in Large Scale Systems to Balance Between Energy Consumption and Response Times Yongpeng Liu (1), Hong Zhu (2), Kai Lu (1) , Xiaoping Wang (1) (1) School of Computer Science, National University of Defense Technology, Changsha, P. R. China (2) Department of Computing and Communication Technologies, Oxford Brookes University, Oxford, U.K

Large scale high performance computing systems consume a tremendous amount of energy The average power consumption of Top10: 4.34 MW The peak power consumption of the K computer: MW Power management is essential for cloud computing In 2006, US data centers:  61 billion kWh In 2007, global cloud computing: 623 billion kWh The power consumption of an idle node: about 50% of its peak power MOTIVATION the power usage of a middle scale city  4.5 billion U.S. $  15 typical power plants > the electricity demand of India (the 5th largest demand country in the world)

E NERGY E FFICIENCY OF T OP 10 (J UNE 2012)

Dynamic sleep mechanism: A VAILABILITY OF H ARDWARE S UPPORT Sleep stateEnergy Consumption (Watts)Time delay (second) S02070 S11712 S33210 S S50 S 0 : Active S 1 : Sleep 1 S 2 : Sleep 2 S n : Shut down S n-1 : Sleep n-1 Data of a typical node:

Key features of dynamic sleep mechanism The deeper the node sleeps, the less power it consumes (always less than idling in the active state) The deeper the node sleeps, the more time delay to wake up Question: How to balance between performance and energy consumption T HE R ESEARCH P ROBLEM

Single sleep state Server consolidation Finding an active portion of the cluster dynamically The idle remainders are simply turned off (Xue, et al., 2007) Active resource pools whose capacity is determined by the workload demand Spare nodes are simply turned off Multiple sleep states (Gandhi, Harchol-Balter and Kozuch, 2011) Does not dynamically manage the sleep depth of idle servers (Horvath and Skadron, 2008) Predicate the incoming workload based on history Select a number of spare servers for each power states according to heuristic rules Extra spare servers are put in the deepest possible sleep states Related Works Multiple sleep states are not used.

The Structure of ASDMIN T HE P ROPOSED M ODEL ASDMIN: A DAPTIVE S LEEP D EPTH M ANAGEMENT OF I DLE N ODES

Resource Allocation and Reclaim Allocation: Allocate nodes from top level(s) of resources pool(s) Reclaim: Place nodes to the top level resource pool. Changing the states of Idle nodes Upgrading: (called after allocation) For i from the top level to the bottom level do if N i < R i, Move (R i - N i ) nodes from B i-1 into B i Downgrading: For i from the top level to the bottom level do if ((t i > T i ) && (N i > R i )), Move N i -R i nodes of B i to B i-1 ; T HE MANAGEMENT A LGORITHMS reserve capacity threshold Continuous time period without piercing state continuance threshold Level i reserve pool

Piercing a reserve pool A reserve pool is pierced at a time moment, if all the nodes in the pool are allocated but the resource is still insufficient to meet the need. Algorithm (invoked after each resource allocation) When piercing of a reserve pool occurs, its reserve capacity threshold R i is increased; When there are residual nodes in a reserve pool after its providing enough nodes, its reserve capacity threshold R i is increased; A DJUSTMENT OF R ESERVE C APACITY T HRESHOLD In this case, at least one node in the lower level reserve pool is used.

Parallel Workload Archive [14] Dozens of workload logs on real parallel systems. Each log contains the following job information: submit time wait time run time and number of allocated processors The ANL Intrepid log 40,960 quad-core nodes Simulations start at the time 0 of the log. The data of the first 24 hours are neglected Used the data of workload on the following 48 hours I MPLEMENTATION AND E VALUATION From the information and the system scale, one can work out the number of nodes in the system at each second. This is the largest system scale among all published logs. To avoid the fulfilling effect

W ORKLOAD OF THE ANL I NTREPID L OG There is a large number of idle nodes in about 94.79% of the time.

Compute node: The Tianhe-1A Two 6-core Xeon CPUs and 8 GB DIMMs Simulation scenarios: Flat reserve pool structures (S0, S1, S3, S4) Hierarchical reserve pool structure (ASDMIN) The measurement and metrics: Performance: Power efficiency: S IMULATION E NVIRONMENT

M AIN R ESULTS 1: C OMPARISON ON P OWER E FFICIENCY

M AIN R ESULTS 2: C OMPARISON ON P ERFORMANCE

T HE S ELF -A DAPTIVE B EHAVIOUR

M AIN R ESULTS 3: O VERALL E FFECTS 84.12%87.44% 8.85%

Conclusion: The simulation experiments demonstrated that our solution can reduce the power consumption of idle nodes by 84.12% with the cost of slowdown rate being only 8.85%. Future work: Conducting more experiment with the system in order to gain a full understanding of the relationships between various parameters. Exploring the combination of various policies in the selection of idle node for downgrading and upgrading sleep states C ONCLUSION AND F UTURE W ORK

THANK YOU