Power-aware Resource Allocation for Cpu- and Memory Intense Internet Services Vlasia Anagnostopoulou Susmit Biswas, Heba Saadeldeen,

Slides:



Advertisements
Similar presentations
Ramya (UCSB), Parthasarathy et al (HP Labs). Overview Power delivery, consumption and cooling problems in a data center are being tackled currently by.
Advertisements

Traffic Engineering with Forward Fault Correction (FFC)
Hadi Goudarzi and Massoud Pedram
VCRIB: Virtual Cloud Rule Information Base Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan HotCloud 2012.
SLA-Oriented Resource Provisioning for Cloud Computing
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Energy Conservation in Datacenters through Cluster Memory Management and Barely-Alive Memory Servers Vlasia Anagnostopoulou Susmit.
Towards Virtual Routers as a Service 6th GI/ITG KuVS Workshop on “Future Internet” November 22, 2010 Hannover Zdravko Bozakov.
Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds School of Computer Engineering Nanyang Technological University,
Xavier León PhD defense
MassConf: Automatic Configuration Tuning By Leveraging User Community Information Computer Science Wei Zheng, Ricardo Bianchini, Thu Nguyen Rutgers University.
SLA-aware Virtual Resource Management for Cloud Infrastructures
Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs John Oleszkiewicz, Li Xiao, Yunhao Liu IEEE TRASACTION ON PARALLEL.
Energy Management and Adaptive Behavior Tarek Abdelzaher.
Kevin Lim*, Jichuan Chang +, Trevor Mudge*, Parthasarathy Ranganathan +, Steven K. Reinhardt* †, Thomas F. Wenisch* June 23, 2009 Disaggregated Memory.
Prefix Caching assisted Periodic Broadcast for Streaming Popular Videos Yang Guo, Subhabrata Sen, and Don Towsley.
Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
Quantifying the Environmental Advantages of Large-Scale Computing Vlasia Anagnostopoulou Heba Saadeldeen, and Frederic T. Chong Department.
New Challenges in Cloud Datacenter Monitoring and Management
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Energy Aware Network Operations Authors: Priya Mahadevan, Puneet Sharma, Sujata Banerjee, Parthasarathy Ranganathan HP Labs IEEE Global Internet Symposium.
GreenHadoop: Leveraging Green Energy in Data-Processing Frameworks Íñigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.
Naixue GSU Slide 1 ICVCI’09 Oct. 22, 2009 A Multi-Cloud Computing Scheme for Sharing Computing Resources to Satisfy Local Cloud User Requirements.
Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan, S. Narravula, P. Balaji and D. K. Panda Network Based.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
18 June 2001 Optimizing Distributed System Performance via Adaptive Middleware Load Balancing Ossama Othman Douglas C. Schmidt
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
Network Aware Resource Allocation in Distributed Clouds.
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Cloud Computing Energy efficient cloud computing Keke Chen.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
On QoS Guarantees with Reward Optimization for Servicing Multiple Priority Class in Wireless Networks YaoChing Peng Eunyoung Chang.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.
Utilizing Call Admission Control for Pricing Optimization of Multiple Service Classes in Wireless Cellular Networks Authors : Okan Yilmaz, Ing-Ray Chen.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
Energy Aware Consolidation for Cloud Computing Srikanaiah, Kansal, Zhao Usenix HotPower 2008.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.
OPERETTA: An Optimal Energy Efficient Bandwidth Aggregation System Karim Habak†, Khaled A. Harras‡, and Moustafa Youssef† †Egypt-Japan University of Sc.
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.
Copyright © 2010, Performance and Power Management for Cloud Infrastructures Hien Nguyen Van; Tran, F.D.; Menaud, J.-M. Cloud Computing (CLOUD),
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Zeta: Scheduling Interactive Services with Partial Execution Yuxiong He, Sameh Elnikety, James Larus, Chenyu Yan Microsoft Research and Microsoft Bing.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Part III BigData Analysis Tools (YARN) Yuan Xue
Web Servers load balancing with adjusted health-check time slot.
Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008.
Energy Aware Network Operations
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Hydra: Leveraging Functional Slicing for Efficient Distributed SDN Controllers Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay Rao.
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
湖南大学-信息科学与工程学院-计算机与科学系
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Multi-hop Coflow Routing and Scheduling in Data Centers
Storing and Replication in Topic-Based Pub/Sub Networks
Towards Predictable Datacenter Networks
Presentation transcript:

Power-aware Resource Allocation for Cpu- and Memory Intense Internet Services Vlasia Anagnostopoulou Susmit Biswas, Heba Saadeldeen, Ricardo Bianchini, Tao Yang, Diana Franklin, Frederic T. Chong University of California, Santa Barbara First E 2 DC Workshop 08/05/2012

Cpu- and Memory Intense Internet-Services MapReduce, Hadoop,… Latency-bound Intense computation (=>high cpu utilization) Petascale data

Datacenter clusters

Datacenter cluster operation

Challenges Standard middleware algorithms are inefficient for cpu- and memory-intense internet services Resource allocation operates at a fine- granularity – But is oblivious of the SLA Power management is SLA-aware – But is only driven by the CPU – Coarse-grained Request distribution does not operate at a resource granularity

Overview of solution SLA-aware and fine-grained Two steps: – Configure states of servers (basic power-aware resource allocation) – Allocate resources to servers (cpu and memory) Resource Allocation Power Management Request distribution Power-aware Resource Allocation for cpu and memory Adjusted Request distribution Standard Middleware Optimized Middleware

Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology Experiments Conclusion

Basic Power-aware Resource Allocation Configure server states: – Active, off, low-power state Problem of memory being inaccessible – Internet-services have high memory demand (for caching) Solution: use a memory-active, low-power state (barely-alive) – Memory is on – Server is not operational, but memory can be remotely accessed – Memory contributes to global cache

Details of Barely-alive state

Basic Power-aware Resource Allocation Calculations: Active servers to service load – N_cpu_act = Load_demand / Cpu_capacity Memory-active servers to satisfy memory demand – Active or barely-alive – N_mem_act = Memory_demand/ Mem_capacity Configure to maximize energy savings, or to maximize memory allocation

Example N=5 servers Cpu-capacity = 1,000 conn. Mem-capacity = 1GB Load = 3,000 conn. Target mem-alloc = 4GB Maximize energy-savings: Maximize memory alloc.: Mem. usage: 0.8GB/server How to control the memory allocation?

Memory Allocation for SLA Two objectives: 1) Allocate memory for SLA 2) Share memory among services with SLA guarantees – Must be fair; accept priority – Guarantee minimum performance Characteristics: Uniform allocation per server (to avoid imbalance) Memory performance monitoring capability which is SLA-aware

Memory allocation for SLA Utilize stack algorithm [Mattson] – Measures contribution of memory size to the hit-rate – Hit-rate is used as proxy of performance Server-level: Calculate alloc for target-hit-rate – Attach SLA mapping Cluster-level: calculate avg size for target hit-rate How to allocate memory when constrained? SizeHitsHit- ratio 16/966.7% 21/977.8% 30/977.8% ServerSize …… 52 Avg:2 SLA #3 #2

SLA/Memory Sharing Aggregate metric of performance – sum of allocations which yield performance closest to SLA Linear optimization problem to maximize aggregate performance: at each step, allocate memory s.t. to minimize aggregate performance subject to memory capacity constraint guarantee min SLA for each app SizeHit- ratio SLA 166.7%# %# %#2 {app1, app2} => Target SLA {#2, #2} dist_to_SLA_alloc = ∞ dist_to_SLA_alloc = 1 dist_to_SLA_alloc = 0

Request Distribution Processing…

Adjusted Request Distribution Processing…

Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments Conclusion

Methodology Datacenter-cluster simulator: – 1 rack – trace-based functional simulator Simulate all standard and proposed middleware algorithms Traces: – Internet-search “snippet” generator

Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments – Basic Algorithm – Shared Cluster Conclusion

Experiments – Basic Algorithm Evaluate various configuration objectives: Barely-alive: maximize memory allocation; Mixed: maximize energy savings Fix SLA, evaluate energy savings only. Also, evaluate residual memory. SLA #1, #2, #3: Response time degradation 1-2%, 2- 3%, 3-4% Aggressiveness of consolidation: 50, 70, 85% SystemActiveOffBarely-alive BaselineYNN On/OffYYN BAYNY MixedYYY

Results – basic algorithm Mixed system has highest energy savings; up to 42% (24% over On/Off) BA: up to 34% (20% over On/Off)

Results – basic algorithm Mixed system is most stable In barely-alive system savings depend on the SLA level; can push the parameter for savings aggressiveness On/off system savings are influenced by both parameters. Degrade significantly at high SLA levels

Results - Basic algorithm BA: up to extra 7.5GB memory: allocate to another application, transition to low- power etc

Results – Cluster Sharing

Results – Cluster sharing

Contents Introduction Power-aware Resource Allocation – Basic – With Support for Multiple Applications – Adjusted Request Distribution Methodology – Simulator – Traces Experiments – Basic Algorithm – Shared Cluster Conclusion

Combine power management and resource allocation => power-aware resource allocation SLA-driven, fine grained management of datacenter clusters – Performance guarantees + energy savings Flexibility to different optimizations for datacenter scenarios Achieve deep energy savings or potential for more memory utility out of cluster Holistic design of middleware software

Thank you for your attention!!! Questions? Contact: URL: