André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

Slides:



Advertisements
Similar presentations
Hadi Goudarzi and Massoud Pedram
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
System Center 2012 R2 Overview
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
1 Distributed Systems Meet Economics: Pricing in Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of.
Cloud SUT proposal OSGcloud group. Objective To fill in the Research the group about the thinking within the OSG working group To solicit new ideas/proposals.
Detecting Transient Bottlenecks in n-Tier Applications through Fine- Grained Analysis Qingyang Wang Advisor: Calton Pu.
Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Providing Performance Guarantees for Cloud Applications Anshul Gandhi IBM T. J. Watson Research Center Stony Brook University 1 Parijat Dube, Alexei Karve,
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Automatic Resource Scaling for Web Applications in the Cloud Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice In search of a virtual yardstick:
Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.
New Challenges in Cloud Datacenter Monitoring and Management
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,
SOFTWARE AS A SERVICE PLATFORM AS A SERVICE INFRASTRUCTURE AS A SERVICE.
Computer Systems Performance Evaluation CSCI 8710 Kraemer Fall 2008.
Generating Adaptation Policies for Multi-Tier Applications in Consolidated Server Environments College of Computing Georgia Institute of Technology Gueyoung.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
C. Yang, M. Sun, J. Xia, J. Li, K. Liu, Q. Huang and Z. Gui, Chapter 12 How to test the readiness of cloud services, In Spatial Cloud Computing,
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.
Simulation of Cloud Environments
Virtual Machine Course Rofideh Hadighi University of Science and Technology of Mazandaran, 31 Dec 2009.
Extension to PerfCenter: A Modeling and Simulation Tool for Datacenter Application Nikhil R. Ramteke, Advisor: Prof. Varsha Apte, Department of CSA, IISc.
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
Adaptive software in cloud computing Marin Litoiu York University Canada.
Improving Network I/O Virtualization for Cloud Computing.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Adaptive Virtual Machine Provisioning in Elastic Multi-tier Cloud Platforms Fan Zhang, Junwei Cao, Hong Cai James J. Mulcahy, Cheng Wu Tsinghua University,
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Profiling and Modeling Resource Usage.
QoS Enabled Application Server The Controller Service Bologna, February 19 th 2004.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VIII Concluding Remarks.
1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
Internet Applications: Performance Metrics and performance-related concepts E0397 – Lecture 2 10/8/2010.
EuroSys Doctoral Workshop 2011 Resource Provisioning of Web Applications in Heterogeneous Cloud Jiang Dejun Supervisor: Guillaume Pierre
Brokering Techniques for Managing Three-Tier Applications in Distributed Cloud Computing Environments Nikolay Grozev Supervisor: Prof. Rajkumar Buyya 7.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.
Cloud Benchmarking, Tools, and Challenges
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
OPERATING SYSTEMS CS 3502 Fall 2017
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
André Bauer, Simon Spinner, Nikolas Herbst, Samuel Kounev
Container-based Operating System Virtualization: A scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Potzl, Marc E. Fiuczynski,
StratusLab Final Periodic Review
StratusLab Final Periodic Review
CSE 591: Energy-Efficient Computing Lecture 17 SCALING: survey
Introduction to Cloud Computing
Cloud Computing.
Overview Introduction VPS Understanding VPS Architecture
Extending BUNGEE Elasticity Benchmark for Multi-Tier Cloud Applications - Talk - André Bauer.
Henge: Intent-Driven Multi-Tenant Stream Processing
Cloud computing mechanisms
Syllabus and Introduction Keke Chen
Admission Control and Request Scheduling in E-Commerce Web Sites
ICSOC 2018 Adel Nadjaran Toosi Faculty of Information Technology
Cloud Computing Architecture
The Vision of Self-Aware Performance Models
Performance And Scalability In Oracle9i And SQL Server 2000
Tailoring Performance Prediction Processes to Situational Requirements
Tailoring Performance Prediction Processes to Situational Requirements
Presentation transcript:

André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

Motivation Cloud computing is a paradigm that faces the increasing scale and complexity of modern web services To guarantee a reliable service, most applications run with a fixed amount of resources Unnecessary cost if the system is not fully utilized Bad performance if unexpected peaks appear Accurate auto-scalers are required, which reconfigure the system regarding its load Motivation Approach Evaluation Conclusion

State of the Art In academia, auto-scalers can be classified into 5 groups [Lorido-Botran14] In practice, cloud providers offer reactive threshold-based auto-scalers AWS EC2 scales based on metrics such as CPU utilization, disk or network IO Threshold-based Rules Queueing Theory Control Theory [Maurer11] [Adhikari12] [Urgaonkar08] Reinforcement Learning Time Series Analysis [Iqbal11] [Rao09] Motivation Approach Evaluation Conclusion

Problem: Auto-Scaling Input Existing auto-scalers either assume as input Measured resource utilization metrics (e.g., CPU load, #disk read operations, …) Estimate of the processing speed of the resources (e.g., #jobs that can be served per time, …) Problems while choosing the observed resource metrics Bottleneck resource may not be known at configuration time Some metrics require a deep knowledge, which is unfeasible in dynamic cloud deployments Measured utilizations are bounded to 100% Based on our experiments, we recommend using service demands as input for auto-scaling Motivation Approach Evaluation Conclusion

Service Demand A service demand is the time a unit of work (e.g., request) spends obtaining service from a resource (e.g., CPU or hard disk) in a system (excluding waiting time). [Spinner15] Direct measurements of service demand (e.g., TimerMeter [Kuperberg09]) Requires specialized instrumentation to monitor low-level statistics Cause overhead through monitoring Unaccounted work in system or background threads Statistical estimation of service demand Requires coarse instrumentation Different methods: e.g., Kalman filtering [Wang12], regression, … Motivation Approach Evaluation Conclusion

Service Demand Estimation We maintain a library for service demand estimation [Spinner14] Ready-to-use implementations of 8 existing approaches Allows offline and online estimation For our experiments, we use an approach based on the Service Demand Law i: resource and c: workload class Di,c: service demand Ui,c: utilization X0,c: system throughput Based on the approach of D. Menascé [Menascé04], only average CPU utilization and throughput of each workload class is required 𝐷 𝑖,𝐶 = 𝑈 𝑖,𝑐 𝑋 0,𝑐 Motivation Approach Evaluation Conclusion

Evaluation Design We compare 3 cases: Auto-scaling scenarios State-of-the-art reactive auto-scaler with CPU utilization System utilization based on queueing theory No auto-scaling, i.e., 75% of available VMs fixed Auto-scaling scenarios Hardware contention scenario: matrix decomposition (CPU limitation) Software contention scenario: limited content pool Software and hardware contention scenario ( 1 + 2 ) Motivation Approach Evaluation Conclusion

Auto-Scaler Utilization 𝜌 is either Measured average CPU utilization Calculated average system utilization based on Service Demand Law Motivation Approach Evaluation Conclusion

Evaluation Setup Each scenario is deployed in a CloudStack based private cloud Each scenario application is deployed on an Apache tomcat Middleware for each scenario is WildFly with interfaces for gathering application knowledge Criteria Server Worker VM Model HP DL160 Gen9 m4.large Operating System Xen-Server Centos 6.5 CPU 8 cores 2 vcores Memory 32 GB 4 GB Amount 8 1 – 20 Motivation Approach Evaluation Conclusion

Experiment Controller Evaluation with BUNGEE experiment controller [Herbst14] Workload of Retailrocket Anonymized online shop from www.kaggle.com Between 40 and 200 requests/second 2 Days  6.5 hours Motivation Approach Evaluation Conclusion

Evaluation Metrics Elasticity metrics with as demanded (dt) respectively supplied (st) resources at time t User metrics Average reponse time #SLO violations (requests > 2 seconds) Motivation Approach Evaluation Conclusion

Hardware Contention Scenario Configuration Service Demand CPU Up threshold 90% Down threshold 70% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.54% 7.46% accuracyO (𝜃𝑂) 14.60% 23.57% timeshareU (𝜏 𝑈 ) 19.10% 22.20% timeshareO (𝜏 𝑂 ) 62.56% 62.65% #SLO violations 8.40% 12.67% Avg. resp. time 0.70 s 0.90 s Motivation Approach Evaluation Conclusion

Software Contention Scenario Configuration Service Demand CPU Up threshold 90% 30% Down threshold 70% 5% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.64% 60.45% accuracyO (𝜃𝑂) 8.36% 24.67% timeshareU (𝜏 𝑈 ) 27.02% 86.83% timeshareO (𝜏 𝑂 ) 43.37% 9.29% #SLO violations 6.27% 97.25% Avg. resp. time 1.04 s 1.97 s Motivation Approach Evaluation Conclusion

Mixed Scenario Configuration Service Demand CPU Up threshold 90% 55% Down threshold 70% 40% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.13% 14.61% accuracyO (𝜃𝑂) 14.12% 7.05% timeshareU (𝜏 𝑈 ) 19.28% 42.86% timeshareO (𝜏 𝑂 ) 59.90% 35.03% #SLO violations 4.77% 11.96% Avg. resp. time 1.94 s 1.95 s Motivation Approach Evaluation Conclusion

In a Nutshell Summary of research findings Service demand based approach No knowledge of the bottleneck resource required = is independent of the scenario Threshold configuration is independent of the application CPU based approach Only suitable where CPU is bottleneck and threshold configuration depends on scenario Easy to gather and set up Better than no auto-scaling in the mixed and hardware scenario Service demand based outperforms CPU based We want to encourage further research in service demand estimation Motivation Approach Evaluation Conclusion

Literatur [Adhikari12] R. Adhikari and R. Agrawal, “An Introductory Study on Time Series Modeling and Forecasting,” arXiv preprint arXiv:1302.6613, 2013. [Iqbal11] W. Iqbal, M. N. Dailey, D. Carrera, and P. Janecek, “Adaptive Resource Provisioning for Read Intensive Multi-tier Applications in the Cloud,” Future Generation Computer Systems, vol. 27, no. 6, pp. 871-879, 2011. [Lorido-Botran14] T. Lorido-Botran, J. Miguel-Alonso, and J. A. Lozano, “A Review of Autoscaling Techniques for Elastic Applications in Cloud Environments,” Journal of Grid Computing, vol. 12, no. 4, pp. 559-592, 2014. [Maurer11] M. Maurer, I. Brandic, and R. Sakellariou, “Enacting Slas in Clouds Using Rules,“ in Euro-Par 2011 Parallel Processing. Springer, 2011, pp. 455-466. [Rao09] J. Rao, X. Bu, C.-Z. Xu, L. Wang, and G. Yin, “VCONF: a Reinforcement Learning Approach to Virtual Machines Auto-conguration,” in Proceedings of the 6th international conference on Autonomic computing. ACM, 2009,pp. 137-146. [Urgaonkar08] B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and T. Wood, “Agile Dynamic Provisioning of Multi-tier Internet Applications,“ ACM Transactions on Autonomous and Adaptive Systems (TAAS), vol. 3, no. 1, p. 1, 2008. [Pierre12] Pierre, Guillaume, and Corina Stratan. "ConPaaS: a platform for hosting elastic cloud applications." IEEE Internet Computing 16.5 (2012): 88-92.

[Menascé04]: Menascé, D. A. , Dowdy, L. W. , Almeida, V. A. F [Menascé04]: Menascé, D.A., Dowdy, L.W., Almeida, V.A.F.: Performance by Design: Computer Capacity Planning By Example. Prentice Hall PTR, Upper Saddle River, NJ, USA (2004) [Kuperberg09]: M. Kuperberg, M. Krogmann, R. Reussner, TimerMeter: Quantifying Accuracy of Software Times for System Analysis, in: Proceedings of the 6th International Conference on Quantitative Evaluation of SysTems (QEST) 2009, 2009. [Wang12]: W. Wang, X. Huang, X. Qin, W. Zhang, J. Wei, H. Zhong, Application-Level CPU Consumption Estimation: Towards Performance Isolation of Multi-tenancy Web Applications, in: Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing, 2012, pp. 439 {446. [Spinner14]: S. Spinner, G. Casale, X. Zhu, and S. Kounev. LibReDE: A Library for Resource Demand Estimation (Demonstration Paper). In Proc. of the 5th ACM/SPEC International Conference on Performance Engineering (ICPE 2014), Dublin, Ireland, March 22- 26, 2014, pages 227-228. [Herbst14]: Andreas Weber, Nikolas Roman Herbst, Henning Groenda, and Samuel Kounev. Towards a Resource Elasticity Benchmark for Cloud Environments. In Proceedings of the 2nd International Workshop on Hot Topics in Cloud Service Scalability (HotTopiCS 2014), co-located with the 5th ACM/SPEC International Conference on Performance Engineering (ICPE 2014), Dublin, Ireland, March 22, 2014, HotTopiCS '14, pages 5:1--5:8. ACM, New York, NY, USA. March 2014.

Slides are available at https://descartes.tools/ Thank you Slides are available at https://descartes.tools/

Comparison Scenario Metric Service Demand CPU No Auto-Scaling Hardware Contention Avg. #VMs 8.58 7.93 15.00 #SLO violations 8.40% 12.67% 45.72% Avg. resp. Time 0.70 s 0.94 s 2.62 s Software Contenion 8.21 6.69 6.27% 97.25% 8.64% 1.04 s 1.97 s 1.07 s Mix 8.96 9.51 4.77% 11.96% 6.40% Avg. resp. time 1.94 s 1.95 s

Accuracy Scenario Designed Service Demand Estimated Service Demand Hardware Contention 0.10 𝑠 0.099±0.016 𝑠 Software Contenion 0.15 𝑠 0.153±0.025 𝑠 Mix 0.25 𝑠 0.238±0.043 𝑠