André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

Motivation Cloud computing is a paradigm that faces the increasing scale and complexity of modern web services To guarantee a reliable service, most applications run with a fixed amount of resources Unnecessary cost if the system is not fully utilized Bad performance if unexpected peaks appear Accurate auto-scalers are required, which reconfigure the system regarding its load Motivation Approach Evaluation Conclusion

State of the Art In academia, auto-scalers can be classified into 5 groups [Lorido-Botran14] In practice, cloud providers offer reactive threshold-based auto-scalers AWS EC2 scales based on metrics such as CPU utilization, disk or network IO Threshold-based Rules Queueing Theory Control Theory [Maurer11] [Adhikari12] [Urgaonkar08] Reinforcement Learning Time Series Analysis [Iqbal11] [Rao09] Motivation Approach Evaluation Conclusion

Problem: Auto-Scaling Input
Existing auto-scalers either assume as input Measured resource utilization metrics (e.g., CPU load, #disk read operations, …) Estimate of the processing speed of the resources (e.g., #jobs that can be served per time, …) Problems while choosing the observed resource metrics Bottleneck resource may not be known at configuration time Some metrics require a deep knowledge, which is unfeasible in dynamic cloud deployments Measured utilizations are bounded to 100% Based on our experiments, we recommend using service demands as input for auto-scaling Motivation Approach Evaluation Conclusion

Service Demand A service demand is the time a unit of work (e.g., request) spends obtaining service from a resource (e.g., CPU or hard disk) in a system (excluding waiting time). [Spinner15] Direct measurements of service demand (e.g., TimerMeter [Kuperberg09]) Requires specialized instrumentation to monitor low-level statistics Cause overhead through monitoring Unaccounted work in system or background threads Statistical estimation of service demand Requires coarse instrumentation Different methods: e.g., Kalman filtering [Wang12], regression, … Motivation Approach Evaluation Conclusion

Service Demand Estimation
We maintain a library for service demand estimation [Spinner14] Ready-to-use implementations of 8 existing approaches Allows offline and online estimation For our experiments, we use an approach based on the Service Demand Law i: resource and c: workload class Di,c: service demand Ui,c: utilization X0,c: system throughput Based on the approach of D. Menascé [Menascé04], only average CPU utilization and throughput of each workload class is required 𝐷 𝑖,𝐶 = 𝑈 𝑖,𝑐 𝑋 0,𝑐 Motivation Approach Evaluation Conclusion

Evaluation Design We compare 3 cases: Auto-scaling scenarios
State-of-the-art reactive auto-scaler with CPU utilization System utilization based on queueing theory No auto-scaling, i.e., 75% of available VMs fixed Auto-scaling scenarios Hardware contention scenario: matrix decomposition (CPU limitation) Software contention scenario: limited content pool Software and hardware contention scenario ( ) Motivation Approach Evaluation Conclusion

Auto-Scaler Utilization 𝜌 is either Measured average CPU utilization
Calculated average system utilization based on Service Demand Law Motivation Approach Evaluation Conclusion

Evaluation Setup Each scenario is deployed in a CloudStack based private cloud Each scenario application is deployed on an Apache tomcat Middleware for each scenario is WildFly with interfaces for gathering application knowledge Criteria Server Worker VM Model HP DL160 Gen9 m4.large Operating System Xen-Server Centos 6.5 CPU 8 cores 2 vcores Memory 32 GB 4 GB Amount 8 1 – 20 Motivation Approach Evaluation Conclusion

Experiment Controller
Evaluation with BUNGEE experiment controller [Herbst14] Workload of Retailrocket Anonymized online shop from Between 40 and requests/second 2 Days  6.5 hours Motivation Approach Evaluation Conclusion

Evaluation Metrics Elasticity metrics with as demanded (dt) respectively supplied (st) resources at time t User metrics Average reponse time #SLO violations (requests > 2 seconds) Motivation Approach Evaluation Conclusion

Hardware Contention Scenario
Configuration Service Demand CPU Up threshold 90% Down threshold 70% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.54% 7.46% accuracyO (𝜃𝑂) 14.60% 23.57% timeshareU (𝜏 𝑈 ) 19.10% 22.20% timeshareO (𝜏 𝑂 ) 62.56% 62.65% #SLO violations 8.40% 12.67% Avg. resp. time 0.70 s 0.90 s Motivation Approach Evaluation Conclusion

Software Contention Scenario
Configuration Service Demand CPU Up threshold 90% 30% Down threshold 70% 5% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.64% 60.45% accuracyO (𝜃𝑂) 8.36% 24.67% timeshareU (𝜏 𝑈 ) 27.02% 86.83% timeshareO (𝜏 𝑂 ) 43.37% 9.29% #SLO violations 6.27% 97.25% Avg. resp. time 1.04 s 1.97 s Motivation Approach Evaluation Conclusion

Mixed Scenario Configuration Service Demand CPU Up threshold 90% 55%
Down threshold 70% 40% Metric Service Demand CPU accuracyU (𝜃𝑈) 7.13% 14.61% accuracyO (𝜃𝑂) 14.12% 7.05% timeshareU (𝜏 𝑈 ) 19.28% 42.86% timeshareO (𝜏 𝑂 ) 59.90% 35.03% #SLO violations 4.77% 11.96% Avg. resp. time 1.94 s 1.95 s Motivation Approach Evaluation Conclusion

In a Nutshell Summary of research findings
Service demand based approach No knowledge of the bottleneck resource required = is independent of the scenario Threshold configuration is independent of the application CPU based approach Only suitable where CPU is bottleneck and threshold configuration depends on scenario Easy to gather and set up Better than no auto-scaling in the mixed and hardware scenario Service demand based outperforms CPU based We want to encourage further research in service demand estimation Motivation Approach Evaluation Conclusion

Literatur [Adhikari12] R. Adhikari and R. Agrawal, “An Introductory Study on Time Series Modeling and Forecasting,” arXiv preprint arXiv: , 2013. [Iqbal11] W. Iqbal, M. N. Dailey, D. Carrera, and P. Janecek, “Adaptive Resource Provisioning for Read Intensive Multi-tier Applications in the Cloud,” Future Generation Computer Systems, vol. 27, no. 6, pp , 2011. [Lorido-Botran14] T. Lorido-Botran, J. Miguel-Alonso, and J. A. Lozano, “A Review of Autoscaling Techniques for Elastic Applications in Cloud Environments,” Journal of Grid Computing, vol. 12, no. 4, pp , 2014. [Maurer11] M. Maurer, I. Brandic, and R. Sakellariou, “Enacting Slas in Clouds Using Rules,“ in Euro-Par 2011 Parallel Processing. Springer, 2011, pp [Rao09] J. Rao, X. Bu, C.-Z. Xu, L. Wang, and G. Yin, “VCONF: a Reinforcement Learning Approach to Virtual Machines Auto-conguration,” in Proceedings of the 6th international conference on Autonomic computing. ACM, 2009,pp [Urgaonkar08] B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and T. Wood, “Agile Dynamic Provisioning of Multi-tier Internet Applications,“ ACM Transactions on Autonomous and Adaptive Systems (TAAS), vol. 3, no. 1, p. 1, 2008. [Pierre12] Pierre, Guillaume, and Corina Stratan. "ConPaaS: a platform for hosting elastic cloud applications." IEEE Internet Computing 16.5 (2012):

[Menascé04]: Menascé, D. A. , Dowdy, L. W. , Almeida, V. A. F
[Menascé04]: Menascé, D.A., Dowdy, L.W., Almeida, V.A.F.: Performance by Design: Computer Capacity Planning By Example. Prentice Hall PTR, Upper Saddle River, NJ, USA (2004) [Kuperberg09]: M. Kuperberg, M. Krogmann, R. Reussner, TimerMeter: Quantifying Accuracy of Software Times for System Analysis, in: Proceedings of the 6th International Conference on Quantitative Evaluation of SysTems (QEST) 2009, 2009. [Wang12]: W. Wang, X. Huang, X. Qin, W. Zhang, J. Wei, H. Zhong, Application-Level CPU Consumption Estimation: Towards Performance Isolation of Multi-tenancy Web Applications, in: Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing, 2012, pp. 439 {446. [Spinner14]: S. Spinner, G. Casale, X. Zhu, and S. Kounev. LibReDE: A Library for Resource Demand Estimation (Demonstration Paper). In Proc. of the 5th ACM/SPEC International Conference on Performance Engineering (ICPE 2014), Dublin, Ireland, March , 2014, pages [Herbst14]: Andreas Weber, Nikolas Roman Herbst, Henning Groenda, and Samuel Kounev. Towards a Resource Elasticity Benchmark for Cloud Environments. In Proceedings of the 2nd International Workshop on Hot Topics in Cloud Service Scalability (HotTopiCS 2014), co-located with the 5th ACM/SPEC International Conference on Performance Engineering (ICPE 2014), Dublin, Ireland, March 22, 2014, HotTopiCS '14, pages 5:1--5:8. ACM, New York, NY, USA. March 2014.

Slides are available at https://descartes.tools/
Thank you Slides are available at

Comparison Scenario Metric Service Demand CPU No Auto-Scaling
Hardware Contention Avg. #VMs 8.58 7.93 15.00 #SLO violations 8.40% 12.67% 45.72% Avg. resp. Time 0.70 s 0.94 s 2.62 s Software Contenion 8.21 6.69 6.27% 97.25% 8.64% 1.04 s 1.97 s 1.07 s Mix 8.96 9.51 4.77% 11.96% 6.40% Avg. resp. time 1.94 s 1.95 s

Accuracy Scenario Designed Service Demand Estimated Service Demand
Hardware Contention 0.10 𝑠 0.099±0.016 𝑠 Software Contenion 0.15 𝑠 0.153±0.025 𝑠 Mix 0.25 𝑠 0.238±0.043 𝑠

André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

Similar presentations

Presentation on theme: "André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev

Similar presentations

Presentation on theme: "André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev"— Presentation transcript:

Similar presentations

About project

Feedback