ICSOC 2018 Adel Nadjaran Toosi Faculty of Information Technology

Slides:



Advertisements
Similar presentations
Key Metrics for Effective Storage Performance and Capacity Reporting.
Advertisements

Paging: Design Issues. Readings r Silbershatz et al: ,
University of Minnesota Optimizing MapReduce Provisioning in the Cloud Michael Cardosa, Aameek Singh†, Himabindu Pucha†, Abhishek Chandra
SLA-Oriented Resource Provisioning for Cloud Computing
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
Power Management in Cloud Computing using Green Algorithm -Kushal Mehta COP 6087 University of Central Florida.
Infrastructure as a Service (IaaS) Amazon EC2
Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.
Look Who’s Talking: Discovering Dependencies between Virtual Machines Using CPU Utilization HotCloud 10 Presented by Xin.
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Automatic Resource Scaling for Web Applications in the Cloud Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science.
OnCall: Defeating Spikes with Dynamic Application Clusters Keith Coleman and James Norris Stanford University June 3, 2003.
Cloud Computing (101).
New Challenges in Cloud Datacenter Monitoring and Management
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
Utility Computing Casey Rathbone 1http://cyberaide.org.edu.
Harold C. Lim, Shinath Baba and Jeffery S. Chase from Duke University AUTOMATED CONTROL FOR ELASTIC STORAGE Presented by: Yonggang Liu Department of Electrical.
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Department of Computer Science Engineering SRM University
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.
Autonomic SLA-driven Provisioning for Cloud Applications Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer Presented by Ismail Alan.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Server Virtualization
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
Kaleidoscope – Adding Colors to Kademlia Gil Einziger, Roy Friedman, Eyal Kibbar Computer Science, Technion 1.
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
EuroSys Doctoral Workshop 2011 Resource Provisioning of Web Applications in Heterogeneous Cloud Jiang Dejun Supervisor: Guillaume Pierre
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Web Servers load balancing with adjusted health-check time slot.
INTRODUCTION TO WEB HOSTING
Md Baitul Al Sadi, Isaac J. Cushman, Lei Chen, Rami J. Haddad
Workload Distribution Architecture
Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
AWS Integration in Distributed Computing
Jacob R. Lorch Microsoft Research
Prepared by: Assistant prof. Aslamzai
Memory Thrashing Protection in Multi-Programming Environment
Mechanism: Limited Direct Execution
Dave Bartoletti, Senior Analyst
Written by : Thomas Ristenpart, Eran Tromer, Hovav Shacham,
Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
Frequency Governors for Cloud Database OLTP Workloads
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Comparison of the Three CPU Schedulers in Xen
Cloud Computing and Cloud Networking
Department of Computer Science University of California, Santa Barbara
Rui Wu, Jose Painumkal, Sergiu M. Dascalu, Frederick C. Harris, Jr
Rank Aggregation.
Extending BUNGEE Elasticity Benchmark for Multi-Tier Cloud Applications - Talk - André Bauer.
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Fault Tolerance Distributed Web-based Systems
Smita Vijayakumar Qian Zhu Gagan Agrawal
AWS Cloud Computing Masaki.
Targeting Wait Statistics with Extended Events
Performance And Scalability In Oracle9i And SQL Server 2000
Cloud Computing Architecture
Specialized Cloud Architectures
Cloud Computing Architecture
Cloud Computing: Concepts
Protect Consumer Privacy from Load Monitoring
Virtual Memory: Working Sets
Cost Effective Presto on AWS
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Presentation transcript:

A Fuzzy-based Auto-scaler for Web Applications in Cloud Computing Environments ICSOC 2018 Adel Nadjaran Toosi Faculty of Information Technology Monash University Email: adel.n.toosi [at] monash.edu Homepage: http://adelnadjarantoosi.info

Outline Auto-scaling Why is it hard? Motivation and objectives Our proposed Fuzzy Auto-scaler Performance Evaluation System Overview Experimental Setup Results Conclusions and Future Directions

What is auto-scaling? Dynamically adding and removing cloud resources to match the load Monitor, Analysis, Plan, Execution loop Maintain SLA (Service Level Agreement) https://www.safaribooksonline.com/library/view/cloud-architecture-patterns/9781449357979/ch04.html Auto-scaler can help us automatically adding or remove virtual machines depends on the incoming traffic of your web application. Its goal is to minimize our cost and maximize our SLA satisfaction SLA stands for service level agreement which is our service target performance Auto-scaler generally goes through 4 steps. First, it monitors application related metrics, and then it will analyse the observed metrics and based on the analysis it will plan on how many virtual machines to remove or add. The final step is simply executing this scaling plan via cloud provider’s API.

Cloud Computing Main features exhorting web application providers to host their application in Clouds Pay-as-you-go, Elasticity, and on-demand nature of cloud resources Huge pool of computational resources with infinite resource provisioning Cloud is a pool of Computers that people can use them and get charged based on the usage. The on-demand feature of cloud can be very attractive for small and medium companies which might not have enough money to pay upfront for buying servers. Also, cloud resource can provide an illusion of infinite computational resources since its size is relatively larger than we need.

Why is auto-scaling hard? Incoming load varies -> Hard to know when to scale The auto-scaling methods Reactive  Certain metrics (e.g., CPU utilization) exceed a threshold Only use recently observed metrics Proactive  Predicting the load using historical data Difficult to tune and complex Do not perform well for highly dynamic and unforeseen workloads. Different applications have different indicator metrics (system/service stats) Takes time to start and configure new virtual machines (around minutes) For a web application, especially for the one which is getting popular, the incoming load patterns varies often, so it is hard to tell when to scale Different web application consumes different system resources, so it is hard to find good metrics which could accurately reflect the capacity of the virtual machines which is used to host your web application. Provisioning new virtual machines take a long time to configure and boot up, so for every scaling step we should perform it wisely otherwise we will continuingly violate our SLA.

Motivation and Objectives There is still not an ideal auto-scaler Reactive approaches are the most common approaches (often threshold based) What are the optimal threshold values for auto-scaling? Dynamic thresholds Objective An improved reactive auto-scaler can be used by general web applications Minimizes cost while meeting SLA requirements Dynamic upper threshold Dynamic cloud resource provisioning simplicity and the adequate performance What motivates us to explore auto-scaling methods. Cloud computing is growing so fast, and there is still not an idea auto-scaler can help us to use cloud resource effectively. We aim to improve the traditional reactive auto-scaler which use the static threshold to trigger scaling and static value to provision new virtual machines by using fuzzy logic to make it adaptive.

The proposed auto-scaler Uses Fuzzy logic Implement scaling rules Dynamically adds and removes VMs Dynamically adjusts the scaling threshold Response Time, Cluster Size Metrics CPU Load Response Time Cluster size

What is fuzzy logic Not just about true or false Degree of truth -> continuous value between 0.0 and 1.0 Linguistic terms ‘if CPU is HIGH then CLOSE SOME PROCESS Human-like reasoning (vagueness).

Two fuzzy engines One produces dynamic upper threshold -> trigger adding VMs Input: Response Time, Cluster Size Output: Upper threshold in proportion of SLA Rules: Gradually increase upper threshold when cluster size gets larger One produces the cluster size -> how many VMs to add Input: CPU Load, Cluster Size Output: Cluster size in proportion of maximum number of virtual machines Rules: Conservative when cluster size is small and nearly reached maximum The fuzzy rules are that, when cluster size increased, it should have higher upper-threshold to tolerate more fluctuation of the response time in order to avoid unnecessary resource provisioning. The second fuzzy logic produce the dynamic number for provisioning new virtual machines The inputs are CPU Load and Cluster Size The idea is that when cluster size is small, the metric observed is generally heavily overloaded or not accurate, so we should be conservative when performing scaling with those metrics. When the cluster size is near the maximum cluster size, should add in less virtual machines to prevent it from exhausting all of the resources.

What is CPU Load? Uptime command Not CPU Utilization Linux based system it is average usage of the system resource Originally only measure demand of CPU, but now it includes IO Ideally CPU Load = 1 when the resources are fully utilized CPU Load shows the demand of your CPU, it is not CPU Utilization which only shows how busy is your CPU. See the following road analogy that if the road is fully occupied by cars it is 100% CPU load, If the road is half used then it is 50% CPU load If the load is fully used and there are some cars waiting to enter the road the CPU Load will show value beyond 100%. CPU Load is also evolved to show the IO demands, so we think it is a useful metric to estimate the cloud resource we need when doing auto-scaling. http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

System Overview We only use T2.micro instance from AWS since it is free in the first year. Our aim is to automatically add or removing mediawiki web servers. Mediawiki as our web application.

Experimental Setup SLA = 200ms, 90% 30 minutes Wikipedia trace fired by Wikijector Max 10 virtual machines SLA or the goal of our website is to keep request 200ms. Our experiment time is 30 minutes that is because the t2.micro instance has restriction of using full CPU power after 30 minutes.

Experiment - Auto-scalers Three Auto-scaler Amazon native Auto-scaler Amazon emulated Auto-scaler 100ms Response time as upper threshold add or remove static number of VMs Fuzzy based Auto-scaler Dynamic response time upper threshold Dynamic number of provisioning We compare three auto-scalers First two use AWS reactive scaling methods. The second one does it locally without calling the scaling API. So it is called Amazon emulated auto-scaler They all have an upper threshold at 100ms response time, and they can only remove or add a static number of virtual machines. Then we have our own fuzzy logic based auto-scaler

Experimental Results Auto-scaler % of request with response time under 200ms (Improvement to AWS Native) The Area Under the Graph/Cost (Increase to AWS Native) Fuzzy Auto-Scaler 87.97 (16.28%) 12781.00 (17.17%) AWS emulated (adding 1 VM) 80.75 (6.74%) 11768.00 (7.88%) AWS emulated (adding 2 VMs) 90.43 (19.53%) 14009.50 (28.43%) AWS Native 75.65 (0%) 10908.00 (0%) Firstly, we can see the increase of response time satisfaction will cause the cost to go up since you need more virtual machines to handle the increasing number of web requests. The aim is to increase SLA with effective use of cloud resource. In here we use AWS native auto scaler which is the one directly calling the AWS scaling API as our baseline since it performs the worst. In here we can see when emulated adding 1 virtual machine it only increase 6.74% of SLA, when it adds two each time it goes up to 19.53% which outperforms our fuzzy logic auto-scaler which increase SLA by 16.28% However, AWS emulated virtual machines (adding 2 VMs) doesn’t use cloud resource effectively.

Experiment Result - Dynamic Provisioning Number virtual machines versus time in seconds Emulated Amazon Auto-scaler reached a maximum of 10 virtual machines two times, whereas Fuzzy auto-scaler only reach 9 virtual machines without breaking the 200ms target. A static number of adding virtual machines could be overkill sometimes, you have no other choice but always 2 virtual machines with amazon emulated auto-scaler.

Experiment Result - Dynamic Upper threshold Since the previous trace file goes up and down too quickly, the effect of the dynamic upper threshold is not clear However, during the long run of the web application, you will experience many small fluctuations of the response time. So this Wikipedia trace mimic the slow increase and slow decrease of the incoming load. These two auto-scalers are same except the bottom one uses dynamic upper threshold The adaptive upper threshold keeps the cluster at 7 longer than the top one Imagine, if we run this MediaWiki for weeks and the auto-scaler’s dynamic upper-threshold will help you to save a lot of costs.

Conclusion and future work Fuzzy logic is an effective method in auto-scaling. Adaptiveness of the scaling methods is important. Scaling Thresholds should be dynamically adjusted. We will explore techniques to determine the best candidate VMs for scaling-in. The auto-scaling of session-based web applications, where users have sticky sessions needs more attention since it requires session migration. More in-depth investigation of the correlation between metrics The experiment results show that Fuzzy logic is a decent method that could improve the traditional reactive auto-scaler which uses static upper threshold and provisioning the static number of virtual machines. Adaptivity is important to build a good auto-scaler, that we need to add or remove virtual machines by considering the current state of the cluster. Single static scaling method does not work well in most of the time. In the future, we hope we can support all cloud platforms so that we can compare the price of virtual machines during scaling.

Thank You! Questions?