AGILE, DYNAMIC PROVISIONING OF MULTITIER INTERNET APPLICATIONS Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandray, and Pawan Goyal ACM Transactions on.

Slides:

Advertisements

Similar presentations

Hadi Goudarzi and Massoud Pedram

Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.

XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.

CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Allocation for Shared Data Centers Using Online Measurements.

Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University NYMAN’04.

Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht.

Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University Dagstuhl.

Yaksha: A Self-Tuning Controller for Managing the Performance of 3-Tiered Web Sites Abhinav Kamra, Vishal Misra CS Department Columbia University Erich.

Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.

DotSlash: Providing Dynamic Scalability to Web Applications Weibin Zhao and Henning Schulzrinne Department of Computer Science, Columbia University More.

SPRING 2011 CLOUD COMPUTING Cloud Computing San José State University Computer Architecture (CS 147) Professor Sin-Min Lee Presentation by Vladimir Serdyukov.

Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,

Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.

Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif † Univ. of Massachusetts.

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.

Generating Adaptation Policies for Multi-Tier Applications in Consolidated Server Environments College of Computing Georgia Institute of Technology Gueyoung.

A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.

Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.

AUTONOMOUS RESOURCE PROVISIONING FOR MULTI-SERVICE WEB APPLICATIONS Jiang Dejun,Guillaume Pierre,Chi-Hung Chi WWW '10 Proceedings of the 19th international.

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.

1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.

Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,

OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science An Analytical Model for Multi-tier Internet Services and its Applications Bhuvan.

November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [1] 4/24/2014 Presented by: Rakesh Kumar [1 ]

Improving Network I/O Virtualization for Cloud Computing.

Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.

Challenges towards Elastic Power Management in Internet Data Center.

Adaptive Virtual Machine Provisioning in Elastic Multi-tier Cloud Platforms Fan Zhang, Junwei Cao, Hong Cai James J. Mulcahy, Cheng Wu Tsinghua University,

© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Profiling and Modeling Resource Usage.

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites Weibin Zhao Henning Schulzrinne Department of Computer Science Columbia.

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Design and Evaluation of a Model for Multi-tiered Internet Applications Bhuvan Urgaonkar Internship project talk – Services Management Middleware Dept,

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Computer Science Dynamic Resource Management in Internet Data Centers Prashant Shenoy University of Massachusetts.

Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.

DotSlash – or how to deal with 15 minutes of fame Weibin Zhao Henning Schulzrinne Columbia University CATT/WICAT Annual Research Review November 14, 2003.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory.

1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.

Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.

Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.

Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Hosting Platforms Ph.D. Thesis Defense.

Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.

Basic Concepts Maximum CPU utilization obtained with multiprogramming

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Abhinav Kamra, Vishal Misra CS Department Columbia University

Regulating Data Flow in J2EE Application Server

Dynamic Provisioning for Multi-tier Internet Applications

Zhen Xiao, Qi Chen, and Haipeng Luo May 2013

DotSlash: An Automated Web Hotspot Rescue System

Admission Control and Request Scheduling in E-Commerce Web Sites

Cloud Computing Architecture

Cloud Computing Architecture

Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00

Cataclysm: Handling Extreme Overloads in Internet Services

Presentation transcript:

AGILE, DYNAMIC PROVISIONING OF MULTITIER INTERNET APPLICATIONS Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandray, and Pawan Goyal ACM Transactions on Autonomous Adaptive Systems, 3(1),

Agenda  Introduction  System Overview  Provisioning Algorithm  How much  When  Server Switching  Evaluation  Conclusion  Comments 2

Introduction (1/4)  Internet applications employ a multi-tier architecture, with each tier providing a certain functionality  Such applications tend to see dynamically varying workloads that contain long-term variations such as time-of-day effects short-term fluctuations due to flash crowds  Predicting the peak workload of an Internet application and capacity provisioning based on these worst case estimates is notoriously difficult 3

Introduction (2/4)  Since many single-tier provisioning mechanisms have already been proposed  a straightforward extension is to employ such an approach at each tier of the application  But…. Use single-tier provisioning mechanisms Bottleneck Shifting Model all tiers as a black box and allocate servers whenever the observed response time exceed a threshold Hard to determine how much servers and where the server should be allocated 4

Introduction (3/4) 5

Introduction (4/4)  Research Contributions  Predictive and Reactive Provisioning  Analytical modeling and incorporating tails of workload distributions  Virtual Machine based provisioning  Handling session-based workloads 6

System Overview (1/6) -- Multi-tier Internet Application  A tier may be clustered or not  the front-end tier can be a clustered Apache server that runs on multiple machines  the backend tier employs a database with shared-nothing architecture, it cannot be replicated on-demand  Each clustered tier is also assumed to employ a load balancing element  responsible for distributing requests to servers  If a session is stateful, successive requests will need to be serviced by the same server at each tier the load balancing element will need account for this server state when redirecting requests 7

System Overview (2/6) -- Multi-tier Internet Application  Every application also runs a special component called a sentry  polices incoming sessions to an application’s server pool  unlike systems that use per-tier admission control makes a one-time admission decision when a session arrives avoids resource wastage resulting from partially serviced requests that may be dropped at later tiers  Once a session has been admitted, none of its requests can be dropped at any intermediate tier 8

System Overview (3/6) -- Multi-tier Internet Application 9

System Overview (4/6) -- Hosting Platform Architecture  The hosting platform is a data center that consists of a cluster of commodity servers interconnected by gigabit Ethernet  Servers Hosting Application Components  each application runs on a subset of the servers and a server is allocated to at most one application at any given time  The component of an application that runs on a server is referred to as a capsule If the capsule is replicable – the server is called Elf If the capsule is non-replicable – the server is called Ent 10

System Overview (5/6) -- Hosting Platform Architecture  Nucleus  a software component that performs online measurements of the capsule workload, performance and resource usage  these statistics are periodically conveyed to the control plane  Control Plane  responsible for dynamic provisioning of servers to individual applications 11

System Overview (6/6) -- Hosting Platform Architecture 12

Provisioning Algorithm -- How much (1/3)  Model each server as a G/G/1 queuing model  Request arrival rate to tier i  λ i : the request arrival rate to tier i  d i : the mean response time for tier i  s i : the average service time for a request  : the variance of inter-arrival time  : the variance of service time 13

 =>  W q : the waiting time in queue  X : the (random) service time 14

Provisioning Algorithm -- How much (2/3)  Observe that d i is known  the per-tier service time s i  the variance of inter-arrival and service times and can be monitored online in the system.  By substituting these values, a lower bound on request rate λ i that can serviced by a single server can be obtained. 15

Provisioning Algorithm -- How much (3/3)  η i : The number of servers needed at tier i (output)  Z : average session think-time  : the rate that a session issues requests  λ : the session arrival rate  : the average session duration  β i: the requests that triggered by a single incoming request at tier i 16

Provisioning Algorithm –When – Predictive Provisioning for Long Term(1/3)  Predictive provisioning is motivated by long-term variations such as time-of-day or seasonal effects exhibited by Internet workloads  the workload seen by an Internet application typically peaks around noon every day and is minimum in the middle of the night  The predictor uses past observations of the workload to predict peak demand that will be seen over a period of T hours  For simplicity of exposition, assume that T = 1 hour 17

Provisioning Algorithm –When – Predictive Provisioning for Long Term(2/3) 18

Provisioning Algorithm –When – Predictive Provisioning for Long Term(3/3)  λ pred (t): the predicted arrival rate during a particular hour denoted by t  λ obs (t): the actual arrival rate seen during this hour  λ obs (t) - λ pred (t): the prediction error  h : the mean prediction error over the past h hours 19

Provisioning Algorithm –When – Reactive Provisioning for Short Term(1/3)  sudden load spikes or flash crowds are inherently unpredictable phenomena  Reactive provisioning is used to swiftly react to such unforeseen events operates on short time scales—on the order of minutes— checking for workload anomalies 20

Provisioning Algorithm –When – Reactive Provisioning for Short Term(2/3)  Reactive provisioning is invoked once every few minutes  It can also be invoked on-demand by the application sentry  Two approaches  Recompute a new allocation of server for the various tiers  Increase the allocation of all tiers that are at or near saturation by a constant amount 21

Provisioning Algorithm –When – Reactive Provisioning for Short Term(3/3)  If the free pool is empty or has insufficient servers  need to be borrowed from other underloaded applications running on the hosting platform  An application is said to be underloaded if its observed workload is significantly lower than its provisioned capacity 22

Server Switching (1/2)  assume that each Elf server runs multiple virtual machines and capsules of different applications within it  Only one capsule and its virtual machine is active at any time  Other virtual machines are dormant—they are allocated minimal server resources  If the server belongs to the free pool, all of its resident VMs are dormant 23

Server Switching (2/2)  switching an Elf server from one application to another implies deactivating a VM by reducing its resource allocation to ε  ε is a small value such that the VM consumes negligible resources  But, if the server retains state of existing sessions  Fixed rate ramp down Some long-lived residual session will be forced to terminate  Measurement-based ramp down The server switching time is long 24

Evaluation – Environment (1/3)  a prototype data center  a cluster of 40 Pentium servers An application capsule (2.8GHz, 512MB RAM) Load balancer Control plane (dual-processor 450MHz, 1GB RAM) Sentry (dual-processor 1GHz, 1GB RAM) Workload Generator  connected via a 1Gbps ethernet switch  running Linux  Three tiers  Apache Web server (2.0.48)  Tomcat servlets container (4.1.29)  Non-replicable Mysql database server (4.0.18) 25

Evaluation – Environment (2/3)  Virtual Machine Monitor  Xen 1.2 …..  Nucleus  online measurements of resource usages and request performance  real-time processing of logs provided by the application software components  offline measurements to determine various quantities needed by the control plane  Sentry and Load balancer  Use Kernel TCP Virtual Server (ktcpvs) version for sentry and Apache layer  mod_jk: an Apache module that implement a varient of round robin request distribution for Tomcat layer  Control Plane  A daemon running in a dedicated machine  Implements the predictive and reactive provisioning 26

Evaluation – Environment (3/3)  two open-source multi-tier applications  Rubis An eBay like auction site Three type of user sessions : selling, browsing, bidding 9 tables in the database 26 interactions that can be accessed from the clients’ Web browsers  Rubbos A bulletin-board application Two different levels of access : regular user and moderator provides 24 Web interactions  SLA: the 95th percentile of the response time is no greater than 2 seconds 27

Evaluation -- independent per-tier provisioning(1/3)  Use Rubbos application  Workload increase every 10 minutes 28

Evaluation -- independent per-tier provisioning(2/3)  employ dynamic provisioning only at the most compute- intensive tier of the application, since it is the most common bottleneck  the Tomcat tier  The capacity of a Tomcat server was determined to be 40 simultaneous sessions, while Apache was configured with a connection limit of 256 sessions 29

Evaluation -- independent per-tier provisioning(3/3)  Use multi-tier provisioning technique 30

Evaluation -- the black box approach(1/2)  Use Rubis  assume that two Tomcat servers and one Apache server are added to the application every time a capacity increase is signaled  But database is not replicable 31

32

Evaluation -- the black box approach(2/2)  Use multi-tier provisioning technique 33

Evaluation -- Predictive and Reactive Provisioning(1/4)  Use Rubis  Workload  1998 Soccer World Cup Site 8 day period  Compressing the original 24-hr long trace to 1hr Picking every 24 th minutes and discarding the rest Day 6(typical day) Day 7(moderate overload) Day 8(extreme overload) 34

Evaluation -- Predictive and Reactive Provisioning(2/4) 35  Day 6  Only predictive provisioning

Evaluation -- Predictive and Reactive Provisioning(3/4) 36  Day 7  Predicted with/without recent trand  Prediction failed during interval 2  Reactive must trigger after the SLA is violated

Evaluation -- Predictive and Reactive Provisioning(4/4) 37  Day 8  Prediction is failed  The unpredictable workload consumes all the server  Using policing to drop sessions

Evaluation – Switching of server resources 38  Scenario 1: New server taken from free pool; the application must be start  Scenario 2: as 1, but application is already running  Scenario 3: taken from another application, waiting for all residual sessions to finish  Scenario 4: as 3, let two VMs share the CPU equally until the session finish  Scenario 5: as 3, using “fixed rate ramp down”

Conclusion 39  a flexible queuing model to determine how much resources to allocate to each tier of the application  a combination of predictive and reactive methods that determine when to provision these resources, both at large and small time scales

Comments(1/2) 40  A different thinking about resource provisioning  Which service should be allocated resource ? SLA must be violated first  How many resources and when to allocate to services ? The accuracy of prediction is key point  Can the two ways combine together?  The evaluation result in the paper seems not so good  The prediction interval and reactive interval is too long (15 min and few minutes)  But frequently checking will make more loading

Comments(2/2) 41  Unpredictable workload is really unpredictable ?  Cooperate with news  But its not automatic  Queuing theory…………

42  Thanks  The End