Dynamic Provisioning for Multi-tier Internet Applications

Dynamic Provisioning for Multi-tier Internet Applications
Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt. Ltd. Thanks for the nice introduction. It’s a pleasure to be here. I am Bhuvan Urgaonkar. I will talk about the research I have done for my thesis on dynamic resource management in internet data centers.

Internet Applications
Proliferation of Internet applications auction site online game online store Growing significance in personal, business affairs Focus: Internet server applications A wide variety of Internet applications have become popular during the last decade or so. Ex of such applications include online auction sites, gaming sites, online retail stores and so on. We have come to increasingly rely on such applications for conducting both our personal and business affairs. These applications typically provide a web-based interface to their clients. The focus of my research is on Internet server applications. -- Web based interface -- abrupt

Multi-tiered Internet Applications
requests http J2EE database Load balancer Internet applications: multiple tiers Example: 3 tiers: HTTP, J2EE app server, database Replicable components Individual tiers: partially or fully replicable Example: clustered HTTP, J2EE server, shared-nothing db Each tier uses a dispatcher: load balancing

Internet Workloads Are Dynamic
Multi-time-scale variations Time-of-day, hour-of-day Flash crowds Key issue: How to provide desired response time under varying workloads? 1200 1 2 3 4 5 Arrivals per min Time (days) 20000 40000 60000 80000 100000 120000 140000 5 10 15 20 Time (hrs) Request Rate (req/min) 140K The workloads seen by these Internet applications show variations at multiple time-scales. Perhaps the most well known example of such variation is the time-of-day behavior exhibited by the workloads of many web sites. This figure shows the workload in requests per minute arriving at a production web server at a large corporation over a period of 5 days. Each day, the workload starts off at a low value in the morning, increases steadily and peaks sometime in the afternoon and then recedes again. Thus the workload seems to show a cyclic behavior at the granularity of a day although it also exhibits variations from day to day. Within each day also there are variations from hour to hour. Internet applications are prone to unanticipated overloads known as flash crowds. An example is seen in the figure shown here. This figure shows the request arrival rate to the web site hosting the world cup soccer event in Here we see that once the game started, the access rate to the site went up by a factor of 7 due to an increased number of clients checking live scores. Despite such variations in workloads, it is still important for applications to provide good performance to their clients. Several user studies have shown that web clients tend to get frustrated if a web page doesn’t get downloaded within 8-10 sec. So a key problem in the context of Internet applications then is to be able to provide good response time to their clients even as their workloads vary with time. Time (hours)

Internet Data Center Internet applications run on data centers
Server farms Provide computational and storage resources Applications share data center resources Problem: How should the platform allocate resources to absorb workload variations?

Our Provisioning Approach
Flexible queuing theoretic model Captures all tiers in the application Predictive provisioning Long-term workload variations Reactive provisioning Short-term variations, flash crowds

Talk Outline Introduction Internet data center model
Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary Here is the outline of the rest of this talk. I will first present capacity planning and application placement mechanisms that are concerned with deriving the requirements of an application and deciding where on the data center it should run. Then I will describe provisioning mechanisms to deal with dynamic workloads. Finally I will summarize the talk and discuss the work that remains in this thesis.

Data Center Model Retail Web site streaming Dedicated hosting: each application runs on a subset of servers in the data center Subsets are mutually exclusive: no server sharing Data center hosts multiple applications Free server pool: unused servers

Single-tier Provisioning
Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 14 req/s 14 10 dropped 4 req/s C=15 C=10 C=10.1

Single-tier Provisioning
Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 10.1 14 14 14 req/s C=15 C=10.1 C=20 dropped 3.9 req/s

Model-based Provisioning
Black box approach Treat application as a black box Measure response time from outside Increase allocation if response time > SLA Use a model to determine how much to allocate Strawman #2: use black box for multi-tier apps Problems: Unclear which tier needs more capacity May not increase goodput if bottleneck tier is not replicable 14 req/s C=15 C=10.1 14 C=20 10.1

Provisioning Multi-tier Apps
Approach: holistic view of multi-tier application Determine tier-specific capacity independently Allocate capacity by looking at all tiers (and other apps) Predictive provisioning Long-term provisioning: time scale of hours Maintain long-term workload statistics Predict and provision for the next few hours Reactive provisioning Short term provisioning: time scale of several minutes React to “current” workload trends Correct errors of long-term provisioning Handle flash crowds (inherently unpredictable)

Predictive Provisioning
Workload predictor Predicts workload based on past observations Application model Infers capacity needed to handle given workload past workload predicted workload Predictor Model required capacity response time target

Workload Prediction Long term workload monitoring and prediction
Monitor workload for multiple days Maintain a histogram for each hour of the day Capture time of day effects Forecast based on Observed workload for that hour in the past Observed workload for the past few hours of the current day Predict a high percentile of expected workload Mon Tue Wed Today

Model-based Capacity Inference
G/G/1 lpred Queuing theoretic application model Each individual server is a G/G/1 queue Derive per-tier E(r) from end-to-end SLA Monitor other parameters and determine l (per-server capacity) Use predicted workload lpred to determine # servers per tier Assumes perfect load balancing in each tier

Reactive Provisioning
lactual Prediction error Invoke reactor allocate servers lerror > t lpred time series Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Account for prediction errors Can be invoked if request drop rate exceeds a threshold Handles sudden flash crowds Operates over time scale of a few minutes Pure reactive provisioning: lags workload Reactive + predictive more effective!

Talk Outline Introduction Internet data center model
Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary Here is the outline of the rest of this talk. I will first present capacity planning and application placement mechanisms that are concerned with deriving the requirements of an application and deciding where on the data center it should run. Then I will describe provisioning mechanisms to deal with dynamic workloads. Finally I will summarize the talk and discuss the work that remains in this thesis.

Prototype Data Center Control Plane 40+ Linux servers Gigabit switches
Server Node Nucleus Apps OS Nucleus Apps OS Nucleus Apps OS Applications Resource monitoring Parameter estimation Control Plane Dynamic provisioning 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, Tomcat (replicable) Mysql database

Only Predictive Provisioning
Auction application RUBiS Factor of 4 increase in 30 min Workload Response time Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Predictor fails during [15, 30] resulting in under-provisioning Response time violations occur

Only Reactive Provisioning
Auction application RUBiS Factor of 4 increase in 30 min Workload Response time Resp time (msec) Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Time (min) Response time shows oscillatory behavior Several response time violations occur

Predictive + Reactive Provisioning
Auction application RUBiS Factor of 4 increase in 30 min 20 40 60 80 100 120 140 160 10 30 50 Arrivals per min Time (min) 1000 2000 3000 4000 5000 6000 7000 10 20 30 40 50 60 Resp time (msec) Time (min) Workload Server allocations Response time Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Server allocations increased to match increased workload Response time kept below 2 seconds

Summary Dynamic provisioning for multi-tier applications
Flexible queuing theoretic model Captures all tiers in the application Predictive provisioning Reactive provisioning Implementation and evaluation on a Linux cluster

Thank you! More information at: http://www.cs.umass.edu/~bhuvan
Thank you for your attention. More information about my research is available at this URL and I would be happy to answer any questions you may have.

Dynamic Provisioning for Multi-tier Internet Applications

Similar presentations

Presentation on theme: "Dynamic Provisioning for Multi-tier Internet Applications"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dynamic Provisioning for Multi-tier Internet Applications

Similar presentations

Presentation on theme: "Dynamic Provisioning for Multi-tier Internet Applications"— Presentation transcript:

Similar presentations

About project

Feedback