Dynamic Provisioning for Multi-tier Internet Applications

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Performance Testing - Kanwalpreet Singh.
Welcome to Middleware Joseph Amrithraj
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
Sandpiper : Black box and Gray-Box resource management for Virtual Machines Journal : Computer Networks: The International Journal of Computer and Telecommunications.
Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Web Server Hardware and Software
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Quantifying the Benefits of Resource Multiplexing in On-Demand Data Centers Abhishek.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Allocation for Shared Data Centers Using Online Measurements.
Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University NYMAN’04.
Handling Web Hotspots at Dynamic Content Web Sites Using DotSlash Weibin Zhao Henning Schulzrinne Columbia University Dagstuhl.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Charging Models for Data Centers Bhuvan Urgaonkar The Penn State University Bhuvan Urgaonkar The Penn State University.
Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,
Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.
AGILE, DYNAMIC PROVISIONING OF MULTITIER INTERNET APPLICATIONS Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandray, and Pawan Goyal ACM Transactions on.
Tracking Services for ANY websites and web applications Zhu Xiong CSE 403 LCO.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.
Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.
Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science An Analytical Model for Multi-tier Internet Services and its Applications Bhuvan.
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,
1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.
Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Design and Evaluation of a Model for Multi-tiered Internet Applications Bhuvan Urgaonkar Internship project talk – Services Management Middleware Dept,
1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,
Computer Science Dynamic Resource Management in Internet Data Centers Prashant Shenoy University of Massachusetts.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory.
1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Hosting Platforms Ph.D. Thesis Defense.
Capsule Placement in the Service Platform Bhuvan Urgaonkar Timothy Roscoe Systems Group, Sprint ATL.
INTRODUCTION About Project: About Project: Our project is based of the technology of cloud computing which is offering many pro’s to the world of computers.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Introduction to Mobile-Cloud Computing. What is Mobile Cloud Computing? an infrastructure where both the data storage and processing happen outside of.
Chapter-04 Building an Ecommerce Website. Building an E-commerce Site: A Systematic Approach The two most important management challenges in building.
1 Design and Implementation of a High-Performance Distributed Web Crawler Polytechnic University Vladislav Shkapenyuk, Torsten Suel 06/13/2006 석사 2 학기.
Short-Term Forecasting
Prashant Shenoy Lab Description Seminar 2009
OPERATING SYSTEMS CS 3502 Fall 2017
Abhinav Kamra, Vishal Misra CS Department Columbia University
Platform as a Service (PaaS)
Prepared by: Assistant prof. Aslamzai
Applying Control Theory to Stream Processing Systems
Principles of Network Applications
Analyzing Security and Energy Tradeoffs in Autonomic Capacity Management Wei Wu.
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Measuring Service in Multi-Class Networks
Cloud Computing.
Enterprise Application Architecture
Management of Virtual Execution Environments 3 June 2008
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
DotSlash: An Automated Web Hotspot Rescue System
Admission Control and Request Scheduling in E-Commerce Web Sites
A Simulator to Study Virtual Memory Manager Behavior
Internet Protocols IP: Internet Protocol
IP Control Gateway (IPCG)
COMPONENTS – WHY? Object-oriented source-level re-use of code requires same source code language. Object-oriented source-level re-use may require understanding.
Uniprocessor scheduling
McGraw-Hill Technology Education
Cataclysm: Handling Extreme Overloads in Internet Services
Modeling and Evaluating Variable Bit rate Video Steaming for ax
Presentation transcript:

Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt. Ltd. Thanks for the nice introduction. It’s a pleasure to be here. I am Bhuvan Urgaonkar. I will talk about the research I have done for my thesis on dynamic resource management in internet data centers.

Internet Applications Proliferation of Internet applications auction site online game online store Growing significance in personal, business affairs Focus: Internet server applications A wide variety of Internet applications have become popular during the last decade or so. Ex of such applications include online auction sites, gaming sites, online retail stores and so on. We have come to increasingly rely on such applications for conducting both our personal and business affairs. These applications typically provide a web-based interface to their clients. The focus of my research is on Internet server applications. -- Web based interface -- abrupt

Multi-tiered Internet Applications requests http J2EE database Load balancer Internet applications: multiple tiers Example: 3 tiers: HTTP, J2EE app server, database Replicable components Individual tiers: partially or fully replicable Example: clustered HTTP, J2EE server, shared-nothing db Each tier uses a dispatcher: load balancing

Internet Workloads Are Dynamic Multi-time-scale variations Time-of-day, hour-of-day Flash crowds Key issue: How to provide desired response time under varying workloads? 1200 1 2 3 4 5 Arrivals per min Time (days) 20000 40000 60000 80000 100000 120000 140000 5 10 15 20 Time (hrs) Request Rate (req/min) 140K The workloads seen by these Internet applications show variations at multiple time-scales. Perhaps the most well known example of such variation is the time-of-day behavior exhibited by the workloads of many web sites. This figure shows the workload in requests per minute arriving at a production web server at a large corporation over a period of 5 days. Each day, the workload starts off at a low value in the morning, increases steadily and peaks sometime in the afternoon and then recedes again. Thus the workload seems to show a cyclic behavior at the granularity of a day although it also exhibits variations from day to day. Within each day also there are variations from hour to hour. Internet applications are prone to unanticipated overloads known as flash crowds. An example is seen in the figure shown here. This figure shows the request arrival rate to the web site hosting the world cup soccer event in 1998. Here we see that once the game started, the access rate to the site went up by a factor of 7 due to an increased number of clients checking live scores. Despite such variations in workloads, it is still important for applications to provide good performance to their clients. Several user studies have shown that web clients tend to get frustrated if a web page doesn’t get downloaded within 8-10 sec. So a key problem in the context of Internet applications then is to be able to provide good response time to their clients even as their workloads vary with time. 0 12 24 Time (hours)

Internet Data Center Internet applications run on data centers Server farms Provide computational and storage resources Applications share data center resources Problem: How should the platform allocate resources to absorb workload variations?

Our Provisioning Approach Flexible queuing theoretic model Captures all tiers in the application Predictive provisioning Long-term workload variations Reactive provisioning Short-term variations, flash crowds

Talk Outline Introduction Internet data center model Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary Here is the outline of the rest of this talk. I will first present capacity planning and application placement mechanisms that are concerned with deriving the requirements of an application and deciding where on the data center it should run. Then I will describe provisioning mechanisms to deal with dynamic workloads. Finally I will summarize the talk and discuss the work that remains in this thesis.

Data Center Model Retail Web site streaming Dedicated hosting: each application runs on a subset of servers in the data center Subsets are mutually exclusive: no server sharing Data center hosts multiple applications Free server pool: unused servers

Single-tier Provisioning Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 14 req/s 14 10 dropped 4 req/s C=15 C=10 C=10.1

Single-tier Provisioning Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 10.1 14 14 14 req/s C=15 C=10.1 C=20 dropped 3.9 req/s

Model-based Provisioning Black box approach Treat application as a black box Measure response time from outside Increase allocation if response time > SLA Use a model to determine how much to allocate Strawman #2: use black box for multi-tier apps Problems: Unclear which tier needs more capacity May not increase goodput if bottleneck tier is not replicable 14 req/s C=15 C=10.1 14 C=20 10.1

Provisioning Multi-tier Apps Approach: holistic view of multi-tier application Determine tier-specific capacity independently Allocate capacity by looking at all tiers (and other apps) Predictive provisioning Long-term provisioning: time scale of hours Maintain long-term workload statistics Predict and provision for the next few hours Reactive provisioning Short term provisioning: time scale of several minutes React to “current” workload trends Correct errors of long-term provisioning Handle flash crowds (inherently unpredictable)

Predictive Provisioning Workload predictor Predicts workload based on past observations Application model Infers capacity needed to handle given workload past workload predicted workload Predictor Model required capacity response time target

Workload Prediction Long term workload monitoring and prediction Monitor workload for multiple days Maintain a histogram for each hour of the day Capture time of day effects Forecast based on Observed workload for that hour in the past Observed workload for the past few hours of the current day Predict a high percentile of expected workload Mon Tue Wed Today

Model-based Capacity Inference G/G/1 lpred Queuing theoretic application model Each individual server is a G/G/1 queue Derive per-tier E(r) from end-to-end SLA Monitor other parameters and determine l (per-server capacity) Use predicted workload lpred to determine # servers per tier Assumes perfect load balancing in each tier

Reactive Provisioning lactual Prediction error Invoke reactor allocate servers lerror > t lpred time series Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Account for prediction errors Can be invoked if request drop rate exceeds a threshold Handles sudden flash crowds Operates over time scale of a few minutes Pure reactive provisioning: lags workload Reactive + predictive more effective!

Talk Outline Introduction Internet data center model Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary Here is the outline of the rest of this talk. I will first present capacity planning and application placement mechanisms that are concerned with deriving the requirements of an application and deciding where on the data center it should run. Then I will describe provisioning mechanisms to deal with dynamic workloads. Finally I will summarize the talk and discuss the work that remains in this thesis.

Prototype Data Center Control Plane 40+ Linux servers Gigabit switches Server Node Nucleus Apps OS Nucleus Apps OS Nucleus Apps OS Applications Resource monitoring Parameter estimation Control Plane Dynamic provisioning 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, Tomcat (replicable) Mysql database

Only Predictive Provisioning Auction application RUBiS Factor of 4 increase in 30 min Workload Response time Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Predictor fails during [15, 30] resulting in under-provisioning Response time violations occur

Only Reactive Provisioning Auction application RUBiS Factor of 4 increase in 30 min Workload Response time Resp time (msec) Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Time (min) Response time shows oscillatory behavior Several response time violations occur

Predictive + Reactive Provisioning Auction application RUBiS Factor of 4 increase in 30 min 20 40 60 80 100 120 140 160 10 30 50 Arrivals per min Time (min) 1000 2000 3000 4000 5000 6000 7000 10 20 30 40 50 60 Resp time (msec) Time (min) Workload Server allocations Response time Finally, we show our system performs when the prediction mechanism is enhanced by the threshold-based reactor and policer. The reactor was invoked at intervals of 5 min. We see that at 20 and 25 min, on observing deviations between actual arrivals and the predicted values, the reactor pulled in additional servers to the java application tier. The policer was also active keeping the response times under the desired limit. This illustrates the effectiveness of the integration of these mechanism. The predictor … -- Questions Why isn’t it enough to have just the reactor? Why is the response time behavior after t=30min different from that in the last experiment? Server allocations increased to match increased workload Response time kept below 2 seconds

Summary Dynamic provisioning for multi-tier applications Flexible queuing theoretic model Captures all tiers in the application Predictive provisioning Reactive provisioning Implementation and evaluation on a Linux cluster

Thank you! More information at: http://www.cs.umass.edu/~bhuvan Thank you for your attention. More information about my research is available at this URL and I would be happy to answer any questions you may have.