A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications Qi Zhang College of William and Mary Williamsburg, VA 23187,

Slides:



Advertisements
Similar presentations
Performance Testing - Kanwalpreet Singh.
Advertisements

1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
1 A class of Generalized Stochastic Petri Nets for the performance Evaluation of Mulitprocessor Systems By M. Almone, G. Conte Presented by Yinglei Song.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Dynamic Tuning of the IEEE Protocol to Achieve a Theoretical Throughput Limit Frederico Calì, Marco Conti, and Enrico Gregori IEEE/ACM TRANSACTIONS.
Computational Biology, Part 17 Biochemical Kinetics I Robert F. Murphy Copyright  1996, All rights reserved.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Development of Empirical Models From Process Data
Performance Evaluation
1 Validation and Verification of Simulation Models.
1 Multiple class queueing networks Mean Value Analysis - Open queueing networks - Closed queueing networks.
1 Automatic Request Categorization in Internet Services Abhishek B. Sharma (USC) Collaborators: Ranjita Bhagwan (MSR, India) Monojit Choudhury (MSR, India)
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Introduction to ModelingMonte Carlo Simulation Expensive Not always practical Time consuming Impossible for all situations Can be complex Cons Pros Experience.
Computer Systems Performance Evaluation CSCI 8710 Kraemer Fall 2008.
Using Standard Industry Benchmarks Chapter 7 CSE807.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
Computer Networks Performance Evaluation. Chapter 12 Single Class MVA Performance by Design: Computer Capacity Planning by Example Daniel A. Menascé,
yahoo.com SUT-System Level Performance Models yahoo.com SUT-System Level Performance Models8-1 chapter12 Single Class MVA.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Introduction to Discrete Event Simulation Customer population Service system Served customers Waiting line Priority rule Service facilities Figure C.1.
Performance of Web Applications Introduction One of the success-critical quality characteristics of Web applications is system performance. What.
(C) 2009 J. M. Garrido1 Object Oriented Simulation with Java.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science An Analytical Model for Multi-tier Internet Services and its Applications Bhuvan.
Simpson Rule For Integration.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
NETE4631:Capacity Planning (2)- Lecture 10 Suronapee Phoomvuthisarn, Ph.D. /
Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Entities and Objects The major components in a model are entities, entity types are implemented as Java classes The active entities have a life of their.
1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Queueing Models with Multiple Classes CSCI 8710 Tuesday, November 28th Kraemer.
Handling Session Classes for Predicting ASP.NET Performance Metrics Ágnes Bogárdi-Mészöly, Tihamér Levendovszky, Hassan Charaf Budapest University of Technology.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Injecting Realistic Burstiness to.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
NETE4631: Network Information System Capacity Planning (2) Suronapee Phoomvuthisarn, Ph.D. /
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
A simple model for analyzing P2P streaming protocols. Seminar on advanced Internet applications and systems Amit Farkash. 1.
1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.
Managing Web Server Performance with AutoTune Agents by Y. Diao, J. L. Hellerstein, S. Parekh, J. P. Bigus Presented by Changha Lee.
1 Exploiting Nonstationarity for Performance Prediction Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs)
© 2002 IBM Corporation IBM Research 1 Policy Transformation Techniques in Policy- based System Management Mandis Beigi, Seraphin Calo and Dinesh Verma.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
IEEE International Conference on Fuzzy Systems p.p , June 2011, Taipei, Taiwan Short-Term Load Forecasting Via Fuzzy Neural Network With Varied.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
OPERATING SYSTEMS CS 3502 Fall 2017
Jacob R. Lorch Microsoft Research
Mean Value Analysis of a Database Grid Application
Lecture 2 Part 3 CPU Scheduling
Outline System architecture Current work Experiments Next Steps
Approximate Mean Value Analysis of a Database Grid Application
Presentation transcript:

A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications Qi Zhang College of William and Mary Williamsburg, VA 23187, USA Ludmila Cherkasova Hewlett-Packard Labs Palo Alto, CA 94304, USA Evgenia Smirni College of William and Mary Williamsburg, VA 23187, USA 1 Fourth International Conference on Autonomic Computing (ICAC 2007)

OUTLINE INTRODUCTION EXPERIMENTAL ENVIRONMENT TRANSACTION AS A UNIT OF CLIENT/SERVER INTERACTION SESSION-BASED V.S. TRANSACTION-BASED SYSTEMS CPU COST OF TRANSACTIONS ANALYTIC MODEL APPROACH LIMITATIONS CONCLUSIONS 2

INTRODUCTION Effective models of complex enterprise systems are central to capacity planning and resource provisioning Self-adaptive resource provisioning in such systems requires swift responses to workload changes The need of fast response necessitates the use of analytic models – can quickly supply performance numbers, then – can drive system provisioning Server virtualization – Accurate performance models become instrumental for enabling applications to automatically request necessary resources and support design of utility services 3

Effective analytic models Effective analytic models can enable powerful and simple solutions for dynamic resource provisioning – analytic models with ability of providing a contained abstraction of the system – If the system workload are captured well,then analytic models can be effective in predicting A further challenge is the sensitivity of analytic models to their parameterization – Real systems cannot provide accurate workload demands – The workload is session-based rather than transaction-based 4

Practical Solution Each user session consists of an assortment of transactions, – transactions consist of processing many smaller objects and database queries – Although detailed measurements are necessary to increase model accuracy, become totally impractical We provide a practical solution to the above problems by laying out a theoretical framework – illustrates how to use information at the transaction level to effectively model session-based workloads 5

Practical Solution Framework is based on a regression-based methodology to approximate CPU demands of transactions – can absorb some level of uncertainty or noise present in real-world data – effectively compacting information on workload demands within a few model parameters only A detailed set of experimentation using the TPC-W ecommerce suite – TPC-W is a transactional web e-Commerce benchmark. – TPC (Transaction Processing Performance Council) 6

EXPERIMENTAL ENVIRONMENT Use a testbed of a multi-tier e-commerce site that simulates the operation of an online bookstore, according to the classic TPC-W benchmark 7

Basic Transactions A client access to a web service occurs in the form of a session consisting of a sequence of consecutive individual transactions 8

TPC-W Specification According to the TPC-W specification – the number of concurrent sessions is kept constant – statistically defines the user session length, the user think time, and the queries that are generated – a time-out period (uniformly distributed between 5 and 15 minutes) – The database size is determined by the number of items and the number of customers (10,000 items and 1,440,000 customers ) 9

Customer Behavior One way to capture the navigation pattern within a session is through the Customer Behavior Model Graph(CMBG) [9] TPC-W defines the set of probabilities that drive user behavior from one state to another at the user session level – the browsing mix: 95% browsing and 5% ordering – the shopping mix: 80% browsing and 20% ordering – the ordering mix: 50% browsing and 50% ordering 10 [9] D. Menasce and V. Almeida. Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning. Prentice Hall, 2000.

System Performance 11 the system becomes overloaded with 300 EBs, 400 EBs, and 500 EBs under the browsing mix, shopping mix and ordering mix, respectively it is apparent that the front server is a bottleneck when the system is processing shopping and ordering transaction mixes the browsing mix under high loads it is not obvious which tier and resource is responsible for the bottleneck

Bottleneck Switch The figure shows that there is a continuous bottleneck switch between the front and database servers over time – one can observe increased client response times while server utilizations on individual components remain modest 12

TRANSACTION AS A UNIT OF CLIENT/SERVER INTERACTION What is required at the server side to generate a reply in response to a web page request issued by a client A client communicates with a web via a web interface – the unit of activity at the client-side corresponds to a download of a web page (composed of an HTML file and several embedded objects) At the server side, a web page retrieval corresponds to processing multiple smaller objects that can be retrieved either in sequence or via multiple concurrent connections 13

Transactions Since the HTTP protocol does not provide any means to delimit the beginning or the end of a web page – it is very difficult to accurately measure the aggregate resources consumed due to web page processing at the server side – There is no practical way to effectively measure the service times for all page objects Define a transaction as a combination of all the processing activities at the server side to deliver an entire web page – generate the main HTML file as well as retrieve embedded objects, and perform related database queries 14

SESSION-BASED V.S. TRANSACTION-BASED A session is defined as a sequence of interdependent individual transactions The session-based system is not stateless since the next client transaction explicitly depends on the previous ones – Such transaction dependency in the client behavior limits the opportunity for an efficient analytical model design We can model well resource requirements of a session-based system by evaluating the resource requirements of its simplified transaction-based equivalent 15

Simulation Model Assume that there is a total of N transaction types processed by the server We use the following notation: 16 Vector π represents the steady-state probability all transactions i.e., πi gives the overall percentage of transactions of type i in the workload

Simulation Model Session-based model: – The transaction type is determined when a client sends out the request to the system (according to the pre-defined transition probability matrix P) Transaction-based model: – the transaction type is selected according to the stationary probabilities π 17

Simulation Model The overall transaction distribution is the same as in the system with session-based behavior – Such transaction distribution can be easily monitored for an existing production system If we find a way to approximate the service time of each transaction type in the workload, – we can evaluate the average service time for the entire system under changing workload conditions (i.e., under varying transaction mix and load conditions over time) – design compact and efficient analytical models answering capacity planning and resource requirement questions 18

CPU COST OF TRANSACTIONS Propose a statistical regression-based approach for an efficient approximation of CPU demands of different transaction types Prerequisite to applying regression is that a service provider collects the following: – the application server access log that reflects all processed client transactions (i.e., client web page accesses) – the CPU utilization if every tier of the evaluated system 19

Regression Methodology Assuming that there are totally N transaction types processed by the server, we use the following notation: From the utilization law, one can easily obtain Eq. (2) for each monitoring window 20

Regression Methodology it is practically infeasible to get accurate service times, let denote the approximated CPU cost of for 0 ≤ i ≤ N An approximated utilization can be calculated as: We use the Non-negative Least Squares Regression provided by MATLAB to obtain This regression solver produces a solution for 200 equations with 14 variables only in 7 millisecond the common least squares algorithms have polynomial time complexity as (solving v equations with u variables) 21

Monitoring Window Size the traces collected from the TPC-W experiments under the three workload mixes to validate the accuracy of the proposed regression-based method (i.e., browsing, shopping, and ordering mixes as described in Section II) We then examine the sensitivity of the regression results to the length T of the monitoring window, i.e., T equal to 1 minute, 5 minutes, 10 minutes, and 15 minutes For every monitoring window, the relative error of the approximated utilization is defined as 22

The approximation of CPU transaction cost at the front server is of higher accuracy than that at the database server Larger T achieves higher accuracy at the front server: 87% - 98% of monitoring windows have relative errors less than 15% at the database server: 79% - 89% of monitoring windows show relative errors less than 20% 23

Workload Rate Measurements from experiments with – less than or equal to 200 EBs are used to get CPU costs under light load – larger than 200 EBs are used to get the costs under steady load 24 The approximation of CPU transaction cost is much more accurate when the regression is done separately for different workload rates The approximation of CPU transaction cost is less accurate under the “light” workload rates

Analytic Model Because of the upper limit on the number of simultaneous connections at a web server, the system can be modeled as a closed system with a network of queues The number of clients in the system is fixed and circulate in the network When a client receives the response from the server, it issues another request after certain think time (i.e. after spending some time at Q0) 25

Analytic Model This model can be efficiently solved using Mean-Value Analysis (MVA) (?)[8], a classic algorithm for solving closed product-form networks This model takes as inputs the think time in Q0 and the service demands of Q1 and Q2, and provides average system throughput, average transaction response time, and average queue length in each queue The average service demand at tier n is computed as follows 26 [8]D. Menasce, V. Almeida, L. Dowdy. Capacity Planning and Performance Modeling: from mainframes to client-server systems. Prentice Hall, The above value is used by the MVA model to evaluate the maximum achievable system throughput for the three TPCW transaction mixes: browsing, shopping, and ordering

Simulation Model We also evaluate an accuracy and performance of our transaction-based simulation model introduced (used in Section IV) After a certain think time (exponential distributed), the client sends a transaction to the front server. – The transaction type i is randomly selected according to the stationary probabilities π of the browsing, shopping, or ordering mixes the front server processes this transaction with an exponentially distributed service time with mean is equal to of the front server i.e., the approximated CPU cost of transaction type i as given by regression 27

Modeling Results For the browsing mix, both analytic and simulation models predict higher system throughput than the measured one – The reason that the two models do not do as well relates to the bottleneck switching behavior for browsing mix under higher loads 28

APPROACH LIMITATIONS Once we approximated the CPU cost of different client transactions at different tiers, then we could use these cost functions for evaluating the resource requirement of scaled or modified transaction workload mix, in order to accurately size a future system Ideally, one would like to use the CPU cost function obtained with the regression method under WorkloadMix 1 to predict the system behavior under a different WorkloadMix 2 we try to assess the accuracy of performance predictions under drastic changes in the workload using the analytic model 29

The cost function “all” obtained from the aggregate profile of all the workload mixes gives excellent results for a diverse set of workloads The transaction cost function should not be applied to a very different workload mix compared to the mix it was derived from – For example, the relative error of the average throughput reaches 80% when the cost function from the browsing mix profile is used to simulate the ordering mix 30 the browsing mix: 95% browsing and 5% ordering the ordering mix: 50% browsing and 50% ordering

CONCLUSION We develop a practical solution to above problem by providing a theoretical framework – enables the resource evaluation of complex session-base systems through the performance modeling of their transaction-based equivalent dealing with “stateless” transaction-based workloads, we design an analytic model – for evaluating multi-tier system performance that is based on a network of queues representing the different tiers The accuracy of regression significantly improves when the CPU transaction demands are – derived from the extensive, aggregate workload profile that incorporates these possible different behaviors 31