Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web (Book, chap. 6)

Slides:



Advertisements
Similar presentations
Tales from the Lab: Experiences and Methodology Demand Technology User Group December 5, 2005 Ellen Friedman SRM Associates, Ltd.
Advertisements

1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Capacity Planning and Benchmarking (Chapter 9)
1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Workload Characterization Techniques (Chapter 6)
Copyright © 2005 Department of Computer Science CPSC 641 Winter PERFORMANCE EVALUATION Often in Computer Science you need to: – demonstrate that.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Adapted from Menascé & Almeida.1 Workload Characterization for the Web.
Workload Characterization Sept. 23 rd, 2008 CSCI 8710.
Developing a Characterization of Business Intelligence Workloads for Sizing New Database Systems Ted J. Wasserman (IBM Corp. / Queen’s University) Pat.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part IV Capacity Planning Methodology.
Lect5.ppt - 10/13/04 CIS 4100 Systems Performance and Evaluation Lecture 7 by Zornitza Genova Prodanoff.
Performance Engineering Methodology Chapter 4. Performance Engineering Performance engineering analyzes the expected performance characteristics of a.
Adapted from Menascé & Almeida1 Web and Intranet Performance Issues.
1 Part IV Capacity Planning Methodology © 1998 Menascé & Almeida. All Rights Reserved.
1Adapted from Menascé & Almeida. Capacity Planning Methodology.
  Copyright 2003 by SPAN Technologies. Performance Assessments of Internet Systems By Kishore G. Kamath SPAN Technologies Testing solutions for the enterprise.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
CS CS 5150 Software Engineering Lecture 19 Performance.
1 Part III Web and Intranet Performance Issues © 1998 Menascé & Almeida. All Rights Reserved.
1 Web Performance Modeling Chapter New Phenomena in the Internet and WWW Self-similarity - a self-similar process looks bursty across several time.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
Copyright © 2005 Department of Computer Science CPSC 641 Winter Network Traffic Measurement A focus of networking research for 20+ years Collect.
1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.
Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.
1Adapted from Menascé & Almeida. Capacity Planning Methodology.
1 Part VI System-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Computer Networks Performance Evaluation. Chapter 4 Chapter 4 Performance Engineering Methodology Performance by Design: Computer Capacity Planning by.
Introduction to Discrete Event Simulation Customer population Service system Served customers Waiting line Priority rule Service facilities Figure C.1.
Performance of Web Applications Introduction One of the success-critical quality characteristics of Web applications is system performance. What.
(C) 2009 J. M. Garrido1 Object Oriented Simulation with Java.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part III Web and Intranet Performance Issues.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Performance Evaluation of Computer Systems Introduction
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Entities and Objects The major components in a model are entities, entity types are implemented as Java classes The active entities have a life of their.
1 PREFETCHING INLINES TO IMPROVE WEB SERVER LATENCY Ronald Dodge US Army Daniel Menascé, Ph. D. George Mason University
1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VIII Concluding Remarks.
Chapter 10 Verification and Validation of Simulation Models
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Injecting Realistic Burstiness to.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
1 Part VII Component-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web (Book, Chapter 8)
Internet Applications: Performance Metrics and performance-related concepts E0397 – Lecture 2 10/8/2010.
1 Exploiting Nonstationarity for Performance Prediction Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs)
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part II System Performance Modeling: basic concepts, operational analysis (book, chap. 3)
1 Web Performance Modeling Issues Daniel A. Menascé Department of Computer Science George Mason University 
Development of a QoE Model Himadeepa Karlapudi 03/07/03.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
On the scale and performance of cooperative Web proxy caching 2/3/06.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
1 Presented by: Val Pennell, Test Tool Manager Date: March 9, 2004 Software Testing Tools – Load Testing.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VIII Web Performance Modeling (Book, Chapter 10)
Introduction To Modeling and Simulation 1. A simulation: A simulation is the imitation of the operation of real-world process or system over time. A Representation.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web.
OPERATING SYSTEMS CS 3502 Fall 2017
Software Architecture in Practice
Network Performance and Quality of Service
Computer Systems Performance Evaluation
Operating Systems : Overview
Computer Systems Performance Evaluation
Presentation transcript:

Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web (Book, chap. 6)

Ó 1998 Menascé & Almeida. All Rights Reserved.2 Configuration Plan Investment Plan Personnel Plan Understanding the Environment Workload Characterization Workload Model Validation and Calibration Workload Forecasting Performance Prediction Cost Prediction Valid Model Cost Model Developing a Cost Model Performance Model Cost/Performance Analysis

Ó 1998 Menascé & Almeida. All Rights Reserved.3 Learning Objectives (1) Introduce the workload characterization problem. Discuss a simple example of characterizing the workload for an intranet. Present a workload characterization methodology.

Ó 1998 Menascé & Almeida. All Rights Reserved.4 Learning Objectives (2) Discuss the following steps: – analysis standpoint – identification of the basic component – choice of the characterizing parameters – data collection – partitioning the workload Characteristics of Web workloads: – burstiness

Ó 1998 Menascé & Almeida. All Rights Reserved.5 What is Workload Characterization?

Ó 1998 Menascé & Almeida. All Rights Reserved.6 Workload The workload of a system can be defined as the set of all inputs that the system receives from its environment during any given period of time. HTTP requests Web Server

Ó 1998 Menascé & Almeida. All Rights Reserved.7 Workload Characterization Depends on the purpose of the study –cost x benefit of a proxy caching server –impact of a faster CPU on the response time Common steps –specification of a point of view from which the workload will be analyzed –choice of set of relevant parameters –monitoring the system -> raw performance data –analysis and reduction of performance data –construction of a workload model.

Ó 1998 Menascé & Almeida. All Rights Reserved.8 A Simple Example A construction and engineering company is planning to roll out new applications and to increase the number of employees that have access to the corporate intranet. The main applications are health human resources, insurance payments, on-demand interactive training, etc. Main problem: response time of the human resource system

Ó 1998 Menascé & Almeida. All Rights Reserved.9 A Simple Example (2) A D E C B Clients Servers Network...

Ó 1998 Menascé & Almeida. All Rights Reserved.10 A Simple Example: basic questions What is the purpose of the study? What workload we want to characterize? –Client’s point-of-view: user commands, server responses –server p.o.v: HTTP requests –network p.o.v.: traffic (packet size distribution), inter-packet arrival time What is the level of the workload description? –High-level characterization in terms of Web applications ; –Low-level characterization in terms of resource usage. How could this workload be precisely described?

Ó 1998 Menascé & Almeida. All Rights Reserved.11 Workload Characterization: concepts and ideas Basic component of a workload: refers to a generic unit of work that arrives at the system from external sources. –DB transaction, –interactive command, –process, –HTTP request Depends on the nature of service provided

Ó 1998 Menascé & Almeida. All Rights Reserved.12 Workload Characterization: concepts and ideas Workload characterization process –analyzes workload and identifies the basic components/features having impact on system’s performance –yields parameters that retain characteristics capable of driving performance models –workload model is a representation that mimics the workload under study Workload models can be used for: –selection of systems –performance tuning –capacity planning

Ó 1998 Menascé & Almeida. All Rights Reserved.13 Workload Description Hardware Software User Resource-oriented Description Functional Description Business Description

Ó 1998 Menascé & Almeida. All Rights Reserved.14 Workload Description Business characterization: a user-oriented description that describes the load in terms such as number of employees, invoices per customer, etc. Functional characterization: describes programs, commands and requests that make up the workload Resource-oriented characterization: describes the consumption of system resources by the workload, such as processor time, disk operations, memory, etc.

Ó 1998 Menascé & Almeida. All Rights Reserved.15 A Web Server Example The pair (CPU time, I/O time) characterizes the execution of a request at the server. Our basic workload: 10 HTTP requests First case: only one document size (15KB) 10 executions ---> (0.013 sec, 0.09 sec) More realistic workload: documents have different sizes.

Ó 1998 Menascé & Almeida. All Rights Reserved.16 Execution of HTTP Requests (sec)

Ó 1998 Menascé & Almeida. All Rights Reserved.17 Representativeness of a Workload Model System Performance Measures P real System Performance Measures P model Workload Model Real Workload

Ó 1998 Menascé & Almeida. All Rights Reserved.18 A Refinement in the Workload Model The average response time of 0.55 sec does not reflect the behavior of the actual server. Due to the heterogeneity of the its components, it is difficult to view the workload as a single collection of requests. Three classes: –small documents (0 < CPU <= 0.01, 0 < I/O <= 0.05) –medium doc.s (0.01 < CPU <= 0.03, 0.05 < I/O <= 0.14) –large doc.s (CPU > 0.3, I/O > 0.14)

Ó 1998 Menascé & Almeida. All Rights Reserved.19 Execution of HTTP Requests (sec)

Ó 1998 Menascé & Almeida. All Rights Reserved.20 Three-Class Characterization

Ó 1998 Menascé & Almeida. All Rights Reserved.21 Class Characterization A class comprises components that are similar to each other concerning resource usage Clustering the workload into classes increases the predictive power of a model

Ó 1998 Menascé & Almeida. All Rights Reserved.22 Completing the characterization Two major points: –Class characterization: statistical description of each class (e.g. CPU, I/O) number of components of each class –Requests arrival rate: depends on number of users generating requests think time (how often a user interacts with the server)

Ó 1998 Menascé & Almeida. All Rights Reserved.23 Workload Models A model should be representative and compact. Natural models are constructed either using basic components of the real workload or using traces of the execution of real workload. Artificial models do not use any basic component of the real workload. –Executable models (e.g.: synthetic programs, artificial benchmarks, etc) - not adequate for performance models –Non-executable models, that are described by a set of parameter values that reproduce the same resource usage of the real workload.

Ó 1998 Menascé & Almeida. All Rights Reserved.24 Workload Models The basic inputs to analytical models are parameters that describe the service centers (i.e., hardware and software resources) and the customers (e.g. requests and transactions) Typical parameters: –component (e.g., transactions) inter-arrival times; –service demands –component sizes –execution mix (e.g., levels of multiprogramming)

Ó 1998 Menascé & Almeida. All Rights Reserved.25 A Workload Characterization Methodology Choice of an analysis standpoint Identification of the basic component Choice of the characterizing parameters Data collection Partitioning the workload Calculating the class parameters

Ó 1998 Menascé & Almeida. All Rights Reserved.26 Selection of characterizing parameters Each workload component is characterized by two groups of information: Workload intensity –arrival rate –number of clients and think time –number of processes or threads in execution simultaneously Service demands (D i1, D i2, … D iK ), where D ij is the service demand of component i at resource j.

Ó 1998 Menascé & Almeida. All Rights Reserved.27 Data Collection This step assigns values to each component of the model. –Identify the time windows that define the measurement sessions. –Monitor and measure the system activities during the defined time windows. –From the collected data, assign values to each characterizing parameters of every component of the workload.

Ó 1998 Menascé & Almeida. All Rights Reserved.28 Partitioning the workload Motivation: real workloads can be viewed as a collection of heterogeneous components. Partitioning techniques divide the workload into a series of classes such that their populations are composed of quite homogeneous components. What attributes can be used for partitioning a workload into classes of similar components?

Ó 1998 Menascé & Almeida. All Rights Reserved.29 Partitioning the Workload Resource usage Applications Objects Geographical orientation Functional Organizational units Mode: interactive, transaction, batch

Ó 1998 Menascé & Almeida. All Rights Reserved.30 Workload Partitioning: Resource Usage

Ó 1998 Menascé & Almeida. All Rights Reserved.31 Workload Partitioning: Internet Applications

Ó 1998 Menascé & Almeida. All Rights Reserved.32 Workload Partitioning: Document Types

Ó 1998 Menascé & Almeida. All Rights Reserved.33 Workload Partitioning: Geographical Orientation

Ó 1998 Menascé & Almeida. All Rights Reserved.34 Modes Transaction: –workload characterized by arrival rate, –when a client has completed the service, it leaves the system –the number of clients in the system (population) varies extensively with time Interactive: –workload characterized by population (i.e. number of active client workstations or terminals) and by think time

Ó 1998 Menascé & Almeida. All Rights Reserved.35 Modes Batch: –workload characterized by population, which is fixed –when a client has completed the service, it leaves the system, but it is replaced by a new client (the output is short-circuited with input)

Ó 1998 Menascé & Almeida. All Rights Reserved.36 Calculating the class parameters How should one calculate the parameter values that represent a class of components? –Averaging: when a class consists of homogeneous components concerning service demands, an average of the parameter values of all components may be used. –Clustering of workloads is a process in which a large number of components are grouped into clusters of similar components.

Ó 1998 Menascé & Almeida. All Rights Reserved.37 Clustering Analysis

Ó 1998 Menascé & Almeida. All Rights Reserved.38 Clustering Analysis Software packages: SAS, SPSS, etc. A clustering algorithm attempts to find natural groups of components based on similar resource requirements Output of clustering algorithm: for each cluster, statistical description of centroid, + number of components -- this is the characterization of the associated class

Ó 1998 Menascé & Almeida. All Rights Reserved.39 Clustering Analysis: required steps Data analysis –sampling from real data –choice of logarithmic or other scale –discarding outliers (with care!) –trimming data to 95. or 98. percentile: D i t = (meas. D i - min.{D i }) / (max.{D i } - min{D i }) Distance measures –euclidean or other metric

Ó 1998 Menascé & Almeida. All Rights Reserved.40 Clustering Analysis: required steps Scaling –apply z score transform to use only adimensional values: z score = (meas. val - mean val) / standard dev Clustering algorithms –minimal spanning tree (hierarchical) –k-means algorithm (non-hierarchical) N.B. - the number of clusters must be kept small

Ó 1998 Menascé & Almeida. All Rights Reserved.41 New Phenomena in the Internet and WWW Self-similarity - a self-similar process looks bursty across several time scales. Heavy-tailed distributions in workload characteristics, that means a very large variability in the values of the workload parameters.

Ó 1998 Menascé & Almeida. All Rights Reserved.42 WWW Traffic Burst Bytes Chronological time (slots of 1000 sec)

Ó 1998 Menascé & Almeida. All Rights Reserved.43 Incorporating New Phenomena in the Workload Characterization Burstiness Modeling burstiness in a given period can be represented by a pair of parameters (a,b) –a is the ratio between the maximum observed request rate and the average request rate during the period. –b is the fraction of time during which the instantaneous arrival rate exceeds the average arrival rate. (a = 6, b = 5%) => Web server throughput degraded by 12 to 20%

Ó 1998 Menascé & Almeida. All Rights Reserved.44 Part V: Summary Workload Characterization what is it? basic concepts workload description and modeling representativeness of a workload model Methodology (1) Choice of an analysis standpoint Identification of the basic component Choice of the characterizing parameters Data collection

Ó 1998 Menascé & Almeida. All Rights Reserved.45 Part V: Summary Methodology (2) Partitioning the workload Calculating the class parameters Averaging Clustering techniques and algorithms Taking burstiness into account