Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Slides:



Advertisements
Similar presentations
Chapter 13 Queueing Models
Advertisements

Introduction to Queuing Theory
S. D. Deshmukh OM V. Capacity Planning in Services u Matching Supply and Demand u The Service Process u Performance Measures u Causes of Waiting u Economics.
1 Chapter 8 Queueing models. 2 Delay and Queueing Main source of delay Transmission (e.g., n/R) Propagation (e.g., d/c) Retransmission (e.g., in ARQ)
Lecture 13 – Continuous-Time Markov Chains
Queuing Analysis Based on noted from Appendix A of Stallings Operating System text 6/10/20151.
1 Part II Web Performance Modeling: basic concepts © 1998 Menascé & Almeida. All Rights Reserved.
System Performance & Scalability i206 Fall 2010 John Chuang.
ECS 152A Acknowledgement: slides from S. Kalyanaraman & B.Sikdar
Performance analysis for high speed switches Lecture 6.
1 Performance Evaluation of Computer Networks Objectives  Introduction to Queuing Theory  Little’s Theorem  Standard Notation of Queuing Systems  Poisson.
Single queue modeling. Basic definitions for performance predictions The performance of a system that gives services could be seen from two different.
Data Communication and Networks Lecture 13 Performance December 9, 2004 Joseph Conron Computer Science Department New York University
1 Queueing Theory H Plan: –Introduce basics of Queueing Theory –Define notation and terminology used –Discuss properties of queuing models –Show examples.
Little’s Theorem Examples Courtesy of: Dr. Abdul Waheed (previous instructor at COE)
Rensselaer Polytechnic Institute © Shivkumar Kalvanaraman & © Biplab Sikdar1 ECSE-4730: Computer Communication Networks (CCN) Network Layer Performance.
7/3/2015© 2007 Raymond P. Jefferis III1 Queuing Systems.
Introduction to Queuing Theory. 2 Queuing theory definitions  (Kleinrock) “We study the phenomena of standing, waiting, and serving, and we call this.
1 Part VI System-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
Lecture 14 – Queuing Systems
Lesson 11: Solved M/G/1 Exercises
Introduction to Queuing Theory
Basic teletraffic concepts An intuitive approach
AN INTRODUCTION TO THE OPERATIONAL ANALYSIS OF QUEUING NETWORK MODELS Peter J. Denning, Jeffrey P. Buzen, The Operational Analysis of Queueing Network.
Network Analysis A brief introduction on queues, delays, and tokens Lin Gu, Computer Networking: A Top Down Approach 6 th edition. Jim Kurose.
Introduction to Queuing Theory
Performance Analysis of Computer Systems and Networks Varsha Apte Department of Computer Science and Engineering, IIT Bombay July 10, 2003 Wipro Technologies,
 Birth Death Processes  M/M/1 Queue  M/M/m Queue  M/M/m/B Queue with Finite Buffers  Results for other Queueing systems 2.
Introduction to Operations Research
NETE4631:Capacity Planning (2)- Lecture 10 Suronapee Phoomvuthisarn, Ph.D. /
Introduction to Queueing Theory
Network Design and Analysis-----Wang Wenjie Queueing System IV: 1 © Graduate University, Chinese academy of Sciences. Network Design and Analysis Wang.
Queuing Theory Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues Chapter.
1 Queueing Theory Frank Y. S. Lin Information Management Dept. National Taiwan University
Queueing Theory What is a queue? Examples of queues: Grocery store checkout Fast food (McDonalds – vs- Wendy’s) Hospital Emergency rooms Machines waiting.
TexPoint fonts used in EMF.
1 Elements of Queuing Theory The queuing model –Core components; –Notation; –Parameters and performance measures –Characteristics; Markov Process –Discrete-time.
Modeling and Analysis of Computer Networks
CS433 Modeling and Simulation Lecture 12 Queueing Theory Dr. Anis Koubâa 03 May 2008 Al-Imam Mohammad Ibn Saud University.
1 Chapters 8 Overview of Queuing Analysis. Chapter 8 Overview of Queuing Analysis 2 Projected vs. Actual Response Time.
yahoo.com SUT-System Level Performance Models yahoo.com SUT-System Level Performance Models8-1 chapter11 Single Queue Systems.
Queuing Theory and Traffic Analysis Based on Slides by Richard Martin.
The M/M/ N / N Queue etc COMP5416 Advanced Network Technologies.
CS352 - Introduction to Queuing Theory Rutgers University.
CSCI1600: Embedded and Real Time Software Lecture 19: Queuing Theory Steven Reiss, Fall 2015.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Computer Networking Queueing (A Summary from Appendix A) Dr Sandra I. Woolley.
NETE4631: Network Information System Capacity Planning (2) Suronapee Phoomvuthisarn, Ph.D. /
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web (Book, Chapter 8)
Maciej Stasiak, Mariusz Głąbowski Arkadiusz Wiśniewski, Piotr Zwierzykowski Model of the Nodes in the Packet Network Chapter 10.
Review Session Jehan-François Pâris. Agenda Statistical Analysis of Outputs Operational Analysis Case Studies Linear Regression.
Internet Applications: Performance Metrics and performance-related concepts E0397 – Lecture 2 10/8/2010.
Chapter 6 Queueing Models
1 Queuing Delay and Queuing Analysis. RECALL: Delays in Packet Switched (e.g. IP) Networks End-to-end delay (simplified) = End-to-end delay (simplified)
Queuing Theory.  Queuing Theory deals with systems of the following type:  Typically we are interested in how much queuing occurs or in the delays at.
Chap 2 Network Analysis and Queueing Theory 1. Two approaches to network design 1- “Build first, worry later” approach - More ad hoc, less systematic.
Queueing Fundamentals for Network Design Application ECE/CSC 777: Telecommunications Network Design Fall, 2013, Rudra Dutta.
Random Variables r Random variables define a real valued function over a sample space. r The value of a random variable is determined by the outcome of.
Mohammad Khalily Islamic Azad University.  Usually buffer size is finite  Interarrival time and service times are independent  State of the system.
Queuing Theory Simulation & Modeling.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web.
OPERATING SYSTEMS CS 3502 Fall 2017
B.Ramamurthy Appendix A
Queueing Theory Carey Williamson Department of Computer Science
System Performance: Queuing
TexPoint fonts used in EMF.
Carey Williamson Department of Computer Science University of Calgary
CSE 550 Computer Network Design
Queueing Theory Frank Y. S. Lin Information Management Dept.
Presentation transcript:

Performance Analysis of Computer Systems and Networks Prof. Varsha Apte

© 2004 by Varsha Apte, IIT Bombay 2 Example: On-line Service Client Server What questions about performance can we ask? Why should we ask them? How can we answer them?

© 2004 by Varsha Apte, IIT Bombay 3 …What Software Response time Blocking Queue length Throughput Utilization Network Packet Delay, Message Delay Loss Rate Queue Length “Goodput” Utilization Delay Jitter

© 2004 by Varsha Apte, IIT Bombay 4...Why Sizing (Hardware, network) Setting configuration parameters Choosing architectural alternatives Comparing algorithms Determining bottlenecks Guaranteeing QoS will be met

HOW?

© 2004 by Varsha Apte, IIT Bombay 6 Example: Estimating end-to- end delay Client Server Measure it! At the client At the server Simulate it – Write (or use) a computer program that simulates the behaviour of the system and collects statistics Analyse it “with pen and paper” Let's try (delay)! Assume Web service

© 2004 by Varsha Apte, IIT Bombay 7 Dissecting the response time Delay components: Client Processing (prepare request) Connection Set-up Sending the request Server processing the request Sending the response Client processing (display response)

© 2004 by Varsha Apte, IIT Bombay 8...Dissecting delays Connection Set-up (assume TCP): SYN—SYNACK 1 Round-trip time before request can be sent Sending the request ½ RTT for request to reach server At the server: Queuing Delay for server thread Processing delay (once request gets server thread) Thread will also be in CPU job “queue” or disk queue Sending the response back

© 2004 by Varsha Apte, IIT Bombay 9...Dissecting delays Delay components of “RTT”: Queuing delay at (at each link) Packet processing delay (at each node) Packet transmission delay (at each link) Link propagation delay

© 2004 by Varsha Apte, IIT Bombay 10 Delay- observations Many delays are fixed Propagation delay Packet processing delay For a given packet size, transmission delay Some are variable Notably, Queuing Delay

© 2004 by Varsha Apte, IIT Bombay 11 Key Concept Fundamental concept: Contention for a resource leads to users of the resource spending time queuing, or in some way waiting for the resource to be given to them. The calculation of this time, is what requires sophisticated models, because this time changes with random changes in the system - e.g. traffic volumes, failures, etc. and because it depends on various system mechanisms. Focus of this workshop: Queuing Delay

Queuing Systems An Introduction to Elementary Queuing Theory

© 2004 by Varsha Apte, IIT Bombay 13 What/Why is a Queue? The systems whose performance we study are those that have some contention for resources If there is no contention, performance is in most cases not an issue When multiple “users/jobs/customers/ tasks” require the same resource, use of the resource has to be regulated by some discipline

© 2004 by Varsha Apte, IIT Bombay 14 …What/Why is a Queue? When a customer finds a resource busy, the customer may Wait in a “queue” (if there is a waiting room) Or go away (if there is no waiting room, or if the waiting room is full) Hence the word “queue” or “queuing system” Can represent any resource in front of which, a queue can form In some cases an actual queue may not form, but it is called a “queue” anyway.

© 2004 by Varsha Apte, IIT Bombay 15 Examples of Queuing Systems CPU Customers: processes/threads Disk Customers: processes/threads Network Link Customers: packets IP Router Customers: packets ATM switch: Customers: ATM cells Web server threads Customers: HTTP requests Telephone lines: Customers: Telephone Calls

© 2004 by Varsha Apte, IIT Bombay 16 Elements of a Queue Server Waiting Room/ Buffer/ Queue Queueing Discipline Customer Inter- arrival time Service time

© 2004 by Varsha Apte, IIT Bombay 17 Elements of a Queue Number of Servers Size of waiting room/buffer Service time distribution Nature of arrival “process” Inter-arrival time distribution Correlated arrivals, etc. Number of “users” issuing jobs (population) Queuing discipline: FCFS, priority, LCFS, processor sharing (round-robin)

© 2004 by Varsha Apte, IIT Bombay 18 Elements of a Queue Number of Servers: 1,2,3…. Size of buffer: 0,1,2,3,… Service time distribution & Inter-arrival time distribution Deterministic (constant) Exponential General (any) Population: 1,2,3,…

© 2004 by Varsha Apte, IIT Bombay 19 Queue Performance Measures Queue Length: Number of jobs in the system (or in the queue) Waiting time (average, distribution): Time spent in queue before service Response time: Waiting time+service time Utilization: Fraction of time server is busy or probability that server is busy Throughput: Job completion rate

© 2004 by Varsha Apte, IIT Bombay 20 Queue Performance Measures Let observation time be T A = number of arrivals during time T C = number of completions during time T B = Total time system was busy during time T Then: Arrival Rate = = A/T Throughput =  C/T Utilization = ρ = B/T Average service time =  = B/C Service rate = 1/ 

© 2004 by Varsha Apte, IIT Bombay 21 Basic Relationships In “steady-state” for a stable system without loss (i.e. infinite buffer system)  Completion  rate = Arrival Rate, since “in-flow = out-flow”) If arrival rate > service rate, then   Utilization =  B/T = (B/C) x (C/T) = Average Service Time x Completion Rate =  =  for a loss-less system. For loss-full systems, if p = fraction of requests lost,  (1 – p) Throughput of a system,  Utilization x Service Rate =  (C/T) = (B/T) x (C/B)

© 2004 by Varsha Apte, IIT Bombay 22 Little’s Law: N =  R Average number of customers in a queuing system = Throughput x Average Response Time Applicable to any “closed boundary” that contains queuing systems Some other assumptions Also, if L is the number in queue (not in service), and W is waiting time: L =  W

© 2004 by Varsha Apte, IIT Bombay 23 Simple Example (Server) Assume just one server (single thread) Requests come 3 requests/second Request processing time = 250 ms. Utilization of server? 75% Throughput of the server? 3 reqs/second What if requests come in 5 reqs/second? Utilization = 100%, Throughput = 3 reqs/second Waiting time (for 3 reqs/second?) L/3, where L is queue length. But what is L? Need Queuing Model

© 2004 by Varsha Apte, IIT Bombay 24 Queuing Systems Notation X/Y/Z/A/B/C X: Inter-arrival time distribution Distributions denoted by D (Deterministic), M (Exponential) or G (General) Y: Service time distribution Z: Number of Servers A: Buffer size B: Population size C: Discipline E.g.: M/G/4/50/2000/LCFS

© 2004 by Varsha Apte, IIT Bombay 25 Classic Single Server Queue: M/M/1 Exponential service time Exponential inter-arrival time This is the “Poisson” arrival process. Single Server Infinite buffer (waiting room) FCFS discipline Can be solved very easily, using theory of Markov chains

© 2004 by Varsha Apte, IIT Bombay 26 Exponential Distribution Memory-less distribution Distribution of remaining time does not depend on elapsed time Mathematically convenient Realistic in many situations (e.g. inter-arrival times of calls) X is EXP( )  P[X < t] = 1 – e - t Average value of X = 1/

© 2004 by Varsha Apte, IIT Bombay 27 Exponential Poisson When distribution of inter-arrival time is Exponential, the “arrival process” is a “Poisson” process. Properties of Poisson process with parameter  If N t = Number of arrivals in (0,t]; then P[N t = k] = t e - t /k! Superposition of Poisson processes is a Poisson process Splitting of Poisson process results in Poisson processes

© 2004 by Varsha Apte, IIT Bombay 28 Important Result! M/M/1 queue results Let be arrival rate and  be service time, and  = 1/  be service rate  Utilization Mean number of jobs in the system  /(1-  ) Throughput Average response time (Little’s law): R = N/ 

© 2004 by Varsha Apte, IIT Bombay 29 Response Time Graph Graph illustrates typical behaviour of response time curve

© 2004 by Varsha Apte, IIT Bombay 30 M/M/1 queue results For M/M/1 queue, a formula for distribution of response time is also derived. M/M/1 response time is exponentially distributed, with parameter  (1-  ), i.e. P [Response Time < t] = 1 – e -(  t

© 2004 by Varsha Apte, IIT Bombay 31 M/G/1 single server queue General service time distribution Mean number in system = N =      Where    standard deviation/mean of service time Called the Pollaczek-Khinchin (P-K) mean value formula. Mean response time by Little’s law

© 2004 by Varsha Apte, IIT Bombay 32 M/G/1 delay Mean response time by Little’s law R = N/ =      For constant service time (M/D/1 queue):    M/D/1-Mean response time=  M/D/1-Mean waiting time = 

© 2004 by Varsha Apte, IIT Bombay 33 Queue Length by P-K formula Coefficient of variation for: Det: 0 Uniform(10-50): Erlang-2: 0.5 Exp: 1 Gen: 3

© 2004 by Varsha Apte, IIT Bombay 34 Multiple server queue: M/M/c One queue, c servers Utilization,  c a  = Average number of busy servers. Queue length equation exists (not shown here) For c = 2, queue length is: 2  -    Average Response Time? Little’s Law! For c = 2, R = N/  -    Important quantity: termed traffic intensity or offered load

© 2004 by Varsha Apte, IIT Bombay 35 Finite Buffer Models: M/M/c/K c servers, buffer size K (total in the system can be c+K) If a request arrives when system is full, request is dropped For this system, there will be a notion of loss, or blocking, in addition to response time etc. Blocking probability (p) is probability that arriving request finds the system full Response time is relevant only for requests that are “accepted”

© 2004 by Varsha Apte, IIT Bombay 36...Finite Buffer Queues Arrival rate: Service rate:  Throughput?  – p) Utilization?  c Blocking probability? Queue length (N)? Formula exists (from queuing model) Waiting time? (Little's law = N/  )

© 2004 by Varsha Apte, IIT Bombay 37 Finite Buffer Queue: Asymptotic Behavior As offered load increases (    infinity) Utilization (  )    Throughput    c  Blocking probability p  1 Queue length N  c+ K Waiting time K/(c  + 1/ 

© 2004 by Varsha Apte, IIT Bombay 38 Finite Buffer (Loss Models) M/M/c/0: Poisson arrivals, exponential service time, c servers, no waiting room. Represents which kind of systems? Circuit-switched telephony! (Servers are lines or trunks, Service time is termed “holding time”) Interesting measure for this queue: probability that arriving call finds all lines busy

© 2004 by Varsha Apte, IIT Bombay 39 Erlang-B formula Blocking probability (probability that arriving call finds all s servers busy) = (a s /s!) / [sum(k from 0 to s) {a k /k!}]

Examples

© 2004 by Varsha Apte, IIT Bombay 41 Example-1 You are developing an application server where the performance requirements are: Average response time < 3 seconds Forecasted arrival rate = 0.5 requests/second What should be the budget for service time of requests in the server? Answer:  <1.2 seconds.

© 2004 by Varsha Apte, IIT Bombay 42 Example-2 If you have two servers, is it better to split the incoming requests into two queues and send them to each server Or, put them in one queue, and the first in queue is sent to whichever server is idle. Or, replace two servers by one server, twice as fast. for minimizing response times?

© 2004 by Varsha Apte, IIT Bombay 43 Example-2 contd. Verify intuition by model. Let be arrival rate, and  be service time Calculate response times, and order cases by response times. Answer: R3 < R2 < R1

© 2004 by Varsha Apte, IIT Bombay 44 Example-3: ATM Link Model Assume ATM link, Poisson arrivals, infinite buffer Link b/w: 10 Mbps Packet size: 53 bytes Packet transmission time = 42.4  s Packet inter-arrival time = 50  s. Assume Poisson arrivals

© 2004 by Varsha Apte, IIT Bombay 45 Example-3 contd. Delay through link: node processing delay (negligible) + queuing delay (waiting time)+ transmission delay + propagation delay Queuing Delay? M/D/1 delay = (42.4)(42.4/50) / (2 x ( /50)) = ms

© 2004 by Varsha Apte, IIT Bombay 46 Example-4: Multi-threaded Server Assume multi-threaded server. Arriving requests are put into a buffer. When a thread is free, it picks up the next request from the buffer. Execution time: mean = 200 ms Traffic = 30 requests/sec How many threads should we configure? (assume enough hardware). Traffic Intensity = 30 x 0.2 = 6 = Average number of busy servers  At least 6 Response time = (Which formula?)

© 2004 by Varsha Apte, IIT Bombay 47...Example-4 Related question: estimate average memory requirement of the server. Likely to have: constant component + dynamic component Dynamic component is related to number of active threads Suppose memory requirement of one active thread = M Avg. memory requirement= constant + M* 6

© 2004 by Varsha Apte, IIT Bombay 48 Example-5: Hardware Sizing Consider the following scenario: An application server runs on a 24-CPU machine Server seems to peak at 320 transactions per second We need to scale to 400. Hardware vendor recommends going to 32 CPU machine. Should you?

© 2004 by Varsha Apte, IIT Bombay 49 Example-5: Hardware Sizing First do bottleneck analysis! Suppose logs show that at 320 transactions per second, CPU utilization is 67% - What is the problem? What is solution?

© 2004 by Varsha Apte, IIT Bombay 50 Example-5: Hardware Sizing Most likely Explanation: Number of threads in server is < number of CPUs Possible diagnosis: Server has 16 threads configured Each thread has capacity of 20 transactions per second Total capacity: 320 reqs/second. At this load, 16 threads will be 100% busy  average CPU utilization will be 16/24=67% Solution: Increase number of threads – no need for CPUs.

Part II Applications to Software Performance Testing and Measurement, Network Models and Service Performance

© 2004 by Varsha Apte, IIT Bombay 52 Software Performance Testing

© 2004 by Varsha Apte, IIT Bombay 53 Typical Scenario M clients issue requests, wait for response, then “think”, then issue request again Let 1/ be mean think time, 1/  be mean service time. Scenario is actually of “closed” queuing network Clients Server

© 2004 by Varsha Apte, IIT Bombay 54 Observations Throughput? Arrival rate? At steady-state, “flow” through both queues is equal Server throughput = Server utilization X Service rate U X  Request is generated by each user once every [think time + response time] = 1/ (1/  + R) Overall request arrival rate = M / (1/ + R) = U *  Clients Server

© 2004 by Varsha Apte, IIT Bombay 55 Observations M / (1/ + R) = U *  = Throughput Response time = number of clients/Throughput – think time As number of clients increase, can be shown to tend to: M/  – 1/. Linear function of M Increase M by 1  Response time increases by 1/  Clients Server

© 2004 by Varsha Apte, IIT Bombay 56 Metrics from model Saturation number (number of clients after which system is “saturated”) M * = 1 +  / M R

Case Study Software Performance Measurement

© 2004 by Varsha Apte, IIT Bombay 58 Queuing Networks Jobs arrive at certain queues (open queuing networks) After receiving service at one queue (i), they proceed to another server (j), with some probability p ij, or exit

© 2004 by Varsha Apte, IIT Bombay 59 Open Queuing Network - measures Maximum Throughput Bottleneck Server Throughput

© 2004 by Varsha Apte, IIT Bombay 60 Open Queuing Network - measures Total time spent in the system before completion (overall response time, from the point of view of the user) start finish

© 2004 by Varsha Apte, IIT Bombay 61 Open Queuing Network: Example 1 W: Web Server A1, A2: Application Server 1, 2 D1, D2: Database Server 1,2 W A1 D1 D2 A2 p w0 p wA1 p wA2 p A1w p A2w p A1D1 p A2D2

© 2004 by Varsha Apte, IIT Bombay 62...Open Queuing Network- Example 1 Each server has different service time But what is the request rate arriving to each server? Need to calculate this using flow equations p w0 p wA1 p wA2 p A1w p A2w p A1D1 p A2D2

© 2004 by Varsha Apte, IIT Bombay 63...Open Queuing Network- Example 1 Equations for Average number of visits before leaving the server network v A1 = p wA1. v w + v D1, v D1 = p A1D1. v A1 v A2 = p wA2. v w + v D2, v D2 = p A2D2. v A2 V w = 1 + p A1w. v A1 + p A2w. v A2 p w0 p wA1 p wA2 p A1w p A2w p A1D1 p A2D2

© 2004 by Varsha Apte, IIT Bombay 64...Open Queuing Network- Example 1 Spreadsheet calculation shows How bottleneck server changes How throughput changes How response time changes Results are accurate for only some types of networks, for others, they are approximate p w0 p wA1 p wA2 p A1w p A2w p A1D1 p A2D2