Download presentation
Presentation is loading. Please wait.
1
System Performance & Scalability i206 Fall 2010 John Chuang
2
2 http://bits.blogs.nytimes.com/2007/11/26/yahoos-cybermonday-meltdown/index.html
3
John Chuang3 Computing Trends Multi-core CPUs Data centers Cloud computing What are the drivers? -scalability, availability, cost-effectiveness
4
John Chuang4 Lecture Outline Performance Metrics Availability Queuing theory -M/M/1 queue Scalability -M/M/m queue
5
John Chuang5 What is Performance? Users want fast response time and high availability Managers want happy users, and many of them, while minimizing cost What are standard measures of system performance?
6
John Chuang6 Performance Metrics Response time (seconds) Throughput (MIPS, Mbps, TPS,...) Resource utilization (%) Availability (%)
7
John Chuang7 Availability Down-time per yearOne hour down-time per: 90%36 days9 hours 99%3.7 days4.1 days 99.9%9 hours41.6 days 99.99%53 minutes1.14 years 99.999%5 minutes11.41 years Availability = MTTF / (MTTF + MTTR) -Mean-time-to-failure (MTTF) -Mean-time-to-recover (MTTR)
8
John Chuang8 Response Time ClientServer Formulate request Message latency Processing time Interpret response Network Queuing time Adapted from: David Messerschmitt
9
John Chuang9 Queuing Theory 1. Arrival Process 2. Service Time Distribution 3. Number of Servers 4. System Capacity 5. Customer Population 6. Service Discipline Source: Raj Jain
10
John Chuang10 Kendall’s Notation (1953) A/B/c/k/N/D -A: arrival process -B: service time distribution -c: number of servers -k: system capacity -N: population size -D: service discipline M: Markov (exponential, memoryless, random, Poisson) D: deterministic E: Erlang H: hyper-exponential G: general FCFS: first come first served FCLS: first come last served RR: round-robin etc. 1. Arrival Process 2. Service Time Distribution 3. Number of Servers 4. System Capacity 5. Customer Population 6. Service Discipline
11
John Chuang11 Example Systems M/M/1/ / /FCFS (simplified as M/M/1) -Markovian (Poisson, memoryless) arrival -Markovian service time -1 server -Infinite server capacity -Infinite arrival stream -First-come-first-serve discipline Other examples: -M/M/1/k (finite capacity) -M/M/m (m servers) -G/D/1 (arbitrary arrival, deterministic service time) 8 8
12
John Chuang12 M/M/1 Queue Poisson arrival, with average arrival rate of jobs/sec Poisson service, with average service rate of jobs/sec Single server with infinite queue System utilization (hopefully < 1): = / Average number of jobs in system: N = n·p n = /(1 - ) System throughput (if < 1) : X = Average response time (from Little’s Law): R = N/X = 1/( - )
13
John Chuang13 Example: Web Server Web server receives 40 requests/second Web server can process 100 requests/second What is server utilization? At any given time, how many requests are at server (waiting plus being processed)? What is the mean total delay at server (waiting plus processing)? What happens when traffic rate doubles?
14
John Chuang14 Example: Web Server = 40 requests/second = 100 requests/second Utilization = = / = 40/100 = 40% # of requests = N = /(1 - ) = 0.67 Average time spent at server = R = N/X = 0.67/40 = 17ms
15
John Chuang15 Example: Traffic Doubled = 80 requests/second = 100 requests/second Utilization = = / = 80/100 = 80% # of requests = N = /(1 - ) = 4 Average time spent at server = R = N/X = 4/80 = 50ms (more than doubled!)
16
John Chuang16 Approaching Congestion = 99 requests/second = 100 requests/second Utilization = = / = 99/100 = 99% # of requests = N = /(1 - ) = 99 Average time spent at server = R = N/X = 99/99 = 1 second!
17
John Chuang17 Utilization Affects Performance
18
John Chuang18 M/M/1/k Queue (Finite Capacity) = / N = /(1- ) – (k+1) k+1 /(1- k+1 ) R = N/X = N/ eff -where eff = (1-P k ) = effective arrival rate -and P k = k (1- )/(1- k+1 ) = probability of a full queue Loss rate = - eff
19
John Chuang19 M/M/1/k Response Time
20
John Chuang20 M/M/1/k Throughput
21
John Chuang21 Lecture Outline Performance Metrics Availability Queuing theory -M/M/1 queue Scalability -M/M/m queue
22
John Chuang22 Scalability The capability of a system to increase total throughput under an increased load when resources (typically hardware) are added -Cost of additional resource -Performance degradation under increased load
23
John Chuang23 Scalability Example Original web server: can process requests/sec; accepts requests at /sec Now request rate increases to 10 /sec and web server is swamped ( = 10 / )! Need to add new hardware!
24
John Chuang24 Which is better? Option 1: One big web server that can process 10 requests/sec Option 2: Ten web servers, each can process requests/sec; each accepts 10% of requests ( /sec per server) Option 3: Ten web servers, each can process requests/sec; share single queue (load balancer) that accepts requests at 10 /sec
25
John Chuang25 Option 1: M/M/1 queue with big server Option 2: (ten M/M/1 queues) Option 3: M/M/10 queue
26
John Chuang26 M/M/m Queue (m Servers) = /m N = m + /(1- ) where and
27
John Chuang27 Which is Better? Option 1 (M/M/1 big) Option 2 (ten M/M/1) Option 3 (M/M/10) Utilization ( ) 0.5 Number of requests (N) 11*105.036 Response Time (R) 2ms20ms10.07ms m = 10; = 100; = 50 Remember: Scalability is not just about performance!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.