Computer Systems Design

Computer Systems Design
What Queueing Theory Teaches Us About Computer Systems Design Mor Harchol-Balter Computer Science Dept, CMU

Outline Basic Vocabulary Single-server queues III.Multi-server queues
Avg arrival rate, l Avg service rate, m Avg load, r Avg throughput, X Open vs. closed systems Response time, T Waiting time, TQ Exponential vs. Pareto/Heavy-tailed Squared coefficient of variation, C2 Poisson Process D/D/1, M/M/1, M/G/1 Inspection Paradox Effect of job size variaibility Effect of load Provisioning bathrooms/scaling Scheduling: FCFS, PS, SJF, LAS, SRPT Web server scheduling implementation Open vs. closed systems: wait Open vs. closed systems: scheduling Static load balancing Throwing away servers M/M/k + Comparing architectures Many slow servers vs. 1 fast Capacity provisioning & scaling Square root staffing Dynamic power management Dynamic load balancing/FCFS servers Replication Dynamic load balancing/PS servers

Vocabulary l m sec. Example: Avg. service rate Avg. arrival rate
throughout Avg. service rate jobs sec m Avg. arrival rate jobs sec l FCFS Avg service rate Avg size of job on this server: sec. Example: On average, job needs 3x106 cycles Machine executes 9X106 cycles/sec jobs sec

Vocabulary l m Example: arrive Each job requires sec on avg FCFS jobs

More Vocabulary l m 2m Defn: Throughput X denotes the average
rate at which jobs complete (jobs/sec) QUESTION: Which has higher throughput, C? jobs sec m l 2m throughout Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

More Vocabulary C: l m avg rate at which jobs complete
sec m l C: (assuming no jobs dropped) Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Open versus Closed Systems
Interactive Closed Batch Open MPL N: fixed #users Z: “think time” MPL N: fixed #jobs l explain where the jobs are in each picture.

More Vocabulary l m response time queueing time (waiting time)
jobs sec m jobs sec l response time queueing time (waiting time) Ask how T and TQ are related. T = TQ + S Q: Given that l < m, what causes wait? A: Variability in the arrival process & service requirements

Variability l m Variability Variability in arrival in job size, S
sec m jobs sec l Variability in arrival process Ask how T and TQ are related. T = TQ + S Variability in job size, S

Job Size Distributions
“Most jobs are small; few jobs are large” 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 heavy tail Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Job Size Distributions
QUESTION: Which best represents UNIX process lifetimes? QUESTION: For which do top 1% of jobs comprise 50% of load? QUESTION: Which distribution fits the saying, “the longer a job has run so far, the longer it is expected to continue to run.” 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. heavy tail DFR

Pareto Job Size Distribution
Pareto job sizes are ubiquitous in CS: CPU lifetimes of UNIX jobs [Harchol-Balter, Downey 96] Supercomputing job sizes [Schroeder, Harchol-Balter 00] Web file sizes [Crovella, Bestavros 98], [Barford, Crovella 98] IP flow durations [Shaikh, Rexford, Shin 99] Wireless call durations [Blinn, Henderson, Kotz 05] Also ubiquitous in nature: Forest fire damage Earthquake damage Human wealth [Vilfredo Pareto ‘65] 1 2 3 4 5 6 7 8 9 heavy tail

S is time until coin w/prob md
Exponential Job Size Distribution md md md md md md md md d 2d 3d 4d 5d 6d 7d 8d 9d time S is time until coin w/prob md comes up heads S is memoryless!

Variability in Job Sizes
Squared Coefficient of Variation = QUESTION: Match these distributions to their C2 values: Deterministic Exponential Uniform(0,b) Unix process lifetimes Human IQs Pareto distribution Explain how C^2 is indpt of scale, so variability doesn’t go up when compare 1,2,3 with 10, 20, 30.

Variability in Job Sizes
Squared Coefficient of Variation Deterministic Human IQs Uniform(0,b) – for any b Exponential distribution Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Unix process lifetimes Pareto distribution

Variability l m Variability Variability in arrival in job size, S
sec m jobs sec l Variability in arrival process Ask how T and TQ are related. T = TQ + S Variability in job size, S

Poisson Process with rate l
QUESTION: What’s a Poisson process with rate l? Hint: It’s related to Exp(l).

Poisson Process with rate l
Poisson process models sequence of arrival times (typically representing aggregation of many users) d 2d 3d 4d 5d 6d 7d 8d 9d time

Summary Part I Basic Vocabulary Prize-winning messages 
Avg arrival rate, l Avg service rate, m Avg load, r Avg throughput, X Open vs. closed systems Response time, T Waiting time, TQ Exponential vs. Pareto/Heavy-tailed Squared coefficient of variation, C2 Poisson Process Prize-winning messages  Throughput is very different for open vs. closed systems An Exponential distribution is the time to get a single “head.” A Poisson process is a sequence of “heads.” Heavy-tailed, Pareto distributions: * represent real workloads * very high variability & DFR * top 1% comprise half the load Variance in job sizes is key. C2: measure of variance. 19

Avg arrival rate, l Avg service rate, m Avg load, r Avg throughput, X Open vs. closed systems Response time, T Waiting time, TQ Exponential vs. Pareto/Heavy-tailed Squared coefficient of variation, C2 Poisson Process D/D/1, M/M/1, M/G/1 Inspection Paradox Effect of job size variaibility Effect of load Provisioning bathrooms/scaling Scheduling: FCFS, PS, SJF, LAS, SRPT Web server scheduling implementation Open vs. closed systems: wait Open vs. closed systems: scheduling Static load balancing Throwing away servers M/M/k + Comparing architectures Many slow servers vs. 1 fast Capacity provisioning & scaling Square root staffing Dynamic power management Dynamic load balancing/FCFS servers Replication Dynamic load balancing/PS servers

Single-Server Queue l m D/D/1 M/M/1 M/G/1 M=“memoryless”=“Markovian”
jobs sec m l D/D/1 Deterministic service times M/M/1 Exponential inter-arrival times service 1 server M/G/1 General service times M=“memoryless”=“Markovian”

Single-Server Queue jobs sec m l D/D/1 M/M/1 M/G/1 Q: Does low  low ?

Single-Server Queue l m M/M/1 M/G/1 D/D/1 C2: variability related to
jobs sec m l M/M/1 M/G/1 D/D/1 related to C2: variability job size Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Single-Server Queue l m low load does NOT imply low wait 20 15 10 5
jobs sec m l M/G/1 C2 = 100 M/G/1 C2 = 10 20 15 10 5 Tell consulting story low load does NOT imply low wait M/M/1 D/D/1 0.2 0.4 0.6 0.8

M/G/1 Where is this coming from?
Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Waiting for the bus Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Waiting for the bus S: time between buses time QUESTION:
Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: On average, how long do I have to wait for a bus? < 5 min 5 min 10 min >10 min

Waiting for the bus S: time between buses Wait “Inspection Paradox”
It’s a paradox because it doesn’t make sense that the wait should be longer than E[S]. Reason you have this wait is that you’re much more likely to arrive in a big S than a small one. “Inspection Paradox”

Back to Single-Server Queue
jobs sec m l Same paradox comes up in waiting time for M/G/1. This red term represents remaining time on job in service. Normally you would think that the job in service has a remaining time of E[S]/2. But the inspec paradox says you’re much more likely to arrive when a really big job is in service, so it’s remaining time is bigger than E[S]/2. Transition: So we’ve seen how waiting time is related to job size variability. But it’s also related to load (rho), and in a non-linear way. Let’s examine that now.

Waiting for the Loo Check out the line for the men’s room …
Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Check out the line for the men’s room …

Waiting for the Loo On avg, Women spend 88 sec in loo.
On avg, Men spend 40 sec in loo. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Waiting for the Loo l m l 2m M/M/1 model r doubling QUESTION:
ppl sec m l M/M/1 model doubling r ppl sec 2m l Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: Women take 2X as long. What’s the difference in their wait? factor < 2 factor 2 factor 4 factor > 4

Doubling r can increase E[TQ]
Waiting for the Loo M/M/1 M/G/1 Doubling r can increase E[TQ] by factor of 4 to ∞ For M/G/1, it’s factor of 1 to infty (1-rho/2)/(1 – rho). For M/M/1, think: rho/(1-rho) * E[S] = women rho/2/(1-rho/2) * E[S/2] = men

Equalizing the wait for men & women
ppl sec m 2 Women’s rooms for each Men’s room. ppl sec 2m Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: Is this (a) insufficient (b) overkill (c) just right

Equalizing the wait for men & women
ppl sec m Insufficient! Waiting time for women is still factor of 2 higher. ppl sec m ppl sec 2m Would be sufficient if everyone arrived at once to the bathroom. Also, more than sufficient (overkill) if put women in M/M/2 – single queue. Also true under M/G/1 model. For what models is this not true?

M/G/1 To drop load, we can increase server speed.
High load leads to high wait High job size variability leads to high wait Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. To drop load, we can increase server speed. Q: What can we do to combat job size variability? A: Smarter scheduling!

Scheduling in M/G/1 l m QUESTION:
jobs sec m l DFR QUESTION: Which scheduling policy is best for minimizing E[T]? FCFS (First-Come-First-Served, non-preemptive) PS (Processor-Sharing, preemptive) SJF (Shortest-Job-First, non-preemptive) SRPT (Shortest-Remaining-Processing-Time, preemptive) LAS (Least-Attained-Service First, preemptive) Might know LAS from ATLAS – HPCA with Yoongu Kim and Onur Mutlu – Memory controllers prioritize threads to favor those that have obtained least service so far. [Harchol-Balter EORMS 2011]

Scheduling in M/G/1 l m E[T] r E[T] r C2=100 C2=10 1 3 5 7 9 0.2 0.4
jobs sec m l FCFS SJF PS LAS SRPT 1 3 5 7 9 0.2 0.4 0.6 0.8 1.0 E[T] r FCFS SJF PS LAS SRPT 1 3 5 7 9 0.2 0.4 0.6 0.8 1.0 E[T] r C2=100 Describe all policies & why they do what they do. At end ask about PLCFS (if time). C2=10

Scheduling in M/G/1 l m E[T] We saw: r But isn’t SRPT unfair
jobs sec m l PS SRPT E[T] r We saw: Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. But isn’t SRPT unfair to large jobs, when compared to PS?

Unfairness Question Let S ~ Bounded Pareto Let r = 0.9 1010 PS ? SRPT
Let S ~ Bounded Pareto with max = 1010 Let r = 0.9 PS M/G/1 Mr.Max 1010 QUESTION: Which queue does Mr. Max prefer? ? Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. SRPT M/G/1

Unfairness Question Let S ~ Bounded Pareto (a = 1.1) Let r = 0.9 1010
Let S ~ Bounded Pareto (a = 1.1) with max = 1010 Let r = 0.9 PS Mr.Max SRPT 1010 I SRPT Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Unfairness Question All-can-win-theorem:
Defies Kleinrock’s Conservation Law All-can-win-theorem: [Bansal, Harchol-Balter, Sigmetrics 2001] Under M/G/1, for all job size distributions, if r < 0.5, E[T(x)]SRPT < E[T(x)]PS for all job size x. For heavy-tailed distributions, holds for r < 0.95. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Scheduling in the Real World
Traditional web servers use PS (Fair) scheduling. Let’s do SRPT scheduling instead! [Harchol-Balter et al. TOCS 2003] WEB SERVER client 1 client 2 “Get File” client 1000 Internet Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Q: What is being scheduled? Q: How is size used? 5

SRPT Scheduling for Web Servers
Q: What is being scheduled? A: Bottleneck device is limited ISP bandwidth. Q: How is size being used? A: S = Size of requests = Size of file ~ Pareto(a = 1) Site buys limited fraction of ISP’s bandwidth (say 100Mbps) client 1 client 2 “Get File” client 1000 rest of Internet Apache Linux ISP Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. bottleneck Schedule the sharing of this 100Mbps among 1000 clients. 5

Linux Implementation S 1st M 2nd 3rd L socket 1 x-mit queue netwk
Sockets take turns draining: PS netwk card bottleneck x-mit queue socket 1 socket 1000 socket 2 Apache Linux socket 1 S netwk card 1st Apache Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. M socket 2 2nd bottleneck 3rd L Linux Socket of file w/smallest remaining data feeds first: SRPT socket 1000 priority queues.

Mean response time results
PS SRPT 0.4 0.6 0.8 1.0 r

Response time as fcn of Size
E[T(x)] 1.0s 10-1s 10-2s 10-3s PS SRPT 20% 40% 60% 80% 100% Percentile of Request Size percentile of job size x

Caution: Open versus Closed
MPL N: fixed #users Z: think time Response Time: T l Open Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: When run with same load r, which has higher E[T]? (a) Open (b) Closed (c) Same

Caution: Open versus Closed
MPL N: fixed #users Z: think time Response Time: T l Open Performance of Auction Site [Schroeder, Wierman, Harchol-Balter NSDI 2006] 0.2 0.4 0.6 0.8 1.0 r E[T] (ms) 10-1 1 10 102 Open MPL 1000 MPL 100 MPL 10 Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. E[T] much lower for closed system w/ same r

[Schroeder, Wierman, Harchol-Balter, NSDI 06]
Caution: Open versus Closed Closed Open l r E[T] PS SRPT r E[T] PS SRPT Closed & open systems run w/ same job size distribution and same load. [Schroeder, Wierman, Harchol-Balter, NSDI 06]

Summary Part II II. Single-server queues Prize-winning messages 
D/D/1, M/M/1, M/G/1 Inspection Paradox Effect of job size variaibility Effect of load Provisioning bathrooms/scaling Scheduling: FCFS, PS, SJF, LAS, SRPT Web server scheduling implementation Open vs. closed systems: wait Open vs. closed systems: scheduling Prize-winning messages  Policies that seem unfair may not be. M/G/1: Low load does NOT always imply low waiting time. Waiting time has non-linear relationship to load. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Closed systems behave very differently from open. “Inspection paradox” Waiting time is affected by variability in job size. Smart scheduling can combat job size variability. 51

Avg arrival rate, l Avg service rate, m Avg load, r Avg throughput, X Open vs. closed systems Response time, T Waiting time, TQ Exponential vs. Pareto/Heavy-tailed Squared coefficient of variation, C2 Poisson Process D/D/1, M/M/1, M/G/1 Inspection Paradox Effect of job size variaibility Effect of load Provisioning bathrooms/scaling Scheduling: FCFS, PS, SJF, LAS, SRPT Web server scheduling implementation Open vs. closed systems: wait Open vs. closed systems: scheduling Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Static load balancing Throwing away servers M/M/k + Comparing architectures Many slow servers vs. 1 fast Capacity provisioning & scaling Square root staffing Dynamic power management Dynamic load balancing/FCFS servers Replication Dynamic load balancing/PS servers

Load Balancing l m=2 p 1-p m=1 Poisson process with rate
jobs sec l Poisson process with rate m=1 p 1-p QUESTION: What is the optimal p to minimize E[T]? (a) (b) (c) Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Load Balancing l m=2 p 1-p m=1 l Opt p Poisson process with rate p=1
jobs sec l Poisson process with rate m=1 p 1-p Opt p 1.0 0.9 0.8 0.7 2.0 3.0 p=1 throw away slower server Might mention that for closed systems want to balance load. l

M/M/k l m k servers Poisson process with rate
jobs sec l Poisson process with rate k servers Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Central queue. Server takes job when free. Job size

3 Architectures m m km Q: Which is best for minimizing E[T]? M/M/k
Splitting M/M/1fast m jobs sec l m jobs sec l km jobs sec l Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Q: Which is best for minimizing E[T]?

? 3 Architectures m m km M/M/k l Splitting M/M/1fast l l jobs jobs
sec l Splitting M/M/1fast m jobs sec l km jobs sec l ? Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Many slow or 1 fast? vs. m km M/M/k M/M/1fast l l
jobs sec l km jobs sec l vs. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: Which is best for minimizing E[T]?

Many slow or 1 fast? E[T] l M/M/10 M/M/1fast 2.0 1.0 2 4 6 8 10 0.5
0.6 0.2 2.0 3.0 M/M/1fast M/M/10 E[T] l 2.0 1.0 2 4 6 8 10 0.5 1.5 Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Many slow or 1 fast: Revisited
M/G/k M/G/1fast m jobs sec l jobs sec l vs. km Why? M/G/1 suffers from high variability in job sizes, while M/G/k gets around that. QUESTION: Which is best for minimizing E[T]?

Many slow or 1 fast: Revisited
M/G/10 M/G/1fast 2 4 6 5 10 E[T] l Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Probability an arrival
Capacity Provisioning & Scaling Consider the following example: 12 servers m=1 jobs sec l=9 Poisson process with rate Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Probability an arrival has to queue

Capacity Provisioning & Scaling
QUESTION: If arrival rate becomes 106 times higher, how many servers do we need to keep PQ the same? (a) 9.1 × (d) 12 × 106 (b) 10 × (e) 13 × 106 (c) 11 × (f) none m=1 jobs sec Poisson Proc. ? servers Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk….

Proportional Scaling is Overkill
M/M/4 m 4l m 2l M/M/2 M/M/8 m 8l PQ 1.0 0.6 0.2 l 0.4 0.8 M/M/2 M/M/4 M/M/8 Mu = 1 throughout here, so lambda can be at most 1. E[TQ] l M/M/2 M/M/4 M/M/8 0.6 0.4 0.2 0.8 1.0

Proportional Scaling is Overkill
M/M/4 m 4l m 2l M/M/2 M/M/8 m 8l Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. More servers at same system load  lower PQ  lower E[TQ] high r  high E[TQ], given enough servers

Back to Capacity Provisioning
m=1 9 12 servers jobs sec 9x106 PQ Q: How many servers? A: 9.003x106 servers “Square root staffing” [Halfin, Whitt OR 1981] Let R be the minimum #servers for stability. Then servers yields PQ = 20%. CLT: Stochastic variation decreases with more samples. Transition: These scaling ideas are important in saving power. Don’t need as many servers as you think you need. Can save further power by turning servers off when not in use. Let’s see what scaling says about that … Lesson: SAVE MONEY: Don’t scale proportionately!

Dynamic Power Management
Annual U.S. data center energy consumption: 100B kWh Unfortunately most is wasted… Servers are only busy 5-30% time on average, but they’re left ON, wasting power. [Gartner Report] [NYTimes] Setup time 260s 200W BUSY server: Watts IDLE server: Watts OFF server: Watts Intel Xeon E5520 2 quad-core 2.27 GHz 16 GB memory Q: Given setup time, does dynamic power mgmt work?

M/M/1/setup model l m Thm: [Welch ’64] for larger (M/M/k) systems?
Response Time: T jobs sec m l When server is idle, immediately shuts off. Requires setup time to get it back on. Thm: [Welch ’64] This adds 260s to response time! Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: Does setup have same effect for larger (M/M/k) systems?

Effect of setup in larger systems
We will scale up system size, while keep load fixed. m jobs sec lk M/M/k/setup k servers #servers, k 100 80 60 40 20 Setup matters less as k increases. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. This is why dynamic power mgmt works! [Gandhi, Doroudi, Harchol-Balter, Scheller-Wolf Sigmetrics 2013]

Dynamic Power Mgmt Implementation
[Gandhi, Harchol-Balter, Raghunathan, Kozuch TOCS 2012] 28 Application servers Load Balancer Power Aware Load Balancer Unknown 500 GB DB Just finished talking about dynamic power mgmt. Let’s talk about dynamic load balancing. So far static – don’t look at queues. 7 Memcached key-value workload mix of CPU & I/O 1 job = 1 to 3000 KV pairs 120ms total on avg SLA: T95 < 500 ms Setup time: 260 s

AlwaysOn AutoScale Facebook has adopted AutoScale Within 30% of OPT
Time (min)  Num. servers  AlwaysOn T95=291ms, Pavg=2,323W _ load o kbusy+idle x kbusy+idle+setup AutoScale T95=491ms, Pavg=1,297W _ load o kbusy+idle x kbusy+idle+setup Num. servers  Time (min)  Here are the results on ON/OFF shown again, and here is the comparison with AutoScale. The first thing that you should notice about AutoScale is that it fluctuates a lot less with changes in load. Notice these nice straight lines. It doesn’t try to rush to turn servers off and then turn them on again. This is due to organically waiting for server to go idle and then waiting further. Everything is much smoother. The second thing that you notice about AutoScale is how it turns servers on. Using number of jobs, rather than the arrival rate, to determine how many servers we need is a much better predictor. You see that the number of servers we have on (the blue circles) ends up much higher, so that we can better respond to the load. As a result the 95%-tile is a lot better, and we meet our desired response time SLA. Note that we still save a huge amount of power compared to AlwaysOn, in fact more than 40% of power is saved. Facebook: Qiang Wu, Goranka Bjedov, Eitan Frachtenberg, Jason Taylor, Sanjeev Kumar Within 30% of OPT power on all our traces! Facebook has adopted AutoScale

Dynamic Load Balancing
FCFS Dispatcher Response time, T F5 Big-IP Microsoft SharePoint Cisco Local Director Coyote Point Equalizer IBM Network Dispatcher etc. QUESTION: What is a good dispatching policy for minimizing E[T]? Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. All hosts identical. Jobs i.i.d. with highly variable size distrib.

Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. Central-Queue (M/G/k) Host grabs next job when free. 5. Size-Interval Splitting Jobs are split up by size among hosts. Response time, T FCFS Dispatcher Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. All hosts identical. Jobs i.i.d. with highly variable size distrib. [Harchol-Balter, Crovella, Murta JPDC 99], [Harchol-Balter JACM 02], [Harchol-Balter,Scheller-Wolf,Young SIGMETRICS 09]

High E[T] Low generally Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. Central-Queue (M/G/k) Host grabs next job when free. 5. Size-Interval Splitting Jobs are split up by size among hosts. Response time, T FCFS Dispatcher Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. All hosts identical. Jobs i.i.d. with highly variable size distrib. [Harchol-Balter, Crovella, Murta JPDC 99], [Harchol-Balter JACM 02], [Harchol-Balter,Scheller-Wolf,Young SIGMETRICS 09]

High E[T] Low generally Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. Central-Queue (M/G/k) Host grabs next job when free. 5. Size-Interval Splitting Jobs are split up by size among hosts. Central-Queue: + Good utilization of servers. + Some isolation for smalls Size-Interval Task Assignment - Worse utilization of servers. + Great isolation for smalls! Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. [Harchol-Balter, Crovella, Murta JPDC 99], [Harchol-Balter JACM 02], [Harchol-Balter,Scheller-Wolf,Young SIGMETRICS 09]

Newest work: Don’t Decide. Send to all!
And we each get in line. Whenever one of us is at the head of the queue, we join up again.

Newest work: Don’t Decide. Send to all!
Replicate! And we each get in line. Whenever one of us is at the head of the queue, we join up again. Microsoft/Berkeley Dolly System 2012 [Ananthanarayanan, Ghodsi, Shenker, Stoica] Google “Tail at Scale” [Dean, Barroso] Berkeley Sparrow paper 2013 [Ousterhout et al.] DNS and Database query systems [Vulimiri et al.] CMU first exact analysis of replication SIGMETRICS [Gardner et al.]

Dynamic Load Balancing 2
Response time, T HTTP Web requests:  immediately dispatched to server Commodity servers used:  do Processor-Sharing PS Dispatcher QUESTION: What is a good dispatching policy for minimizing E[T]? Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. All hosts identical. Jobs i.i.d. with highly variable size distrib.

High E[T]FCFS Low Response time, T Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. 4. Size-Interval Splitting Jobs are split up by size among hosts. PS Dispatcher Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. All hosts identical. Jobs i.i.d. with highly variable size distrib.

High E[T]FCFS Low Response time, T Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. 4. Size-Interval Splitting Jobs are split up by size among hosts. PS Dispatcher Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: What is the best of these for PS server farms?

High E[T]FCFS Low Response time, T Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs. Least-Work-Left Go to host with least total work. 4. Size-Interval Splitting Jobs are split up by size among hosts. PS Dispatcher Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. QUESTION: What is the best of these for PS server farms? [Gupta,Harchol-Balter,Sigman,Whitt Performance 07]

Not covering: Networks of Queues
+ Closed-form analysis exists - Requires Poisson arrivals & indpt Exponential service times + Routes can depend on packet’s “class.” FCFS + Closed-form analysis exists - Requires Poisson arrivals. + General service times! + Routes and service rates can depend on packet’s class. PS PS Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. PS

Summary Part III III. Multi-server queues Prize-winning messages  vs.
Static load balancing Throwing away servers M/M/k + Comparing architectures Many slow servers vs. 1 fast Capacity provisioning & scaling Square root staffing Dynamic power management Dynamic load balancing/FCFS servers Replication Dynamic load balancing/PS servers Prize-winning messages  vs. Best choice depends on job size variability. . Dynamic power mgmt works because setup time (and high load) hurt less in large systems. UNbalancing load is best! Throw away slow servers. Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. Proportional scaling is overkill! Square-root staffing. Best dispatching policies aim to mitigate effect of job size variability.

THANK YOU! www.cs.cmu.edu/~harchol/ 84
Hi, My name is Mor Harchol-Balter. I’m a professor in the Computer Science Department here at CMU. I’ll be talking about what analytical performance modeling teaches us about designing computer systems. Don’t get nervous about these friends of mine … you’ll meet them all later during the talk…. 84

Computer Systems Design

Similar presentations

Presentation on theme: "Computer Systems Design"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Systems Design

Similar presentations

Presentation on theme: "Computer Systems Design"— Presentation transcript:

Similar presentations

About project

Feedback