Presentation is loading. Please wait.

Presentation is loading. Please wait.

Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms

Similar presentations


Presentation on theme: "Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms"— Presentation transcript:

1 Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms
VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Karl Sigman Columbia Univ. Ward Whitt

2 Application: Web server farms
Timeshare service among current requests Local Router (Immediate Dispatch) JSQ : most popular policy Cisco Local Director IBM Network Dispatcher Commodity web servers

3 Model: PS server farm with JSQ
Timeshare service among current requests Local Router (Immediate Dispatch) JSQ : most popular policy Cisco Local Director IBM Network Dispatcher Commodity web servers

4 Model: PS server farm with JSQ
Local Router PS (Immediate Dispatch) K homogenous, processor sharing servers

5 Model: PS server farm with JSQ
Poisson Rate  JSQ / Immed. Dispatch PS K homogenous, processor sharing servers Poisson arrivals Job sizes i.i.d. ~ G ≡ M/G/K/JSQ/PS

6 Why join the shortest queue?
Dynamic load balancing Simple Greedy for PS server farm share server with minimum # of jobs

7 Prior Analysis of JSQ routing
FCFS Limited to FCFS servers and mostly exponential job size distribution 2-server: [Kingman 61] , [Flatto, McKean 77], [Cohen, Boxma 83], [Wessels, Adan, Zijm 91] [Foschini, Salz 78], [Knessl, Makkowsky, Schuss, Tier 87] [Conolly 84], [Rao, Posner 87], [Blanc 87], [Grassmann 80] >2-server approximations: [Nelson, Philips, Sigmetrics 89] [Lin, Raghavendra, TPDS 96] [Lui, Muntz, Towsley 95] OUR GOAL: Analyze JSQ with PS servers and general job size distributions; interested in mean response time, E[T]

8 GOAL: Analysis of JSQ with PS servers
Observe: exponential job sizes How about general job sizes? M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS joint queue length Approximations exist GOAL: Effect of job size variability on JSQ/PS

9 Goal: Effect of job size variability on JSQ/PS
Idea: Look at H2*(,p) distribution 2 degrees of freedom can fix mean and control variance THEOREM: E[T] insensitive under H2* jobs

10 ( ) THEOREM: E[T] insensitive under H2* job size distribution PROOF:
Q: What happens to 0-sized jobs? A: Disappear on arrival M/H2*/K/JSQ/PS JSQ PS  H2*(,p) M/M/K/JSQ/PS JSQ PS  Exp ( ) 1-p equal mean size stationary queue length distribution stationary queue length distribution M/M/K/JSQ/PS JSQ PS (1-p) Exp()

11 Insensitivity for general distributions?
Simulate M/G/K/JSQ/PS under following 7 distributions (all with mean 2) 1. Deterministic var=0 2. Erlang2 var=2 3. Exponential var=4 4. Bimodal(1,11) var=9 5. Weibull-1 var=20 6. Weibull-2 var=76 7. Bimodal(1,101) var=99 Heavy-tailed

12 Simulation results Number of servers = 2 E[T] < 2% deviation
from Exp (95% conf intervals) Number of servers = 8 E[T] < 2% deviation from Exp Increasing variability

13 Goal: Effect of variability on JSQ/PS
Conclusion: E[T] is “nearly insensitive” to variability of G

14 Why is JSQ/PS “near-insensitive”?
Maybe just because M/G/1/PS is insensitive. Maybe all routing policies are near-insensitive. Which of the following do you think are insensitive? ??? PS RANDOM – randomly select one of K servers Round Robin – cyclic assignment Least Work Left – join the server with the smallest total remaining work

15 E[T] Number of servers = 2 ??? PS JSQ Det Exp Bim-1 Weib-1 Weib-2

16 E[T] Number of servers = 2 ??? PS RANDOM JSQ Det Exp Bim-1 Weib-1

17 E[T] Number of servers = 2 ??? PS RANDOM R-R JSQ Det Exp Bim-1 Weib-1

18 “Near-insensitivity” of JSQ is non-trivial (but cool) !
??? PS Number of servers = 2 RANDOM R-R E[T] LWL JSQ Det Exp Bim-1 Weib-1 Weib-2 Bim-2 “Near-insensitivity” of JSQ is non-trivial (but cool) !

19 Recap ≈ = JSQ/PS “nearly insensitive” to variability M/G/K/JSQ/PS JSQ
M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] E[T] = Approximations exist THEOREM: equality for H2*

20 Outline ≈ = PART I: JSQ/PS “nearly insensitive” to variability
M/G/K/JSQ/PS JSQ PS M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] E[T] = Approximations exist THEOREM: equality for H2* PART II: Investigate new approaches for M/M/K/JSQ PART III: Is JSQ the best routing policy for PS servers?

21 Single Queue Approximation (SQA)
M/M/K/JSQ/PS Mn/M/1/PS ??/M/1/PS JSQ PS (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Captures the effect of other queues in the JSQ system Model queue 1 as an independent PS queue with state (queue length) dependent arrival rates

22 Single Queue Approximation (SQA)
M/M/K/JSQ/PS Mn/M/1/PS JSQ PS (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Intuition test Q1: Which is true? a. (0) = /K b. (0) < /K c. (0) > /K Q1: Which is true? a. (0) = /K b. (0) < /K c. (0) > /K Q2: Which is true? a. (0) = (1) b. (0) < (1) c. (0) > (1) Q2: Which is true? a. (0) = (1) b. (0) < (1) c. (0) > (1) Q3: (n) as n→ a. 0 /K (/K)K None of the above Q3: (n) as n→ a. 0 /K (/K)K None of the above THEOREM: lim (n) = (/2)2 when K=2. n→

23 Single Queue Approximation (SQA)
M/M/K/JSQ/PS Mn/M/1/PS JSQ PS (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 n = Pr{n jobs in queue 1} xn = Pr{n jobs} THEOREM: n = xn Where is the approximation? Don’t know the exact (n)’s !

24 Single Queue Approximation (SQA)
M/M/K/JSQ/PS Mn/M/1/PS JSQ PS (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Recall: (n) (/K)K n→ Approximations for (0), (1), …, (n) For n≥3, (n) ≈ (/K)K Obtain closed form functional approx for (0), (1), (2)

25 Results (SQA) E[T] Number of servers (K) per server load = 0.9
Simulation E[T] per server load = 0.9 Number of servers (K)

26 Results (SQA) E[T] Number of servers (K)
Simulation SQA E[T] per server load = 0.9 Number of servers (K) < 2% error for E[T] for up to 64 servers

27 Outline ≈ = PART I: JSQ/PS “nearly insensitive” to variability
M/G/K/JSQ/PS JSQ PS M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] E[T] = Approximations exist THEOREM: equality for H2* PART II: Accurate approximation for M/M/K/JSQ PART III: Is JSQ the best routing policy for PS servers?

28 To JSQ or not to JSQ, that is the question..
??? PS RANDOM R-R E[T] LWL JSQ Det Exp Bim-1 Weib-1 Weib-2 Bim-2 OPT-0 – minimize average response time given no more arrivals

29 To JSQ or not to JSQ, that is the question..
??? PS RANDOM R-R E[T] LWL JSQ OPT-0 Det Exp Bim-1 Weib-1 Weib-2 Bim-2 OPT-0 – minimize average response time given no more arrivals

30 To JSQ or not to JSQ, that is the question..
??? PS RANDOM CONJEC: Minimum E[T] over all distributions, routing policies R-R E[T] LWL JSQ OPT-0 Det Exp Bim-1 Weib-1 Weib-2 Bim-2 Compare here for optimality

31 To JSQ or not to JSQ, that is the question..
Conclusion: JSQ is near optimal, without knowing job sizes or distribution

32 Conclusions JSQ/PS exhibits near-insensitivity to job size variability
M/G/K/JSQ/PS JSQ PS Conclusions JSQ/PS exhibits near-insensitivity to job size variability SQA method to analyze M/M/K/JSQ/PS JSQ is near-optimal for all job size distributions M/G/K/JSQ/PS ≈ M/M/K/JSQ/PS THM: H2* equivalence M/M/K/JSQ/PS = Mn/M/1/PS THM: Single queue equivalence THM: (n) convergence


Download ppt "Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms"

Similar presentations


Ads by Google