Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms

Slides:



Advertisements
Similar presentations
VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)
Advertisements

Dispatching to Incentivize Fast Service in Multi-Server Queues Raga Gopalakrishnan and Adam Wierman California Institute of Technology Sherwin Doroudi.
VARUN GUPTA Carnegie Mellon University 1 With: Mor Harchol-Balter (CMU)
Thrasyvoulos Spyropoulos / Eurecom, Sophia-Antipolis 1  Load-balancing problem  migrate (large) jobs from a busy server to an underloaded.
Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 16, 2013 Things I Thought I Knew about Queueing Theory, but was Wrong About (Part 2, Service.
1 Mor Harchol-Balter, CMU, Computer Sci. Alan Scheller-Wolf, CMU, Tepper Business Andrew Young, Morgan Stanley Surprising results on task assignment for.
Anshul Gandhi (Carnegie Mellon University) Varun Gupta (CMU), Mor Harchol-Balter (CMU) Michael Kozuch (Intel, Pittsburgh)
Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 15, 2013 Things I Thought I Knew About Queueing Theory, but was Wrong About: Part 1, Multiserver.
Load Balancing of Elastic Traffic in Heterogeneous Wireless Networks Abdulfetah Khalid, Samuli Aalto and Pasi Lassila
Simulation Evaluation of Hybrid SRPT Policies
Queuing Theory For Dummies
1 IOE/MFG 543 Chapter 11: Stochastic single machine models with release dates.
Volcano Routing Scheme Routing in a Highly Dynamic Environment Yashar Ganjali Stanford University Joint work with: Nick McKeown SECON 2005, Santa Clara,
Scheduling in Server Farms
A gentle introduction to fluid and diffusion limits for queues Presented by: Varun Gupta April 12, 2006.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
FINDING THE OPTIMAL QUANTUM SIZE Revisiting the M/G/1 Round-Robin Queue VARUN GUPTA Carnegie Mellon University.
Parametric Inference.
I think your suggestion is, Can we do two things at once? Well, we’re of the view that we can walk and chew gum at the same time. —Richard Armitage, deputy.
1 Connection Scheduling in Web Servers Mor Harchol-Balter School of Computer Science Carnegie Mellon
Queueing Network Model. Single Class Model Open - Infinite stream of arriving customers Closed - Finite population eg Intranet users Indistinguishable.
Fundamental Characteristics of Queues with Fluctuating Load VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Alan Scheller-Wolf Carnegie.
Fluid level in tandem queues with an On/Off source VARUN GUPTA Carnegie Mellon University Joint work with PETER HARRISON Imperial College.
Carnegie Mellon University Computer Science Department 1 CLASSIFYING SCHEDULING POLICIES WITH RESPECT TO HIGHER MOMENTS OF CONDITIONAL RESPONSE TIME Adam.
Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.
Fundamental Characteristics of Queues with Fluctuating Load (appeared in SIGMETRICS 2006) VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ.
Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.
Thesis Proposal VARUN GUPTA 1. Thesis Proposal VARUN GUPTA 2.
ORF Electronic Commerce Spring 2009 April 15, 2009 Week 10 Capacity Analysis Deterministic Model –Assume: each HTTP request takes T r (request time)
Web Server Load Balancing/Scheduling Asima Silva Tim Sutherland.
Introduction to Queuing Theory
1 Scheduling in Server Farms Mor Harchol-Balter Associate Department Head Computer Science Dept Carnegie Mellon University
The Poisson Process. A stochastic process { N ( t ), t ≥ 0} is said to be a counting process if N ( t ) represents the total number of “events” that occur.
Decentralised load balancing in closed and open systems A. J. Ganesh University of Bristol Joint work with S. Lilienthal, D. Manjunath, A. Proutiere and.
1 Mor Harchol-Balter Computer Science Dept, CMU What Analytical Performance Modeling Teaches Us About Computer Systems Design.
Flows and Networks Plan for today (lecture 5): Last time / Questions? Blocking of transitions Kelly / Whittle network Optimal design of a Kelly / Whittle.
 Birth Death Processes  M/M/1 Queue  M/M/m Queue  M/M/m/B Queue with Finite Buffers  Results for other Queueing systems 2.
1 Server Scheduling in the L p norm Nikhil Bansal (CMU) Kirk Pruhs (Univ. of Pittsburgh)
1 Networks of queues Networks of queues reversibility, output theorem, tandem networks, partial balance, product-form distribution, blocking, insensitivity,
Networks of Queues Plan for today (lecture 6): Last time / Questions? Product form preserving blocking Interpretation traffic equations Kelly / Whittle.
Flows and Networks Plan for today (lecture 4): Last time / Questions? Output simple queue Tandem network Jackson network: definition Jackson network: equilibrium.
Queuing Networks Jean-Yves Le Boudec 1. Contents 1.The Class of Multi-Class Product Form Networks 2.The Elements of a Product-Form Network 3.The Product-Form.
Networks Plan for today (lecture 8): Last time / Questions? Quasi reversibility Network of quasi reversible queues Symmetric queues, insensitivity Partial.
Queuing Networks Jean-Yves Le Boudec 1. Networks of Queues Stability Queuing networks are frequently used models The stability issue may, in general,
1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.
An Optimal Service Ordering for a World Wide Web Server A Presentation for the Fifth INFORMS Telecommunications Conference March 6, 2000 Amy Csizmar Dalal.
1 The Base Stock Model. 2 Assumptions  Demand occurs continuously over time  Times between consecutive orders are stochastic but independent and identically.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Flows and Networks Plan for today (lecture 6): Last time / Questions? Kelly / Whittle network Optimal design of a Kelly / Whittle network: optimisation.
Winter 2004EE384x1 Poisson Process Review Session 2 EE384X.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
Analysis of SRPT Scheduling: Investigating Unfairness Nikhil Bansal (Joint work with Mor Harchol-Balter)
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.
1 Task Assignment with Unknown Duration Mor Harchol-Balter Carnegie Mellon.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Flows and Networks Plan for today (lecture 3): Last time / Questions? Output simple queue Tandem network Jackson network: definition Jackson network: equilibrium.
Grid Performability, Modelling and Measurement AHM’04 Optimal Tree Structures for Large-Scale Grids J. Palmer I. Mitrani School of Computing Science University.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Web Server Load Balancing/Scheduling
Web Server Load Balancing/Scheduling
Load Balancing and Data centers
Serve Assignment Policies
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Flows and Networks Plan for today (lecture 6):
SRPT Applied to Bandwidth Sharing Networks
Resource Sharing with Subexponential Distributions
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Karl Sigman Columbia Univ. Ward Whitt

Application: Web server farms Timeshare service among current requests Local Router (Immediate Dispatch) JSQ : most popular policy Cisco Local Director IBM Network Dispatcher … Commodity web servers

Model: PS server farm with JSQ Timeshare service among current requests Local Router (Immediate Dispatch) JSQ : most popular policy Cisco Local Director IBM Network Dispatcher … Commodity web servers

Model: PS server farm with JSQ Local Router PS (Immediate Dispatch) K homogenous, processor sharing servers

Model: PS server farm with JSQ Poisson Rate  JSQ / Immed. Dispatch PS K homogenous, processor sharing servers Poisson arrivals Job sizes i.i.d. ~ G ≡ M/G/K/JSQ/PS

Why join the shortest queue? Dynamic load balancing Simple Greedy for PS server farm share server with minimum # of jobs

Prior Analysis of JSQ routing FCFS Limited to FCFS servers and mostly exponential job size distribution 2-server: [Kingman 61] , [Flatto, McKean 77], [Cohen, Boxma 83], [Wessels, Adan, Zijm 91] [Foschini, Salz 78], [Knessl, Makkowsky, Schuss, Tier 87] [Conolly 84], [Rao, Posner 87], [Blanc 87], [Grassmann 80] >2-server approximations: [Nelson, Philips, Sigmetrics 89] [Lin, Raghavendra, TPDS 96] [Lui, Muntz, Towsley 95] OUR GOAL: Analyze JSQ with PS servers and general job size distributions; interested in mean response time, E[T]

GOAL: Analysis of JSQ with PS servers Observe: exponential job sizes How about general job sizes? M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS joint queue length Approximations exist GOAL: Effect of job size variability on JSQ/PS

Goal: Effect of job size variability on JSQ/PS Idea: Look at H2*(,p) distribution 2 degrees of freedom  can fix mean and control variance THEOREM: E[T] insensitive under H2* jobs

( ) THEOREM: E[T] insensitive under H2* job size distribution PROOF: Q: What happens to 0-sized jobs? A: Disappear on arrival M/H2*/K/JSQ/PS JSQ PS  H2*(,p) M/M/K/JSQ/PS JSQ PS  Exp ( )  1-p equal mean size stationary queue length distribution stationary queue length distribution M/M/K/JSQ/PS JSQ PS (1-p) Exp()

Insensitivity for general distributions? Simulate M/G/K/JSQ/PS under following 7 distributions (all with mean 2) 1. Deterministic var=0 2. Erlang2 var=2 3. Exponential var=4 4. Bimodal(1,11) var=9 5. Weibull-1 var=20 6. Weibull-2 var=76 7. Bimodal(1,101) var=99 Heavy-tailed

Simulation results Number of servers = 2 E[T] < 2% deviation from Exp (95% conf intervals) Number of servers = 8 E[T] < 2% deviation from Exp Increasing variability

Goal: Effect of variability on JSQ/PS Conclusion: E[T] is “nearly insensitive” to variability of G

Why is JSQ/PS “near-insensitive”? Maybe just because M/G/1/PS is insensitive. Maybe all routing policies are near-insensitive. Which of the following do you think are insensitive? ??? PS RANDOM – randomly select one of K servers Round Robin – cyclic assignment Least Work Left – join the server with the smallest total remaining work

E[T] Number of servers = 2 ??? PS JSQ Det Exp Bim-1 Weib-1 Weib-2

E[T] Number of servers = 2 ??? PS RANDOM JSQ Det Exp Bim-1 Weib-1

E[T] Number of servers = 2 ??? PS RANDOM R-R JSQ Det Exp Bim-1 Weib-1

“Near-insensitivity” of JSQ is non-trivial (but cool) ! ??? PS Number of servers = 2 RANDOM R-R E[T] LWL JSQ Det Exp Bim-1 Weib-1 Weib-2 Bim-2 “Near-insensitivity” of JSQ is non-trivial (but cool) !

Recap ≈ = JSQ/PS “nearly insensitive” to variability M/G/K/JSQ/PS JSQ M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] ≈ E[T] = Approximations exist THEOREM: equality for H2*

Outline ≈ = PART I: JSQ/PS “nearly insensitive” to variability M/G/K/JSQ/PS JSQ PS M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] ≈ E[T] = Approximations exist THEOREM: equality for H2* PART II: Investigate new approaches for M/M/K/JSQ PART III: Is JSQ the best routing policy for PS servers?

Single Queue Approximation (SQA) M/M/K/JSQ/PS Mn/M/1/PS ??/M/1/PS JSQ PS ≈ (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Captures the effect of other queues in the JSQ system Model queue 1 as an independent PS queue with state (queue length) dependent arrival rates

Single Queue Approximation (SQA) M/M/K/JSQ/PS Mn/M/1/PS JSQ PS ≈ (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Intuition test Q1: Which is true? a. (0) = /K b. (0) < /K c. (0) > /K Q1: Which is true? a. (0) = /K b. (0) < /K c. (0) > /K Q2: Which is true? a. (0) = (1) b. (0) < (1) c. (0) > (1) Q2: Which is true? a. (0) = (1) b. (0) < (1) c. (0) > (1) Q3: (n) as n→ a. 0 /K (/K)K None of the above Q3: (n) as n→ a. 0 /K (/K)K None of the above THEOREM: lim (n) = (/2)2 when K=2. n→

Single Queue Approximation (SQA) M/M/K/JSQ/PS Mn/M/1/PS JSQ PS ≈ ≡ (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 n = Pr{n jobs in queue 1} xn = Pr{n jobs} THEOREM: n = xn Where is the approximation? Don’t know the exact (n)’s !

Single Queue Approximation (SQA) M/M/K/JSQ/PS Mn/M/1/PS JSQ PS ≈ (n) PS (n)= # arrivals into queue 1 finding n jobs total time there are n jobs in queue 1 Recall: (n) (/K)K n→ Approximations for (0), (1), …, (n) For n≥3, (n) ≈ (/K)K Obtain closed form functional approx for (0), (1), (2)

Results (SQA) E[T] Number of servers (K) per server load = 0.9 Simulation E[T] per server load = 0.9 Number of servers (K)

Results (SQA) E[T] Number of servers (K) Simulation SQA E[T] per server load = 0.9 Number of servers (K) < 2% error for E[T] for up to 64 servers

Outline ≈ = PART I: JSQ/PS “nearly insensitive” to variability M/G/K/JSQ/PS JSQ PS M/M/K/JSQ/PS JSQ PS JSQ FCFS M/M/K/JSQ/FCFS E[T] ≈ E[T] = Approximations exist THEOREM: equality for H2* PART II: Accurate approximation for M/M/K/JSQ PART III: Is JSQ the best routing policy for PS servers?

To JSQ or not to JSQ, that is the question.. ??? PS RANDOM R-R E[T] LWL JSQ Det Exp Bim-1 Weib-1 Weib-2 Bim-2 OPT-0 – minimize average response time given no more arrivals

To JSQ or not to JSQ, that is the question.. ??? PS RANDOM R-R E[T] LWL JSQ OPT-0 Det Exp Bim-1 Weib-1 Weib-2 Bim-2 OPT-0 – minimize average response time given no more arrivals

To JSQ or not to JSQ, that is the question.. ??? PS RANDOM CONJEC: Minimum E[T] over all distributions, routing policies R-R E[T] LWL JSQ OPT-0 Det Exp Bim-1 Weib-1 Weib-2 Bim-2 Compare here for optimality

To JSQ or not to JSQ, that is the question.. Conclusion: JSQ is near optimal, without knowing job sizes or distribution

Conclusions JSQ/PS exhibits near-insensitivity to job size variability M/G/K/JSQ/PS JSQ PS Conclusions JSQ/PS exhibits near-insensitivity to job size variability SQA method to analyze M/M/K/JSQ/PS JSQ is near-optimal for all job size distributions M/G/K/JSQ/PS ≈ M/M/K/JSQ/PS THM: H2* equivalence M/M/K/JSQ/PS = Mn/M/1/PS THM: Single queue equivalence THM: (n) convergence