Queueing Theory
The study of queues – why they form, how they can be evaluated, and how they can be optimized. Building blocks – arrival process and a service process.
Arrival process – individually/in groups, independent/correlated, single source/multiple sources, infinite/finite population, limited/unlimited capacity. Service process – single/multiple servers, single/multiple stages, individually/in groups, independent/correlated, service discipline (FCFS/priority). Some characteristics of arrival and service processes
GX/GY/k/NGX/GY/k/N A Common notation G : distribution of inter- arrival times X : distribution of arrival batch (group) size G : distribution of service times Y : distribution of service batch size k : number of servers N : maximum number of customers allowed
Common examples M / M /1 M / G /1 M / M / k M / M /1/ N M X / M / 1 GI / M /1 M / M / k/k
Fundamental quantities L : expected number of customers in the system, L = E ( n ). L Q : expected number of customers waiting in queue. W : expected time a customer spends in the system. W Q : expected time a customer spends waiting in queue E [ S ]: expected time customer spends in service. : customer arrival rate, = lim t ∞ N ( t )/ t, where N ( t ) is the number of arrivals up to time t.
Fundamental relationships L = L Q + N s W = W Q + E(S) L = W L Q = W Q N s = E(S) The relationship L = W is often referred to as Little’s law.
t 1 t 2 t 3 t 4 t 5 t 6 t 7 T Number in system A heuristic proof
L = [1( t 2 - t 1 )+2( t 3 - t 2 )+1( t 4 - t 3 )+2( t 5 - t 4 )+3( t 6 - t 5 )+2( t 7 - t 6 )+1(T- t 7 )]/ T = (area under curve)/ T = (T+ t 7 + t 6 - t 5 - t 4 + t 3 - t 2 - t 1 )/ T W = [( t 3 - t 1 )+( t 6 - t 2 )+( t 7 - t 4 )+(T- t 5 )]/4 = (T+ t 7 + t 6 - t 5 - t 4 + t 3 - t 2 - t 1 )/4 = (area under curve)/ N(T)
L = (area under curve)/ T, W = (area under curve)/ N(T) LT = WN ( T ) L = WN ( T )/T Since as T ∞, N ( T )/ T , L = W as T ∞. A similar heuristic proof can be used to show L Q = W Q and N s = E(S).
For a single server queue:
Case 1 Customers arrive at regular & constant intervals Service times are constant Arrival rate < service rate ( < ) W Q = 0 W = W Q + E(S) = E(S) L = W = E(S) L Q = W Q = 0 E(S) TH (output/throughput rate) = Why do queues form?
Case 2 Customers arrive at regular & constant intervals Service times are constant Arrival rate > service rate ( > ) W Q = ∞ W = W Q + E(S) = ∞ L = W = ∞ L Q = W Q = ∞ > 1 (but utilization is actually 1) TH (output/throughput rate) = Why do queues form?
Case 3 Customers arrive at regular & constant intervals Service times are not constant Arrival rate < service rate ( < ) Example: Inter-arrival time = 8 min Average service time = 6 min
= 6/8 = 0.75 TH = 1/8 parts/min = 7.5 parts/hour W Q = ? L q = ?
Conclusion 1: If customers arrive at a faster rate than the service rate, the system becomes instable and infinitely large queues will form. Conclusion 2: In the presence of variability, customers will generally wait for processing and a queue in front of the processing unit will build up.
In managing queueing systems, we must alway strive to reduce variability while allowing for enough capacity
Measuring Variability
The M/M/1 queue
Example Example: Customers arrive according to a Poisson process with rate of 1 per every 12 minutes and that the service time is exponential at a rate of one service per 8 minutes. What is L and W ? What happens if arrival rate increases by 20%? If there is a waiting cost of $2 per minute a customer spends in the system, what is the total cost per minute incurred in both cases?
Example A 20% increase in arrival rate leads to a 100% increase in number of customers!
The M/M/1/N queue
Example: A service facility with a finite queue size of N has service rate and an arrival rate. Each customer that is served generates $ A. Service rate can be increased. However, there is a cost $c per unit time for operating a facility with rate . What is the optimal choice of ?
The G/G/1 queue If (1) < , (2) the distributions of customer service time and inter-arrival times are stationary, and (3) customers are served on a first come, first served (FCFS) basis, then average waiting time in the queue can be approximated as follows: waiting time in a M/M/1 queue
The G/G/m queue When there are m parallel servers, then average waiting time can be approximated as follows:
Examples m = 1 C A = C S = 1 = 1 Case 1: = 0.50 W = 2, L = 1 Case 2: = 0.66 W = 3, L = 1.98 Case 3: = 0.75 W = 4, L = 3 Case 4: = 0.80 W = 5, L = 4 Case 5: = 0.90 W = 10, L = 9 Case 6: = 0.95 W = 20, L = 19 Case 7: = 0.99 W = 100, L = 99
Examples m = 1 C A = 1 = 1 = 0.8 Case 1: c S = 0 W = 3, L = 2.4 Case 2: c S = 0.5 W = 4, L = 3.2 Case 1: c S = 1 W = 5, L = 4 Case 1: c S = 1.5 W = 6, L = 4.8 Case 1: c S = 2 W = 7, L = 5.6
Network of Queues Two servers in series Customers arrive to server 1 according to a Poisson process with rate Service times are exponentially distributed at servers 1 and 2 with rates 1 and 2, respectively There is always enough waiting room between the two servers
Server 1 alone is simply an M/M/1 queue. Then If server 2 alone is also an M/M/1 queue (this is actually true), then If the number of customers at servers 1 and 2 are independent (this is also true), then
The results generalize to k servers in series. Each server behaves like an M/M/1 queue and number of customers at each server are independent of the number of customers at other servers.