1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews Bell Laboratories, Murray Hill, NJ LCA Seminars Talk, EPFL, March 27, 2003
2 Introduction: Input-Queued Switch input portsoutput ports II crossbar At any point in time, connectivity restricted to permutation matrices
3 Some Existing Approaches for Crossbar Scheduling maximum-weight matching (McKeown ‘96, many others) decomposition-based scheduling (Chang et al, 2000) fluid-tracking (Tabatabaee et al, ToN ’01)
4 Decomposition-Based Scheduling Given: M, a I x I rate demand matrix [m ij ] intensity of the service offered to the ij-th input/output port pair Assume M doubly sub-stochastic Constraint: crossbar Find: Decompose M into permutation matrices. Find a schedule such that intensity of the service offered to ij-th input/output port pair is at least [m ij ]
5 Decomposition-Based Sched. (cont’d) Observation: A solution to the problem ensures the service rate to be at least M in the long-run Desired Property: broadly speaking, we want a schedule to be also “smooth” (“non bursty”), that is, the transmission slots would need to be evenly offered to any input-output port pair Observation: Note, the last is a short-run property
6 A Decomposition: Birkoff/von Neumann Birkoff/von Neumann (e.g. Chvátal ‘84, p. 330): Any doubly stochastic matrix M is a convex combination of permutation matrices, that is M k is a permutation matrix k is intensity of the k-th permutation matrix Other decompositions can be used for doubly sub-stochastic M; Birkoff/von Neumann maximizes throughput Birkoff/von Neumann applied to the switch problem by Chang et al (2000)
7 The Problem that We Study Given: M 1, M 2, …, M K a sequence of permutation matrices Find: schedules with a guarantee on their smoothness “smooth” quantified through the concept of latency defined shortly
8 Why is the Problem Important Rate provision, but also, delay-jitter guarantees for diffserv like EF (Expedited Forwarding), guarantees for MPLS, provision of a good Connection-Reservation-Table to offer guaranteed service to control traffic inside a switch
9 Related Work When load is not more than 1/4 (Giles and Hajek ‘97) a schedule exists such that each pair ij is scheduled at least once in 1/ ij When load is 1 (Chang et al ‘00) Birkoff/von Neumann decomposition + PGPS scheduling of the decomposition permutation matrices, then a bound exists (shown shortly)
10 Related Work (cont’d) Leonardi et al (Infocom’01): a maximum-weight matching switch uniformly loaded with <1 has the mean delay Shah and Kopikare (Infocom’02): a switch with bernoulli <1 arrivals and scheduling that at each slots picks permutation matrix uniformly at random over the entire set of I! permutation matrices has the mean delay Mean-delay results:
11 Content Method to Construct Schedules Latency definition used Latencies of 4 schedulers: Random- Permutation, Random-Phase, Random- Distortion, Poisson Competition Numerical Examples Tasting some of the Methods Used to Obtain Results Conclusion
12 Method to Construct a Schedule: Superposition of Marked Point Processes Schedule: N1:N1: N2:N2: NK:NK: N:
13 Latency of a Schedule Latency 1: For any n, m, there exists Latency 2: For any n, there exists Latency 3: There exists
14 Latency of a Schedule number of slots offered to the ij-th port pair in [0,m) m
15 It is Valuable to have an Input- Output port Characterized with Rate-Latency Is a bound on lateness of the slots offered to the ij-th port pair It is a strict (rate-latency) service curve Having an input-output port pair characterized with a service curve, enables us to use known results from Network Calculus to bound backlog and delay for appropriately characterized arrival traffic
16 Scheduler by Chang et al PGPS token arrivals tokens placed back as new arrivals Initialization: token of type k arrive at
17 Scheduler by Chang et al (cont’d) Schedule: Tokens 1: Tokens 2: Tokens K:
18 Scheduler by Chang et al (cont’d) The bound of Chang et al is almost tight One can construct an example that almost attains the bound, see the paper
19 Smooth per-permutation matrix may not mean smooth per input- output port An input-output port pair may be scheduled by more than one permutation matrix Aggregate of subset of permutation matrices may be not smoothly scheduled, even though the schedule of permutation matrices is smooth If each input-output port pair would have 1 exactly in 1 perm. matrix, then classical polling
20 Random Permutation Scheduler Schedule: Tokens 1: Tokens 2: Tokens K: copy from [0,1)... copy from [0,1)
21 Latency of Random Permutation Scheduler Result 1: Fix some 0< <1. With probability 1- where (for, the same estimate holds with A=1/2ln
22 Flavor of a Way to Obtain the Result the range of Brownian bridge definition of the latency 3 period-L known result
23 Variance of the offered slots with Random Permutation
24 Random-Phase Scheduler Schedule: Tokens 1: Tokens 2: Tokens K:
25 Random-Phase Scheduler (cont’d) Result 2: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,
26 Random-Distortion Scheduler Schedule: Tokens 1: Tokens 2: Tokens K:
27 Random-Distortion Scheduler Result 3: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,
28 Poisson-Competition Scheduler Amounts to: at a slot, the permutation matrix is of type k ~ For latency 2: Waiting time of Geo/D/1 queue (known) Brownian approximation
29 Numerical Evaluations Goal: Evaluate latencies over a large set of service rate matrices (matrix M defined earlier) Algorithm to generate stochastic matrices Begin (k=0): set IxI matrix M such that [m ij ]=1/L, all ij Step (k), k=1,…,k 0 : draw i1, j1, i2, j2 uniformly at random on 1,2,…,I draw d uniformly at random on [0,min(m i1j1,m i2j2 )] [m i1j1 ]<-[m i1j1 ]-d, [m i2j2 ]<-[m i2j2 ]-d, [m i1j2 ]<-[m i1j2 ]+d, [m i2j1 ]<-[m i2j1 ]+d Evolution of M is a Markov chain One perhaps may prefer to generate M uniformly at random over the space of doubly stochastic matrices
30 Numerical Evaluations: varying switch size Ob.: except for small switch sizes, the random-phase bound is tighter than PGPS; the random-distortion bound is tightest
31 Numerical Evaluations: per port- pair latencies for a 64x64 matrix L=4096 K=2423 Ob.: the fraction is larger for the random-phase than PGPS for large enough x, the fraction is largest for the random-distortion
32 Numerical Evaluation for Random Permutation Scheduler L
33 Excerpts from the Analysis
34 Preliminaries “Good” Event: Assume: Result 1:
35 Preliminaries Cont’d Result 2: Putting the Pieces Together: G n,m is implied by the events easier to handle
36 Random-phase Scheduler Scheduler def: Assume Then Remains only to handle two events
37 Random-phase Scheduler (cont’d) Note Hoeffding Similarly Finally, sum to L, periodicity > 0
38 Random-phase Scheduler: DERANDOMIZATION Method of conditional probabilities Assume
39 Random-phase Scheduler: DERANDOMIZATION (cont’d) Result there exist In addition, if Then
40 Random-phase Scheduler: DERANDOMIZATION (cont’d) Application to our problem We showed By the method of cond. prob., it follows that the latency holds w.p.1 < 1
41 Conclusion We showed that one can obtain less pessimistic bounds on latency that hold in probability One can derandomize and obtain latencies that hold with probability 1 In many cases the obtained latencies are better than a best-known latency Approach of the Point Processes may be used to construct other schedulers Worth to try to obtain sharper results The question remains: what is the best possible latency for load larger than 1/4