Download presentation
Presentation is loading. Please wait.
1
Towards Simple, High-performance Input-Queued Switch Schedulers Devavrat Shah Stanford University Berkeley, Dec 5 Joint work with Paolo Giaccone and Balaji Prabhakar
2
2 Outline Description of input-queued switches Scheduling –the problem –some history Simple, high-performance schedulers –Laura –Serena –Apsara Conclusions
3
3 The Input-Queued (IQ) Switch Architecture N inputs, N outputs (in fig, N = 3) Time is slotted –at most one packet can arrive per time-slot at each input Equal sized cells/packets Buffers only at inputs Use a crossbar for switching packets
4
4 Scheduling Crossbar is defined by these constraints: in each time-slot –only one packet can be transferred to each output –only one packet can be transferred from each input The scheduling problem: Subject to the above constraint, find a matching of inputs and outputs –i.e. determine which output will receive a packet from which input in each time slot
5
5 Background to switch scheduling 1.[Karol et al. 1987] Throughput is limited due to head-of-line blocking (limited to 58% for Bernoulli IID uniform traffic) 2.[Tamir 1989] Observed that with “Virtual Output Queues” (VOQs) head-of-line blocking is eliminated.
6
6 Basic Switch Model S(t) N N L NN (t) A 1N (t) A 11 (t) L 11 (t) 11 A NN (t) A N1 (t) D 1 (t) D N (t)
7
7 Some definitions 3. Queue occupancies: Occupancy L 11 (t) L NN (t)
8
8 More background on theory [Anderson et al. 1993] A schedule is equivalent to finding a matching in a bipartite graph induced by input and output nodes
9
9 Background [McKeown et al. 1995] (a) Maximum size match does not give 100% throughput. (b) But maximum weight match can, where weight can be queue-length, age of a cell 20 3 2 30 25 20 30 25 MWM
10
10 Maximum Weight Matching Maximum weight matching (MWM) –100% throughput –provable delay bounds for i.i.d. Bernoulli admissible traffic –but, finding MWM is like solving a network-flow problem whose complexity is -- complex for high-speed networks We seek to approximate maximum weight matching Our goal: –obtain a simply implementable approximation to MWM that performs competitively with MWM
11
11 Approximating MWM Two performance measures –throughput –delay We first consider simple approximations to MWM that deliver 100% throughput (i.e. stability), and then deal with delay
12
12 Methods of Approximation Randomization –well-known method for simplifying implementation Using information in packet arrivals –since queue-sizes grow due to arrivals, and arrival times are a source of randomness Hardware parallelism –yields an efficient search procedure
13
13 Randomization The main idea of randomized algorithms is –to simplify the decision-making process by basing decisions upon a small, randomly chosen sample from the state rather than upon the complete state
14
14 An Illustrative Example Find the oldest person from a population of 1 billion Deterministic algorithm: linear search –has a complexity of 1 billion A randomized version: find the oldest of 30 randomly chosen people –has a complexity of 30 (ignoring complexity of random sampling) Performance –linear search will find the absolute oldest person (rank = 1) –if R is the person found by randomized algorithm, we can make statements like P(R has rank 0.95 thus, we can say that the performance of the randomized algorithm is very good with a high probability
15
15 Randomizing Iterative Schemes Often, we want to perform some operation iteratively Example: find the oldest person each year Say in 2001 you choose 30 people at random –and store the identity of the oldest person in memory –in 2002 you choose 29 new people at random –let R be the oldest person from these 29 + 1 = 30 people P(R has rank < 100 million) or, P(R has rank < 50 million)
16
16 Back to Switch Scheduling: Randomizing MWM Choose d matchings at random and use the heaviest one as the schedule Ideally we would like to have small d. However: Theorem: Even with d = N this algorithm doesn’t yield 100% throughput!
17
17 Proof
18
18 Switch Size : 32 X 32 Input Traffic (shown for a 4 X 4 switch) –Bernoulli i.i.d. inputs –diagonal load matrix: normalized load=x+y<1 x=2y Simulation Scenario
19
19
20
20 Crucial Observation The state of the switch changes due to arrivals & departures Between consecutive time slots, a queue’s length can change at most by 1 –hence a heavy matching tends to stay heavy Therefore –‘’remembering’’ a heavy matching should help in improving the performance
21
21 Tassiulas’ Algorithm [Tassiulas 1998] proposed the following algorithm based on this observation: –let S(t-1) be the matching used at time t-1 –let R(t) be a matching chosen uniformly at random –and let S(t) be the heavier of R(t) and S(t-1) This gives 100% throughput ! note the boost in throughput is due to the use of memory But, delays are very large
22
22
23
23 Derandomization Let G be a fully-connected graph where each node is one of the N! possible schedules Construct a Hamiltonian walk, H(t), on G –H(t) cycles through the nodes of G At any time t –let R(t) = H(t mod N!) –and let S(t) be the heavier of R(t) and S(t-1) this also has 100% throughput, but delays are large (derandomization will be useful later)
24
24 Stability Lemma: Consider IQ switch with Bernoulli i.i.d. inputs. Let B be a matching algorithm which ensures W B (t) >= W*(t) – c for every t. Then B is stable. Theorem: W DER (t) >= W*(t) – 2N.N! Therefore, it is stable.
25
25 Delay These simple approximations of MWM yield 100% throughput, but delays are large To obtain good delays we’ll present three different algorithms which use the following features: –selective remembrance -- Laura –information in the arrivals -- Serena –hardware parallelism -- Apsara
26
26 Laura Tassiulas COMP = Maximum R(t) – uniform sample Next time COMP S(t-1) S(t) R(t) Laura COMP = Merge, picks the best edges of two matchings R(t) – non-uniform sample
27
27 10 70 60 50 40 30 10 20 Merging S(t-1) R 10 – 40+10 - 30+10-50= - 90 70-10+60-20=100 W(S(t-1))=160W(R)=150 S(t) W(S(t)) = 250 Merging Procedure
28
28 Throughput Theorem: –LAURA is stable under any admissible Bernoulli i.i.d. input traffic.
29
29 Average Backlog via Simulation Switch size: N = 32 Length of VOQ: Q MAX = 10000 Comparison with –iSLIP, iLQF, MUCS, RPA and MWM
30
30 Simulation Traffic Matrices –uniform diagonal sparse –logdiagonal
31
31 Laura: Diagonal traffic
32
32 Laura: Sparse traffic
33
33 Since an increase in queue sizes is due to arrivals And arrivals are a source of randomness Use arrivals to generate random matching SERENASerena
34
34 Serena Next time Merge S(t-1) S(t) R(t) = matching generated using arrivals
35
35 23 7 89 3 2 5 Arr-R 47 11 31 97 S(t-1) Merging Procedure 89 3 5 23 W(S(t-1))=209 1 W(R)=121 R Merging S(t) W(S(t))=243 89 3 23 31 97
36
36 Throughput Theorem: –SERENA achieves 100% throughput under any admissible i.i.d. Bernoulli traffic pattern
37
37 Serena: Diagonal traffic
38
38 Apsara One way to obtain MWM is to search the space of all N! matchings A natural approximation: If S(t-1) is the current matching, then S(t) is the heaviest matching in a “neighborhood” of S(t-1) It turns out that there is a convenient way of defining neighbors (both for theory and for practice)
39
39 Neighbors Neighbors differ from S(t) in ONLY TWO edges (for all values of N) Neighbors Example: 3 x 3 switch S(t)
40
40 Apsara Next time MAX S(t-1) S(t) Neighbors generated in parallel N1N2Nk H(t) Hamiltonian Walk
41
41 Apsara: Throughput Theorem: Apsara is stable under any admissible i.i.d. Bernoulli traffic. (stability due to Hamiltonian matching) Also, note that W(S(t)) >= W(S(t-1),t) Theorem: If W(S(t)) = W(S(t-1),t) then W(S(t)) >= 0.5 W *(t) (this is not enough to ensure stability)
42
42 Apsara: Diagonal traffic
43
43 Limited Parallelism The Apsara algorithm searches over neighbors in parallel If space is limited to modules, then search over randomly chosen subset of size K from all neighbors And there are other (good) deterministic ways of searching a smaller neighborhood of matchings
44
44 Apsara: Limited parallelism
45
45 Diagonal traffic
46
46 Conclusions We have presented novel scheduling algorithms for input-queued switches –Laura –Serena –Apsara They are simple to implement and perform competitively with respect to the Maximum Weight Matching algorithm
47
47 References 1.L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input-queued switches,” Proc. INFOCOM 1998. 2.D. Shah, P. Giaccone and B. Prabhakar, “An efficient randomized algorithm for input-queued switch scheduling,” Proc. of Hot Interconnects, 2001. 3.P. Giaccone, D. Shah and B. Prabhakar,” An Implementable Parallel Scheduler for Input-Queued Switches”, Proc. of Hot Interconnects, 2001. 4.P. Giaccone, B. Prabhakar and D. Shah, “Towards simple and efficient scheduler for high-aggregate IQ switches”, Submitted INFOCOM’02. 5.R. Motwani and P. Raghavan, Randomized Algorithms, Cambridge University Press, 1995.
48
48 Uniform traffic
49
49 LogDiagonal traffic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.