Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.

Similar presentations

Presentation on theme: "Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant."— Presentation transcript:

1 Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant Scheduling Algorithms for CIOQ switches”, Prashanth Pappu, Jon Turner. To appear in ICNP 2003.

2 Prashanth Pappu Anatomy of a Router Switch Fabric IPP OPP Line Card IPP OPP Line Card IPP OPP Line Card IPP OPP Line Card IPP OPP Line Card IPP OPP Line Card Control Processor  Port processor queue packets and make routing decisions  Line cards encode data for transmission on target physical layer.  Control processor – routing protocols and monitoring functions.

3 Prashanth Pappu Output Queuing  Queuing is done only at output ports.  Maximizes throughput.  Contentions between packets - only at output ports.  Speedup=N, impractical but ideal model. Switching Fabric Output Ports Input Ports

4 Prashanth Pappu Combined Input Output Queuing (CIOQ)  Use of VOQs.  Crossbar configured by centralized scheduler.  Bipartite graph matching problem. Switching Fabric Output Ports Input Ports … … Centralized Scheduler

5 Prashanth Pappu Stability results  Maximum size matching (MSM) – stable for i.i.d, uniform, admissible traffic.  Maximum weight matching (MWM) – stable for independent, admissible traffic.  Too complex, O(N 5/2 ) and O(N 3 logN).  Switch with 10 Gb/s links has < 40 ns to make scheduling decision.  Maximal size matching algorithms – Parallel Iterative Matching (PIM) and iterative SLIP (iSLIP).

6 Prashanth Pappu Parallel iterative Matching (PIM)  Iterative matching algorithm  each unmatched input sends request to every output for which it has a queued cell.  unmatched outputs randomly pick a request and send grant.  if input receives multiple grants, it picks one randomly.  O(log N) convergence.

7 Prashanth Pappu iterative SLIP (iSLIP)  Iterative matching algorithm  unmatched inputs send requests to unmatched outputs (for which they have cells)  unmatched outputs pick a request that appears next in a fixed round-robin order from an input pointer. (input pointer is updated only in first iteration)  if input gets multiple grants, it picks one that appears next in a fixed round robin order from an output pointer. (update of output pointer)  Desynchronization effect.  Simple to implement but do not perform well under extreme traffic conditions.

8 Prashanth Pappu Worst case results  Critical Cells First (CCF) can emulate output queuing with speedup =2.  Lowest occupancy output first algorithm is work conserving with speedup =2.  Can be augmented with timestamps to emulate output queuing. (speedup=3)

9 Prashanth Pappu LOOFA  Iterative matching algorithm  unmatched inputs send requests to outputs with lowest occupancy (for which they have queued cells)  outputs pick a request randomly and send grant to input  O(N) iterations to perform correctly.  Work conserving with speedup of 2.  Significant result but not practical.

10 Prashanth Pappu Traffic in IP networks  Unregulated nature of IP networks can cause sustained overloads.  use of slow congestion control mechanisms  limited route diversity makes congested links common  use of route selection mechanisms not guided by session bandwidth needs  sudden route changes causing rapid traffic shifts  malicious users  How do practical scheduling algorithms perform in these conditions?

11 Prashanth Pappu Solution  We use targeted stress tests to  study performance of practical scheduling algorithms under extreme conditions  study performance of work conserving scheduling algorithms under speedups < 2  design stress resistant scheduling algorithms which maintain throughput under uniform traffic and stress tests and can still be implemented at high speeds.

12 Prashanth Pappu Miss fraction  Previous work use average queuing delay as a metric.  Not useful under inadmissible traffic conditions.  Miss fraction miss fraction = 1 – N A /N I  Determines relative loss in throughput.

13 Prashanth Pappu Stress Test phase 1phase 2phase 3phase 4  Adversary approach in overloading (stressing) various outputs.  Output with empty queues have cells queued at various inputs.  Inputs with cells for an empty output also have cells queued for other outputs.  Test can be varied by changing number of participating inputs or phases.

14 Prashanth Pappu Stress Test (Example)  PIM (speedup =1.5). Stress test with 3 participating inputs, 4 phases.

15 Prashanth Pappu PIM (under uniform traffic) Average Queuing delays Miss fraction

16 Prashanth Pappu iSLIP (under uniform traffic) Average Queuing delays Miss fraction

17 Prashanth Pappu Stress Tests Test A (Worst case for PIM(4), speedup=2) Test B (Worst case for LOOFA, speedup=2)

18 Prashanth Pappu Stress resistant algorithms  Better performance of LOOFA suggests, ordering outputs is the key.  Complete ordering can make algorithms too complex to implement.  But traffic conditions are persistent and change slowly, use approximate ordering schemes.  Lowest Layer Selection (LLS) heuristic which achieves a coarser ordering of outputs.  Odd-even sorting which achieves approximate ordering but converges to ideal ordering under persistent traffic conditions.

19 Prashanth Pappu Lowest Layer Selection  achieves coarser ordering  bigger layers for larger queue lengths  beyond a queue limit all outputs are treated equal  number of layers independent of N.  algorithms give priority to outputs in lowest layer in accept phase.  priority encoder or N-way minimum finding circuit can be used on a grant vector.

20 Prashanth Pappu Lowest Layer Selection - Random (LLS-R)  Iterative matching algorithm  each unmatched input sends request to every output for which it has a queued cell.  unmatched outputs randomly pick a request and send grant.  if input receives multiple grants, it picks one randomly from lowest layer.  O(log N) convergence still holds.

21 Prashanth Pappu Lowest Layer Selection –SLIP (LLS-S)  Iterative matching algorithm  unmatched inputs send requests to unmatched outputs (for which they have cells)  unmatched outputs pick a request that appears next in a fixed round-robin order from an input pointer. (input pointer is updated only in first iteration)  if input gets multiple grants, it picks one that appears next in the lowest layer in a fixed round robin order from an output pointer. (update of output pointer)  Both LLS-R and LLS-S have the same performance as PIM and iSLIP under uniform traffic.

22 Prashanth Pappu Stress Test Miss fractions for LLS-R, LLS-S (using 16 layers) and LOOFA. Test A Test B

23 Prashanth Pappu Stress Test Miss fractions for LLS-S and LLS-R (single iteration) with varying layers. LLS-S (Test A)LLS-R (Test A)

24 Prashanth Pappu Approximate LOOFA (A-LOOFA)  LOOFA is complex but can be used as the basis for a practical algorithm (with similar performance)

25 Prashanth Pappu Approximate LOOFA (A-LOOFA)  Matching in A-LOOFA is accomplished using a simple combinational circuit.  O(N) but constant factor is determined by gate delays (2N times delay in each block). .13 um ASIC process, gate delays are 25-50 ps. Match can be completed in 3.2-6.4 ns.

26 Prashanth Pappu A-LOOFA  Columns are ordered using odd-even sort.  for all even j q j+1.  Similarly, for all odd j < N-1  Rows are ordered using a permutation based on perfect shuffle (to ensure fairness).  for all even i<N, generate a pseudo random bit x i.  if x i = 0, values in row i are moved to row i/2 and those in i+1 are moved to (N+i)/2.  else, values in row i are moved to row (N+i)/2 and values in row i+1 are moved to row i/2.

27 Prashanth Pappu A-LOOFA performance Test ATest B

Download ppt "Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant."

Similar presentations

Ads by Google