Download presentation
Presentation is loading. Please wait.
Published byWalter Ball Modified over 9 years ago
1
Winter 2006EE384x1 EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm
2
Winter 2006EE384x2 Delay Guarantees Problem : How can we design a parallel output-queued router from slower parallel memories and provide delay guarantees? This is difficult because The counting technique depends on being able to predict the departure time and schedule it (before, we assumed that the output queue is FCFS). In policies such as strict priority, weighted fair queueing etc., we don’t know a cell’s departure time when it arrives.
3
Winter 2006EE384x3 Delay Guarantees one output, many logical FIFO queues 1 m Weighted fair queueing sorts packets by finishing time constrained traffic PIFO models Weighted Fair Queueing Weighted Round Robin Strict priority one output, single PIFO queue Push In First Out (PIFO) constrained traffic push-in
4
Winter 2006EE384x4 Theorem A parallel output-queued router can give delay guarantees (within a bounded error) with 4N –2 memories that can perform at most one memory operation per time slot.
5
Winter 2006EE384x5 Intuition for Theorem 2 N=3 987456321 2.5 Departure Order … 8763452.521 1.5 … Departure Order 8763452.521 N-1 packets before cell at time of insertion 987456321 DT = 3 DT= 2DT= 1 … 7652.53421.51 N -1 packets after cell at time of insertion PIFO: 2 windows of memories of size N-1 that can’t be used FIFO: Window of memories of size N-1 that can’t be used Departure Order
6
Winter 2006EE384x6 Proof A packet cannot use the memories: 1.Used to write the N-1 arriving cells at t. 2.Used to read the N departing cells at t. Time = t DT=t DT=t+T Cell C Before C After C 3.Used to read the N-1 cells that depart before it. 4.Used to read the N-1 cells that depart after it.
7
Winter 2006EE384x7 With a PIFO per output c3b3a3a2b2c2c1b1a1 DT = 3 DT= 2DT= 1 c4b4a4 DT = 4 c3b3a3a2b2c2c1b1a1 a2’ c4b4a4 c3b3a2a2’b2c2c1b1a1c4b4a3 Relative order of (a3,b3) reversed after being placed in memory Therefore, departure is not in PIFO order. By how much can the order differ?
8
Winter 2006EE384x8 Permute departure order DT = 3 DT= 2DT= 1 DT = k N1N1b1a1N2N2b2a2N3N3b3a3Nkbkbkakak akaka2a1bkbkb2b1ckckc2c1NkN2N2N1N1 a2’ akaka2a1bkbkb2b1ckckc2c1NkN2N2N1N1 Cells are correctly resequenced by each output. Therefore, maximum delay is k -1 time slots. a (k-1) a2’a1bkbkb2b1ckckc2c1NkN2N2N1N1
9
Winter 2006EE384x9 Summary - Routers with delay guarantees Marriage2NR6NR3R2N -2NR2NR2NR/kNk - NR2NR2RNCrossbar Input Queued None2NR 1BusShared Mem. Switch Algorithm Switch BW Total Memory BW Mem. BW# Mem.Fabric NoneNRN(N+1)R(N+1)RNBusOutput-Queued PSM C. Sets6NR3N(N+1)R3R(N+1)/kNkClosPPS - OQ C. Sets6NR 6RN C. Sets8NR4NR4RN Edge Color5NR4NR4RN Xbar C. Sets4NR 4NR/kkBus C. Sets6NR 6NR/kNk Clos Time Reserve 3NR6NR3R2N Crossbar PPS – Shared Memory DSM (Juniper) CIOQ (Cisco)
10
Winter 2006EE384x10 Summary of OQ Switches Output queued switches are ideal Work-conserving. Maximize throughput. Minimize expected delay (for fixed length packets). Permit delay guarantees for constrained traffic. Output queued switches don’t scale well Requires N memory writes per time slot. Memory bandwidth (dictated by the random-access time of a memory) is a bottleneck. Parallelism is not straightforward.
11
Winter 2006EE384x11 Summary of OQ Switches (2) Parallelizing packet switches has problems Resource conflicts. Packet mis-sequencing. Methods to analyze parallel OQ switches Constraint Sets (based on pigeon-hole principle) Parallel packet switches Parallel shared memory Distributed shared memory Extension to PIFO Parallel packet buffers Hybrid SRAM-DRAM FIFO queues. With and without lookahead buffer.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.