A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
IEEE Hot Interconnects XIII, August 17-19, Motivation Scalability: Traffic demands growing, driven in part by increasing broadband adoption 10x increase in broadband subscription in just last 3 years, already over 100 million subscribers Gbps fiber to homes emerging (GPON, GEPON, EPON, BPON …) Service Guarantees: Operators need bandwidth partitioning capabilities Provide guaranteed rates in service-level agreements Enable logical partitioning of converged networks Traffic engineering in general
IEEE Hot Interconnects XIII, August 17-19, Router Wish List Scalable in line rates and number of linecards e.g. R = 160 Gbps (new packet every 2ns), thousands of linecards, petabit capacity No centralized scheduler No per-packet dynamic switch reconfigurations Low complexity linecards Provide performance guarantees 100% throughput guarantee Service guarantees No packet reordering
IEEE Hot Interconnects XIII, August 17-19, Existing Architectures Output-Queueing (OQ) Switch Well-known rate guarantees possible with Weighted Fair Queueing or Deficit Round-Robin scheduling But OQ switches require speedup of N Crossbar Switches, using Input-Queueing (IQ) or Combined Input-Output Queueing (CIOQ) OQ emulation possible But expensive centralized scheduling and per-packet dynamic switch reconfigurations Birkhoff-von Neumann decomposition If traffic matrix known, can provide rate guarantees with distributed scheduling, but still requires per-packet dynamic switch reconfigurations
IEEE Hot Interconnects XIII, August 17-19, Existing Architectures (cont’d) Load-Balanced Switches Chang et al., “Load balanced Birkhoff-von Neumann switches, Part I: one-stage buffering”, Computer Communications, 2002 Keslassy et al., “Scaling Internet routers using optics”, ACM SIGCOMM 2003 A key idea: fixed configuration uniform meshes in optics, no dynamic switch reconfigurations Showed 100 Tb/s load-balanced router with R = 160 Gbps and N = 640 linecards Showed 100% throughput for “best effort” traffic, but no service guarantees
IEEE Hot Interconnects XIII, August 17-19, This Talk Presents the Interleaved Matching Switch (IMS) Like a load-balanced switch, use fixed configuration uniform meshes, implemented with an optical fabric No arbitrary per-packet switch reconfiguration Can emulate any IQ or CIOQ switch Can emulate a Birkhoff-von Neumann switch If traffic matrix known, can ensure 100% throughput, service guarantees, and packet ordering Show we can use O(1) distributed online scheduling
IEEE Hot Interconnects XIII, August 17-19, Out R R R R/N R R R Generic Load-Balanced Switch Using Fixed Configuration Uniform Meshes R/N In Linecards
IEEE Hot Interconnects XIII, August 17-19, Out R R R R/N R R R Generic Load-Balanced Switch Using Fixed Configuration Uniform Meshes R/N Linecards In
IEEE Hot Interconnects XIII, August 17-19, Out R R R R/N R R R Generic Load-Balanced Switch Using Fixed Configuration Uniform Meshes R/N Linecards In Many Fabric Options (any spreading device) Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches Just need fixed uniform rate channels at R/N No dynamic switch reconfigurations Many Fabric Options (any spreading device) Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches Just need fixed uniform rate channels at R/N No dynamic switch reconfigurations
IEEE Hot Interconnects XIII, August 17-19, Out R R R R/N R R R From Load-Balanced Switch R/N Linecards In
IEEE Hot Interconnects XIII, August 17-19, Out R R R R/N R R R To Interleaved Matching Switch R/N Linecards Move main packet buffers to INPUT Add coordination slots in MIDDLE Retain Fixed Configuration Meshes
IEEE Hot Interconnects XIII, August 17-19, How It Works IMS works by emulating an IQ or CIOQ crossbar switch, but without per-packet dynamic switch reconfigurations (will show how centralized scheduling can be avoided later)
IEEE Hot Interconnects XIII, August 17-19, How It Works
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Crossbar Switch
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Crossbar Switch
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Crossbar Switch B1 C1 A1
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 C1 B2 C2 A1 A2 R R R Crossbar Switch R R B1 C1 A1 Differences with crossbar switch No dynamic switch reconfigurations Departure times delayed by 2N time slots, N time slots per mesh, otherwise same sequence Packet transfers initiated at each time slot to next MIDDLE linecard in round-robin order Differences with crossbar switch No dynamic switch reconfigurations Departure times delayed by 2N time slots, N time slots per mesh, otherwise same sequence Packet transfers initiated at each time slot to next MIDDLE linecard in round-robin order
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A1 A2 A1 B1 C1 C2 B2 C2 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 B2 C2 A2 R R R Crossbar Switch R R C1 C2 C1 C2
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A2 A1 B1 C1 C2 B2 C2 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A1 A2 A1 B1 C1 C2 B2 C2 A2 R R R Crossbar Switch R R C2 C1 C2 A1 B1 C1
IEEE Hot Interconnects XIII, August 17-19, How It Works R/N R R R Linecards R/N Linecards A2 A1 C1 C2 B2 C2 A2 R R R Out Interleaved Matching Switch R R R XBARLinecards Out R R R R R R Linecards A2 A1 C1 C2 B2 C2 A2 R R R Crossbar Switch R R C2 B1 B2 B1 B2 Crossbar MATCHINGS are INTERLEAVED across MIDDLE linecards (analogous to memory interleaving)
IEEE Hot Interconnects XIII, August 17-19, IQ and CIOQ Switch Emulation An IMS can emulate any IQ or CIOQ switch.
IEEE Hot Interconnects XIII, August 17-19, When Traffic Matrix is Known When traffic matrix is known, can perform Birkhoff-von Neumann decomposition offline Given any admissible traffic matrix Can decompose into a series of permutation matrices ( ) such that where
IEEE Hot Interconnects XIII, August 17-19, Example Consider following example: Use weighted fair queueing to schedule each permutation matrix proportionally to its corresponding weight
IEEE Hot Interconnects XIII, August 17-19, Distributed Storage and Scheduling Distributed storage: each input linecard only stores its corresponding “rows” Distributed scheduling: each input linecard only responsible for scheduling its own VOQs O(1) time/hardware complexity: use deficit round-robin scheduling (many efficient variants)
IEEE Hot Interconnects XIII, August 17-19, Birkhoff-von Neumann Emulation If traffic matrix known, an IMS can guarantee 100% throughput and guaranteed flow rates when combined with Birkhoff-von Neumann decomposition and online fair scheduling
IEEE Hot Interconnects XIII, August 17-19, Frame-Based Decomposition If traffic matrix can be converted to an integer matrix by multiplying by an integer F, then can be decomposed into F permutations Known decomposition algorithms (if F is integer multiple of N ) Birkhoff-von Neumann: O( N 3.5 ) Slepian-Duguid: O( N 3 ) New efficient formulation using edge-coloring O( N 2 log N)
IEEE Hot Interconnects XIII, August 17-19, Conclusions Scalability IMS leverages scalability of fixed optical meshes If traffic matrix known, distributed online scheduling can achieve O(1) time and hardware complexity Emulation IMS can emulate any IQ or CIOQ switch under same speedup and matching Guarantees If traffic matrix known, can ensure 100% throughput, service guarantees, and packet ordering via Birkhoff-von Neumann switch emulation For integer matrices, new edge coloring formulation
Thank You