Download presentation
Presentation is loading. Please wait.
1
A Load-Balanced Switch with an Arbitrary Number of Linecards Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University
2
Stanford 100Tb/s Router “Optics in Routers” project http://yuba.stanford.edu/or/ Some challenging numbers: 100Tb/s R =160Gb/s linecard rate N =640 linecards Performance guarantees
3
Router Wish List Scale to High Linecard Speeds No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity Scale to High Number of Linecards High Number of Linecards Arbitrary Arrangement of Linecards Provide Performance Guarantees 100% Throughput Guarantee Delay Guarantee No Packet Reordering
4
Out R R R R/N In R R R R/N 1 1 2 2 3 3 Load-Balanced Switch Load-balancing mesh Forwarding mesh
5
Out R R R R/N In R R R R/N 3 3 2 2 1 1 Load-Balanced Switch Load-balancing mesh Forwarding mesh
6
Out R R R R/N In R R R R/N Combining the Two Meshes One linecard In Out In Out
7
A Single Combined Mesh In Out In Out In Out In Out R In Out In Out In Out In Out R 2R/N
8
References on Early Work Initial Work C.-S. Chang, D.-S. Lee and Y.-S. Jou, "Load Balanced Birkhoff-von Neumann Switches, part I: One-Stage Buffering," Computer Communications, Vol. 25, pp. 611-622, 2002. Sigcomm’03 I. Keslassy, S.-T. Chuang, K. Yu, D. Miller, M. Horowitz, O. Solgaard and N. McKeown, "Scaling Internet Routers Using Optics," ACM SIGCOMM '03, Karlsruhe, Germany, August 2003.
9
Summary of Early Work Initial Work (C.-S. Chang et al.) Sigcomm‘03 Scheduler No centralized scheduler Architecture Crossbar-based architecture Mesh-based architecture => no reconfiguration Single Mesh Performance guarantees 100% throughput guarantee for weakly-mixing traffic 100% throughput guarantee for any adversarial traffic Average delay within constant from output-queued router No packet reordering
10
Router Wish List Scale to High Linecard Speeds No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity Scale to High Number of Linecards High Number of Linecards Arbitrary Arrangement of Linecards Provide Performance Guarantees 100% Throughput Guarantee Delay Guarantee No Packet Reordering
11
1 2 3 4 Example N =8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 2R/8
12
When N is Too Large Decompose into groups (or racks) 4R/4 2R2R2R2R 1 2 3 4 5 6 7 8 2R2R 2R2R 1 2 3 4 5 6 7 8 4R
13
When N is Too Large Decompose into groups (or racks) 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G 2RL 2RL/G
14
Router Wish List Scale to High Linecard Speeds No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity Scale to High Number of Linecards High Number of Linecards Arbitrary Arrangement of Linecards Provide Performance Guarantees 100% Throughput Guarantee Delay Guarantee No Packet Reordering
15
When Linecards are Missing Failures, Incremental Additions, and Removals… 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G 2RL 2RL/G 2RL Solution: replace mesh with sum of permutations = + + 2RL/G ≤ 2RL 2RL/G G *
16
Hybrid Electro-Optical Architecture Using MEMS Switches 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch Electronics Optics
17
12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch When Linecards are Missing
18
Router Wish List Scale to High Linecard Speeds No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity Scale to High Number of Linecards High Number of Linecards Arbitrary Arrangement of Linecards Provide Performance Guarantees 100% Throughput Guarantee Delay Guarantee No Packet Reordering
19
Questions Number of MEMS Switches? TDM Schedule?
20
All Link Capacities Are Equal 12L 2R 12L Group/Rack 1 Group/Rack G 12L 2R Group/Rack 1 12L 2R Group/Rack G MEMS Switch MEMS Switch MEMS Switch Link Capacity ≈ 64 λ’s * 5 Gb/s/λ = 320 Gb/s = 2R Laser/ Modulator MUX ≤ 2R
21
Group/Rack 1 1 2 2R 4R Group/Rack 2 12 2R 4R Example 2 Groups of 2 Linecards 12 2R Group/Rack 1 12 2R Group/Rack 2 4R 2R
22
Intuition on Worst-Case 12L 2R Group/Rack 1 12L 2R Group/Rack 1 MEMS Switch MEMS Switch MEMS Switch 2RL ≤ 2R L Group/Rack G 1 2R 1 Group/Rack 2 2R 1 Group/Rack 2 2R 1 Group/Rack G 2R G-1
23
Theorem: M ≤ L+G-1 Number of MEMS Switches Examples:
24
Questions Number of MEMS Switches? TDM Schedule?
25
Group A 1 2 2R 4R Group B 12 2R 4R TDM Schedule 12 2R Group A 12 2R Group B 4R 2R
26
Group A 1 2 2R 4R Group B 12 2R 4R TDM Schedule 12 2R Group A 12 2R Group B 4R 2R Uniform-spreading constraint on linecards Constraints on linecards at each time-slot Constraints on groups at each time-slot
27
TDM Schedule T+1T+2T+3T+4 Tx LC A1???? Tx LC A2???? Tx LC B1???? Tx LC B2???? Tx Group A Tx Group B
28
TDM Schedule T+1T+2T+3T+4 Tx LC A1A1A2B1B2 Tx LC A2B2A1A2B1 Tx LC B1B1B2A1A2 Tx LC B2A2B1B2A1 Tx Group A Tx Group B
29
Bad TDM Schedule T+1T+2T+3T+4 Tx LC A1A1A2B1B2 Tx LC A2B2A1A2B1 Tx LC B1B1B2A1A2 Tx LC B2A2B1B2A1 Tx Group A Tx Group B
30
TDM Schedule Algorithm Intuition 1. Create TDM schedule between groups: “Group A sends to group B” 2. Assign group connections to specific linecards: “Linecard A1 sends to linecard B3” Theorem: There exists a polynomial-time algorithm to find a correct TDM schedule.
31
Algorithm Running Time milliseconds number of linecards Worst Case Average Case Best Case [Verilog simulation, linecard placement generated uniformly-at-random among 40 groups, 4ns clock cycle, 1000 runs per case. Source: Srikanth Arekapudi]
32
Open Questions Greedy TDM algorithm with more capacity? A better switch fabric architecture?
33
Thank you.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.