Download presentation
Presentation is loading. Please wait.
Published byBrianne Harvey Modified over 9 years ago
1
Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang (NEC Labs, Princeton)
2
Data centers: Foundation of Internet services, enterprise operation –Need good bandwidth connectivity between servers Data Centers 2
3
“Good” Bandwidth Connectivity Connect all servers at full bandwidth? Fat-trees [SIGCOMM 2008], VL2 [SIGCOMM 2009] 3 C ABLING C OMPLEXITY U PGRADE TO 40/100-G IG E? P OWER C ONSUMPTION ?
4
Oversubscribed Networks Is all-to-all full bandwidth connectivity always necessary? –Small number of ‘hot’ ToR-ToR connections Flyways [HotNets 2009] –>90% bytes flow in ‘elephant flows’ VL2 [SIGCOMM 2009] – ~60% ToRs see <20% change in traffic for between 1.6-2.2 sec The Case for Fine-grained TE in Data Centers [WREN 2010] Flyways [HotNets 2009], c-Through and Helios [SIGCOMM 2010] Supplement electrical network with wireless/optics –Wireless/Optical connections are set up between hot ToRs –Some flexibility to adjust to changes in traffic matrix 4
5
Proteus Proteus is a novel interconnect above the ToR layer –Topology adjusts to traffic demands –Low cabling complexity –Easier migration to 40/100-GigE –Low power consumption 5 A N EW D ESIGN P OINT : A LL - OPTICS Optical Interconnect ToR... ToR... Servers Proteus is an oversubscribed network with topology malleability topology malleability
6
Malleability AB C D E F GH G C F A D E B H C HANGE TOPOLOGY G C F A D E B H C HANGE CAPACITY T RAFFIC C HANGE P ICK R OUTES AG10 BH CE DF BD AG BH CE GF20 BD10 6
7
1 Gigabit X 64,000 64 Terabits* X 1 * Achieved by NEC Labs and AT&T Low complexity, reconfigurability, low power consumption MEMS D C B A A C B D A B C D A B C D A C D WSS MEMS C IRCUIT SETUP TIME L IMITED W AVELENGTHS TOPOLOGY MANAGEMENT 7 MEMS = Micro-Electro Mechanical Switch WSS = Wavelength Selective Switch Optics: Perfect Fit
8
Problem Setting: Container-sized DCN Proteus-2560: Connect 80 ToRs, each with 32 servers Typical container-size in containerized data center architectures Image adapted from: www.sun.com/blackboxwww.sun.com/blackbox 8
9
ToR Perspective 9 … N ON - BLOCKING T O R … O PTICAL I NTERCONNECT S ERVERS 32 PORTS TOWARDS INTERCONNECT 32 PORTS FOR S ERVERS
10
ToR Perspective 10 … N ON - BLOCKING T O R … O E O I NTRA -R ACK T RAFFIC T RANSIT T RAFFIC (H OP - BY - HOP ) C ROSS -R ACK T RAFFIC T RANSCEIVERS W ITH U NIQUE W AVELENGTHS (O-E-O conversions add sub-nanosecond latency at each hop) L IMITED BY T O R PORT CAPACITY
11
11 … TOR1TOR1 … O PTICAL C OMPONENTS ToR 13 ToR 21 ToR 45 ToR 73 I NCOMING H IGH C APACITY L INK L OW C APACITY L INK ToR 67 ToR 11 ToR 29 ToR 55 C HANGE T OPOLOGY C HANGE C APACITY O PTICAL C OMPONENTS
12
T OPOLOGY (MEMS) B I - DIRECTIONALITY (C IRCULATORS ) C APACITY (WSS) 12 MEMS (320 ports) C C C C WSS MUX … … ToR 26 … … … … C C C C … … ToR 59 … COUPLER DEMUX To ToR 2 To ToR 31 32 4 S S R R
13
Proteus-2560 Properties Build any 4-regular ToR topology Each link’s capacity varies in each direction –Capacity Є {10, 20, 30, …, 320 } Gbps –Provided sum of capacities of 4 links <= 320 Gbps –(Also avoid wavelength contention) Use hop-by-hop connections to other ToRs –Transit traffic doesn’t interfere with intra-ToR traffic 13
14
Topology Management We formulate the problem as a mixed-integer linear program Describe a heuristic approach backed by graph-theoretic insights –Likely to take under a couple of hundred milliseconds C OMPLEX PROBLEM : A LL CONFIGURATIONS ARE INTERDEPENDENT D CB A ? A C D ? A B C D ? MEMS WSSHop-by-hop routing 14
15
Heuristic Approach – Key Ideas Topology: Weighted 4-matching over hot ToR-ToR connections –Check and correct for connectivity Routing: Can use shortest paths –Ideally, need low-congestion routing schemes Capacities: Graph edge-coloring over wavelengths –Ensure each link carries at least one wavelength 15
16
Preliminary Analysis Cabling: #Fibers ≈ 1/5 th #cables in a fat-tree Ease of upgrade: When ToRs move to 40/100-GigE, nothing else changes! Cost: similar to a fat-tree –Optics is yet to benefit from commoditization –To some extent, dispels the optics is expensive myth Power: 50% of fat-tree power consumption Fat-tree is also fault tolerant though 16
17
Conclusion, Ongoing Work A novel data center architecture –Unprecedented topology flexibility –Reduced cabling complexity –Easier migration to 40/100-GigE –Reduced power consumption –Explores a new design point – all-optics Experimental evaluation Incremental update heuristics Mega-data-center scale Fault tolerance 17 T RANSIENT B EHAVIOR ? R OUTING ? S YNCHRONIZATION ?
18
Thank You! Questions?
19
Extras / Backup 19
20
Hop-by-hop Through ToRs MEMS – limited end-to-end circuits Need hop-by-hop routes over these circuits Feasibility assessment: works fine! 20
21
Helios [SIGCOMM ’10] Pods are still fat-trees Requires design-time decision on stable vs. unstable traffic Does not exploit multi-hop optical routes Does not leverage WSS technology for variable capacity Image from “Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers” – Farrington et al 21
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.