Subways: A Case for Redundant, Inexpensive Data Center Edge Links Vincent Liu, Danyang Zhuo, Simon Peter, Arvind Krishnamurthy, Thomas Anderson University.

Slides:



Advertisements
Similar presentations
Ethernet Switch Features Important to EtherNet/IP
Advertisements

Data Center Networking with Multipath TCP
Improving Datacenter Performance and Robustness with Multipath TCP
NETWORK TRANSFORMATION THROUGH VIRTUALIZATION
Towards Predictable Datacenter Networks
Hierarchical Design.
Jaringan Komputer Lanjut Packet Switching Network.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
Cisco Hierarchical Network Model RD-CSY /101.
Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
Dynamic Routing Scalable Infrastructure Workshop, AfNOG2008.
Ethernet and switches selected topics 1. Agenda Scaling ethernet infrastructure VLANs 2.
Highly Available Central Services An Intelligent Router Approach Thomas Finnern Thorsten Witt DESY/IT.
60 GHz Flyways: Adding multi-Gbps wireless links to data centers
Reconfigurable Network Topologies at Rack Scale
Virtual Layer 2: A Scalable and Flexible Data-Center Network Work with Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Parantap Lahiri,
Chuanxiong Guo, Haitao Wu, Kun Tan,
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
FireFly: A Reconfigurable Wireless Datacenter Fabric using Free-Space Optics Navid Hamedazimi, Zafar Qazi, Himanshu Gupta, Vyas Sekar, Samir Das, Jon.
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Multipath Routing CS 522 F2003 Beaux Sharifi. Agenda Description of Multipath Routing Necessity of Multipath Routing 3 Major Components Necessary for.
Ji-Yong Shin * Bernard Wong +, and Emin Gün Sirer * * Cornell University + University of Waterloo 2 nd ACM Symposium on Cloud ComputingOct 27, 2011 Small-World.
A Scalable, Commodity Data Center Network Architecture.
 I/O channel ◦ direct point to point or multipoint comms link ◦ hardware based, high speed, very short distances  network connection ◦ based on interconnected.
Server Assisted TRILL Edge Linda Dunbar
The Structure of Networks with emphasis on information and social networks T-214-SINE Summer 2011 Chapter 8 Ýmir Vigfússon.
Barracuda Load Balancer Server Availability and Scalability.
DENS: Data Center Energy-Efficient Network-Aware Scheduling
Copyright © 2011, Programming Your Network at Run-time for Big Data Applications 張晏誌 指導老師:王國禎 教授.
Routing Protocol Evaluation David Holmer
On the Data Path Performance of Leaf-Spine Datacenter Fabrics Mohammad Alizadeh Joint with: Tom Edsall 1.
Congestion control for Multipath TCP (MPTCP) Damon Wischik Costin Raiciu Adam Greenhalgh Mark Handley THE ROYAL SOCIETY.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
Datacenter Network Simulation using ns3
Clustering In A SAN For High Availability Steve Dalton, President and CEO Gadzoox Networks September 2002.
1 Enabling Efficient and Reliable Transitions from Replication to Erasure Coding for Clustered File Systems Runhui Li, Yuchong Hu, Patrick P. C. Lee The.
CSE 461 University of Washington1 Topic How do we connect nodes with a switch instead of multiple access – Uses multiple links/wires – Basis of modern.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
Section #7: Getting Data from Point A to Point B.
C-Through: Part-time Optics in Data centers Aditi Bose, Sarah Alsulaiman.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
R2C2: A Network Stack for Rack-scale Computers Paolo Costa, Hitesh Ballani, Kaveh Razavi, Ian Kash Microsoft Research Cambridge EECS 582 – W161.
© ITT Educational Services, Inc. All rights reserved. IS3120 Network Communications Infrastructure Unit 7 Layer 3 Networking, Campus Backbones, WANs, and.
XFabric: a Reconfigurable In-Rack Network for Rack-Scale Computers Sergey Legtchenko, Nicholas Chen, Daniel Cletheroe, Antony Rowstron, Hugh Williams,
VL2: A Scalable and Flexible Data Center Network
Energy Aware Network Operations
Yiting Xia, T. S. Eugene Ng Rice University
Resilient Datacenter Load Balancing in the Wild
Architecture and Algorithms for an IEEE 802
Affinity Depending on the application and client requirements of your Network Load Balancing cluster, you can be required to select an Affinity setting.
Data Center Network Architectures
Selecting Unicast or Multicast Mode
Chuanxiong Guo, et al, Microsoft Research Asia, SIGCOMM 2008
Improving Datacenter Performance and Robustness with Multipath TCP
Improving Datacenter Performance and Robustness with Multipath TCP
IS3120 Network Communications Infrastructure
Chuanxiong Guo, Haitao Wu, Kun Tan,
Dingming Wu+, Yiting Xia+*, Xiaoye Steven Sun+,
Omega: flexible, scalable schedulers for large compute clusters
NTHU CS5421 Cloud Computing
 What is Topology  Categories of Topology  Definition, structure, advantage and disadvantage of all of the following topologies: o Mesh o Bus o Ring.
Internet and Web Simple client-server model
Programmable Networks
Specialized Cloud Architectures
2019/5/13 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Peng Wang, George Trimponias, Hong Xu,
COMPUTER NETWORKS CS610 Lecture-16 Hammad Khalid Khan.
Presentation transcript:

Subways: A Case for Redundant, Inexpensive Data Center Edge Links Vincent Liu, Danyang Zhuo, Simon Peter, Arvind Krishnamurthy, Thomas Anderson University of Washington

Data Centers Are Growing Quickly Data center networks need to be scalable Upgrades need to be incrementally deployable What’s worse: workloads are often bursty

Today’s Data Center Networks Oversubscribed: can send more than the network can handle Locality within a rack and/or cluster Capacity upgrades are often “rip-and-replace” Top-of-Rack (ToR) Switches Cluster Switches Racks of Servers Cluster Fabric Switches

Could we upgrade by augmenting servers with multiple links?

Strawman: Trunking Add a parallel connection Requires rewiring of existing links

Strawman: Trunking Add a parallel connection Requires rewiring of existing links

Subways Instead of having all links go to the same ToR, use an overlapping pattern

Advantages of Subways Incremental upgrades Short paths to more nodes Less traffic in the network backbone Better statistical multiplexing A more even split of remaining traffic Incremental upgrades and better-than-proportional performance gain

Roadmap How do we wire servers to ToRs? Our wiring method uses incrementally deployable, short wires asdfasdasdgadsfgs How can we use multiple ToRs? Our routing protocols increase the number of short paths and better balance the remaining load What about the rest of the network?

Roadmap How do we wire servers to ToRs? Our wiring method uses incrementally deployable, short wires asdfasdasdgadsfgs How can we use multiple ToRs? Our routing protocols increase the number of short paths and better balance the remaining load What about the rest of the network?

Subways Physical Topology

Roadmap How do we wire servers to ToRs? Our wiring method uses incrementally deployable, short wires asdfasdasdgadsfgs How can we use multiple ToRs? Our routing protocols increase the number of short paths and better balance the remaining load What about the rest of the network?

Local Traffic Always prefer shorter paths Subways creates short paths to more nodes ⇒ Less traffic in the oversubscribed network Single link or trunk Subways

Uniform Random Simple Doesn’t use capacity optimally if there are 2+ hot racks

Uniform Random Simple Doesn’t use capacity optimally if there are 2+ hot racks

Adaptive Load Balancing Using either MPTCP or Weighted-ECMP Spreads load more effectively

Detours Offload traffic to nearby ToRs Detours can overcome oversubscription

Roadmap How do we wire servers to ToRs? Our wiring method uses incrementally deployable, short wires asdfasdasdgadsfgs How can we use multiple ToRs? Our routing protocols take advantage of short paths and better balances the remaining load What about the rest of the network?

Wire all ToRs into the same cluster Routing is unchanged Cluster may need to be rewired Wiring ToRs into the Backbone: Type 1

Just like server-ToR, Cross-wire adjacent ToRs to different clusters Incremental cluster deployment, short paths & stat muxing Routing is more complex Wiring ToRs into the Backbone: Type 2

Evaluation

Evaluation Methodology Packet-level simulator 2 ports per server, 15 servers per rack 3 levels of 10 GbE switches Validated using a small Cloudlab testbed

How Does Subways Compare to Other Upgrade Paths? 90 node MapReduce shuffle-like workload For this workload, superlinear speedup

Other Questions We Address How sensitive is Subways to job size? How sensitive is it to loop size? Is it better than multihoming/MC-LAG? How do performance effects scale with port count? Does the degree of oversubscription have an effect on the benefits of Subways? How much CPU overhead does detouring add?

Subways Wire multiple links to overlapping ToRs Enables incremental upgrades Short paths to more nodes Better statistical multiplexing Superlinear speedup depending on workload