Datacenter Network Topologies

Slides:



Advertisements
Similar presentations
Data Center Networking with Multipath TCP
Advertisements

Improving Datacenter Performance and Robustness with Multipath TCP
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, and Amin Vahdat Department.
Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
Applying NOX to the Datacenter Arsalan Tavakoli, Martin Casado, Teemu Koponen, and Scott Shenker 10/22/2009Hot Topics in Networks Workshop 2009.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
Utilizing Datacenter Networks: Dealing with Flow Collisions Costin Raiciu Department of Computer Science University Politehnica of Bucharest.
60 GHz Flyways: Adding multi-Gbps wireless links to data centers
Cross-Layer Scheduling in Cloud Systems Hilfi Alkaff, Indranil Gupta, Luke Leslie Department of Computer Science University of Illinois at Urbana-Champaign.
Virtual Layer 2: A Scalable and Flexible Data-Center Network Work with Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Parantap Lahiri,
Portland: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric Offense Kai Chen Shih-Chi Chen.
Chuanxiong Guo, Haitao Wu, Kun Tan,
PortLand Presented by Muhammad Sadeeq and Ling Su.
Data Center Network Topologies: FatTree
Data Center Basics (ENCS 691K – Chapter 5)
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Ji-Yong Shin * Bernard Wong +, and Emin Gün Sirer * * Cornell University + University of Waterloo 2 nd ACM Symposium on Cloud ComputingOct 27, 2011 Small-World.
A Scalable, Commodity Data Center Network Architecture Mohammad AI-Fares, Alexander Loukissas, Amin Vahdat Presented by Ye Tao Feb 6 th 2013.
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture.
Datacenter Networks Mike Freedman COS 461: Computer Networks
DataLink Layer1 Ethernet Technologies: 10Base2 10: 10Mbps; 2: 200 meters (actual is 185m) max distance between any two nodes without repeaters thin coaxial.
Network Topologies.
ElasticTree: Saving Energy in Data Center Networks 許倫愷 2013/5/28.
Network Sharing Issues Lecture 15 Aditya Akella. Is this the biggest problem in cloud resource allocation? Why? Why not? How does the problem differ wrt.
Before We Start How to read a research paper?
Introduction1-1 Data Communications and Computer Networks Chapter 5 CS 3830 Lecture 27 Omar Meqdadi Department of Computer Science and Software Engineering.
Advanced Topics in Distributed Systems Fall 2011 Instructor: Costin Raiciu.
Networking the Cloud Presenter: b 電機三 姜慧如.
Multipath TCP design, and application to data centers Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke.
VL2 – A Scalable & Flexible Data Center Network Authors: Greenberg et al Presenter: Syed M Irteza – LUMS CS678: 2 April 2013.
Routing & Architecture
A Scalable, Commodity Data Center Network Architecture Jingyang Zhu.
DARD: Distributed Adaptive Routing for Datacenter Networks Xin Wu, Xiaowei Yang.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 Jayaram Mudigonda, HP Labs Praveen Yalagandula, HP Labs Mohammad Al-Fares, UCSD Jeff Mogul,
Congestion control for Multipath TCP (MPTCP) Damon Wischik Costin Raiciu Adam Greenhalgh Mark Handley THE ROYAL SOCIETY.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
Day11 Devices/LAN/WAN. Network Devices Hub Switches Bridge Router Gateway.
Department of Computer Science A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares Alexander Loukissas Amin Vahdat SIGCOMM’08 Reporter:
Lecture Topics: 11/27 Networks Layered Model Ethernet IP.
Ethernet. Ethernet standards milestones 1973: Ethernet Invented 1983: 10Mbps Ethernet 1985: 10Mbps Repeater 1990: 10BASE-T 1995: 100Mbps Ethernet 1998:
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
Approaches to Improve Data Center Performance through Networking - Gurubaran.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Network Virtualization Ben Pfaff Nicira Networks, Inc.
VL2: A Scalable and Flexible Data Center Network
Data Center Architectures
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Chen Qian, Xin Li University of Kentucky
Yiting Xia, T. S. Eugene Ng Rice University
CIS 700-5: The Design and Implementation of Cloud Networks
Data Center Network Architectures
Chuanxiong Guo, et al, Microsoft Research Asia, SIGCOMM 2008
ECE 544: Traffic engineering (supplement)
Improving Datacenter Performance and Robustness with Multipath TCP
Chapter 4 Data Link Layer Switching
Improving Datacenter Performance and Robustness with Multipath TCP
FAR: A Fault-avoidance Routing Method for Data Center Networks with Regular Topology Please send.
NTHU CS5421 Cloud Computing
A Scalable, Commodity Data Center Network Architecture
Chuanxiong Guo, Haitao Wu, Kun Tan,
NTHU CS5421 Cloud Computing
VL2: A Scalable and Flexible Data Center Network
Jellyfish: Networking Data Centers Randomly
Data Center Architectures
Presentation transcript:

Datacenter Network Topologies Costin Raiciu Advanced Topics in Distributed Systems

Datacenter apps have dense traffic patterns Map-reduce jobs – shuffle phase Mappers finish Reducers must contact every mapper and download data All-to-all communication! One-to-many – scatter-gather workloads – web search, etc. One-to-one – filesystem reads/writes

Flexibility is Important in Data Centers Apps distributed across thousands of machines. Flexibility: want any machine to be able to play any role. But: Traditional data center topologies are tree based. Don’t cope well with non-local traffic patterns.

Traditional Data Center Topology Core Switch 10Gbps Aggregation Switches 10Gbps Top of Rack Switches 1Gbps Racks of servers …

Problems in Traditional Solutions They lack robustness Aggregation switch failures wipe out entire racks They lack performance Oversubscription = max_throughput / worst_case_throughput Typical oversubscription ratios 4:1, 8:1 They are expensive! 7K for 48-port Gigabit switch 700K for 128-port 10Gigabit switch

Want a datacenter network that: Offers full-bisection bandwidth Over-subscription ratio of 1:1 Worst case: every host can talk to every other host at line rate! Is fault tolerant Is cheap

The Fat Tree [Al Fares et al, Sigcomm2008] Inspired from the telephone networks of the 50’s – Clos networks Uses cheap, commodity switches – all switches are the same Lots of redundancy Single parameter to describe the topology: K – the number of ports in a switch

Fat Tree Topology [Fares et al., 2008; Clos, 1953] K=4 Aggregation Switches Racks of servers K Pods with K Switches each Show multiple paths between servers Say that network is rearrangeably non blocking Clos 4 x 1Gbps 8

Fat Tree Properties Number of hosts = Full bisection K/2 hosts per lower-pod switch K/2 lower pod switches per pod K pods Full bisection Topology is rearrangeably non-blocking

The Fat Tree Topology has k*k/4 paths between any two endpoints Aggregation Switches 1Gbps K Pods with K Switches each 1Gbps Show multiple paths between servers Say that network is rearrangeably non blocking Clos Racks of servers 10

Routing How do hosts access different paths? Basic solution at Layer 2 Spanning Tree Protocol Anything wrong with this? Say we come up with a proper L2 solution that offers multiple paths What about L2 broadcasts? (e.g. ARP) Layer 2 still might be desirable, though Some apps expect servers in the same LAN

Multipath Routing at Layer 3 Run a link-state routing protocol on the switches (routers) (e.g. OSPF) Compute shortest-path to any destination Drawback: must use smarter, more expensive switches! Equal Cost Multipath Routing (ECMP): When there are multiple shortest paths, pick one “randomly” Hash packet header to choose a path All packets of the same flow go on the same path Why not use per-packet ECMP?

Novel Layer 2 solutions TRILL – IETF standard in the making Switches are as “Routing Bridges” Run IS-IS between them to compute multiple paths ECMP to place packets on different flows! Cons: switch support still missing today

VL2 Topology [Greenberg et al, Sigcomm 2009] 10Gbps … 10Gbps 20 hosts

Performance ECMP routing All-to-all traffic matrix Every host sends to every other host – every host link is fully utilized, network runs at 100% (both VL2 and FatTree) Many-to-one traffic: limited by the host NIC. Permutation traffic matrix Every host sends to/receives from a single other host a long running TCP connection Average network utilization FatTree: 40% VL2: 80%

Single-path TCP collisions reduce throughput

Comparison between FatTree and VL2 Full-bisection Yes Switches Commodity Top-end (20 Gige ports, 2 10Gige ports) Routing ECMP (with problems) ECMP seems enough Cabling Tons of cables Much Simpler

Jellyfish [Singla et. Al, NSDI 2012]

Incremental expansion Facebook adding capacity “daily” Easy to add servers, but what about the network? Structured topologies constrain expansion 3k^2/4 servers for K-port Fat Tree 24 ports – 3456 servers 32 ports – 8192 servers 48 ports – 27648 servers Workarounds: Leave ports free for later or oversubscribe network

Jellyfish Key Idea: forget about structure

Jellyfish example

Jellyfish overview Each 4L port switch connects to L hosts 3L other random switches

Building Jellyfish

Jellyfish Performance

Why is Jellyfish better than FatTree? Intuition Say we fully utilize all available links in the network N – number of flows getting 1Gbps throughput

Jellyfish has smaller mean path length

Routing in Jellyfish Does ECMP still work? Use K-shortest paths instead Much more difficult to implement! OpenFlow (next week), Spain, MPLS-TE

Thinking differently: The BCube datacenter network

Bcube Key Idea: Have servers forward packets on behalf of other servers We can use very cheap, dumb switches Bcube (n,k) Uses n-port switches and k+1 levels Each server has k+1 ports

BCube Topology [Guo et al, Sigcomm 2009]

BCube Topology [Guo et al, Sigcomm 2009]

BCube Topology [Guo et al, Sigcomm 2009]

BCube Topology [Guo et al, Sigcomm 2009]

BCube Topology [Guo et al, Sigcomm 2009]

BCube Topology [Guo et al, Sigcomm 2009]

BCube Properties Number of servers: NK+1 Maximum path length: K+1 K+1 parallel paths between any two servers Is Bcube better than FatTree? It depends on the traffic pattern K+1 times better for many-to-one, one-to-one traffic patterns Same as FatTree for all-to-all, permutation

Bcube Routing

Issues with BCube How do we implement routing? Bcube source routing How do we pick a path for each flow? Probe all paths briefly then select best path

Which topologies are used in practice?

Which topologies are used in practice? [Raiciu et al, Hotcloud’12] We did a brief study of the Amazon EC2 network topology (us-east-1d) Rented many VMs Between all pairs we ran: Traceroute Record route (ping –R) Used aliasing techniques to group IPs on the same device

EC2 Measurement results Edge Router (IP) C Dom0 Top-of-Rack Switch (L2) D Dom0 A B Dom0

EC2 Measurement results Edge Router (IP) Top-of-Rack Switch (L2)

EC2 Measurement results Top-of-Rack Switch Edge Router

EC2 Measurement results INTERNET …. Core Router Top-of-Rack Switch Edge Router