Download presentation
Presentation is loading. Please wait.
1
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean
2
Overview Structure and Properties of a Data Center Desired properties in a DC Architecture Fat tree based solution Evaluation of fat tree approach
3
Common data center topology
4
Problem With common DC topology Single point of failure Over subscription of links higher up in the topology – Typical over subscription is between 2.5:1 and 8:1 – Trade off between cost and provisioning
5
Properties of solutions Compatible with Ethernet and TCP/IP Cost effective – Low power consumption & heat emission – Cheap infrastructure – Commodity hardware Allows host communication at line speed – Over subscription of 1:1
6
Cost of maintaining switches
7
Review of Layer 2 & Layer 3 Layer 2 – Data Link Layer Ethernet MAC address – One spanning tree for entire network Prevents looping Ignores alternate paths Layer 3 – Transport Layer TCP/IP – Shortest path routing between source and destination – Best-effort delivery
8
FAT Tree based Solution Connect end-host together using a fat tree topology – Infrastructure consist of cheap devices Every port is the same speed – All devices can transmit at line speed if packets are distributed along existing paths – A k-port fat tree can support k 3 /4 hosts
9
Fat-Tree Topology
10
Problems with a vanilla Fat-tree Layer 3 will only use one of the existing equal cost paths Packet re-ordering occurs if layer 3 blindly takes advantage of path diversity – Creates overhead at host as TCP must order the packets
11
FAT-tree Modified Enforce special addressing scheme in DC – Allows host attached to same switch to route only through switch – Allows inter-pod traffic to stay within pod – unused.PodNumber.switchnumber.Endhost Use two level look-ups to distribute traffic and maintain packet ordering.
12
2 Level look-ups First level is prefix lookup – Used to route down the topology to endhost Second level is a suffix lookup – Used to route up towards core – Diffuses and spreads out traffic – Maintains packet ordering by using the same ports for the same endhost
13
Diffusion Optimizations Flow classification – Eliminates local congestion – Assign to traffic to ports on a per-flow basis instead of a per-host basis Flow scheduling – Eliminates global congestion – Prevent long lived flows from sharing the same links – Assign long lived flows to different links
14
Results: Heat & Power Consumption
15
Implementation NetFPGA: 4 Gigabit Ports, 36 Mb SRAM 64MB DDR2, 3GB SATA Port Implemented elements in Click Router Software Two Level Table Initialized with preconfigured information Flow Classifier Distributes output evenly across local ports Flow Report + Flow Schedule Communicates with central schedule
16
Evaluation Purpose: measure bisection bandwidth Fat-Tree: 10 machines connected to 48 port switch Hierarchical: 8 machines connected to 48 port switch
17
Results
18
Related Work Myrinet – popular for cluster based supercomputers Benefit: low latency Cost: proprietary, host responsible for load balancing Infiniband – used in high-performance computing environments Benefit: proven to scale and high bandwidth Cost: imposes its own layer 1-4 protocol Uses Fat Tree Many massively parallel computers such as Thinking Machines & SGI use fat-trees
19
Conclusion The Good: cost per gigabit, energy per gigabit is going down The Bad: Datacenters are growing faster than commodity Ethernet devices Our fat-tree solution Is better: technically infeasible 27k node cluster using 10 GigE, we do it in $690M Is faster: equal or faster bandwidth in tests Increases fault tolerance Is Cheaper: 20k hosts costs $37M for hierarchical and $8.67M for fat-tree (1 GigE) KO’s the competing data center’s
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.