Data Center Architectures CIS 700/005 – Lecture 2 Includes material from lectures by Hakim Weatherspoon and Jennifer Rexford
Traditional Data Centers Internet Data Center Layer-3 router Core Layer-2/3 switch Aggregation Layer-2 switch Access Servers
Limitation (1): Oversubscription Ratio of the worst-case achievable aggregate bandwidth among the end hosts to the total bisection bandwidth of a particular communication topology Lower the total cost of the design Typical designs: factor of 2:5:1 (400 Mbps)to 8:1(125 Mbps)
Limitation (2): Fault tolerance Oversubscription + Bigger routers less routers at the top of the tree a core router failure has high blast radius
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Scalable interconnection bandwidth 1:1 oversubscription Economies of scale Backwards compatibility
History Lesson: Clos Networks (1953) Emulate a single huge switch with many smaller switches Add more layers to scale out
History Lesson: Clos Networks (1953) Emulate a single huge switch with many smaller switches Add more layers to scale out
History Lesson: Clos Networks (1953) Emulate a single huge switch with many smaller switches Add more layers to scale out
Fat-tree Architecture K-ary fat tree: three-layer topology (edge, aggregation and core) each pod consists of (k/2)2 servers & 2 layers of k/2 k-port switches each edge switch connects to k/2 servers & k/2 aggr. switches each aggr. switch connects to k/2 edge & k/2 core switches (k/2)2 core switches: each connects to k pods
Obligatory Network Questions How do I address destinations? Hierarchical IP addresses for scalability [PodNumber].[SwitchNumber].[Endhost] How does a switch route packets? Assumption: every routing table entry has 1 output Route downward using prefix (for scalability) Route upward using suffix (for load balancing)
Routing Optimizations Flow classification Classify flows (e.g., src, dest, port #s) Move around a small set of flows as needed Flow scheduling Keep track of large, long-lived flows at the edge switches Assign them to different links
VL2: a scalable and flexible data center network A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta Let’s take the “single big switch” model to the limit: Uniform high capacity Performance isolation: Layer-2 semantics:
Virtual Layer 2 Switch (VL2) The Illusion of a Huge L2 Switch 1. L2 semantics 2. Uniform high capacity 3. Performance isolation A A A A A … A A A A A A … A A A A A A A A A A … A A A A A A A A A A A A … A A A A
VL2 Goals and Solutions Objective Approach Solution 1. Layer-2 semantics Employ flat addressing Name-location separation & resolution service 2. Uniform high capacity between servers Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) 3. Performance Isolation Enforce hose model using existing mechanisms only TCP “Hose”: each node has ingress/egress bandwidth constraints