Data Center Traffic Engineering

Slides:



Advertisements
Similar presentations
Data Center Networking
Advertisements

Mobility Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
NDN in Local Area Networks Junxiao Shi The University of Arizona
The Case for Enterprise Ready Virtual Private Clouds Timothy Wood, Alexandre Gerber *, K.K. Ramakrishnan *, Jacobus van der Merwe *, and Prashant Shenoy.
Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
Data Center Network Topologies: VL2 (Virtual Layer 2) Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems.
Tunis, Tunisia, 28 April 2014 Business Values of Virtualization Mounir Ferjani, Senior Product Manager, Huawei Technologies 2.
Virtual Layer 2: A Scalable and Flexible Data-Center Network Work with Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Parantap Lahiri,
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
VL2: A Scalable and Flexible data Center Network
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Center Traffic Management COS 597E: Software Defined Networking.
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Data.
A Scalable, Commodity Data Center Network Architecture.
Datacenter Networks Mike Freedman COS 461: Computer Networks
Data Center Networks Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Active Network Applications Tom Anderson University of Washington.
Software-Defined Networks Jennifer Rexford Princeton University.
Networking the Cloud Presenter: b 電機三 姜慧如.
VL2 – A Scalable & Flexible Data Center Network Authors: Greenberg et al Presenter: Syed M Irteza – LUMS CS678: 2 April 2013.
Cloud Scale Performance & Diagnosability Comprehensive SDN Core Infrastructure Enhancements vRSS Remote Live Monitoring NIC Teaming Hyper-V Network.
Floodless in SEATTLE : A Scalable Ethernet ArchiTecTure for Large Enterprises. Changhoon Kim, Matthew Caesar and Jenifer Rexford. Princeton University.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
6.888: Lecture 2 Data Center Network Architectures Mohammad Alizadeh Spring 2016  Slides adapted from presentations by Albert Greenberg and Changhoon.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
VL2: A Scalable and Flexible Data Center Network
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Data Center Architectures
Chen Qian, Xin Li University of Kentucky
Yiting Xia, T. S. Eugene Ng Rice University
Data Center Networking
Connected Infrastructure
Chapter 6: Securing the Cloud
CIS 700-5: The Design and Implementation of Cloud Networks
Lecture 2: Cloud Computing
Data Center Network Topologies II
Lab A: Planning an Installation
Overview: Cloud Datacenters
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Data Center Network Architectures
Data Centers: Network Architecture
IOT Critical Impact on DC Design
Networking Recap Storage Intro
Revisiting Ethernet: Plug-and-play made scalable and efficient
Data Center Network Architectures
Improving Datacenter Performance and Robustness with Multipath TCP
Sebastian Solbach Consulting Member of Technical Staff
CS 268: Computer Networking
Connected Infrastructure
Elastic Provisioning In Virtual Private Clouds
NTHU CS5421 Cloud Computing
GGF15 – Grids and Network Virtualization
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
湖南大学-信息科学与工程学院-计算机与科学系
NTHU CS5421 Cloud Computing
2018/12/10 Energy Efficient SDN Commodity Switch based Practical Flow Forwarding Method Author: Amer AlGhadhban and Basem Shihada Publisher: 2016 IEEE/IFIP.
VL2: A Scalable and Flexible Data Center Network
Internet and Web Simple client-server model
Specialized Cloud Architectures
Data Center Architectures
Cloud-Enabling Technology
Data Center Networks Mohammad Alizadeh Fall 2018
Backbone Traffic Engineering
EE 122: Lecture 22 (Overlay Networks)
In-network computation
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
Lecture 8, Computer Networks (198:552)
Lecture 9, Computer Networks (198:552)
Presentation transcript:

Data Center Traffic Engineering

Cloud Computing Elastic resources Multi-tenancy Expand and contract resources Pay-per-use Infrastructure on demand Multi-tenancy Multiple independent users Security and resource isolation Amortize the cost of the (shared) infrastructure Flexibility service management Resiliency: isolate failure of servers and storage Workload movement: move work to other locations

Cloud Service Models Software as a Service Platform as a Service Provider licenses applications to users as a service E.g., customer relationship management, e-mail, .. Avoid costs of installation, maintenance, patches, … Platform as a Service Provider offers software platform for building applications E.g., Google’s App-Engine Avoid worrying about scalability of platform Infrastructure as a Service Provider offers raw computing, storage, and network E.g., Amazon’s Elastic Computing Cloud (EC2) Avoid buying servers and estimating resource needs

Multi-Tier Applications Applications consist of tasks Many separate components Running on different machines Commodity computers Many general-purpose computers Not one big mainframe Easier scaling Front end Server Aggregator … … Aggregator Aggregator Aggregator … … Worker Worker Worker Worker Worker

Enabling Technology: Virtualization Multiple virtual machines on one physical machine Applications run unmodified as on real machine VM can migrate from one computer to another

Data Center Network

Status Quo: Virtual Switch in Server

Top-of-Rack Architecture Rack of servers Commodity servers And top-of-rack switch Modular design Preconfigured racks Power, network, and storage cabling Aggregate to the next level

Modularity, Modularity, Modularity Containers Many containers

Data Center Network Topology Internet CR CR . . . AR AR AR AR S S . . . S S S S Key CR = Core Router AR = Access Router S = Ethernet Switch A = Rack of app. servers A A … A A A … A ~ 1,000 servers/pod

Capacity Mismatch . . . ~ 200:1 ~ 40:1 ~ 5:1 … … … … CR CR AR AR AR AR

~ 1,000 servers/pod == IP subnet Data-Center Routing Internet CR CR DC-Layer 3 . . . AR AR AR AR DC-Layer 2 S S S S S S S S . . . S S S S Key CR = Core Router (L3) AR = Access Router (L3) S = Ethernet Switch (L2) A = Rack of app. servers A A … A A A … A ~ 1,000 servers/pod == IP subnet

Reminder: Layer 2 vs. Layer 3 Ethernet switching (layer 2) Cheaper switch equipment Fixed addresses and auto-configuration Seamless mobility, migration, and failover IP routing (layer 3) Scalability through hierarchical addressing Efficiency through shortest-path routing Multipath routing through equal-cost multipath So, like in enterprises… Data centers often connect layer-2 islands by IP routers

Load Balancers Spread load over server replicas Present a single public address (VIP) for a service Direct each request to a server replica 10.10.10.1 Virtual IP (VIP) 192.121.10.1 10.10.10.2 10.10.10.3

Data Center Costs (Monthly Costs) Servers: 45% CPU, memory, disk Infrastructure: 25% UPS, cooling, power distribution Power draw: 15% Electrical utility costs Network: 15% Switches, links, transit http://perspectives.mvdirona.com/2008/11/28/CostOfPowerInLargeScaleDataCenters.aspx

Wide-Area Network . . . Internet Data Centers Servers Router DNS Server DNS-based site selection . . . Servers Internet Clients Data Centers

Wide-Area Network: Ingress Proxies Router Data Centers . . . Servers Clients Proxy

Data Center Traffic Engineering Challenges and Opportunities

Traffic Engineering Challenges Scale Many switches, hosts, and virtual machines Churn Large number of component failures Virtual Machine (VM) migration Traffic characteristics High traffic volume and dense traffic matrix Volatile, unpredictable traffic patterns Performance requirements Delay-sensitive applications Resource isolation between tenants

Traffic Engineering Opportunities Efficient network Low propagation delay and high capacity Specialized topology Fat tree, Clos network, etc. Opportunities for hierarchical addressing Control over both network and hosts Joint optimization of routing and server placement Can move network functionality into the end host Flexible movement of workload Services replicated at multiple servers and data centers Virtual Machine (VM) migration

The Illusion of a Huge L2 Switch 3. Performance isolation Virtual Layer 2 Switch The Illusion of a Huge L2 Switch CR AR S 1. L2 semantics . . . 2. Uniform high capacity 3. Performance isolation . . . A A A A A … A A A A A A … A A A A A A A A A A … A A A A A A A A A A A A … A A A A

VL2 Goals and Solutions Objective Approach Solution 1. Layer-2 semantics Employ flat addressing Name-location separation & resolution service 2. Uniform high capacity between servers Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) 3. Performance Isolation Enforce hose model using existing mechanisms only TCP “Hose”: each node has ingress/egress bandwidth constraints

Name/Location Separation Cope with host churns with very little overhead VL2 Directory Service Switches run link-state routing and maintain only switch-level topology Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration … x  ToR2 y  ToR3 z  ToR4 … x  ToR2 y  ToR3 z  ToR3 ToR1 . . . ToR2 . . . ToR3 . . . ToR4 Lazy update of stale cache entries when packet erroneously reaches the old ToR (who forwards to the directory server, who updates with unicast) ToR3 y payload Lookup & Response x y y, z z ToR3 ToR4 z z payload payload Servers use flat names

Offer huge aggr capacity & multi paths at modest cost Clos Network Topology Offer huge aggr capacity & multi paths at modest cost VL2 Int . . . D (# of 10G ports) Max DC size (# of Servers) 48 11,520 96 46,080 144 103,680 Aggr . . . K aggr switches with D ports D/2 ports facing up, and D/2 ports facing down… . . . . . . . . . TOR . . . . . . . . . . . 20 Servers 20*(DK/4) Servers

Valiant Load Balancing: Indirection Cope with arbitrary TMs with very little overhead Links used for up paths Links used for down paths IANY IANY IANY [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today T1 T2 T3 T4 T5 T6 IANY T3 T5 y z payload payload 1. Must spread traffic 2. Must ensure dst independence Equal Cost Multi Path Forwarding x y z

VL2 vs. Seattle Similar “virtual layer 2” abstraction Flat end-point addresses Indirection through intermediate node Enterprise networks (Seattle) Hard to change hosts  directory on the switches Sparse traffic patterns  effectiveness of caching Predictable traffic patterns  no emphasis on TE Data center networks (VL2) Easy to change hosts  move functionality to hosts Dense traffic matrix  reduce dependency on caching Unpredictable traffic patterns  ECMP and VLB for TE

Research Questions What topology to use in data centers? Reducing wiring complexity Achieving high bisection bandwidth Exploiting capabilities of optics and wireless Routing architecture? Flat layer-2 network vs. hybrid switch/router Flat vs. hierarchical addressing How to perform traffic engineering? Over-engineering vs. adapting to load Server selection, VM placement, or optimizing routing Virtualization of NICs, servers, switches, …

Research Questions Rethinking TCP congestion control? Low propagation delay and high bandwidth “Incast” problem leading to bursty packet loss Division of labor for TE, access control, … VM, hypervisor, ToR, and core switches/routers Reducing energy consumption Better load balancing vs. selective shutting down Wide-area traffic engineering Selecting the least-loaded or closest data center Security Preventing information leakage and attacks