Data Center Network Topologies: VL2 (Virtual Layer 2) Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems.

Slides:



Advertisements
Similar presentations
Data Center Networking
Advertisements

Data Center Networking with Multipath TCP
Improving Datacenter Performance and Robustness with Multipath TCP
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
Jaringan Komputer Lanjut Packet Switching Network.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Data Center Networking Major Theme: What are new networking issues posed by large-scale data centers? Network Architecture? Topology design? Addressing?
Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
PRESENTED BY: TING WANG PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric Radhika Niranjan Mysore, Andreas Pamboris, Nathan.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
Towards Virtual Routers as a Service 6th GI/ITG KuVS Workshop on “Future Internet” November 22, 2010 Hannover Zdravko Bozakov.
High Performance Router Architectures for Network- based Computing By Dr. Timothy Mark Pinkston University of South California Computer Engineering Division.
Virtual Layer 2: A Scalable and Flexible Data-Center Network Work with Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Parantap Lahiri,
Chuanxiong Guo, Haitao Wu, Kun Tan,
Semester 4 - Chapter 3 – WAN Design Routers within WANs are connection points of a network. Routers determine the most appropriate route or path through.
Data Center Middleboxes Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking November 24,
Data Center Traffic and Measurements Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking.
Alternative Switching Technologies: Optical Circuit Switches Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance.
Data Center Network Topologies: FatTree
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
Data Center Networks and Basic Switching Technologies Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems.
Data Center Virtualization: Open vSwitch Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking.
VL2: A Scalable and Flexible data Center Network
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Center Traffic Management COS 597E: Software Defined Networking.
A Scalable, Commodity Data Center Network Architecture
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Data.
A Scalable, Commodity Data Center Network Architecture.
Datacenter Networks Mike Freedman COS 461: Computer Networks
Data Center Traffic and Measurements: Available Bandwidth Estimation Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance.
Data Center Virtualization: VirtualWire Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking.
Networking the Cloud Presenter: b 電機三 姜慧如.
VL2 – A Scalable & Flexible Data Center Network Authors: Greenberg et al Presenter: Syed M Irteza – LUMS CS678: 2 April 2013.
Routing & Architecture
David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:
DARD: Distributed Adaptive Routing for Datacenter Networks Xin Wu, Xiaowei Yang.
1 Energy in Networks & Data Center Networks Department of EECS University of Tennessee, Knoxville Yanjun Yao.
A.SATHEESH Department of Software Engineering Periyar Maniammai University Tamil Nadu.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
Department of Computer Science A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares Alexander Loukissas Amin Vahdat SIGCOMM’08 Reporter:
Netprog: Routing and the Network Layer1 Routing and the Network Layer (ref: Interconnections by Perlman)
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
6.888: Lecture 2 Data Center Network Architectures Mohammad Alizadeh Spring 2016  Slides adapted from presentations by Albert Greenberg and Changhoon.
15-744: Computer Networking L-12 Data Center Networking I.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Network Virtualization Ben Pfaff Nicira Networks, Inc.
Data Center Networking Major Theme: What are new networking issues posed by large-scale data centers? Network Architecture? Topology design? Addressing?
VL2: A Scalable and Flexible Data Center Network
Data Center Architectures
Chen Qian, Xin Li University of Kentucky
Data Center Networking
CIS 700-5: The Design and Implementation of Cloud Networks
Data Center Network Topologies II
Architecture and Algorithms for an IEEE 802
Data Centers: Network Architecture
Revisiting Ethernet: Plug-and-play made scalable and efficient
Data Center Network Architectures
Improving Datacenter Performance and Robustness with Multipath TCP
CS 268: Computer Networking
Improving Datacenter Performance and Robustness with Multipath TCP
NTHU CS5421 Cloud Computing
湖南大学-信息科学与工程学院-计算机与科学系
NTHU CS5421 Cloud Computing
VL2: A Scalable and Flexible Data Center Network
Data Center Architectures
Routing and the Network Layer (ref: Interconnections by Perlman
Lecture 9, Computer Networks (198:552)
Data Center Traffic Engineering
Presentation transcript:

Data Center Network Topologies: VL2 (Virtual Layer 2) Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking September 26, 2014 Slides used and adapted judiciously from COS-561, Advanced Computer Networks At Princeton University

Overview and Basics Data Center Networks – Basic switching technologies – Data Center Network Topologies (today and Monday) – Software Routers (eg. Click, Routebricks, NetMap, Netslice) – Alternative Switching Technologies – Data Center Transport Data Center Software Networking – Software Defined networking (overview, control plane, data plane, NetFGPA) – Data Center Traffic and Measurements – Virtualizing Networks – Middleboxes Advanced Topics Where are we in the semester?

Goals for Today VL2: a scalable and flexible data center network – A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. ACM Computer Communication Review (CCR), August 2009, pages

Architecture of Data Center Networks (DCN)

Conventional DCN Problems Static network assignment Fragmentation of resource Poor server to server connectivity Traffics affects each other Poor reliability and utilization CR AR S S S S S S S S A A A A A A … S S S S A A A A A A …... S S S S S S S S A A A A A A … S S S S A A A A A A … I want more I have spare ones, but… 1:5 1:80 1:240

Objectives: Uniform high capacity: – Maximum rate of server to server traffic flow should be limited only by capacity on network cards – Assigning servers to service should be independent of network topology Performance isolation: – Traffic of one service should not be affected by traffic of other services Layer-2 semantics: – Easily assign any server to any service – Configure server with whatever IP address the service expects – VM keeps the same IP address even after migration

Measurements and Implications of DCN Data-Center traffic analysis: – Traffic volume between servers to entering/leaving data center is 4:1 – Demand for bandwidth between servers growing faster – Network is the bottleneck of computation Flow distribution analysis: – Majority of flows are small, biggest flow size is 100MB – The distribution of internal flows is simpler and more uniform – 50% times of 10 concurrent flows, 5% greater than 80 concurrent flows

Traffic matrix analysis: – Poor summarizing of traffic patterns – Instability of traffic patterns Failure characteristics: – Pattern of networking equipment failures: 95% 10 days – No obvious way to eliminate all failures from the top of the hierarchy Measurements and Implications of DCN

Virtual Layer 2 Switch (VL2) Design principle: – Randomizing to cope with volatility: Using Valiant Load Balancing (VLB) to do destination independent traffic spreading across multiple intermediate nodes – Building on proven networking technology: Using IP routing and forwarding technologies available in commodity switches – Separating names from locators: Using directory system to maintain the mapping between names and locations – Embracing end systems: A VL2 agent at each server

10 1. L2 semantics 2. Uniform high capacity 3. Performance isolation A A A A A A … A A A A A A …... A A A A A A … A A A A A A … CR AR S S S S S S S S S S S S S S S S S S S S S S S S A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A... Virtual Layer 2 Switch (VL2)

VL2 Goals and Solutions 11 Solution Approach Objective 2. Uniform high capacity between servers Enforce hose model using existing mechanisms only Employ flat addressing 1. Layer-2 semantics 3. Performance Isolation Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) Name-location separation & resolution service TCP “Hose”: each node has ingress/egress bandwidth constraints

Name/Location Separation 12 payload ToR 3... y x Servers use flat names Switches run link-state routing and maintain only switch-level topology Cope with host churns with very little overhead yz payload ToR 4 z ToR 2 ToR 4 ToR 1 ToR 3 y, z payload ToR 3 z... Directory Service … x  ToR 2 y  ToR 3 z  ToR 4 … Lookup & Response … x  ToR 2 y  ToR 3 z  ToR 3 … Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration Allows to use low-cost switches Protects network and hosts from host-state churn Obviates host and switch reconfiguration

Clos Network Topology TOR 20 Servers Int... Aggr K aggr switches with D ports 20*(DK/4) Servers Offer huge aggr capacity & multi paths at modest cost D (# of 10G ports) Max DC size (# of Servers) 4811, , ,680

Valiant Load Balancing: Indirection 14 xy payload T3T3 y z T5T5 z I ANY Cope with arbitrary TMs with very little overhead Links used for up paths Links used for down paths T1T1 T2T2 T3T3 T4T4 T5T5 T6T6 [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today [ ECMP + IP Anycast ] Harness huge bisection bandwidth Obviate esoteric traffic engineering or optimization Ensure robustness to failures Work with switch mechanisms available today 1. Must spread traffic 2. Must ensure dst independence 1. Must spread traffic 2. Must ensure dst independence Equal Cost Multi Path Forwarding

VL2 Directory System RSM DS RSM DS RSM DS Agent... Agent... Directory Servers RSM Servers 2. Reply 1. Lookup “Lookup” 5. Ack 2. Set 4. Ack (6. Disseminate) 3. Replicate 1. Update “Update”

Evaluation Uniform high capacity: – All-to-all data shuffle stress test: 75 servers, deliver 500MB Maximal achievable goodput is 62.3 VL2 network efficiency as 58.8/62.3 = 94%

Evaluation Fairness: – 75 nodes – Real data center workload – Plot Jain’s fairness index for traffics to intermediate switches Time (s) Fairness Index Aggr1 Aggr2 Aggr3

Evaluation Performance isolation: – Two types of services: Service one: 18 servers do single TCP transfer all the time Service two: 19 servers starts a 8GB transfer over TCP every 2 seconds Service two: 19 servers burst short TCP connections

Evaluation Convergence after link failures – 75 servers – All-to-all data shuffle – Disconnect links between intermediate and aggregation switches

Perspective Studied the traffic pattern in a production data center and find the traffic patterns Design, build and deploy every component of VL2 in an 80 server testbed Apply VLB to randomly spreading traffics over multiple flows Using flat address to split IP addresses and server names

Critique The extra servers are needed to support the VL2 directory system,: – Brings more cost on devices – Hard to be implemented for data centers with tens of thousands of servers. All links and switches are working all the times, not power efficient No evaluation of real time performance.

VL2 vs. SEATTLE Similar “virtual layer 2” abstraction – Flat end-point addresses – Indirection through intermediate node Enterprise networks (Seattle) – Hard to change hosts  directory on the switches – Sparse traffic patterns  effectiveness of caching – Predictable traffic patterns  no emphasis on TE Data center networks (VL2) – Easy to change hosts  move functionality to hosts – Dense traffic matrix  reduce dependency on caching – Unpredictable traffic patterns  ECMP and VLB for TE 22

VL2: A Scalable and Flexible Data Center Network – consolidate layer-2/layer-3 into a “virtual layer 2” – separating “naming” and “addressing”, also deal with dynamic load-balancing issues A Scalable, Commodity Data Center Network Architecture – a new Fat-tree “inter-connection” structure (topology) to increases “bi-section” bandwidth needs “new” addressing, forwarding/routing Other Approaches: PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric BCube: A High-Performance, Server-centric Network Architecture for Modular Data Centers Other Data Center Architectures

24 Ongoing Research

What topology to use in data centers? – Reducing wiring complexity – Achieving high bisection bandwidth – Exploiting capabilities of optics and wireless Routing architecture? – Flat layer-2 network vs. hybrid switch/router – Flat vs. hierarchical addressing How to perform traffic engineering? – Over-engineering vs. adapting to load – Server selection, VM placement, or optimizing routing Virtualization of NICs, servers, switches, … 25 Research Questions

Rethinking TCP congestion control? – Low propagation delay and high bandwidth – “Incast” problem leading to bursty packet loss Division of labor for TE, access control, … – VM, hypervisor, ToR, and core switches/routers Reducing energy consumption – Better load balancing vs. selective shutting down Wide-area traffic engineering – Selecting the least-loaded or closest data center Security – Preventing information leakage and attacks 26 Research Questions

Before Next time Project Progress – Need to setup environment as soon as possible – And meet with groups, TA, and professor Lab0b – Getting Started with Fractus – Use Fractus instead of Red Cloud Red Cloud instances will be terminated and state lost – Due Monday, Sept 29 Required review and reading for Friday, September 26 – “The Click Modular Router”, E. Kohler, R. Morris, B. Chen, and M. F. Kaashoek. ACM Symposium on Operating Systems Principles (SOSP), December 1999, pages – – Check piazza: Check website for updated schedule