NetLord: A Scalable Multi-Tenant Network Architecture for Virtualized Datacenters SIGCOMM ’11 Authors: Jayaram Mudigonda, Praveen Yalagandula, Jeff Mogul.

Slides:



Advertisements
Similar presentations
EdgeNet2006 Summit1 Virtual LAN as A Network Control Mechanism Tzi-cker Chiueh Computer Science Department Stony Brook University.
Advertisements

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
DOT – Distributed OpenFlow Testbed
Introduction into VXLAN Russian IPv6 day June 6 th, 2012 Frank Laforsch Systems Engineer, EMEA
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
1 © 2004, Cisco Systems, Inc. All rights reserved. Chapter 3 Ethernet Technologies/ Ethernet Switching/ TCP/IP Protocol Suite and IP Addressing.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
Switching Topic 4 Inter-VLAN routing. Agenda Routing process Routing VLANs – Traditional model – Router-on-a-stick – Multilayer switches EtherChannel.
Revisiting Ethernet: Plug-and-play made scalable and efficient Changhoon Kim and Jennifer Rexford Princeton University.
Datacenter Network Topologies
Oct 21, 2004CS573: Network Protocols and Standards1 IP: Addressing, ARP, Routing Network Protocols and Standards Autumn
CSCI 4550/8556 Computer Networks Comer, Chapter 19: Binding Protocol Addresses (ARP)
CS335 Networking & Network Administration Tuesday, April 20, 2010.
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
Layer 2 Switch  Layer 2 Switching is hardware based.  Uses the host's Media Access Control (MAC) address.  Uses Application Specific Integrated Circuits.
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
FAR: A Fault-avoidance Routing Method for Data Center Networks with Regular Topology Bin Liu, ZTE.
Chapter 4: Managing LAN Traffic
CEN Network Fundamentals Chapter 19 Binding Protocol Addresses (ARP) To insert your company logo on this slide From the Insert Menu Select “Picture”
G64INC Introduction to Network Communications Ho Sooi Hock Internet Protocol.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Connecting to the Network Networking for Home and Small Businesses.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) Sriram Gopinath( )
IP Forwarding.
CMPT 471 Networking II Address Resolution IPv4 ARP RARP 1© Janice Regan, 2012.
Chapter 8: Virtual LAN (VLAN)
Connecting The Network Layer to Data Link Layer. ARP in the IP Layer The Address Resolution Protocol (ARP) The Address Resolution Protocol (ARP) Part.
ECE 526 – Network Processing Systems Design Networking: protocols and packet format Chapter 3: D. E. Comer Fall 2008.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 Jayaram Mudigonda, HP Labs Praveen Yalagandula, HP Labs Mohammad Al-Fares, UCSD Jeff Mogul,
Floodless in SEATTLE : A Scalable Ethernet ArchiTecTure for Large Enterprises. Changhoon Kim, Matthew Caesar and Jenifer Rexford. Princeton University.
Microsoft Windows Server 2003 TCP/IP Protocols and Services Technical Reference Slide: 1 Lesson 7 Internet Protocol (IP) Routing.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
1 © OneCloud and/or its affiliates. All rights reserved. VXLAN Overview Module 4.
IP1 The Underlying Technologies. What is inside the Internet? Or What are the key underlying technologies that make it work so successfully? –Packet Switching.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 9 Virtual Trunking Protocol.
Routing and Routing Protocols
Switching Topic 2 VLANs.
Chapter 4 Version 1 Virtual LANs. Introduction By default, switches forward broadcasts, this means that all segments connected to a switch are in one.
Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.
The Goals Proposal Realizing broadcast/multicast in virtual networks
+ Routing Concepts 1 st semester Objectives  Describe the primary functions and features of a router.  Explain how routers use information.
Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 1 Cisco Networking Training (CCENT/CCT/CCNA R&S) Rick Rowe Ron Giannetti.
1 Chapter 4: Internetworking (Introduction) Dr. Rocky K. C. Chang 16 March 2004.
+ Lecture#2: Ethernet Asma ALOsaimi. + Objectives In this chapter, you will learn to: Describe the operation of the Ethernet sublayers. Identify the major.
Coping with Link Failures in Centralized Control Plane Architecture Maulik Desai, Thyagarajan Nandagopal.
1 Computer Networks Chapter 5. Network layer The network layer is concerned with getting packets from the source all the way to the destination. Getting.
InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.
VL2: A Scalable and Flexible Data Center Network
IP: Addressing, ARP, Routing
CIS 700-5: The Design and Implementation of Cloud Networks
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Instructor Materials Chapter 5: Ethernet
Scaling the Network: The Internet Protocol
Revisiting Ethernet: Plug-and-play made scalable and efficient
Chapter 4: Routing Concepts
Chapter 5: Inter-VLAN Routing
Virtual LANs.
NTHU CS5421 Cloud Computing
NetLord: A Scalable Multi-tenant Network Architecture for Virtualized Cloud Datacenters Jayaram Mudigonda Praveen Yalagandula, Jeff Mogul, Bryan Stiekes,
Network Virtualization
NTHU CS5421 Cloud Computing
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 4: Planning and Configuring Routing and Switching.
Scaling the Network: The Internet Protocol
Ch 17 - Binding Protocol Addresses
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
Presentation transcript:

NetLord: A Scalable Multi-Tenant Network Architecture for Virtualized Datacenters SIGCOMM ’11 Authors: Jayaram Mudigonda, Praveen Yalagandula, Jeff Mogul HP Labs, Palo Alto, CA Bryan Stiekes, Yanick Pouffary HP Speaker: Ming-Chao Hsu, National Cheng Kung University

Outline INTRODUCTION PROBLEM AND BACKGROUND NETLORD’S DESIGN BENEFITS AND DESIGN RATIONALE EXPERIMENTAL ANALYSIS DISCUSSION AND FUTUREWORK CONCLUSIONS

INTRODUCTION Cloud datacenters such as Amazon EC2 and Microsoft Azure are becoming increasingly popular computing resources at a very low cost pay-as-you-go model. eliminate the expensive and maintaining infrastructure. Cloud datacenter operators can provide cost effective Infrastructure as a Service (IaaS) time multiplex the physical infrastructure CPU virtualization techniques (VMWare and Xen) Resource sharing (scale to huge sizes) the larger number of tenants the larger number of VMs IaaS networks virtualization and multitenancy at scales of tens of thousands of tenants and servers, and hundreds of thousands of VMs

INTRODUCTION Most existing datacenter network architectures suffer following drawbacks expensive to scale: requires huge core switches with thousands of ports require complex new protocols, hardware and specific features ( such as IP-in-IP decapsulation, MAC-in-MAC encapsulation, switch modifications) require excessive switch resources(large forwarding tables) provide limited support for multi-tenancy a tenant should be able to define its own layer-2 and layer-3 addresses performance guarantees and performance isolation provide IP address space sharing require complex configuration careful manual configuration for setting up IP subnets and configuring OSPF

INTRODUCTION NetLord Encapsulates a tenant’s L2 packets to provide full address-space virtualization. light-weight agent in the end-host hypervisors to transparently encapsulate and decapsulate packets from and to the local VMs. leverages SPAIN’s approach to multi-pathing , using VLAN tags to identify paths through the network. The encapsulating Ethernet header directs the packet to the destination server’s edge switch -> the correct server(hypervisor) -> the correct tenant. Does not expose any tenant MAC addresses to the switches also hides most of the physical server addresses the switches can use very small forwarding tables reducing capital, configurations and operational costs

PROBLEM AND BACKGROUND Scale at low cost must scale, low cost, large numbers of tenants, hosts, and VMs. commodity network switches often have only limited resources and features. MAC forwarding information base (FIB) table entries in data plane e.g., 64K in HP ProCurve and 55K in the Cisco Catalyst 4500 series Small data-plane FIB tables MAC-address learning problem working set of active MAC addresses > switch data-plane table some entries will be lost and a subsequent packet to those destinations will cause flooding. Networks operated and handle link failures automatically Flexible network abstraction Different tenants will have different network needs. VMs migration without needing to change the network addresses of the VMs. Fully virtualizes the L2 and L3 address spaces for each tenant without any restrictions on the tenant’s choice of L2 or L3 addresses.

PROBLEM AND BACKGROUND Multi-tenant datacenters: VLANs to isolate the different tenants on a single L2 network the hypervisor encapsulate a VM’s packets with a VLAN tag the VLAN tag is a 12-bit field in the VLAN header, limiting this to at most 4K tenants The IEEE 802.1ad “QinQ” protocol to allow stacking of VLAN tags, which would relieve the 4K limit, but QinQ is not yet widely supported Spanning Tree Protocol limits the bisection bandwidth for any given tenant expose all VM addresses to the switches, creating scalability problems Amazon’s Virtual Private Cloud (VPC) provides an L3 abstraction and full IP address space virtualization. VPC does not support multicast and broadcast -> implies that the tenants do not get an L2 abstraction. Greenberg et al. propose VL2 VL2 provides each tenant with a single IP subnet (L3) abstraction support IP-in-IP decapsulation/capsulation the approach assumes several features not common in commodity switches VL2 handles a service’s L2 broadcast packets by transmitting them on a IP multicast address assigned to that service. Diverter overwrite the MAC addresses of inter-subnet packets accommodates multiple tenants’ logical IP subnets on a single physical topology. it assigns unique IP addresses to the VMs -> it does not provide L3 address virtualization.

PROBLEM AND BACKGROUND Scalable network fabrics: Many research projects and industry standards address the limitations of Ethernet’s Spanning Tree Protocol. TRILL (an IETF standard) , Shortest Path Bridging (an IEEE standard) Seattle , support multipathing using a single-shortest-path approach. These three need new control-plane protocol implementations and data-plane silicon. Hence, inexpensive commodity switches will not support them Scott et al. proposed MOOSE address Ethernet scaling issues by using hierarchical MAC addressing also uses shortest-path routing needs switches forward packets based on MAC prefixes Mudigonda et al. proposed SPAIN SPAIN uses the VLAN support in existing commodity Ethernet switches to provide multipathing topologies SPAIN uses VLAN tags to identify k edge-disjoint paths between pairs of endpoint hosts. The original SPAIN design may expose each end-point MAC address k times (once per VLAN) stressing data-plane tables , and it can not scale to large number of VMs.

NETLORD’S DESIGN NetLord leverages our previous work on SPAIN provide a multi-path fabric using commodity switches. The NetLord Agent (NLA) light-weight agent, in the hypervisors encapsulating L2 and L3 headers -> the correct edge switch the egress edge switch to deliver the packet to the correct server, and allows the destination NLA to deliver the packet to the correct tenant. The encapsulation method is that tenant VM addresses are never exposed to the actual hardware switches. Share a single edge-switch MAC address across a large number of physical and virtual machines. This resolves the problem of FIB-table pressure, instead of needing to store millions of VM addresses in their FIBs Edge-switch FIBs must also store the addresses of their directly connected server NICs

NETLORD’S DESIGN Several network abstractions available to NetLord tenants. The inset shows a pure L2 abstraction A tenant with three IPv4 subnets connected by a virtual router.

NETLORD’S DESIGN NetLord’s components Fabric switches: traditional, unmodified commodity switches. support VLANs (for multi-pathing) and basic IP forwarding. no routing protocol(RIP,OSPF) A NetLord Agent (NLA) resides in the hypervisor of each physical server encapsulates and decapsulates all packets from and to the local VMs. the NLA builds and maintains a table VM’s Virtual Interface (VIF) ID -> port number + MAC address ( edge switch) Configuration repository: The repository resides at an address known to the NLAs VM manager system and maintains several databases. SPAIN-style multi-pathing; per-tenant configuration information

NETLORD’S DESIGN NetLord’s components NetLord’s high-level component architecture (top) packet encapsulation/decapsulation flows (bottom)

NETLORD’S DESIGN Encapsulation details NetLord identifies a tenant VM’s Virtual Interface (VIF) 3-tuple (Tenant_ID, MACASID, MAC-Address) MACASID is a tenant-assigned MAC address space ID When a tenant VIF SRC sends a packet to a VIF DST, unless the two VMs are on the same server, the NetLord Agent must encapsulate the packet MACASID is carried in the IP.src field The outer VLAN tag is stripped by the egress edge switch NLA needs to know the destination VM remote edge switch MAC address, and the remote edge switch port number Use SPAIN to choose a VLAN for the egress edge switch; Egress edge switch look up the packet address and generate the final forwarding hop. IP address p as the prefix, and tid as the suffix p.tid[16:23].tid[8:15].tid[0:7], which supports 24 bits of Tenant_ID. VLAN.tag = SPAIN_VLAN_for(edgeswitch(DST),flow) MAC.src = edgeswitch(SRC).MAC_address MAC.dst = edgeswitch(DST).MAC_address IP.src = MACASID IP.dst = encode(edgeswitch_port(DST), Tenant_ID) IP.id = same as VLAN.tag IP.flags = Don’t Fragment

NETLORD’S DESIGN Switch configuration NetLord Agent configuration Switches require only static configuration to join a NetLord network. These configurations are set at boot time, and need never change unless the network topology changes. Switch boots simply creates IP forwarding-table entries (prefix, port, next_hop) = (p.*.*.*/8, p, p.0.0.1) for each switch port p. forwarding-table entries can be set up by a switch-local script management-station script via SNMP or the switch’s remote console protocol. Every edge switch has exactly the same IP forwarding table, this simplifies switch management, and greatly reduces the chances of misconfiguration. SPAIN VLAN configurations This information comes from the configuration repository It is pushed to the switches via SNMP or the remote console protocol. When there is a significant topology change SPAIN controller might need to update these VLAN configurations NetLord Agent configuration Distributing VM configuration information (including tenant IDs) to the hypervisors. VIF parameters become visible to the hypervisor its own IP address and edge-switch MAC address. it listens to the Link Layer Discovery Protocol (LLDP - IEEE 802.1AB)

NETLORD’S DESIGN NL-ARP a simple protocol called NL-ARP maintain an NL-ARP table in each hypervisor. to map a VIF location table egress switch MAC address and port number. also uses this table to proxy ARP requests to decreasing broadcast load. NL-ARP defaults to a push-based model the pull based model of traditional ARP a binding of a VIF to its location is pushed to all NLAs whenever the binding changes. The NL-ARP protocol uses three message types: NLA-HERE to report a location (unicast) NLA responds with a unicast NLA-HERE for NLA-WHERE broadcast. NLA-NOTHERE to correct misinformation (unicast) NLA-WHERE to request a location. (broadcast) When a new VM is started, or when it migrates to a new server, the NLA on its current server broadcasts its location using an NLAHERE message. NL-ARP table entries are never expired

NETLORD’S DESIGN VM operation: Sending NLA operation: VM’s outbound packet –(transparent)-> the local NLA. Sending NLA operation: ARP messages -> NL-ARP subsystem -> NL-ARP lookup table -> destination Virtual Interface (VIF), (Tenant_ID, MACASID, MAC-Address). NL-ARP maps a destination VIF ID (MAC-Address, port) tuple -> the destination edge-switchMAC address + server port number Network operation All the switches learn the reverse path to the ingress edge switch( source MAC) the egress edge switch strips the Ethernet header -> looks up the destination IP address(IP forwarding table) -> destination NLA next-hop information -> the switch-port number and local IP address( destination server) Receiving NLA operation: Decapsulates the MAC and IP headers The Tenant_ID from IP destination address the MACASID from the IP source The VLAN tag from the IP_ID field look up the correct tenant’s VIF, delivers the packet to the tenant VIF.

BENEFITS AND DESIGN RATIONALE Scale: Number of tenants: By using the p.*.*.*/24 encoding , NetLord can support as many as 2^24 simultaneous tenants Scale: Number of VMs: The switches are insulated from the much larger number of VM addresses NetLord worst-case limits on unique MAC addresses a worst-case traffic matrix: simultaneous all-to-all flows between all VMs flooded packet due to a capacity miss in any FIB. NetLord can support N =V R 𝐹/2 unique MAC addresses V is the number of VMs per physical server, R is the switch radix, F is the FIB size (in entries) The maximum number of MAC addresses (VM) that NetLord could support, for various switch port counts and FIB sizes. Following a rule of thumb used in industrial designs, we assume V = 50. feature 72 ports/64K FIB entries could support 650K VMs. 144-port ASICs with the same FIB size could support 1.3M VMs.

BENEFITS AND DESIGN RATIONALE Why encapsulate VM addresses can be hidden from switch FIBs either via encapsulation or via header-rewriting. Drawbacks (1) Increased per-packet overhead – extra header bytes and CPU for processing them (2) Heightened chances for fragmentation and thus dropped packets; (3) Increased complexity for in-network QoS processing. Why not use MAC-in-MAC? This would require the egress edge-switch FIB to map a VM destination MAC address It also requires some mechanism to update the FIB when a VM starts, stops, or migrates MAC-in-MAC is not yet widely supported in inexpensive datacenter switches. NL-ARP overheads: NL-ARP imposes two kinds of overheads: bandwidth for its messages on the network space on the servers for its tables. At least one 10Gbps NIC. NLA-HERE traffic to just 1% of that bandwidth can sustain about 164,000 NLA-HERE messages per second. Each table entry is 32 bytes. A million entries can be stored in a 64MB hash table This table consumes about 0.8% of that RAM(8GB of RAM).

EXPERIMENTAL ANALYSIS the NetLord agent as a Linux kernel module, about 950 commented lines of code(SPAIN). statically-configured NL-ARP lookup tables( Ubuntu Linux 2.6.28.19). NetLord Agent micro-benchmarks “ping” for latency and Netperf for throughput. We compare our NetLord results against an unmodified Ubuntu (“PLAIN”) and our original SPAIN Two Linux hosts (quadcore 3GHz Xeon CPU, 8GB RAM, 1Gbps NIC) connected via a pair of “edge switches.” Ping (latency) experiments 100 64-byte packets. add at most a few microseconds to the end-to-end latency. One-way and bi-directional TCP throughput(Netperf) For each case ran 50 10-second trials. 1500-byte packets in the two way experiments. 34-byte encapsulation headers NetLord causes a 0.3% decrease in one-way throughput, and a 1.2% in two-way throughput.

EXPERIMENTAL ANALYSIS Emulation methodology and testbed testing on 74 servers light-weight VM shim layer to the NLA kernel module. each TCP flow endpoint becomes an emulated VM. emulate up to V = 3000 VMs per server. 74 VMs per tenant on our 74 servers Metrics: compute the goodput as the total number of bytes transferred by all tasks Testbed: a larger shared testbed(74 servers) Each server has a quad-core 3GHz Xeon CPU and 8GB RAM. The 74 servers are distributed across 6 edge switches All switch and NIC ports run at 1Gbps, and the switches can hold 64K FIB table entries. Topologies: (i) FatTree topology the entire testbed has 16 edge switches, another 8 switches to emulate 16 core switches we constructed a two-level FatTree, with full bisection bandwidth. no oversubscription (ii) Clique topology: All 16 edge switches are connected to each other oversubscribed by at most 2:1.

EXPERIMENTAL ANALYSIS Emulation results Goodput with varying number of tenants (number of VMs in parenthesis). Number of flooded packets with varying number of tenants.