Download presentation
Presentation is loading. Please wait.
1
L3 + VXLAN Made Practical
Speaker: Nolan
2
Who We Are Nolan Leake Chet Burgess Cofounder, CTO Cumulus Networks
Vice President, Engineering Metacloud Speaker: Nolan & Chet
3
Today, most non-SDN controller based OpenStack deployments use L2 networks.
Speaker: Nolan
4
Traditional Enterprise Network Design
Access Aggregation Core VRRP STP ECMP Speaker: Nolan
5
What’s wrong with L2? Aggregation tier must be highly available/redundant Aggregate/Core scalability MAC/ARP table limits, VLAN exhaustion, East-West choke points Wasted capacity (STP blocking ports) Proprietary protocols/extensions MLAG, vPC, etc Speaker: Nolan
6
How do we make it better? Speaker: Nolan
7
L3: A better design IP Fabrics Are Ubiquitous Simple Feature Set
Proven at scale (The Internet, massive datacenter clusters) Simple Feature Set no alphabet soup of L2 protocols Scalable L2/L3 Boundary ECMP – Equal Cost Multi-Path Each link is active at all times Maximize link utilization Predictable latency Better failure handling Speaker: Nolan
8
L3: A better design LEAF SPINE Speaker: Nolan
9
Pure L3 is great for maximizing connectivity, but what about segregation of projects?
Speaker: Chet VLANs provide segregation of projects, if we use pure L3 we would not be able to trunk our VLANs to every point in the fabric
10
VXLAN: Virtual eXtensible LAN
IETF Draft Standard A type of network overlay technology that encapsulates L2 frames as UDP packets Speaker: Chet
11
VXLAN: Virtual eXtensible LAN
Speaker: Chet Reading right to left: A full ethernet frame, VXLAN header, full outer UDP/IP packet
12
VXLAN: Virtual eXtensible LAN
VNI – VXLAN Network Identifier 24 bit number (16M+ unique identifiers) Part of the VXLAN Header Similar to VLAN ID Limits broadcast domain VTEP – VXLAN Tunnel End Point Originator and/or terminator of VXLAN tunnel for a specific VNI Outer DIP/Outer SIP Speaker: Chet
13
VXLAN: Virtual eXtensible LAN
Sending a packet ARP table is checked for IP/MAC/Interface mapping L2 FDB is checked to determine IP of destination VTEP for destination MAC on source VTEP Speaker: Chet
14
VXLAN: Virtual eXtensible LAN
Sending a packet Packet is encapsulated for destination VTEP with configured VNI and sent to destination Destination VTEP un-encapsulates the packet and the inner packet is then processed by the receiver Speaker: Chet
15
How do VTEPs handle BUM (Broadcast, Unknown Unicast, Multicast)?
Speaker: Chet
16
BUM All BUM type packets (ex. ARP, DHCP, multicast) are flooded to all VTEPs associated with the same VNI. Flooding can be handled 2 ways Packets are sent to a multicast address that all VTEPs are subscribers of Packets are sent to a central service node that then floods the packets to all VTEPs found in its local DB for the matching VNI Speaker: Chet
17
VXLAN: Virtual eXtensible LAN
Well supported in most modern Linux Distros Linux Kernel 3.10+ Linux uses UDP port 8472 instead of IANA issued 4789 iproute2 3.7+ Configured using ip link command Speaker: Chet
18
How do we use this with OpenStack?
Speaker: Chet
19
nova-network Clients needed L3+VXLAN for their existing nova-network based big data deployments (hadoop). Neutron already supports VXLAN and should work with L3 as well (we didn’t have time to test it). Full VXLAN support in nova-network Unicast VXLAN service node for BUM flooding Speaker: Chet
20
VXLAN Service Node Unicast service for BUM flooding
Eliminates the need for multicast Python based 2 Components VXSND – VXLAN Service Node Daemon VXRD – VXLAN Registration Daemon Will be open sourced in the near future. Speaker: Chet
21
VXSND Listens for VXLAN BUM packets from VTEPs
Learns VTEP and VNI endpoints from BUM packets Relays BUM packets to all known VTEPs for given VNI Supports registration/replication from other VXSND daemons or VXRD Speaker: Chet
22
VXRD Monitors local interfaces on hypervisors
Sends VTEP+VNI registration packet to VXSND node for all local VTEPs. Speaker: Chet
23
Software Gateway We’re still getting in/out of the VXLAN network using a software gateway Lower performance Extra servers All nova-net (or neutron’s l3agent) is doing is configuring VXLANs, bridges and iptables NAT. What if we had a hardware switch that could accelerate these standard Linux network features with an ASIC? Speaker: Chet
24
Cumulus Linux Cumulus Linux Standard Linux Tools
Linux Distribution for HW switches (Debian based) Hardware accelerated Linux kernel forwarding using ASICs Just like a Linux server with 32 40G NICs, but ~100x faster Standard Linux Tools Ifconfig, ip route, iptables, brctl, dnsmasq, etc Speaker: Chet
25
Demo
26
Next Steps (nova-network VXLAN)
Blueprint to add VXLAN support to nova-network Juno coming soon. VXSND/VXRD Update VXRD to monitor netlink for VTEP add/delete Improve concurrency and scalability of VXSND Support for tiered replication (TOR, spine, etc) Goal is to open source the product before Paris summit. Speaker: Chet
27
Next Steps (nova-network on Switches)
Hack: ASIC can’t route in/out of VXLAN tunnel Next gin ASICs can Worked around by looping a cable between two ports Packets take a second trip through the switch Hack: Cumulus Linux doesn’t support NAT I hacked in just enough NAT support for floating IPs =) Limitation: ASIC can only NAT 512 IPs. /23 Next gen ASICs will likely have larger tables Speaker: Chet
28
Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.