Open vSwitch HW offload over DPDK

Slides:

Advertisements

Similar presentations

Virtual Machine Queue Architecture Review Ali Dabagh Architect Windows Core Networking Don Stanwyck Sr. Program Manager NDIS Virtualization.

Advertisements

Fluffy’s Safe Right? If you want to limit a user’s functionality, don’t make them an administrator.

DOT – Distributed OpenFlow Testbed

An Overview of Software-Defined Network Presenter: Xitao Wen.

OpenFlow : Enabling Innovation in Campus Networks SIGCOMM 2008 Nick McKeown, Tom Anderson, et el. Stanford University California, USA Presented.

SDN and Openflow.

Alan Shieh Cornell University Srikanth Kandula Albert Greenberg Changhoon Kim Microsoft Research Seawall: Performance Isolation for Cloud Datacenter Networks.

Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) SriramGopinath( )

Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.

Network Implementation for Xen and KVM Class project for E : Network System Design and Implantation 12 Apr 2010 Kangkook Jee (kj2181)

An Overview of Software-Defined Network

Supporting ethtool with Linux Integration Service Open Source Technology Center Microsoft.

Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Software Stack COS 597E: Software Defined Networking.

Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.

An Overview of Software-Defined Network Presenter: Xitao Wen.

Connecting LANs, Backbone Networks, and Virtual LANs

OpenFlow: Enabling Technology Transfer to Networking Industry Nikhil Handigol Nikhil Handigol Cisco Nerd.

Professor OKAMURA Laboratory. Othman Othman M.M. 1.

© 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

Fast NetServ Data Path: OpenFlow integration Emanuele Maccherani Visitor PhD Student DIEI - University of Perugia, Italy IRT - Columbia University, USA.

VIRTUAL SWITCH/ROUTER BENCHMARKING Muhammad Durrani Ramki Krishnan Brocade Communications Sarah Banks Akamai 1 © 2013 Brocade Communications Systems, Inc.

SECURING SELF-VIRTUALIZING ETHERNET DEVICES IGOR SMOLYAR, MULI BEN-YEHUDA, AND DAN TSAFRIR PRESENTED BY LUREN WANG.

EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.

Switching Topic 2 VLANs.

Extending OVN Forwarding Pipeline Topology-based Service Injection

Introduction to Mininet, Open vSwitch, and POX

Software Defined Networking and OpenFlow Geddings Barrineau Ryan Izard.

AVS Brazos : IPv6. Agenda AVS IPv6 background Packet flows TSO/TCO Configuration Demo Troubleshooting tips Appendix.

An open source user space fast path TCP/IP stack and more…

Open vSwitch: Extending Networking into the Virtualization Layer Ben Pfaff Justin Pettit Teemu Koponen Keith Amidon Martin Casado Nicira Networks, Inc.

Network Virtualization Ben Pfaff Nicira Networks, Inc.

Honeycomb + fd.io Ed Warnicke. Fast Data Scope Fast Data Scope: IO Hardware/vHardware cores/threads Processing Classify Transform Prioritize Forward Terminate.

InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.

Shaopeng, Ho Architect of Chinac Group

Lecture 15: IO Virtualization

Ready-to-Deploy Service Function Chaining for Mobile Networks

New Approach to OVS Datapath Performance

Overlay Network Engine (ONE)

BESS: A Virtual Switch Tailored for NFV

Software defined networking: Experimental research on QoS

Agenda About us Why para-virtualize RDMA Project overview Open issues

Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes

DPDK API and Virtual Infrastructure

Virtualization overview

Multi-PCIe socket network device

SDN Overview for UCAR IT meeting 19-March-2014

Aled Edwards, Anna Fischer, Antonio Lain HP Labs

Indigo Doyoung Lee Dept. of CSE, POSTECH

The Stanford Clean Slate Program

Network Virtualization

Software Defined Networking

Network Core and QoS.

link level network slicing with DPDK

Virtio Keith Wiles July 11, 2016.

Xen Network I/O Performance Analysis and Opportunities for Improvement

Azure Accelerated Networking: SmartNICs in the Public Cloud

Virtio/Vhost Status Quo and Near-term Plan

Sangfor Cloud Security Pool, The First-ever NSH Use Case

Implementing an OpenFlow Switch on the NetFPGA platform

Accelerate Vhost with vDPA

Offloading Linux LAG devices Via Open vSwitch and TC

Reprogrammable packet processing pipeline

All or Nothing The Challenge of Hardware Offload

Top #1 in China Top #3 in the world

NetCloud Hong Kong 2017/12/11 NetCloud Hong Kong 2017/12/11 PA-Flow:

Flow Processing for Fast Path & Inline Acceleration

Network Core and QoS.

Openstack Summit November 2017

Chapter 4: outline 4.1 Overview of Network layer data plane

Presentation transcript:

Open vSwitch HW offload over DPDK India DPDK Summit - Bangalore March-2018

Virtual Switch is Overloaded as We Go Cloud Native 1000s containers per server Great portability High resource utilization High performance Few apps per server Low mobility Low resource utilization 10s-100s VM per server Good mobility High resource utilization App App App App App App Guest OS Guest OS Guest OS App App App Application App App App Hypervisor OVS Docker Engine OVS OS Host OS OS Infrastructure Infrastructure Infrastructure Hypervisor Virtualization, App Running in VM OS Resource Virtualization, App Running in Container Bare Metal

Telco and Cloud Applications Expose OVS Weaknesses Low packet performance Vanilla OVS delivering ~0.5 Mpps on 10G link ~1/80 of bare metal performance for voice applications Low efficiency Many CPU cores need to be dedicated to packet processing to achieve a fraction of bare metal performance Latency & packet drop High and unpredictable latency Queues build up, and packets can be dropped Directly affecting user experience for real time applications

How do We Solve the Problem? - ASAP2 Accelerated Switching and Packet Processing What does it do? Offload OVS data plane to eSwitch in the NIC Maintain SDN control plane Leverage standard open API Everything up-streamed Key advantages Higher throughput Lower, more deterministic latency Lower CPU overhead, higher efficiency

Common Operations in Networking Most network functions share some data-path operations Packet classification (into flows) Action based on the classification result Mellanox NIC has the capability to offload both the classification and the actions in hardware NIC Classification A Action A Classification B Action B Classification N Action N Packets In Processed Packets Out

Accelerated Virtual Switch (ASAP2-Flex)

Flex HW Acceleration for vSwitch/vRouter Offload some elements of the data-path to the NIC, but not the entire data-path Data will still flow via the vSwitch Para-Virtualized VM (not SR-IOV) Offloads (examples) Classification offload Application provide flow spec and flow ID Classification done in HW and attach a flow ID in case of match vSwitch classify based on the flow ID rather than full flow spec rte flow is used to configure the classification VxLAN Encap/decap VLAN add/remove QoS vSwitch acceleration VM ConnectX 4 eSwitch Hypervisor OVS PV TC / DPDK Offload Data Path PF

HW classification offload concept For every OVS flow, DP_IF should use the DPDK rte_flow to classify with Action tag (report id) or drop. When packet is received, use the tag id instead of classifying the packet again for Example : OVS set action Y to flow X Add a rte_flow to tag with id 0x1234 for flow X Config datapath to do action Y for mbuf->fdir.id = 0x1234 OVS action drop for flow Z Use rte_flow DROP and COUNT action to drop and count flow Z Use rte_flow counter to get flow statistic Packets flow PMD NIC Hardware User OVS DataPath OVS-vswitchD Rte_flow Flow X mark with id 0x1234 mbuf->fdir.id 0x1234 Do OVS action Y DP_IF - DPDK Config flow

Flow Tables Overview Multiple tables Programmable table size Programmable table cascading Dedicate, isolated tables for hypervisor and/or VMs Practically unlimited table size Can support million of rules/flows

Flow Tables – Classification Key fields example Ethernet Layer 2 Destination MAC 2 outer VLANs / priority Ethertype IP (v4 /v6) Source address Destination address Protocol / Next header TCP /UDP Source port Destination port Flexible fields extraction by “Flexparse” All fields mandatory by OpenFlow

Flow Tables – Actions Actions* Additional actions in newer NICs Steering and Forwarding Drop / Allow Counter set Send to Monitor QP Encapsulation Decapsulation Report Flow ID Additional actions in newer NICs Header rewrite MPLS and NSH encap/decap Flexible encap/decap Hairpin mode * Not all combinations are supported

OVS-DPDK using Flex HW classification offload For every datapath rule we add a rte_flow with flow id The flow id cache can contain flow rules in excess of 1M When packet received matches with a flow id in cache, no need to re-classify the packet to get the rule Flow id cache

Performance Case #flows Base MPPs Offload MPPs improvement Wire to virtio 1 5.8 8.7 50% Wire to wire 6.9 11.7 70% 512 4,2 11,2 267% Code submitted by Yuanhan Liu. Planned to be integrated to OVS 2.10 Single core for each pmd, single queue,

Full OVS Offload (ASAP2-Direct)

Full HW Offload for vSwitch/vRouter acceleration Offload the whole packet processing onto the embedded switch Split control plane and forwarding plane Forwarding plane – use the embedded switch Remove the cost of dataplane in SW SRIOV based

Representors We use VF representors Representor ports are a netdev modeling of eSwitch ports The VF representor supports the following operations Send packet from the host CPU to VF (OVS Re-injection) Receive of eSwitch “miss” packets Flow configuration (add/remove) Flow statistics read for the purposes of aging and statistics The Representor devices are switchdev instances OVS

OVS-DPDK with full HW offloads OVS DPDK with direct data path to VM’s switchdev SR-IOV offloads already implemented in Kernel OVS Use DPDK ‘slow’ path for exception flows or unsupported HW features Allow DPDK to use the control and data path of embedded switch Representor ports are exposed over the PF Data Path RX & TX queues per representor Send/receive packet to/from VF is done through it’s representor ACL, steering, routing encap/decap flow counters IPSec Co-exists with para-virt solutions rte_flow API will be extended to support full HW offload in DPDK 18.05 GuestPV GuestPV GuestVF virtio virtio VF driver OVS-DPDK netdev uplink VF representor Rte_flow switchdev PF VF NIC Embedded Switch uplink

OVS DPDK with/without full HW offload Test Full HW offload Without offload Benefit 1 Flow VXLAN 66M PPS 7.6M PPS (VLAN) 8.6X 60K flows VXLAN 19.8M PPS 1.9M PPS 10.4X