Download presentation
Presentation is loading. Please wait.
1
Open vSwitch HW offload over DPDK
India DPDK Summit - Bangalore March-2018
2
Virtual Switch is Overloaded as We Go Cloud Native
1000s containers per server Great portability High resource utilization High performance Few apps per server Low mobility Low resource utilization 10s-100s VM per server Good mobility High resource utilization App App App App App App Guest OS Guest OS Guest OS App App App Application App App App Hypervisor OVS Docker Engine OVS OS Host OS OS Infrastructure Infrastructure Infrastructure Hypervisor Virtualization, App Running in VM OS Resource Virtualization, App Running in Container Bare Metal
3
Telco and Cloud Applications Expose OVS Weaknesses
Low packet performance Vanilla OVS delivering ~0.5 Mpps on 10G link ~1/80 of bare metal performance for voice applications Low efficiency Many CPU cores need to be dedicated to packet processing to achieve a fraction of bare metal performance Latency & packet drop High and unpredictable latency Queues build up, and packets can be dropped Directly affecting user experience for real time applications
4
How do We Solve the Problem? - ASAP2
Accelerated Switching and Packet Processing What does it do? Offload OVS data plane to eSwitch in the NIC Maintain SDN control plane Leverage standard open API Everything up-streamed Key advantages Higher throughput Lower, more deterministic latency Lower CPU overhead, higher efficiency
5
Common Operations in Networking
Most network functions share some data-path operations Packet classification (into flows) Action based on the classification result Mellanox NIC has the capability to offload both the classification and the actions in hardware NIC Classification A Action A Classification B Action B Classification N Action N Packets In Processed Packets Out
6
Accelerated Virtual Switch (ASAP2-Flex)
7
Flex HW Acceleration for vSwitch/vRouter
Offload some elements of the data-path to the NIC, but not the entire data-path Data will still flow via the vSwitch Para-Virtualized VM (not SR-IOV) Offloads (examples) Classification offload Application provide flow spec and flow ID Classification done in HW and attach a flow ID in case of match vSwitch classify based on the flow ID rather than full flow spec rte flow is used to configure the classification VxLAN Encap/decap VLAN add/remove QoS vSwitch acceleration VM ConnectX 4 eSwitch Hypervisor OVS PV TC / DPDK Offload Data Path PF
8
HW classification offload concept
For every OVS flow, DP_IF should use the DPDK rte_flow to classify with Action tag (report id) or drop. When packet is received, use the tag id instead of classifying the packet again for Example : OVS set action Y to flow X Add a rte_flow to tag with id 0x1234 for flow X Config datapath to do action Y for mbuf->fdir.id = 0x1234 OVS action drop for flow Z Use rte_flow DROP and COUNT action to drop and count flow Z Use rte_flow counter to get flow statistic Packets flow PMD NIC Hardware User OVS DataPath OVS-vswitchD Rte_flow Flow X mark with id 0x1234 mbuf->fdir.id 0x1234 Do OVS action Y DP_IF - DPDK Config flow
9
Flow Tables Overview Multiple tables Programmable table size
Programmable table cascading Dedicate, isolated tables for hypervisor and/or VMs Practically unlimited table size Can support million of rules/flows
10
Flow Tables – Classification
Key fields example Ethernet Layer 2 Destination MAC 2 outer VLANs / priority Ethertype IP (v4 /v6) Source address Destination address Protocol / Next header TCP /UDP Source port Destination port Flexible fields extraction by “Flexparse” All fields mandatory by OpenFlow
11
Flow Tables – Actions Actions* Additional actions in newer NICs
Steering and Forwarding Drop / Allow Counter set Send to Monitor QP Encapsulation Decapsulation Report Flow ID Additional actions in newer NICs Header rewrite MPLS and NSH encap/decap Flexible encap/decap Hairpin mode * Not all combinations are supported
12
OVS-DPDK using Flex HW classification offload
For every datapath rule we add a rte_flow with flow id The flow id cache can contain flow rules in excess of 1M When packet received matches with a flow id in cache, no need to re-classify the packet to get the rule Flow id cache
13
Performance Case #flows Base MPPs Offload MPPs improvement
Wire to virtio 1 5.8 8.7 50% Wire to wire 6.9 11.7 70% 512 4,2 11,2 267% Code submitted by Yuanhan Liu. Planned to be integrated to OVS 2.10 Single core for each pmd, single queue,
14
Full OVS Offload (ASAP2-Direct)
15
Full HW Offload for vSwitch/vRouter acceleration
Offload the whole packet processing onto the embedded switch Split control plane and forwarding plane Forwarding plane – use the embedded switch Remove the cost of dataplane in SW SRIOV based
16
Representors We use VF representors
Representor ports are a netdev modeling of eSwitch ports The VF representor supports the following operations Send packet from the host CPU to VF (OVS Re-injection) Receive of eSwitch “miss” packets Flow configuration (add/remove) Flow statistics read for the purposes of aging and statistics The Representor devices are switchdev instances OVS
17
OVS-DPDK with full HW offloads
OVS DPDK with direct data path to VM’s switchdev SR-IOV offloads already implemented in Kernel OVS Use DPDK ‘slow’ path for exception flows or unsupported HW features Allow DPDK to use the control and data path of embedded switch Representor ports are exposed over the PF Data Path RX & TX queues per representor Send/receive packet to/from VF is done through it’s representor ACL, steering, routing encap/decap flow counters IPSec Co-exists with para-virt solutions rte_flow API will be extended to support full HW offload in DPDK 18.05 GuestPV GuestPV GuestVF virtio virtio VF driver OVS-DPDK netdev uplink VF representor Rte_flow switchdev PF VF NIC Embedded Switch uplink
18
OVS DPDK with/without full HW offload
Test Full HW offload Without offload Benefit 1 Flow VXLAN 66M PPS 7.6M PPS (VLAN) 8.6X 60K flows VXLAN 19.8M PPS 1.9M PPS 10.4X
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.