Accelerating Network Intensive Workloads Using the DPDK netdev

Slides:



Advertisements
Similar presentations
1 Keith D. Underwood, Eric Borch May 16, 2011 A Unified Algorithm for both Randomized Deterministic and Adaptive Routing in Torus Networks.
Advertisements

11 Auto Regression Analysis Shuang He Intel Linux Graphics Validation Team Open Source Technology Center
What are Intel ® Xeon Phi Coprocessors? Datacenter and Connected Systems Group October 2013.
Symantec Education Skills Assessment SESA 3.0 Feature Showcase
Intel® Education Fluid Math™
NUC5i5RYK, NUC5i5RYH, NUC5i3RYK and NUC5i3RYH Intel® NUC Kits
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Accelerating the Path to the Guest
Lappeenrannan teknillinen yliopisto TITE Prof. Esa Kerttula Päivä 1: Luento 1-1-7: Maaliskuu © Esa Kerttula.
HEVC Commentary and a call for local temporal distortion metrics Mark Buxton - Intel Corporation.
Intel ® Server Platform Transitions Nov / Dec ‘07.
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.
Middleware Promises Warranties that Don’t Indemnities that Won’t Stephen Rubin, Esquire
Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.
Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.
IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.
K-12 Blueprint Overview March An Overview The K-12 Blueprint offers resources for education leaders involved.
Copyright © 2013 Intel Corporation. All rights reserved. Digital Signage for Growing Businesses November 2013.
Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.
Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.
Intel Confidential – For Use with Customers under NDA Only Revision - 01 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL®
IBIS-AMI and Direction Decisions
IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.
Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.
Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.
Virtualization for the Win! Scaling Electronic Sports League’s servers way up Sreeram Sammeta Paul Lindberg Intel.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Receipts.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Outbound Logistics (L3) Pick Loads.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Put Away Loads.
© 2015 IBM Corporation Big Data Journey. © 2015 IBM Corporation 2.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Supplier Returns.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage and Disposition Inventory Returns.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Inspect Material.
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
For Oracle employees and authorized partners only. Do not distribute to third parties. © 2008 Oracle Corporation – Proprietary and Confidential.
Wi-Fi BT/BLE Combo Module WINC3400 hands-on
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle Proprietary and Confidential. 1.
Copyright ® Intel Corporation All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
High throughput computing collaboration (HTCC) Jon Machen, Network Software Specialist DCG IPAG, EU Exascale Labs INTEL Switzerland.
Bluetooth-LE OpenJDK 8 & wearable sensors Edison Vinay K. Awasthi October 12, 2015.
Only Use FD.io VPP to Achieve high performance service function chaining Yi Intel.
TLDK Transport Layer Development Kit
New Approach to OVS Datapath Performance
TLDK overview Konstantin Ananyev 05/08/2016.
NFV Compute Acceleration APIs and Evaluation
BESS: A Virtual Switch Tailored for NFV
Towards a single virtualized data path for VPP
BLIS optimized for EPYCTM Processors
Next Gen Infrastructure Core (NGIC) Hands-On Demo
NSH_SFC Performance Report FD.io NSH_SFC and CSIT Team
Many-core Software Development Platforms
Networking overview Sujata
A Proposed New Standard: Common Privacy Vulnerability Scoring System (CPVSS) Jonathan Fox, Privacy Office/PDIT Harold A. Toomey, PSG/ISecG Jason M. Fung,
Virtio Keith Wiles July 11, 2016.
Open vSwitch HW offload over DPDK
12/26/2018 5:07 AM Leap forward with fast, agile & trusted solutions from Intel & Microsoft* Eman Yarlagadda (for Christine McMonigal) Hybrid Cloud – Product.
Ideas for adding FPGA Accelerators to DPDK
Virtio/Vhost Status Quo and Near-term Plan
Enabling TSO in OvS-DPDK
By Vipin Varghese Application Engineer (NCSD)
All or Nothing The Challenge of Hardware Offload
SERVER INNOVATION ACCELERATES IT TRANSFORMATION
A Scalable Approach to Virtual Switching
Expanded CPU resource pool with
Openstack Summit November 2017
Presentation transcript:

Accelerating Network Intensive Workloads Using the DPDK netdev November 2014 OVS Fall Conference 2014 Intel

Legal Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers Intel, the Intel logo, Intel Atom, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 2013 Intel Corporation. All rights reserved

Agenda Motivation Architecture Results Futures

Latency & jitter sensitive workloads impacted Motivation Low packet rate insufficient for packet-processing intensive workloads (e.g. NFV) Latency & jitter sensitive workloads impacted

Performance Data Ixia 10G Tester Socket 1 (10 cores) OVS with DPDK Intel(R) Xeon(R) CPU E5-2680 v2 processor, 2.80GHz , 25M Cache Intel(R) C602 Chipset DDR3 1600MHz, 8 x dual rank registered ECC 8GB (total 64GB), 4 memory channels per socket Configuration, 1 DIMM per channel Operating System: Fedora 20 Kernel version: 3.15.6-200.fc20.x86_64 Open vSwitch: 2.3.0-1.fc20.x86_64 Accelerated Open vSwitch with DPDK-netdev commit id:0d2cb7087c8d058466bb1f6af2426a27fdd388c3 Intel(R) DPDK 1.7.0 IxExplorer 6.60.1000.11 GA Ixia 10G Tester BIOS Settings Setting Enhanced Intel SpeedStep® DISABLED Processor C3 Processor C6 Intel® Hyper-Threading Technology (HTT) Intel® Virtualization Technology ENABLED Intel® Virtualization Technology for Directed I/O (VT-d) MLC Streamer MLC Spatial Prefetcher DCU Data Prefetcher DCU Instruction Prefetcher Direct Cache Access (DCA) CPU Power and Performance Policy Performance Memory Power Optimization Performance Optimized Intel® Turbo boost OFF Memory RAS and Performance Configuration -> NUMA Optimized Server Socket 1 (10 cores) OVS with DPDK Phy – Phy (Tx Received Pkts) Socket 2 (10 cores) (unused) Results will vary depending on software, workloads and system configuration

ofproto netdev netdev provider ofproto-dpif dpif provider Linux User Space OVS Daemon Available at openvswitch.org (https://github.com/openvswitch/ovs ) Version of Open vSwitch integrating DPDK available as of 3/19/14 To be released in ver 2.4 of openvswitch Minimal architectural changes through use of additional “netdev” interface Used in conjunction with User Space Open vSwitch module (Kernel switch not used) User space switch reworked by VMware to optimize for performance Permissive license in User Space Switch Currently Supports in Mainline Git or Patches : Full match / action set DPDK Physical Ports VM – VM , VM – Physical Port DPDK ivShmem Ports DPDK vHost Ports L3 tunneling (VXLAN) support Metering ofproto netdev netdev provider ofproto-dpif dpif provider User Space Switch DPDK Framework Rx Tx Rx Tx Linux Kernel Space Data Plane Switch Physical I/O Physical I/O

Open vSwitch® with DPDK Architectural Approach SDN Controller ovsdb OF External VM qemu VM virtio DPDK ovsdb server shmem qemu ovs-switchd ovs-switchd DPDK Libraries IVSHEM vHost User Space Forwarding qemu VM virtio DPDK netdev Tunnels netdev PMD TAP socket User Space Kernel Space ovs kernel module kernel packet processing NIC

50x lower latency for small packets Results Near 10G line rate 50x lower latency for small packets

Desire to see userspace OVS become a first class data plane Futures DPDK netdev bypasses the kernel, meaning some loss of functionality Would both userspace and kernel space paths be useful? The Bifurcated Driver for DPDK uses hardware classification to put frames either through the kernel path, through the DPDK netdev, or into a virtual function What are the major gaps in the userspace pipeline? Userspace Packet Filtering would allow functions such as security, ACLs, NAT or deep packet inspection to happen after a frame is pulled into userspace over the DPDK netdev A Userspace Connection Tracker would enable applications needing stateful flow tracking These enhancements don’t necessarily need to be part of OVS, just accessible and efficient in userspace for OVS to use Desire to see userspace OVS become a first class data plane

Bifurcated Driver Flow-based classification to be accelerated by hardware. Finer Granularity control versus SR-IOV management interface data path Application ethtool ip nft OVS User DPDK netdev kernel OVS-kernel net-filter Kernel-bypass/ zero-copy ip (route) driver Hardware Classification

Desirable Augmentation to Data Plane SDN Controller Must handle high packet rates ovsdb OF External VM qemu VM virtio DPDK ovsdb server Packet Filter Conn Tracker shmem qemu ovs-switchd ovs-switchd DPDK Libraries IVSHEM vHost User Space Forwarding qemu VM virtio DPDK netdev Tunnels netdev PMD TAP socket User Space Kernel Space ovs kernel module kernel packet processing NIC

Summary The DPDK netdev greatly increases packet receive Bypasses kernel, meaning some loss in functionality Time to consider putting high performance packet processing in userspace Can use the bifurcated driver to have a fast lane and a ‘every kernel filter applied’ lane Long term approach is to move more functionality into userspace Feedback on architecture, code and additional benchmark tests is appreciated

Performance Data Ixia 10G Tester Socket 1 (10 cores) OVS with DPDK Intel(R) Xeon(R) CPU E5-2680 v2 processor, 2.80GHz , 25M Cache Intel(R) C602 Chipset DDR3 1600MHz, 8 x dual rank registered ECC 8GB (total 64GB), 4 memory channels per socket Configuration, 1 DIMM per channel Operating System: Fedora 20 Kernel version: 3.15.6-200.fc20.x86_64 Open vSwitch: 2.3.0-1.fc20.x86_64 Accelerated Open vSwitch with DPDK-netdev commit id:0d2cb7087c8d058466bb1f6af2426a27fdd388c3 Intel(R) DPDK 1.7.0 IxExplorer 6.60.1000.11 GA Ixia 10G Tester BIOS Settings Setting Enhanced Intel SpeedStep® DISABLED Processor C3 Processor C6 Intel® Hyper-Threading Technology (HTT) Intel® Virtualization Technology ENABLED Intel® Virtualization Technology for Directed I/O (VT-d) MLC Streamer MLC Spatial Prefetcher DCU Data Prefetcher DCU Instruction Prefetcher Direct Cache Access (DCA) CPU Power and Performance Policy Performance Memory Power Optimization Performance Optimized Intel® Turbo boost OFF Memory RAS and Performance Configuration -> NUMA Optimized Server Socket 1 (10 cores) OVS with DPDK Phy – Phy (Tx Received Pkts) Socket 2 (10 cores) (unused) Results will vary depending on software, workloads and system configuration