Dataplane Performance, Capacity, and Benchmarking in OPNFV

Slides:



Advertisements
Similar presentations
Diagnosing Performance Overheads in the Xen Virtual Machine Environment Aravind Menon Willy Zwaenepoel EPFL, Lausanne Jose Renato Santos Yoshio Turner.
Advertisements

Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Considerations for Benchmarking VNFs and their Infrastructure Al Morton March, 2015.
Performance Management (Best Practices) REF: Document ID
QTIP Version 0.2 4th August 2015.
IETF 90: VNF PERFORMANCE BENCHMARKING METHODOLOGY Contributors: Sarah Muhammad Durrani: Mike Chen:
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
Qtip Revised project scope July QTIP overview QTIP aims to develop a framework for bottoms up testing of NFVI platforms QTIP aims to test: Computing.
Hosting Virtual Networks on Commodity Hardware VINI Summer Camp.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Vic Liu Lingli Deng Dapeng Liu China Mobile Speaker: Vic Liu China Mobile Gap Analysis on Virtualized Network Test draft-liu-dclc-gap-virtual-test-00.
Srihari Makineni & Ravi Iyer Communications Technology Lab
VIRTUAL SWITCH/ROUTER BENCHMARKING Muhammad Durrani Ramki Krishnan Brocade Communications Sarah Banks Akamai 1 © 2013 Brocade Communications Systems, Inc.
VSPERF – Test and Performance Subgroup Demo 26/08/2015 Maryam Tahhan.
Network design Topic 6 Testing and documentation.
Maryam Tahhan Al Morton Benchmarking Virtual Switches in OPNFV draft-vsperf-bmwg-vswitch-opnfv-01.
Creating SmartArt 1.Create a slide and select Insert > SmartArt. 2.Choose a SmartArt design and type your text. (Choose any format to start. You can change.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
Maryam Tahhan, Al Morton, Jack Morgan. VSPERF Deep Dive: Virtual Switch performance In OPNFV 1.
Maryam Tahhan, Al Morton, Jack Morgan. VSPERF Deep Dive: Virtual Switch performance In OPNFV 1.
Considerations for Benchmarking Virtual Networks Samuel Kommu, Jacob Rapp, Ben Basler,
Virtual Networking Performance
Learnings from the first Plugfest
Shaopeng, Ho Architect of Chinac Group
/csit CSIT Readout to LF OPNFV Project 01 February 2017
/csit CSIT Readout to FD.io Board 08 February 2017
Monitoring Windows Server 2012
New Approach to OVS Datapath Performance
NFV Compute Acceleration APIs and Evaluation
/csit CSIT Readout to FD.io Board 09 February 2017
Instructor Materials Chapter 1: LAN Design
Liang Gao (Kubi), Huawei
OpenStack’s networking-vpp
BESS: A Virtual Switch Tailored for NFV
OPNFV testing strategy
How OPNFV Should Act Beyond Breaking Points
OPNFV testing strategy
Bottlenecks Stress Test Demo
Architecture and Algorithms for an IEEE 802
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
How to Reuse OPNFV Testing Components in Telco Validation Chain
Software Architecture in Practice
Escalator: Refreshing Your OPNFV Environment With Less Troubles
6WIND MWC IPsec Demo Scalable Virtual IPsec Aggregation with DPDK for Road Warriors and Branch Offices Changed original subtitle. Original subtitle:
OPNFV Doctor - How OPNFV project works -
Empirically Characterizing the Buffer Behaviour of Real Devices
Cross Community CI (XCI)
Enterprise vCPE use case requirement
MONITORING MICROSOFT WINDOWS SERVER 2003
Towards 10Gb/s open-source routing
Are You Insured Against Your Noisy Neighbor - A VSPERF Use Case
Multi-PCIe socket network device
Network Function Virtualization: Challenges and
Dependability Evaluation and Benchmarking of
Procket’s IPv6 Implementation
Virtio Keith Wiles July 11, 2016.
Open vSwitch HW offload over DPDK
On the Way to Cloud Native:
Virtio/Vhost Status Quo and Near-term Plan
Casablanca Platform Enhancements to Support 5G Use Case (Network Deployment, Slicing, Network Optimization and Automation Framework) 5G Use Case Team.
Network Services Benchmarking - NSB
Reprogrammable packet processing pipeline
All or Nothing The Challenge of Hardware Offload
Performance Evaluation of Computer Networks
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Performance Evaluation of Computer Networks
Requirements Definition
Performance And Scalability In Oracle9i And SQL Server 2000
NetCloud Hong Kong 2017/12/11 NetCloud Hong Kong 2017/12/11 PA-Flow:
Openstack Summit November 2017
Presentation transcript:

Dataplane Performance, Capacity, and Benchmarking in OPNFV Trevor Cooper, Intel Corp. Sridhar Rao, Spirent Communications Al Morton, AT&T Labs … with acknowledgement to VSPERF committers

Agenda Dataplane Performance Measurement with VSPERF VSPERF Example Results and Analysis Moving Ahead with VSPERF

E2E Dataplane Performance Measurement & Analysis … User-Application Quality-of-Service COVERAGE SPEED ACCURACY RELIABILITY SCALABILITY Activation Operation De-activation Service Performance Indicators The Communications Networks deploying NFV are Wide-Area Nets, composed of NFV Infrastructure systems with VNFs along with PNFs (routers and switches) today. Each user application has a set of performance requirements (SLA) that must be achieved to deliver high quality service, in terms of loss, delay, delay variation, and Capacity or BW. Every component in the E2E path contributes to the performance delivered. The measurement data (metrics and statistics) has to be interpreted correctly (or could be deceiving) and this requires an understanding of the measurement methods as the application (or use-case). Telco context to be aware of looking at the dataplane and service aspects BM of vSwitch has to be informed by same kind of metrics as network delivering E2E service Network SLA Capacity / BW Loss Delay Delay variation Network Performance Metrics & Statistics

VSPERF DUT is an important part of the E2E Data Path VSPERF Test Automation Source/build SW components Set up vSwitch Set up workload Set up traffic generator Execute test cases Collect test results Log and store data Generate test statistics & result dashboards / reports Virtual Switching technology and NIC offloads Physical and virtual ports Virtualized Workload …which is one reason we have very demanding performance expectations for the vSwitch technologies we measure in the lab. We need to carefully characterize the dataplane performance in all the dimensions that will be important when considering the E2E performance in the context of designing and operating the new generation of networks. We’ll see many examples of performance benchmarking, using … Focus on different deployment scenarios -> topologies VSPERF help measure Context Data path includes physical and virtual interfaces, NW topology, workload virtualized workload Phsical ports provides clear separation with TG that represents real deployments

VSPERF and the OPNFV Testing Community

Dataplane Performance Testing Options Workload (DUT) Traffic Generator (HW or SW) Automation code (Test framework) Sample VNFs SampleVNF (vACL, vFW, vCGNAT, …) Open source VNF catalogue Hardware - commercial Ixia Spirent Xena Compliance Dovetail Test VMs vloop-vnf (dpdk-testpmd, Linux bridge, L2fwd module) Spirent stress-VM Virtual Traffic classifier Virtual - commercial VIM and MANO NFVbench Virtual switching OVS OVS-dpdk VPP Software - Open Source Pktgen Moongen TREX PROX VIM, no MANO Yardstick Qtip Bottleneck Physical / virtual interfaces NIC (10GE, 40GE, …) Vhost-user Pass-through, SR-IOV   No VIM or MANO VSPERF Storperf HW offload TSO encrypt/decrypt SmartNIC *Used in test examples presented Daily tests on master and stable branch in OPNFV Lab https://build.opnfv.org/ci/view/vswitchperf/ Solution Stack Specifications IETF BMWG RFCs for Dataplane Performance ETSI NFV Test Specifications Topologies vSwitch SR-IOV etc. Phy2Phy PVP PVVP (multi-VM)

VSPERF Example Results and Analysis Results and Analysis from Recent Tests Using VSPERF to Analyse: OVS and VPP Traffic Generators Impact of noisy neighbor Back2Back frame testing with CI

Virtual Switches in VSPERF: OVS and VPP RFC2544, Phy2Phy OVS2.6.90, VPP 17.01DPDK: 16.07.0 Typical vSwitch baseline test configuration is “phy to phy” with 64B IPv4 packets, single and multiple bi-directional flows Avg. latency for OVS and VPP varies from 10-90us with minimal (1-9%) difference between them Average latency jumps significantly after 128 B For multi-stream, latency variation are: Min: 2-30us Avg: 5-110us Inconsistency for 256B with OVS vs VPP Both OVS and VPP (64 B, 1-Flow, bi-dir), the throughput is ~80% of line-rate NIC has known processing limits that could be the bottleneck For uni-directional traffic line-rate is achieved for 64B A jump in latency for higher packet-sizes, is seen in almost all cases

Virtual Switches in VSPERF: OVS and VPP RFC2544, Phy2Phy OVS2.6.90, VPP 17.01 DPDK: 16.07.0 For multi-stream, 64 and 128B – VPP throughput can go up to 70% higher than OVS. But … *Inconsistencies* OVS: 4K flows lower TPUT vs 1M Traffic generator results differ *Possible Reasons* Packet-handling architectures Packet construction variation Test traffic is fixed size Analysis of Cache Miss: [SNAP monitoring tool] The cache miss of VPP is 6% lower compared to the cache-misses for OVS. Requires further analysis!

Lessons Learned – OVS and VPP Simple performance test-cases (#flows + pps) may not provide meaningful comparisons EANTC Validates Cisco's Network Functions Virtualization (NFV) Infrastructure - (Oct 2015) Test case topology is VM to VM … 0.001% packet loss accepted … Pass-through connects physical interfaces to VNF … VPP and OVS use a “single core” … Software versions – OVS-dpdk 2.4.0, DPDK 2.0, QEMU 2.2.1 … Processor E5-2698 v3 (Haswell – 16 physical cores), NW adaptor X520-DA2 Results are use-case dependent Topology and encapsulation impact workloads under-the-hood Realistic and more complex tests (beyond L2) may impact results significantly Measurement methods (searching for max) may impact results DUT always has multiple configuration dimensions Hardware and/or software components can limit performance (but this may not be obvious) Metrics / statistics can be deceiving – without proper considerations to above points!

Baremetal Traffic Generators RFC2544, Phy2Phy OVS2.6.90, VPP 17.01 DPDK: 16.07.0 Software Traffic Generators on bare-metal are comparable to HW reference for larger pkt sizes Small pkt sizes show inconsistent results Across different generators Between VPP and OVS For both single and multi-stream scenarios For now, in VSPERF, existing baremetal software trafficgens, are unable to provide latency values* *Running vsperf in “trafficgen-off” mode, it is possible to obtain latency values for some SW TGens. Software trafficgens, when run on baremetal, can produce same ‘throughput’ performance results as the hardware trafficgen.

Traffic Generator as a VM RFC2544, Phy2Phy OVS2.6.90, VPP 17.01 DPDK: 16.07.0 BM With TGen-as-a-VM, the throughput is lower (upto 40%) in comparison with baremetal traffic generator. Mostly restricted to lower packet size. *Reasons* Inherent baremetal vs VM differences. Resource allocations. Processes per packet. In VSPERF, TGen-a-VM, can provide latency values. * The latency values (min and avg) can be 10x times the values provided by the hardware traffic-generator * [Configuration of NTP servers]

Software Traffic Generators – Lessons Learned TG characteristics can impact measurements Inconsistent results seen for small packet sizes across TGs Packet stream characteristics may impact results … bursty traffic is more realistic! Back2Back tests confirm sensitivity of DUT at small frame sizes Switching technology (DUT) are not equally sensitive to packet stream characteristics Configuration of ‘environment’ for Software traffic-generators is critical CPUs: Count and affinity definition Memory: RAM, Hugepages and NUMA Configuration DPDK Interfaces: Tx/Rx Queues PCI Passthrough or SRIOV configurations Software version https://wiki.opnfv.org/display/kvm/Nfv-kvm-tuning http://dpdk.org/doc/guides/linux_gsg/nic_perf_intel_platform.html

Noisy Neighbor Noisy Neighbor Test DUT: VSPERF with OVS and L2FWD VNF. Traffic Generator: Hardware. Noisy Neighbor: Stressor VM Test: RFC2544 Throughput Level Last level cache consumption by the noisy neighbor VM Minimal l3 cache consumption (<10%) 1 Average L3 cache consumption (50%) 2 High L3 cache consumption (100%) CPU affinity configuration and NUMA configuration can protect from majority of Noise. Consumption of Last-level cache (L3) is key to creating noise* If the noisy neighbor can thrash the L3-Cache, it can lower the forwarding performance – throughput – upto 80% *It maybe be worth studying the use of tools such as cache-allocation-technology (Libpqos) to manage noisy-neighbors as shown here: https://www.openstack.org/assets/presentation-media/Collectd-and-Vitrage-integration-an-eventful-presentation2.final.pdf

Back2Back Frame Testing Analysis Test Device (Send&Rcv) vSw Physical port Phy2Phy Back2Back Frame Testing Analysis Seek Maximum burst length (sent with min. spacing, or back-to-back) that can be transmitted through the DUT without loss (est.Buffer size) HW Tgen, Phy2Phy, OVS, CI tests on Intel Pod 12, Feb-May 2017 Model: Tgen->Buff->HeaderProc->Rcv Only 64byte Frames are buffered! Ave Burst length = 26,700 Frames Source of Error: many Frames are processed before buffer overflow Corr_Buff=5713 frames, or 0.384ms Similar results for Intel Pod 3 26700 Frames Consistent TPUT as well

Backup: Back2Back Frame Test Pod 3 Pod 12

Moving Ahead with VSPERF Comparing virtual switching technologies and NFVI setups More realistic traffic profiles More complex topologies (e.g. full mesh) Additional real-world use-cases (e.g. overlays) Custom VNFs (dpdk workloads) Stress tests (e.g. noisy neighbor) Additional test cases (e.g. TCP) STUDIES Visualization and Interpretation of test results New NFVI test specs & metrics (IETF, ETSI NFV) Display of latency measurements Test environment and DUT configurations Traffic generator capabilities Dashboards and analytics Correlation of statistics Simplification of results FEATURES Tool support and integration with other OPNFV frameworks Metrics agents & monitoring systems Additional traffic generators (e.g. 40GE) CI unit tests for developers OPNFV scenario support Installer integration Yardstick integration INTEGRATION