OPNFV testing strategy Testing working group 23/06/2017 Reporter for the testing working group Gabriel (Bottlenecks) Morgan (Functest)
Stakes We (test projects) mainly validate a release so far Tooling for stress/resiliency/robustness testing is available in the different frameworks but under used Stress tests have been introduced in Danube (but not in CI) Stress / robustness / resiliency / long duration tests are key
Towards a stress test strategy Community discussions (etherpad, mail) https://wiki.opnfv.org/display/bottlenecks/Sress+Testing+over+OPNFV+Platform
Towards a stress test strategy 1. Test Cases Discussed in Danube Categories Test Case Data-plane Traffic for a virtual or bare metal POD TC1 –Determine baseline for throughput TC2 - Determine baseline for CPU limit Life-cycle Events for VM pairs/stacks TC3 – Perform life-cycle events for ping TC4 – Perform life-cycle events for throughput TC5 – Perform life-cycle events for CPU limit 2. Test Cases Planning in Euphrates (Under Discussion) Categories Test Case Scaling Yardstick Scale-out test Scale-up test Compute & Memory VSPerf & StorPerf Set up VMs until maximum throughput reached Test of different Nova schedulers for different compute nodes Run VSPERF and record numbers (throughput, latency, etc.) Run StorPerf and record numbers (throughput, latency, etc.) Run both at the same time and compare numbers. Cooperation with Bottlenecks Project as load manager for the planned test cases is also under discussion
Towards a stress test strategy 2 Test Cases Implememnted in Danube TC1 - Determine baseline for throughput TC3 - Perform life-cycle events for ping
Towards a stress test strategy TC3 measures the reliability/stability of the system under large number of concurrent requests/traffic. Problem detected on OPNFV & commercial solutions
Cross Project Stress Some Rough Numbers for General View VSPERF: Throughput: 500 mbit/s StorPerf: Bandwidth: 500 mbit/s Combined Tenant / Storage Network VSPERF: Throughput: 1000 mbit/s StorPerf: Bandwidth: 1000 mbit/s Tenant Network Storage Network Bottlenecks could act like load manager and monitoring system behaviors
Theory versus reality
Going beyond release verification Impossible to include stress / robustness tests in CI chain because target version delivered late and instable due to proximity to upstream 6 months cadence...(assuming that we would like to perform tests over weeks)
Needs 2 CI chains (as it is today…) CI chain master => release validation (as it is today, i.e. deploy/functest/yardstick looping on different scenarios) CI chain « stable » Focus on « generic » scenarios first (starting with os-nosdn-nofeature-ha) Reinstallation on demand (CI still needed for clean reinstallation but reinstallation not automated to allow long duration tests + troubleshooting) Schedule to be created by the testing group (allocate N week for project X, N’ for project N’...)
Questions for the TSC 1) Any feedback/comments ? 2) Testing working group is elaborating its stress strategy, shall it be validated by TSC ? (part of release priorities?)
Questions for the TSC 3) How to manage that from a release perspective? (sync point with David) Proposal: Master ============================== (release verification / No change) Stable ============================== | | (short term) Danube 3.0 Euphrates 1.0 ----------------------------- Stable resources for Testing group | | (mid term) N-1.0 N.1.0 ---------------------------------------------
Questions for the TSC In the Future load/resiliency testing could be started at release N-1.0 until release N branching (test windows ~ 3 months) If bug found in N-1 => correction expected in N-1 and reported on N (usual way was fix in N then cherry pick in N-1 but it could be not so easy here)
Questions for the TSC 4) Quid on any commitment from the installers / Is it compatible with Infra resource management (Infra group) ? Installer based need the 4 stable PODs with 4 // test campaigns Additional work on installers (bug fix on N-1) No scenario promotion mechanism Xci based need only 1 stable POD Upstream reporting Promotion mechanism already in place