S3P Scalability Testing: Status update

S3P Scalability Testing: Status update
Matt Welch Network Software Engineer Datacenter Network Solutions Group

Cluster Design

OpenStack installation
Cluster Requirements need versatile compute nodes OpenStack* nodes function similarly to bare metal Compute Nodes devstack for flexibility OpenStack installation maximize resources SDN controller

OpenStack Node Composition
Node Components 1* control node: typical OpenStack infrastructure services 1* network node: Neutron network services with OpenDaylight ML2 plugin N* compute nodes: minimal OpenStack compute services OpenStack services Ubuntu 16.04 Docker OVS 2.5.2 OpenStack stable/newton OpenDaylight Carbon 0.6.1 OpenStack Node Composition

S3P OpenStack Node Deployment
docker0/iptables OpenStack Control (infra) + OpenDaylight br_mgmt br_data LAN-access compute-N LAN-mgmt docker0/iptables compute-1 LAN-data compute-0 br_mgmt br_data In this figure, we show a topology diagram of the emulated cloud network and how it attaches to the physical network. Each of the large black rectangles represents a physical server in our lab. The solid blue rectangles each represents a docker container running the services I described previously. The remainder of the diagram shows the container networking via virtual Ethernet adapters bound to three Linux bridges. The containers are each attached to access, management, and data networks. The Docker0 bridges enable connecting the containers to the wider network. The br_mgmt and br_data bridges help to create the management and data networks for the OpenStack nodes. OpenDaylight appears to be in a container here, but can just as easily run bare metal on a standard server. docker0/iptables docker0/iptables OpenDaylight OpenDaylight br_mgmt br_mgmt br_data br_data

Setup – service initialization
OpenStack* Installation devstack for flexibility nodes in docker containers Network Topology Network with minimal abstractions Test-network shares physical adapter NetVirt Node deployment bash scripts for simple deployment Ansible roles for server setup, deployment, and joining nodes to the OpenStack cloud Numerous methods for OpenStack installation - install from packages or pip, PackStack, devstack, openstack-kolla is a popular production deployment tool - we chose devstack for flexibility - configuration can be complex, but powerfully flexible Networking: We create our own network connections to minimize abstractions e.g. double encapsulation of VXLAN packets keep container networking under our control (for deterministic L2/L3 addresses) bash scripting, working on ansible role for upstream Containers use virtual Ethernet (veth) adapters bound to Linux bridges could potentially be improved with DPDK acceleration of the management and data bridges service nodes could likely benefit from this too we’re specifically looking at NetVirt performance OVSDB application that uses OpenFlow & OVSDB (southbound protocols) Node Deployment - bash scripts that can spawn just a handful with simple scripts or - use ansible to deploy containers to physical host and “stack” to join them to an OpenStack cloud

Testing

Test Plan Plan based on a high level description of NetVirt scale testing shared in the community [3] Test Plan posted to Google docs [4] Testing parameters affecting scaled network performance: Number of servers (VMs) running in the cloud (num_servers) Number of ports assigned per server (ports_per_server) Number of active ports per network (ports_per_network) Number of active networks in the cloud (concurrent_networks) Number of network interfaces per router (networks_per_router) Number of active routers (concurrent_routers) Fraction of ports assigned to floating IPs (floating_ip_per_num_ports)

Learnings and Conclusions

Cluster Validation: Shaking out the emulated cluster
Create OpenStack* servers carefully Scaled-out Resource Density isn’t Limiting Use Smaller Software Defined Networks Emulated Cluster Delivering Results Early Results: Maximum Scale OpenStack Agents Scale Independently

Create OpenStack* servers carefully
Test profiles define server resources Predictable naming Smaller OpenStack “flavors” Security group rules Race conditions on resource creation Ping to verify server status - <Test Profiles> Server network interfaces, additional networks, and router interfaces should be assigned as various test profiles require Predictable naming simplifies resource tracking and failures Smaller “flavors” enable higher node density and faster server spawn times moved from m1.tiny to cirros256 Using Minimal (single) security group with ICMP & SSH allowed to OS server Limit the rate of requests to attach compute nodes, create Neutron networks and attach Nova server VMs A smoke test to validate network resource allocation is critical due to observed failure modes Security groups are another item of interest in network scale since they will directly affect the rules installed in virtual switches When spawning multiple compute nodes on specific hypervisors, the spawn rate must be limited so that race conditions are not created which lead to instances failing to obtain IP addresses On hypervisor nodes where instances spawn, but fail to obtain IP addresses (from their perspective), the rule sets installed in OVSDB (ovs-ofctl –O OpenFlow13 dump-flows br-int) are typically smaller than rule sets in nodes with successful instances. Additionally, there are often fewer VXLAN tunnels created in these nodes leading to the hypothesis that the installation of rules in this node was somehow interrupted or incomplete. These instances are almost always spawned by their hypervisor, but my working hypothesis is that race conditions lead to incomplete rule sets being installed in OVSDB. For most reliable instance spawning, we wait for one instance to obtain an IP an respond to ping before continuing to the next instance. If an instance fails to respond to ping, it’s considered a failure.

Scaled-out Resource Density isn’t Limiting
Validation experiments run to ensure emulation won’t cause problems: OpenStack Server Density per Compute Node 50 server VM instances per compute node container Compute Node density on physical hosts 30 compute nodes with functioning virtual switches Peak OpenStack server density per physical host 30 servers in each of 30 compute host containers  900 servers - In one experiment, we attempted to spawn as many instances as possible within a single compute node running alone on a physical server 50 server instances within the compute node without any interruption in network connectivity Conclusion: at least 50 instance VMs may be created within a single compute host Compute Node density on physical hosts Spawn compute nodes in series and spawn a server instance VM inside each compute node after it has finished stacking (validated by functioning server VM instances were able to join the cloud with out error, but DHCP errors were encountered when spawning instance #31 on a dual socket system with 18 cores per CPU and 64 GB of RAM, 30 compute nodes seem to be a practical limit to the number of functional nodes that can be spawned in a single server Maximum Instance Density per Server 30 compute nodes in a single physical server 30 server instances in each of the compute nodes 900 server instances on 30 virtual switch instances Conclusion: server instance density should not be a limiting factor in emulation of a high-density cloud Ultimately, this tells us the density of compute hosts and instances that we’re packing into our physical servers should not interfere with correct functionality

Use Smaller Software Defined Networks
OpenDaylight+Neutron creates full-mesh networks VXLAN tunnels grow at O(n2) with the number of connected vSwitches in the network Networks should be kept reasonably small during scale out Minimal OpenFlow* rules are installed prior to OS server creation Network scale testing needs better mock-ups or real tenant instances Neutron + OpenDaylight-ML2 create a full-mesh network where each node in the network (OVS instance) is connected to every other node in the network via VXLAN tunnels, representing a O(n2) relationship between number of instances and the “density” of the network mesh that connects them As a result, networks should be kept reasonably small during scale out to prevent an explosion in the number of VXLAN tunnels needed and the resulting increase in resources needed for the network node. When attaching instances to a particular network, OpenFlow rules are installed on the OVSDB that is controlling that instance’s network This shows that, for full-scale testing and installation of complete vSwitch rule sets, we need to launch an instance VM attached to that vSwitch.

Emulated Cluster Delivering Results
Typical developer use cases do not expose error conditions Testing at scale has demonstrated shortcomings of simple operations Race conditions discovered in resource creation and destruction Detection of these error conditions will help to drive improvements

Early Results: Maximum Scale
“Vertical networks” whose members share physical host resources OpenDaylight supported 21 compute node host servers 147 compute nodes 134 switches supporting OS server VMs OpenStack Resources 21 networks, one subnet each one router, fully connected to subnets Create networks whose members all reside within the same physical host Deploy waves of containers to host systems then create instances in each of the new compute hosts With this topology, OpenDaylight was able to support 134 switches spread across 147 compute nodes running on 21 physical servers 21 networks, 21 subnets, 1 router “Last” instances failed DHCP as seen previously Supports the hypothesis that the cloud framework needs to deploy additional OpenStack resources to support further scale-out Services that could improve with replication (HA) include the Message Queue, Database, and DHCP agents OpenDaylight (network control) may see additional scale out performance with additional resources

OpenStack Agents Scale Independently
Resource consumption is significant at scale Java - OpenDaylight Message Queue Network Services: OpenFlow/NetVirt Rules + VXLAN tunnels Network Services: DHCP failures Java is, by far, the greatest consumer of memory and CPU cycles in the network node. At scale, it can consume over 14 * 2.3 GHz = 30 GHz of aggregate CPU; >60 GB or memory & 20 GB of swap/cache Message Queue much of resource management in OpenStack is performed with asynchronous messages delivered via message services aside from Java, the message queue service consumed the greatest amount of memory and CPU cycles Further scale-out may be limited by race conditions caused by a busy message queue service DHCP Instance failures are almost always associated with a DHCP failure concomitant with an OVSDB rule set that appears incomplete Distributing additional DHCP services to each compute node physical host may reduce the burden on a singleton DHCP server and dramatically improve network performance

What needs to be done CSIT integration of compute containers
Compute container integration with existing OpenStack nodes Robot testing: launch containers & run scale test Topology/Configuration improvements OpenDaylight in Clustered/HA configuration OpenStack constructs: DVR, host aggregates, availability zones, regions Distributed DHCP, distributed MQ/database, OpenStack services? Software Version bump Ocata, Pike versions of devstack in compute container Test against Carbon 0.6.2, Nitrogen 0.7.0 Container development CentOS version of compute container validate CentOS as container host

Summary Compute container framework provides convenient scale test environment early results show promise useable with a variety of OpenStack installations Analysis of scale testing is complicated by component interaction everything scales together each subsystem will have its own failure point Simulated activity must be accurate node activity must model normal node behavior scale frameworks must validate infrastructure components are not limiting

References [1] CPerf Project, [2] “Testing OpenDaylight Beryllium for Scale and Performance”, Marcus Williams, BrightTalk presentation, [3] “NetVirt Scale Testing”, Andre Fredette, [4] “OpenDaylight Scale Testing Plan”, Matt Welch, [5] “OpenStack Neutron Performance and Scalability: Testing summary”, Elena Ezhova, Oleg Bondarev, and Alexander Ignatov on December 22, 2016, Mirantis, performance-and-scalability-testing-summary/ [6] “OpenDaylight Performance: A Practical, Empirical Guide”, Linux Foundation, [7] “Conversation on Scale, Security, Stability & Performance”, Marcus Williams, OpenDaylight Developer Design Forum, September 3, 2015, [8] “MidoNet Scalability report”, materials/Midonet_Scalability_Report_final.pdf [9] “Comparing OpenStack Neutron ML2+OVS and OVN – Control Plane”, Russel Bryant,

Notices and Disclaimers
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. Intel, the Intel logo, are trademarks of Intel Corporation and its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others © Intel Corporation.

S3P Scalability Testing: Status update

Similar presentations

Presentation on theme: "S3P Scalability Testing: Status update"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

S3P Scalability Testing: Status update

Similar presentations

Presentation on theme: "S3P Scalability Testing: Status update"— Presentation transcript:

Similar presentations

About project

Feedback