OpenStack High Availability Jakub Pavlik
About me Jakub Pavlík Cloud Platform Engineer 3 years in Cloud 2 years in OpenStack
High Availability ≠ Disaster Recovery! High Availability vs. Disaster Recovery High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!
Four types of HA in an OpenStack Cloud Service Resiliency QoS Cost Transparency Data Integrity ….. Compute Controller Network Controller Database Message Queue Storage .... Physical infrastructure OpenStack Control services VMs OpenStack Compute Applications Physical nodes Physical network Physical storage Hypervisor Host OS …. Virtual Machine Virtual Network Virtual Storage VM Mobility …
Physical Infrastructure
tcp cloud VPC Hardware Switch 1 Switch 2 SAN 1 SAN 2 SAN 1 SAN 2 Passthru 1 Passthru 2 Passthru 1 Passthru 2 168 cores 3,46GHz ,336 threads agregation ¼ : 1344 vCPU 2688 GB RAM 28 x 10GE ports 168 cores 2,67GHz ,336 threads agregation ¼ : 1344 vCPU 1792 GB RAM 28 x 10GE ports SAN 1 SAN 2 SAN 1 SAN 2 Controller 1 Controller 2 Controller 1 Controller 2
OpenStack Control services
OpenStack modules – TCP VPC
OpenStack High Availability Concepts Stateless services There is no dependency between requests For example APIs: Nova, Keystone, Glance, Cinder, etc. Stateful services An action typically compromises multiple requests For example: MySQL, RabbitMQ, etc. Active/Passive Redundant instances of stateless services are load balanced For Stateful services a replacement resource can be brought online Active/Active Stateful services are managed in such a way that services are redundant, and that all instances have and identical state.
Corosync, Pacemaker and HAProxy Totem single-ring ordering and membership protocol UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker Pacemaker High availability and load balancing stack for the Linux platform. Interacts with applications through Resource Agents (RA) HAProxy Load Balancing and Proxying for HTTP and TCP Applications Works over multiple connections Used to load balance API services
MySQL Galera Synchronous multi-master cluster technology for MySQL/InnoDB MySQL patched for wsrep (Write Set REPlication) Active/active multi-master topology Read and write to any cluster node True parallel replication, in row level No slave lag or integrity issues
RabbitMQ – RPC messaging Rabbit cluster
Sample OpenStack HA architecture Stateful Cinder Volume Neutron L3, DHCP agents Ceilometer central agent RabbitMQ Stateless Neutron Server OpenStack APIs Apache web server Nova Scheduler Cinder Scheduler Neutron agents (Active) Neutron agents (Hot Standby)
VMs – Compute nodes
VMs HA – two layers Storage Shared storage filesystem – file disks (qcow2, vmdk, vhv) Block storage Network Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge) Vendor plugins - SDN controller
No vSphere Style HA with KVM
Non-Shared/Shared Storage filesystem Live migration – just RAM memory Hypervisor Evacuation – The instance will be booted from same disk and data will be preserved CEPH, Gluster, NFS, Samba, GFS Non-Shared Storage Block Live Migration – disk and RAM Hypervisor Evacuation – the instance will be booted from a new disk, but will preserve the configuration, e.g. id, name, uuid Standard filesystem EXT4, etc.
Block Storage - Cinder Instance boots from volume iSCSI/FC direct mapping to instance Enable Live Migration Cinder Backends LVM Driver Default linux iSCSI server Vendor software plugins Gluster, CEPH, VMware VMDK driver Vendor storage plugins EMC VNX, IBM Storwize, Solid Fire, etc.
Networking - Vanilla Neutron L3 agent Problems Routing on Linux server (max. bandwith approximately 3-4 Gbits) Limited distribution between more network nodes East-West and North-South communication through network node High Availability Pacemaker&Corosync Keepalived VRRP DVR + VRRP – should be in Juno release
Networking – Vendor SDN Controller plugins Examples Juniper OpenContrail, VMware NSX, SDN PLUMgrid Advantages against Neutron L3 agent North-South communication on network devices (iBGP, MLPSoverGRE) East-West communication directly between compute nodes Higher bandwidth (9.7 Gbits per 10Gbits port) High Availability iBGP peering into two routers Native HA implemented inside of network devices
OpenStack HA TCP VPC VIP HAProxy Pacemaker Corosync MySQL RabbitMQ Openstack Controller GALERA Zookeeper Cassandra Contrail Database Contrail Config with Analytics & WebUI Contrail Control HAProxy VIP Bond Interface Pacemaker Corosync
TCP Virtual Private Cloud
HA methods - vendors Vendor Cluster/Replication Technique Characteristics RackSpace Keepalived, HAProxy, VRRP, DRBD Automatic - Chef Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller tcp cloud Pacemaker, Corosync, HAProxy, Galera, Contrail Automatic Salt-Stack deployment Mirantis Pacemaker, Corosync, HAProxy Galera Automatic - Puppet
Thank you for your attention!