Arne Wiebalck -- VM Performance: I/O LHCb Computing Workshop, CERN May 22, 2015 Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O In this talk: I/O only Most issues were I/O related Symptom: High IOwait You can optimize this Understand service offering Tune your VM CPU being looked at Host-side, too early Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O Hypervisor limits Two-disk RAID-1 Effectively one disk Disk is a shared resource IOPS / #VMs Your VM is co-hosted with 7 other VMs 4 cores 8 GB RAM RAM CPUs Disk pro rata IOPS 25 IOPS virtual machine hypervisor User expectation: ephemeral disk ≈ local disk Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O Ask not only what the VM can do for you … Arne Wiebalck -- VM Performance: I/O
Tip 1: Minimize disk usage Use tmpfs Reduce logging Configure AFS memory caches Supported in Puppet Use volumes … Arne Wiebalck -- VM Performance: I/O
Tip 2: Check IO Scheduling The “lxplus problem” VMs use ‘deadline’ elevator Set by ‘virtual-guest’ tuned profile, RH’s default for VMs Not always ideal (interactive machines): ‘deadline’ prefers reads, can delay writes (default: 5 secs) Made to allow reads under heavy load (webserver) lxplus: sssd makes DB updates during login IO scheduler on the VM changed to CFQ Completely Fair Queuing Benchmark: login loop Arne Wiebalck -- VM Performance: I/O
‘deadline’ vs. ‘CFQ’ IO benchmark starts Switch from ‘deadline’ to ‘CFQ’ IO benchmarks continues Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O Tip 3: Use volumes Volumes are networked virtual disks Show up as block devices Attached to one VM at a time Arbitrary size (within your quota) Provided by Ceph (and NetApp) QoS for IOPS and bandwidth Allows to offer different types Arne Wiebalck -- VM Performance: I/O
Volumes types Name Bandwidth IOPS Comment standard 80MB/s 100 io1 std disk performance io1 120MB/s 500 quota on request cp1 critical power cp2 critical power Windows only (in preparation) Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O Ceph volumes at work IOPS QoS changed from 100 to 500 IOPS ATLAS TDAQ monitoring application Y- Axis: CPU % spent in IOwait Blue: CVI VM (h/w RAID-10 with cache) Yellow: OpenStack VM EGI Message Broker monitoring Y- Axis: Scaled CPU load (5 mins of load / #cores) IOPS QoS changed from 100 to 500 IOPS Arne Wiebalck -- VM Performance: I/O
Tip 4: Use SSD block level caching SSDs as disks in hypervisors would solve all IOPS and latency issues But still (too expensive and) too small Compromise: SSD block level caching flashcache (from Facebook, used at CERN for AFS before) dm-cache (in-kernel since 3.9, rec. by RedHat, in CentOS7) bcache (in kernel since 3.10) Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O bcache Change cache mode at runtime Think SSD replacements Strong error-handling Flush and bypass on error Easy setup Transparent for VM Need special kernel for HV RAM RAM CPUs Disk 100 IOPS SSD 20k IOPS hypervisor Arne Wiebalck -- VM Performance: I/O
Switch cache mode from ‘none’ bcache in action (2) On a 4 VM hypervisor: ~25 IOPS/VM ~1000 IOPS/VM Benchmarking a caching system is non-trivial: - SSD performance can vary over time - SSD performance can vary between runs - Data distribution important (c.f. Zipf) Switch cache mode from ‘none’ to ‘writeback’ Benchmark ended, #threads decreased Arne Wiebalck -- VM Performance: I/O
bcache and Latency Caveat: SSD failures are fatal! SSD block level caching sufficient for IOPS and latency demands. Blue: CVI VM (h/w RAID-10 w/ cache) Yellow: OpenStack VM Red: OpenStack on bcache HV Use a VM on a bcache hypervisor Caveat: SSD failures are fatal! Clients: lxplus, ATLAS build service, CMS Frontier, root, … (16 tenants) Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O “Tip” 5: KVM caching I/O from the VM goes directly to the disk Required for live migration Not optimal for performance I/O can be cached on the hypervisor Operationally difficult No live migration Done for batch 4 cores 8 GB RAM RAM CPUs Disk virtual machine hypervisor Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O KVM caching in action ATLAS SAM VM (‘none’ to ‘write-back’) Batch nodes with and without KVM caching Arne Wiebalck -- VM Performance: I/O
Arne Wiebalck -- VM Performance: I/O Take home messages The Cloud service offers various options to improve the I/O performance of your VMs You need to analyze and pick the right one for your use case Reduce I/O Check I/O scheduler Use volumes Use SSD hypervisors (Use KVM caching) Get in touch with the Cloud team in case you need assistance! Arne Wiebalck -- VM Performance: I/O