A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines Kenichi Kourai Shigeru Chiba Tokyo Institute of Technology
Server consolidation with VMs Server consolidation is widely carried out Multiple server machines are integrated on one physical machine Recently, using virtual machines (VM) VMs are run on a virtual machine monitor (VMM) Multiplexing resources VM VMM hardware VM...
Software aging of VMMs Software aging of a VMM is critical Software aging is... The phenomenon that software state degrades with time E.g. exhaustion of system resources Software aging of a VMM affects all VMs on it E.g. performance degradation VM VMM VM...
Software rejuvenation of VMMs Preventive maintenance Performed before software aging of a VMM affects its VMs Occasionally stops a VMM, cleans its internal state, and restarts it Typical example: rebooting a VMM Cleans the internal state automatically and completely The easiest way
Drawbacks (1/2): Increasing service downtime The VMM reboot needs: Rebooting all OSes running on the VMs The time tends to be long Larger number of VMs Longer startup time of services A hardware reset The BIOS power-on self test is time-consuming OS VMM OS VM... OS shutdown hardware reset OS boot VMM shutdown VMM boot
Drawbacks (2/2): Performance degradation The file cache is lost by the OS reboot OSes cannot restore performance until the file cache is re-filled They strongly rely on the file cache to speed up file accesses The time tends to be long The file cache size is increasing Large amount of memory for a VM Free memory as the file cache disk OS file cache process
Warm-VM reboot Fast rejuvenation technique Efficiently reboots only a VMM The VMM reboot causes no OS reboot Basic idea Suspend all VMs before the VMM reboot Resume them after the reboot Challenge How does a VMM efficiently deal with the large memory images of VMs?
On-memory suspend of VMs Freezes the memory images of VMs on the main memory That memory area is just reserved The time does not depend on the memory size Saving them into a slow disk is inefficient ACPI S3 state for VMs Suspend To RAM Traditional suspend is ACPI S4 state disk main memory VM freez e
On-memory resume of VMs Unfreezes the memory images preserved on the main memory They are reused directly as the memory of VMs No need to read them from a slow disk The file cache of OSes is also restored No performance degradation disk main memory VM unfreez e
Quick reload of VMMs Directly boots a new VMM without a hardware reset The memory images of VMs are preserved through the VMM reboot Software can keep track of them A hardware reset does not guarantee this A VMM is rebooted quickly No overhead due to a hardware reset old VMM new VMM preload VM main memory
Comparison with other methods Cold-VM reboot Needs the OS reboot Saved-VM reboot A naive implementation of the warm-VM reboot VMs are saved into a disk Reboot methodCold-VMSaved-VMWarm-VM Depend on # of VMsYesNo Depend on servicesYesNo Depend on mem size of VMsNoYesNo Performance degradationYesNo
Model for availability Must consider the software rejuvenation of both a VMM and OSes Warm-VM reboot The OS rejuvenation is independent Cold-VM reboot The OS rejuvenation is affected by the VMM rejuvenation # of the OS rejuvenation increases OS rejuvenation VMM rejuvenation OS rejuvenation VMM rejuvenation
RootHammer We have implemented the warm-VM reboot into Xen On-memory suspend/resume Based on Xen's suspend/resume Manages the mapping from the VM memory to the physical memory Quick reload Based on the kexec mechanism in Linux Kexec for a VMM is included in the latest Xen It is not for reusing the memory images VM memory physical memory
Experiments Examine that the warm-VM reboot reduces downtime and performance degradation Comparison Cold-VM reboot with the OS reboot Saved-VM reboot using Xen's suspend/resume VMM Linux GB SDRAM 15,000 rpm SCSI disk 2 dual-core Opteron gigabit Ethernet Linux server client
Performance of on-memory suspend/resume Suspend/resume of one VM with 11 GB of memory Ours: 1 sec Xen's: 280 sec Depends on the memory size Suspend/resume of 11 VMs Ours: 4 sec OS reboot: 58 sec Depends on # of VMs
Effect of quick reload The time of rebooting a VMM with no VMs Warm-VM reboot 11 sec The time of quick reload is negligible Cold-VM reboot 59 sec The time due to a hardware reset is 48 sec
Downtime of services Warm-VM reboot Always the same 42 sec Saved-VM reboot Depends on # of VMs 429 sec (11 VMs) Cold-VM reboot Affected by the service type 157 sec (sshd) 241 sec (JBoss)
Availability of JBoss The warm-VM reboot achieves four 9s Assumptions OS rejuvenation every week 34 sec VMM rejuvenation every 4 weeks In 0.5 week after the last OS rejuvenation Warm-VM reboot99.993% Cold-VM reboot99.985% Saved-VM reboot99.977% OS rejuvenation VMM rejuvenation 0.5 week 1 week
Performance degradation The throughput of the Apache web server before and after the VMM reboot Warm-VM reboot No degradation Cold-VM reboot Degraded by 69%
Software rejuvenation in a cluster environment Clustering achieves zero downtime Multiple hosts can provide the same service Let us consider the total throughput of all hosts in a cluster Warm-VM reboot (m-1)p Cold-VM reboot (m-1)p (m-0.69)p for a while after the reboot m: # of hosts p: throughput of one host t mp (m-1)p total throughput 42 sec 241 sec
Comparison with VM migration in a cluster environment VM migration achieves nearly zero downtime VMs are moved to another host Xen's live migration, VMware's VMotion Total throughput Normal run (m-1)p One host is reserved for migration Live migration (m-1.12)p t mp (m-1)p total throughput 42 sec 17 min
Related work Microreboot [Candea et al.'04] Reboots only a part of subcomponents The warm-VM reboot enables rebooting only a parent component (VMM for VMs) Checkpointing/restart [Randell '75] Saves/restores OS processes Similar to suspend/resume of VMs Optimizations of suspend/resume Incremental suspend, compression of memory images
Conclusion We proposed the warm-VM reboot On-memory suspend/resume Freezes/unfreezes the memory images of VMs Quick reload Preserves the memory images through the VMM reboot It achieved fast rejuvenation Downtime reduced by 83% at maximum No performance degradation