I'm Kenichi Kourai from Kyushu Institute of Technology.

S-memV: Split Migration of Large-memory Virtual Machines in IaaS Clouds
I'm Kenichi Kourai from Kyushu Institute of Technology. I'm gonna talk about S-memV: Split Migration of Large-memory Virtual Machines in IaaS Clouds. This is join work with my students. Masato Suetake, Takahiro Kashiwagi, Hazuki Kizu, and Kenichi Kourai Kyushu Institute of Technology, Japan

Large-memory VMs Recent IaaS clouds provide VMs with a large amount of memory Amazon EC2 provides VMs with up to 4 TB of memory The ratio of memory size to CPU size is increasing [Nitu+, EuroSys'18] Required for big data analysis AI and IoT As a recent trend, IaaS clouds provide virtual machines with a large amount of memory. For example, Amazon EC2 provides VMs with up to 4 TB of memory. Amazon is planning VMs with up to 16 TB of memory. In VMs provided by Amazon EC2, the ratio of memory size to CPU size is increasing, as shown in this figure. In this decade, the average memory size of a VM was doubled. Such large-memory VMs are required for big data analysis in artificial intelligence and Internet of Things. Big data can be analyzed more efficiently by maintaining data in memory as much as possible. ratio year

Migration of Large-memory VMs
VM migration enables a running VM to be moved to another host Transfer VM's memory to the destination host One issue is the availability of the destination host It is not cost-efficient to reserve large hosts as destination The migration of large-memory VMs is still important. VM migration enables a running VM to be moved to another host without stopping the target VM. Using VM migration, administrators can maintain a host without service disruption after they migrate all the VMs running in that host. VM migration first creates a new empty VM at the destination host. Then, it transfers the memory contents of a VM to the destination and stores them in the memory of the created VM. Finally, it switches to the new VM. One issue is the availability of the destination host. Since a large-memory VM runs at the destination host after VM migration, the destination host also needs to have an equal or larger amount of memory than the source host. It is not cost-efficient to always reserve such hosts as the destination of VM migration, even if possible in clouds. To equip physical servers with a large amount of memory, larger-capacity memory modules are necessary but much more expensive. VM 4-TB free memory 4 TB source host destination host

≒ Traditional Approach
Use virtual memory to store part of memory data in storage The system can run VMs with a larger amount of memory than physical memory Perform paging when necessary Page-out: physical memory to storage Page-in: storage to physical memory The traditional approach to this issue is to use virtual memory. Virtual memory is a mechanism of the operating system and enables transparently storing part of memory data in storage. Using virtual memory, the system can run VMs with a larger amount of memory than physical memory. Virtual memory performs paging when necessary. If physical memory becomes full, some data in memory is moved to storage. This is called page-out. It the data in storage is required, it is moved to memory. This is called page-in. Traditionally, slow HDDs are used as storage for virtual memory, but recent SSDs are much faster. For example, some product achieves 3.4 GB/s for sequential read and 2.5 GB/s for write. 2 TB ≒ 4 TB 2 TB SSD virtual memory

Performance Degradation
Migration performance is degraded Paging frequently occurs during VM migration Even fast SSDs are slower than memory Performance after migration is degraded Memory being used by a VM is paged out Regardless of the memory access pattern However, virtual memory is incompatible with the migration of large-memory VMs. First, migration performance is degraded. VM's memory is transferred to the destination host and, after the physical memory becomes full, memory transfers always cause page-outs to storage. So, paging occurs frequently during VM migration. Even fast SSDs are still slower than memory. The transfer rate of the latest DDR4 memory is 34 GB/s. This is 10 times faster than SSDs. Second, application performance after VM migration is degraded. One reason is, of course, slow storage used by paging. The other migration-specific reason is that memory being used by the VM is paged out. VM's memory that has been transferred earlier is unconditionally paged out, regardless of the memory access pattern inside the VM. VM VM 2 TB page-out 4 TB 2 TB SSD source host destination host

Migration Performance
The migration time was 2.2x longer in SSD 11.7x longer in HDD The downtime was 4 seconds in SSD 30 seconds in HDD 30s 11.7x This is the comparison of migration performance. The left figure shows the migration time of a VM with 12 GB of memory using 10 Gigabit Ethernet. Compared with ideal VM migration with sufficient memory, the migration time was 2.2 times longer when using SSD. Using slower HDD, the time was 11.7 times longer. The other indicator of VM migration is the downtime. The downtime is the time during which a VM stops in the final phase of VM migration. The right figure shows the downtime. In ideal VM migration, the downtime was only 0.4 seconds. But when using SSD, the time increased to 4 seconds. Using HDD, it reached 30 seconds. These increases are not acceptable. 2.2x 4s 0.4s Migration time Downtime

Performance after Migration
The throughput of in-memory database was largely degraded By 95% in SSD just after migration It took 21 minutes to restore the performance This is the comparison of application performance after VM migration. The used application is in-memory database, which stores data in memory as cache. The throughput of in-memory database was largely degraded. The performance degradation was 95% in SSD just after VM migration. The throughput was gradually increasing after that. But it took 21 minutes to restore the same performance as before VM migration. When using HDD, the throughput was not restored at all.

Our Approach: Split Migration
Migrate a VM to multiple hosts Divide its memory into smaller pieces Transfer them to the main host or sub-hosts No paging occurs during VM migration Memory that is not accommodated in the main host is directly transferred to sub-hosts So, we propose a new VM migration method called split migration. Split migration migrates a large-memory VM running in one host to multiple destination hosts, instead of using one destination host and its storage. The destination hosts consist of one main host and one or more sub-hosts. Split migration divides VM’s memory into smaller pieces. Then, it transfers them to the main host or sub-hosts. Memory that is not accommodated in the main host is directly transferred to sub-hosts. No paging occurs during VM migration. VM VM 4 TB 2 TB 2 TB source host main host sub-hosts

Remote Paging in S-memV
Run a migrated VM across multiple hosts VM core runs in the main host E.g., virtual CPUs and devices Perform remote paging between the main host and sub-hosts When the VM requires memory in sub-hosts Our system for enabling split migration is called S-memV. After split migration, S-memV runs a migrated VM across multiple hosts. In that VM, VM core such as virtual CPUs and virtual devices runs in the main host. VM’s memory is distributed in the main host and sub-hosts. When the VM requires memory in sub-hosts, S-memV performs remote paging between the main host and sub-hosts. Specifically, the required memory in a sub-host is paged in to the main host. Then, unused memory in the main host is paged out to the sub-host. remote paging VM 2 TB 2 TB main host sub-hosts

Paging-aware Memory Transfers
Reducing remote paging is the key to efficient execution Frequent remote paging degrades performance Split migration transfers memory likely to be accessed to the main host Suppress remote paging after split migration Since S-memV performs remote paging via network, reducing the frequency of remote paging is the key to the efficient execution of VMs. Recent network is being fast, but that is still slower than memory. Frequent remote paging largely degrades VM’s performance. So, split migration performs paging-aware memory transfers. It transfers memory likely to be accessed to the main host. This suppresses remote paging after VM migration because memory in the main host can be accessed by VM core without remote paging. The rest of the memory is transferred to one of the sub-hosts. VM unlikely accessed VM 4 TB 2 TB 2 TB likely accessed source host main host sub-hosts

Memory Access Prediction
The prediction of VM’s memory access is necessary Based on Least Recently Used (LRU) Not recently used memory will not be used Can be also used in page-outs Memory unlikely to be accessed to sub-hosts To achieve paging-aware memory transfers, the prediction of VM's memory access is needed. Our memory access prediction is based on the well-known Least Recently Used algorithm. As in this figure, some memory regions are recently used, but some are not. Using the LRU algorithm, S-memV predicts that not recently used memory will not be used in the near future. In other words, it predicts that recently used memory is likely to be used. This memory access prediction can be also used in page-outs, of course. When S-memV performs remote paging, it pages out memory unlikely to be accessed to sub-hosts. This prevents the memory from being transferred back and forth between the hosts. recently used VM's memory access history likely to be used unlikely to be used not recently used

Memory Access History Keep track of page access inside a VM
Traverse EPT assigned to a VM Record and reset access bits Use the aging algorithm for LRU Accumulate access information in 8-bit history Shift the history right by one bit To obtain necessary information on memory access history, S-memV keeps track of memory page access inside a VM. It periodically traverses the extended page tables, EPT, assigned to a VM. If CPUs support EPT A/D bit, the access bit is set when the corresponding page is accessed. S-memV records the values of the access bits and resets them so that CPUs can record new memory access. As LRU approximation, S-memV uses the aging algorithm. It uses 8-bit history for this algorithm. It accumulates access information in the most-significant bit of the history for a while like this. After a certain period, it shifts the history right by one bit like this. As a result, the history value becomes larger as the page is accessed more recently. 1 1 1 1 1 access bit history 1 1 1 1 1 1 1 1 1 1

LRU-based Memory Splitting (1/2)
Split VM's memory in a chunk granularity A chunk consists of contiguous memory pages Calculate chunk history from page history Assign chunks with larger history values to the main host in order Time-consuming sort of chunks should be avoided On the basis of the memory access history, S-memV splits VM's memory in a chunk granularity. A chunk consists of contiguous memory pages and this is a unit of remote paging. The purpose of using chunks is to achieve efficient remote paging. So, S-memV first calculates the history value of a chunk from the history of the pages contained in the chunk. The chunk history is calculated by a bitwise OR of page history like this. Then, S-memV assigns chunks with larger history values to the main host in order. However, sorting a large number of chunks is time-consuming in a large-memory VM. So, it should be avoided. (191) bitwise OR page history chunk history :

LRU-based Memory Splitting (2/2)
Create a histogram of chunk history and find the threshold value Chunks with larger values to the main host Chunks with smaller ones to sub-hosts Chunks with the threshold value to the main host as much as possible 2 TB S-memV creates a histogram about the 8-bit value of chunk history like this, 0 to 255. And it sums up the number of chunks in the histogram in descending order of history values until the sum reaches the maximum number of chunks that can be accommodated in the main host, in this case, 2 TB. Then it finds the threshold history value. Using the threshold, S-memV assigns chunks with larger values to the main host. Similarly, it assigns chunks with smaller values to one of the sub-hosts. For chunks with the threshold history value, it assigns as many chunks as possible to the main host and the rest to sub-hosts. 2 TB histogram of chunk history main host 255 254 : : threshold : : sub- host 1

Demo of Split Migration
red: memory in a sub-host This is the demo of memory transfers during split migration. This region means VM's memory in a sub-host. Before split migration, there is no VM's memory. When a sub-host received memory of a VM, the region becomes red. Finally, the sub-host received half of VM's memory. main host sub-host

Memory Management VM's memory is managed by multiple hosts
Network page table in the main host Manage in which host each page is located EPT in the main host Page sub-tables in sub-hosts Manage VM's memory data After split migration, VM's memory is managed by multiple hosts. So, S-memV uses three types of page tables. A network page table is maintained in the main host. It manages in which host each page is located. For real memory assignment to a VM, the extended page tables, EPT, are used in the main host. EPT manages only the pages resident in the main host. Page sub-tables are maintained in sub-hosts. They manage the pages resident in sub-hosts. network page table page sub-table 2 TB 2 TB EPT main host sub-host

Chunk-level Page-in Use the Linux userfaultfd mechanism
Raise an event on accessing a non-existent page Send requests to the sub-host found in the network page table Request a faulting page first for a quick restart Next, request the other pages in the chunk To implement remote paging, S-memV uses the Linux userfaultfd mechanism. This is a relatively new mechanism introduced in Linux 4.3. S-memV registers all the memory pages of a VM to userfaultfd in the final phase of split migration. If a virtual CPU accesses a non-existent page in a VM, a page fault occurs and userfaultfd raises an event to S-memV. Then, S-memV searches the network page table and sends page-in requests to the found sub-host. At this time, it first requests a faulting page so that the suspended virtual CPU can restart as quickly as possible. Next, S-memV requests the other pages in the same chunk. main host VM S-memV page-in requests page fault event sub-host Linux

Chunk-level Page-out Find a chunk with the smallest history
Least recently used chunk Obtain the memory contents of the chunk Remove its memory mapping at the same time By extending userfaultfd Send page-out requests to the sub-host After page-ins, S-memV performs page-outs to balance the amount of memory in the main host. It first finds a chunk with the smallest history value. This means the least recently used chunk. Next, S-memV obtains the memory contents of the pages in the found chunk. At the same time, it removes the memory mapping of these pages from the VM. To execute these two operations atomically, we have extended the userfaultfd mechanism. Then, S-memV sends page-out requests to the sub-host where a chunk has been paged in. main host VM data S-memV remove mapping page-out requests sub-host Linux

Demo of Remote Paging After split migration red: memory in a sub-host
black: memory in the main host This demo is remote paging after split migration. The red region means memory chunks in a sub-host. The black region means memory chunks in the main host. During the execution of the sort command, memory chunks in various regions were paged in and out. main host sub-host

Experiments We conducted experiments to examine the performance of S-memV Ideal: VM Migration with sufficient memory Comparison: VM migration with SSD source host destination hosts main host sub-host CPU Xeon E v3 Xeon E v2 Memory 16 GB 12 GB Storage MX300 SSD Network 10 GbE OS Linux 4.3 Virtualization QEMU-KVM 2.4.1 We have implemented S-memV in KVM. To examine the performance of S-memV, we conducted several experiments. For the baseline, we used the traditional ideal VM migration, which migrated a VM to one destination host with sufficient memory. For comparison, we performed VM migration with SSD as swap space. In our experiments, we migrated a 12-GB VM using 10 gigabit Ethernet. Except for ideal VM migration, we adjusted the free memory size of the destination main host to half of VM's memory.

Performance of Split Migration
The migration time was 5% longer 2.2x shorter than when using SSD The downtime was 7 ms longer 3.4 seconds shorter than when using SSD 3.8s Here is the performance of split migration. The left figure shows the migration time. Compared with ideal VM migration, the migration time was only 5% longer. Since split migration did not perform remote paging during migration, this time was 2.2 times shorter than when using SSD. The right figure shows the downtime. The downtime was only 7 ms longer than ideal VM migration. Thanks to almost no remote paging in the final phase, this was 3.4 seconds shorter than when using SSD. 1.05x +7ms

Impact of Memory Re-transfers
We compared a memory-intensive VM with an idle one The migration time was only 23s longer The downtime did not increase Next, we examined the impact of memory re-transfers during VM migration. If VM's memory is modified during memory transfers, it is re-transferred. As more memory is modified, the migration time increases. We compared such a memory-intensive VM with an idle one. As shown in the left figure, the migration time was only 23 seconds longer in split migration. This increase was a little bit smaller than in ideal VM migration. This reason is under investigation. In contrast, when using SSD, the increase was 55 seconds. The downtime did not increase even when using SSD. +55s +23s

Impact of Network We compared 1 GbE and 10 GbE
The migration time was 7x shorter Only 2.9x shorter when using SSD The downtime was almost the same Next, we compared migration performance between 1 gigabit and 10 gigabit Ethernet to examine the impact of network. Using 10 gigabit Ethernet, the migration time was 7 times shorter in split migration although the network was 10 times faster. This is due to network-unrelated constant overhead of VM migration. In contrast, the migration time was only 2.9 times shorter when using SSD. This is because faster 10 gigabit Ethernet could not hide performance degradation due to paging. The downtime was almost the same in split migration. But using SSD, the downtime largely increased in 10 gigabit Ethernet due to paging. 2.9x 7x

Performance after Split Migration
The stable throughput of in-memory database was degraded by 0.6% Largely degraded only just after migration Restored in 5 seconds Here is the application performance after split migration. Compared with after ideal VM migration, the stable throughput of in-memory database was degraded by only 0.6%. The throughput was largely degraded just after migration due to many page-ins. But the performance was restored only in 5 seconds. The page-in rate became almost zero.

Effectiveness of Prediction
The recovery time was 90% shorter than random prediction The throughput was 4.4x higher just after migration Even the stable throughput was 7% higher 7% Here is the comparison of our prediction algorithm with the random one. In the random algorithm, it took 60 seconds to restore the performance. So the recovery time was 90% shorter by more accurate prediction. Just after split migration, the throughput of in-memory database was 4.4 times higher than the random algorithm. Even the stable throughput after recovery was 7% higher.

Overhead of Prediction
We simulated the overhead for a 2-TB VM Collecting memory access history took 0.6s Memory splitting took 0.8s Page-out decision took 13 ms 0.6s 0.8s 13ms Finally, we examined the overhead of memory access prediction. We created only data structure and simulated a 2-TB VM. As shown in the left figure, the collection time of memory access history was proportional to the memory size. It took 0.6 seconds for a 2-TB VM. This overhead is not small, but parallel collection could reduce the time. As shown in the middle figure, memory splitting took 0.8 seconds for a 2-TB VM. This is negligible, compared with a long migration time. The right figure shows the page-out decision time. The time was 13 ms, but this is not critical in remote paging.

Related Work Scatter-Gather Migration [Deshpande et al.'14]
The first half is similar to split migration Memory access prediction is not necessary MemX [Deshpande et al.'10] Run a VM using the memory of multiple hosts Support VM migration in a limited form Halite [Zhang et al.'13] Group VM's memory into locality blocks Use locality in guest OS's virtual address spaces Scatter-Gather migration uses multiple intermediate hosts between source and destination hosts. It pushes VM's memory in the source host as quickly as possible. The first half is similar to split migration in that the source host transfers VM's memory to multiple hosts. However, memory access prediction is not necessary. MemX can run a VM using the memory of multiple hosts. When a VM accesses non-existent memory pages, MemX obtains the corresponding page at other hosts. But it supports VM migration in a very limited form. Halite optimizes the restoration of checkpointed VMs. It groups VM's memory pages likely to be accessed together into locality block. To predict access locality of memory pages, it can use the locality in virtual address spaces of the guest operating system in a VM. This technique can be used in S-memV.

Conclusion We proposed split migration with S-memV Future work
Split migration migrates a large-memory VM to multiple hosts S-memV performs remote paging between hosts Predicts VM’s memory access to reduce the overhead Future work Evaluate split migration for larger-memory VMs Support the migration of VMs across hosts Provide fault tolerance to migrated VMs In conclusion, we proposed split migration with S-memV. Split migration migrates a large-memory VM to multiple hosts. This enables using the memory of smaller hosts even if one large host is not available. After split migration, S-memV performs remote paging between hosts. To reduce the overhead and improve the performance, S-memV predicts VM’s memory access. Our future work is to evaluate split migration for larger-memory VMs. We are interested in the impact of such VMs on split migration. Currently, we are working on the migration of VMs running across multiple hosts after split migration. In addition, we are planning to provide fault tolerance to migrated VMs because one VM relies on multiple hosts.

I'm Kenichi Kourai from Kyushu Institute of Technology.

Similar presentations

Presentation on theme: "I'm Kenichi Kourai from Kyushu Institute of Technology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

I'm Kenichi Kourai from Kyushu Institute of Technology.

Similar presentations

Presentation on theme: "I'm Kenichi Kourai from Kyushu Institute of Technology."— Presentation transcript:

Similar presentations

About project

Feedback