Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,

Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*, Seigo Okazaki*, and Shigeru Chiba*** * Kyushu Institute of Technology, Japan ** Tokyo Institute of Technology, Japan *** The University of Tokyo, Japan I'm Kenichi Kourai from Kyushu Institute of Technology. I'm gonna talk about resource cages, which is a new abstraction of the hypervisor for performance isolation considering IDS offloading. This is joint work with my students and colleague.

Intrusion Detection in Clouds
IaaS clouds Users can run their services in virtual machines (VMs) They need intrusion detection systems (IDSes) to protect their systems IDSes also suffer from external attacks Intruders can disable IDSes easily intruder In Infrastructure-as-a-Service clouds, users can run their services in virtual machines. They can set up their systems in provided VMs and use the VMs as necessary. To protect the systems inside VMs from external attackers, they need intrusion detection systems. IDSes are used to monitor the system states, filesystems, and network packets. But IDSes also suffer from external attacks. After attackers intrude into a VM and take sufficient privileges, they can disable IDSes easily. VM VM IDS IDS

IDS Offloading Run IDSes outside target VMs securely
E.g., in the management VM Using VM introspection (VMI) [Garfinkel+'03] Directly obtain information inside VMs E.g., memory, storage, and networks Intruders cannot disable offloaded IDSes To prevent IDSes from being compromised by intruders, IDS offloading has been proposed. It enables securely running IDSes outside target VMs, for example, in a privileged VM called the management VM. Offloaded IDSes can directly obtain information inside VMs using a technique called VM introspection. For example, they can read kernel data in VM's memory, check the integrity of the storage, and capture all the packets from and to VMs. Even if attackers intrude into a VM, they cannot disable offloaded IDSes. management VM user VM offloaded IDS IDS VMI

Performance Isolation
Each VM is strongly isolated by the hypervisor Cannot use more than a certain amount of resources Upper limit to a VM CPU utilization and memory size CPU shares between VMs VM1 VM2 CPU: 50% Mem: 4GB However, IDS offloading makes performance isolation between VMs difficult. In a virtualized system, each VM is strongly isolated by the hypervisor. The hypervisor runs underneath all the VMs. For performance, the hypervisor can guarantee that each VM doesn't use more than a certain amount of resources such as CPUs and memory. For example, the hypervisor can set the upper limit of CPU utilization to each VM. It can also set the maximum memory size to each VM. In addition, the hypervisor can set CPU shares between VMs. According to the shares, it proportionally allocates the CPU time to VMs. IDS1 IDS2 VM1:VM2 = 1:1 hypervisor

Issue in IDS Offloading
Performance isolation is violated between VMs IDSes consume resources in the management VM A VM and its offloaded IDSes can exceed CPU limits IDSes are part of the monitored VM originally The total CPU utilization should be limited for fairness management VM VM1 50% VM2 In IDS offloading, offloaded IDSes are executed in the management VM. In other words, they consume resources in the management VM. This violates performance isolation between VMs. For example, assume that the hypervisor limits the CPU utilization of VM1 to 50%. If IDS1 offloaded from the VM consumes 10% of the CPU time in the management VM, VM1 and IDS1 can use 60% of the CPU time in total. Since the offloaded IDS1 is part of VM1 originally, the total CPU utilization of VM1 and IDS1 should be limited to 50% for fairness. IDS2 IDS1 10% hypervisor

Existing Resource Controls
Difficult to achieve efficient performance isolation considering IDS offloading Goal: Limit the total CPU utilization of a VM and its offloaded IDS to 50% Configuration: 40% for VM1 and 10% for IDS1 Idle IDS1 leads to low CPU utilization of VM1 management VM VM1 40% VM2 It is difficult to achieve efficient performance isolation considering IDS offloading by simply combining the existing resource controls of the hypervisor and the operating system. Let's consider limiting the total CPU utilization of a VM and its offloaded IDS to 50% as a goal. For example, we can configure the CPU limit of VM1 to 40% and that of IDS1 to 10%. When both VM1 and IDS1 are busy, the goal of 50% can be achieved. But when IDS1 is idle, this configuration can achieve only low CPU utilization. The total CPU utilization of VM1 and IDS1 becomes only 40%. At this time, VM1 should be able to receive up to 50% of the CPU time. Any configurations cannot achieve such efficient performance isolation. IDS2 IDS1 0% hypervisor

Resource Cages A new abstraction of the hypervisor for resource management considering IDS offloading Manage a VM and IDS processes as a group The traditional hypervisor manages only VMs Achieve both performance isolation and high resource utilization For resource management considering IDS offloading, we propose a new abstraction of the hypervisor, called resource cages. Traditionally, the hypervisor manages only VMs but no processes inside each VM because processes are managed by the operating systems. A resource cage manages a VM and IDS processes offloaded from it as a group. Then, the hypervisor assigns CPUs and memory to resource cages, not VMs. A resource cage achieves both performance isolation and high resource utilization for a group of a VM and its offloaded IDSes. RC1 RC2 VM1 management VM VM2 IDS1 IDS2 hypervisor

Example: CPU Limit Guarantee the upper limit of the CPU utilization of each resource cage For idle IDS1, VM1 can receive up to 50% High resource utilization Even for idle RC1, RC2 cannot receive more than 50% Performance isolation Let's consider two resource cages: RC1 for VM1 and IDS1 and RC2 for VM2 and IDS2. The upper limit of the CPU utilization of each resource cage is guaranteed. For example, we can limit the CPU utilization of these resource cages to 50%, respectively. When IDS1 is idle, VM1 in the same resource cage can receive up to 50% of the CPU time. Similarly, when VM1 is idle, IDS1 can receive up to 50%. In any case, RC1 can receive up to 50% in total and high CPU utilization is achieved. Even when both VM1 and IDS1 are idle, RC2 cannot receive more than 50%. The performance is isolated between RC1 and RC2. RC1 RC2 VM1 management VM VM2 50% 50% IDS1 IDS2 0% 50%

Example: CPU Shares Enable scheduling based on CPU shares assigned to resource cages For idle IDS1, VM1 can receive all the CPU time allocated to RC1 For idle RC1, the surplus CPU time is allocated to RC2 Work-conserving Scheduling based on CPU shares assigned to resource cages is also enabled. For example, we can assign CPU shares to RC1 and RC2 in a 1:1 ratio. When IDS1 is idle, VM1 can receive all the CPU time allocated to RC1. So the CPU allocation to RC1 and RC2 is kept to 1:1. Unlike the example of CPU limit, when both VM1 and IDS1 are idle, the surplus CPU time is allocated to RC2. This is because of the work-conserving nature of proportional share scheduling. RC1 RC2 VM1 management VM VM2 256 shares 256 shares IDS1 IDS2 0% 50%

Example: Memory Limit Guarantee the upper limit of the memory size consumed by each resource cage For IDS1 with more memory, the memory allocation to VM1 is decreased more largely Even when both need less memory, RC2 cannot use more memory In addition, the upper limit of the memory size consumed by each resource cage is guaranteed. For example, we can limit the memory of RC1 to 4 GB. When IDS1 uses only a small amount of memory, VM1 can use most of the memory assigned to RC1. When IDS1 needs a larger amount of memory, the memory allocation to VM1 is decreased more largely. Even when both VM1 and IDS1 need a small amount of memory, RC2 cannot use memory that exceeds its upper limit. RC1 RC2 VM1 management VM VM2 4GB 4GB IDS1 IDS2 1GB 3GB

Hierarchy of Resource Cages
Resource cages are created hierarchically A VM is automaticaly assigned to a resource cage An IDS process is manually assigned to a resource cage These resouce cages are assigned to a collective resource cage Resource cages are created hierarchically. The hypervisor automatically assigns a VM to a resource cage RC_vm when the VM is created. In contrast, system administrators manually assign an IDS process to a resource cage RC_ids. This is because the hypervisor doesn't know which process in the management VM is an IDS. Then, administrators create a collective resource cage RC and assign RC_vm and RC_ids to it. For RC_vm, RC_ids, and RC, administrators can set CPU limits, CPU shares, and memory limits. RC RCVM VM RCIDS management VM IDS

Implementation We have implemented resource cages in Xen Also in KVM
A VM scheduler, an IDS scheduler, and a memory scheduler Leverage existing mechanisms as much as possible Also in KVM management VM We have implemented resource cages mainly in Xen. The resource management using resource cages is achieved by a VM scheduler, an IDS scheduler, and a memory scheduler. In our implementation, the hypervisor doesn't fully manage IDS processes included in resource cages. It leverages the existing mechanisms for resource management of the operating system as much as possible. The hypervisor just monitors processes, while the operating system controls them. Also, we have implemented resource cages in KVM, but the implementation is simpler than in Xen. VM1 VM2 memory scheduler IDS scheduler hypervisor VM scheduler

VM Scheduler with Credits
Based on the credit scheduler in Xen A proportional share CPU scheduler CPU shares and limit Calculate credits every 30 ms and distribute them to virtual CPUs (vCPUs) Schedule vCPUs to physical CPUs (pCPUs) Our VM scheduler is based on the credit scheduler in Xen. The credit scheduler is a proportional share CPU scheduler. Each VM is assigned CPU shares and a CPU limit. According to CPU shares, the credit scheduler calculates credits every 30 ms and distributes them to active virtual CPUs assigned to a VM. At that time, the distributed credits are restricted by a CPU limit. On the basis of credits, the scheduler schedules virtual CPUs to physical CPUs. VM1 VM2 50% 256 256 40% limit shares credits limit shares credits

Extended Credits Calculation
Calculate credits considering offloaded IDSes Temporarily decrease the CPU limit and CPU shares of a resource cage By the CPU time consumed by offloaded IDSes Distribute the calculated credits to the VM The remaining credits to the management VM Our VM scheduler calculates credits considering offloaded IDSes. The CPU limit of a resource cage is temporarily decreased by the CPU time consumed by offloaded IDSes. The CPU shares of the resource cage are also temporarily decreased. Using the decreased CPU limit and shares, the VM scheduler calculates credits. Then, it distributes the calculated credits to the virtual CPUs of the VM in the resource cage. The remaining credits decreased for offloaded IDSes are distributed to the management VM to run IDSes. VM management VM IDS 30% 192 consumed CPU time limit shares credits credits

IDS Scheduler: Monitoring
Monitor the CPU utilization of IDS processes from the hypervisor Identify each process by a virtual address space [Jones+'06] Record consumed CPU time by monitoring the switches between virtual address spaces Our IDS scheduler monitors the CPU utilization of IDS processes running in the management VM from the hypervisor. The hypervisor can identify each process by that virtual address space. When the operating system switches processes, it also switches virtual address spaces. The IDS scheduler records consumed CPU time by monitoring the switches between virtual address spaces. In the current implementation, the management VM explicitly notifies the hypervisor of information on the virtual address spaces of IDS processes. switch management VM IDS1 IDS2 hypervisor virtual address space

IDS Scheduler: Scheduling
Schedule IDS processes so as not to consume more CPU time than configured Calculate the runnable time from the CPU limit and the average CPU utilization Control the execution of IDS processes in the management VM According to the monitored CPU utilization of IDS processes, the IDS scheduler schedules the processes so that they don't consume more CPU time than configured. Every 100 ms, it calculates the runnable time of IDS processes from the CPU limit and the average CPU utilization. If the average CPU utilization of IDS processes exceeds the CPU limit, the runnable time is decreased. The IDS scheduler controls the execution of IDS processes in the management VM. Currently, we use a simple tool called cpulimit. We can use the CPU bandwidth controller in Linux cgroups. control management VM cpulimit IDS CPU utilization hypervisor

Multicore Support Need care about pCPU assignment for high resource utilization The surplus CPU time of pCPUs may not be used pCPUs need to be shared between VMs and the management VM The CPU time of pCPUs can be used up management VM Resource cages can be applied to not only single core but also multicore. However, for high resource utilization, we need care about the assignment of physical CPUs to VMs. Let's consider that physical CPU1 is assigned to the management VM and physical CPU2 is assigned to both VM1 and VM2. Assume that the CPU limit of the resource cage containing VM1 and IDS1 is 100%. When IDS1 is idle, the surplus CPU time of physical CPU1 cannot be used by VM1. As a result, the resource cage can receive only 50%. To improve resource utilization, physical CPUs need to be shared between VMs and the management VM. In this assignment, even when IDS1 is idle, the CPU time of physical CPU1 can be used up by VM1. management VM VM1 VM2 VM1 VM2 IDS1 IDS2 IDS1 IDS2 0% 50% 0% 100% pCPU1 pCPU2 pCPU1 pCPU2

Memory Scheduler: Monitoring
Monitor both the process memory and page cache consumed by IDSes The page cache is created by file access Create memory cgroups in the management VM Obtain the memory usage of IDS processes Including the page cache Our memory scheduler monitors both the process memory and page cache consumed by IDSes. Process memory is the physical memory consumed by a process. The page cache is created in the kernel when a process accesses files. The page cache is not the memory belonging to a process, but it should be also considered. To monitor memory usage, the memory scheduler creates Linux memory cgroups in the management VM. A memory cgroup consists of offloaded IDS processes and the memory usage including the page cache is accounted for in the cgroup. The memory scheduler can obtain the total memory size from the cgroup. memory cgroup IDS management VM OS page cache

Memory Scheduler: Scheduling
Calculate a new memory size of a VM from the memory consumed by IDSes Change memory allocation to the VM Using memory ballooning [Waldspurger'02] Limit the memory usage of IDS processes using memory cgroups The memory scheduler periodically calculates a new memory size of a VM from the memory consumed by offloaded IDSes. According to the calculated size, it changes the memory allocation to the VM. First, the hypervisor sends a request to the memory balloon driver in a VM. To decrease the memory size, the driver allocates a requested amount of memory and returns the allocated memory to the hypervisor. The memory scheduler can limit the memory usage of IDS processes using memory cgroups. The maximum size of the process memory and page cache allocated for IDSes is limited. management VM VM IDS balloon driver decrease hypervisor

Resource Cages in KVM Implementation is straightforward in KVM
A VM is managed as a process Create three cgroups for a VM, IDS processes, and these two cgroups Naturally schedule them with CPU limit, CPU shares, and memory limit The implementation of resource cages is straightforward in KVM. KVM has an architecture different from Xen. Xen runs VMs on top of the hypervisor, while KVM runs the hypervisor as a Linux kernel module and runs VMs on top of the operating system. Since a VM is managed as a process, we could easily implement resource cages using Linux cgroups. System administrators create three cgroups for the process of a VM, IDS processes, and a group of these two cgroups. Using the three cgroups, the operating system can naturally schedule resource cages with the CPU limit, CPU shares, and memory limit. cgroups VM IDS OS

Experiments We conducted experiments to confirm performance isolation in IDS offloading Three resource cages RCVM for a VM RCIDS for its offloaded IDS RC for grouping RCVM and RCIDS We conducted experiments to confirm performance isolation in IDS offloading. In the experiments, we created three resource cages: RC_vm for a VM, RC_ids for its offloaded IDS, and RC for grouping them. We used a different experimental setup for each experiment. For details, see the paper. RC RCVM VM RCIDS management VM IDS

CPU Limit: ClamAV We measured the CPU utilization of a VM and offloaded ClamAV ClamAV detects viruses in VM's disk Goal: keep the total CPU utilization to 60% First, we offloaded ClamAV, which is a host-based IDS for detecting viruses in VM's disk. We ran a CPU-intensive task in the VM. Our goal was to keep the total CPU utilization to 60%. First, we set the CPU limit of the VM to 60% without resource cages. When ClamAV started running, it consumed 30% of the CPU time. As a result, the total CPU utilization exceeded 60%. Next, we set the CPU limit of the resource cage RC to 60%. When ClamAV started running, the CPU time assigned to RC_vm was decreased and the CPU utilization of RC was always kept to 60% successfully. RC VM RCVM ClamAV RCIDS

CPU Limit: Snort We measured the CPU utilization of a VM and offloaded Snort The CPU utilization of Snort depends on VM's workload Goal: keep the total CPU utilization to 50% RC Second, we offloaded Snort, which is a network-based IDS for checking network packets. Unlike ClamAV, the CPU utilization of Snort depends on VM's workload. We ran the Apache web server in the VM and sent requests to it. Our goal was to keep the total CPU utilization to 50%. When we set the CPU limit of RC to 50% and that of RC_ids to 20%, the CPU utilization of RC didn't exceed the CPU limit successfully. At this time, we measured the throughput of the web server in the VM. Using resource cages, the throughput was almost the same as when Snort wasn't offloaded, as expected. RCVM RCIDS

Memory Limit: MemBench
We measured the memory usage of a VM and offloaded MemBench MemBench allocates/deallocates memory Goal: keep the total memory usage to 512 MB RC For memory limit, we offloaded MemBench, which repeatedly allocated and deallocated memory. Our goal was to keep the total memory usage to 512 MB. Without resource cages, when MemBench allocated memory, the total memory size exceeded 512 MB. When we limited the memory size of RC, the memory size of the VM was adjusted and the total memory size was kept to 512 MB. VM RCVM RCIDS process memory

Memory Limit: Tripwire
We measured the memory usage of a VM and offloaded Tripwire Tripwire uses a large amount of page cache Goal: keep the total memory usage to 512 MB VM Next, we offloaded Tripwire, which is a host-based IDS for checking the integrity of VM's disk. Our goal was the same as the previous experiment. Without resource cages, Tripwire used a large amount of page cache. The page cache consumed by Tripwire became more than 3.5 GB. In total, the memory size of the VM and Tripwire largely exceeded 512 MB. When we limited the memory size appropriately, the total memory size was kept to 512 MB. page cache RC RCVM RCIDS

Resource Control in KVM
We measured the resource usage of a VM and offloaded Tripwire Goal 1: keep RCVM : RCIDS to 1 : 1 (CPU shares) Goal 2: keep the total memory usage to 256 MB RC We examined that resource cages for KVM also worked well. First, we offloaded Tripwire and ran a CPU-intensive task in the VM. Unlike previous experiments, our goal was to keep CPU utilization of RC_vm and RC_ids to a 1:1 ratio. While Tripwire didn't run, RC_vm could receive 100% of the CPU time. When Tripwire started running, RC_vm and RC_ids received 50%, respectively. Next, we ran MemBench in the VM and offloaded Tripwire. Our goal was to keep the total memory size to 256 MB. Unlike Xen, the real memory allocation to the VM was very small at first in KVM. Even after Tripwire started running, the memory size of RC was less than 256 MB. RCVM RCVM RCIDS RCIDS

Related Work SEDF-DC [Gupta+'06] Resource pools [VMware]
Enforce performance isolation between VMs considering I/O processing No new abstraction and specific to network I/O Resource pools [VMware] Enable performance isolation of a group of VMs VMCoupler [Kourai+'13] Enable offloaded IDSes to run in a dedicated VM Resource pools would be useful Increase resource consumption by more VMs SEDF-DC is a VM scheduler for enforcing performance isolation between VMs considering I/O processing. This is similar to resource cages, but SEDF-DC provides no new abstraction. In addition, it is specific to network I/O processing. Resource pools in VMware vSphere enable performance isolation of a group of VMs. However, a VM is a minimum unit. In combination with VMCoupler, resource pools would be useful for performance isolation considering IDS offloading. VMCoupler enables offloaded IDSes to run in a dedicated VM called a guard VM. A resource pool can group a monitored VM and a guard VM and control their resource usage. However, VMCoupler increases resource consumption by more VMs.

Conclusion We proposed resource cages for resource management considering IDS offloading A new abstraction of the hypervisor Manage a VM and offloaded IDS processes as a group Achieve performance isolation and high resource utilization Future work Implement CPU shares between a VM and IDS processes Control the usage of storage/network in resource cages In conclusion, we proposed resource cages for resource management considering IDS offloading. A resource cage is a new abstraction of the hypervisor. It manages a VM and offloaded IDS processes as a group and achieves both performance isolation and high resource utilization. We have implemented resource cages in Xen and KVM and showed the effectiveness. Our future work is to implement CPU shares between a VM and IDS processes inside the same resource cage in Xen. When VMs and processes are managed by the hypervisor and the operating systems independently, it is difficult to relatively control their resource usage. In addition, we are planning to control the usage of storage and network in resource cages.

Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,

Similar presentations

Presentation on theme: "Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,

Similar presentations

Presentation on theme: "Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,"— Presentation transcript:

Similar presentations

About project

Feedback