Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient VM Introspection in KVM and Performance Comparison with Xen

Similar presentations


Presentation on theme: "Efficient VM Introspection in KVM and Performance Comparison with Xen"— Presentation transcript:

1 Efficient VM Introspection in KVM and Performance Comparison with Xen
I'm Kenichi Kourai from Kyushu Institute of Technology. In this talk, I'm gonna talk about efficient virtual machine introspection in KVM, one of the most famous virtualization software, and performance comparison with Xen, another famous virtualization software. This is joint work with my student. Kenichi Kourai Kousuke Nakamura Kyushu Institute of Technology

2 Intrusion Detection System (IDS)
IDSes detect attacks against servers Monitor the systems and networks of servers Alert to administrators Recently, attackers attempt to disable IDSes Before they are detected This is easy because IDSes are running in servers Attacks against servers are increasing. As one of the methods for detecting such attacks, intrusion detection systems, or IDSes, are used. IDSes monitor the systems and networks of servers and alert to administrators if they detect symptoms of attacks. Recently, attackers first attempt to disable or tamper with IDSes after they intrude into servers and before they are detected by IDSes. This is easy because IDSes are often running in the same servers compromised by attackers. detect intrude IDS server

3 IDS Offloading Offloading IDSes using virtual machines (VMs)
Run a server in a VM Execute IDSes outside the VM Prevent IDSes from being compromised Can be provided as a cloud service Cloud providers can protect users' VMs To counteract such attacks against IDSes, a technique called IDS offloading has been proposed. IDS offloading runs a server in a VM and executes IDSes outside the VM. Unlike traditional in-VM monitoring, where IDSes are running inside a VM, this technique can prevent IDSes from being compromised even if attackers intrude into the VM. As such, the security of IDSes is increased. IDS offloading is also promising in cloud computing environments because it can be provided as a cloud service. Even if cloud users don't install IDSes in their VMs, cloud providers can protect such VMs from the outside attacks by using IDS offloading. VM VM monitor IDS IDS In-VM monitoring IDS offloading

4 VM Introspection (VMI)
A technique for monitoring VMs from the outside Memory introspection Obtain raw memory contents and extract OS data Disk introspection Obtain raw disk data and interpret a filesystem Network introspection Obtain packets only from/to VMs VM The enabling technology of IDS offloading is VM introspection, or VMI. VMI is a technique for monitoring VMs from the outside. This is not so easy. To introspect the system state in a VM, VMI first obtains raw memory contents from the VM and then extracts OS data from them. For disk introspection, VMI obtains raw disk data from a VM, interprets the used filesystem, and extracts files and directories. To introspect the network used by a VM, VMI obtains packets only from and to the VM. ??? memory IDS ??? disk packets network

5 Performance of VMI Performance has not been reported in detail
No performance comparison E.g., VMwatcher [Jiang+ CCS'07] Implemented in Xen, QEMU, VMware, and UML Reported only for UML E.g., EXTERIOR [Fu+ VEE'13] Implemented in KVM and QEMU No difference due to using memory dump Performance data is important For user's selection of virtualization software VMI has been well studied for various kinds of virtualization software, for example, Xen and VMware. But its performance has not been reported in detail. In particular, there is no performance comparison among various virtualization software. For example, VMwatcher is a system using VMI and is implemented in four kinds of virtualization software: Xen, QEMU, VMware, and User-Mode Linux. But the performance is reported only for User-Mode Linux. EXTERIOR is implemented in two kinds of virtualization software: KVM and QEMU. But they are very similar virtualization software and there is no substantial difference in EXTERIOR because the system first dumps the memory of a VM and then introspects it. The performance data is important when users select appropriate virtualization software for their systems.

6 The Purpose of This Work
Performance comparison among virtualization software in terms of VMI Target: Xen and KVM Widely used open source virtualization software System architecture is different process VM VM VM So the purpose of this work is performance comparison among virtualization software in terms of VMI. In this work, our target virtualization software are Xen and KVM because they are widely used open source virtualization software. In addition, it was interesting that the system architecture is different between Xen and KVM. In Xen, VMs run on top of the hypervisor. In KVM, they run on top of the operating system. hypervisor OS Xen KVM

7 Implementation for KVM
No efficient implementation of VMI for KVM Several studies have been done for KVM The implementation details are unclear LibVMI [Payne+ '11] supports VMI for both Xen and KVM The performance of memory introspection is too low in KVM Optimized for Xen When we started this work, we noticed there was no efficient implementation of VMI for KVM, particularly, in memory introspection. Several studies on VMI have been done for KVM, but the implementation details are unclear in literatures. Among them, LibVMI was promising because it is famous open source implementation of VMI for both Xen and KVM. Unfortunately, the performance of memory introspection is too low in KVM due to implementation issues. LibVMI is optimized for Xen. To compare the performance between Xen and KVM, we couldn't use LibVMI because it was not fair for KVM.

8 KVMonitor We have developed an efficient VMI tool for KVM
Execute an IDS as a process of the host OS Provide functions for introspecting memory, disks, and NICs in QEMU IDS KVMonitor offload monitor VM So we have first developed an efficient VMI tool for KVM, called KVMonitor. Before explaining KVMonitor, let me explain KVM. KVM consists of a kernel module and user-level QEMU processes customized for KVM. QEMU is a system emulator and provides virtual resources such as CPUs, memory, disks and network interface cards to a VM. The kernel module assists virtualization using hardware supports. KVMonitor executes an IDS as a process of the host OS. It provides functions for introspecting the memory, disks, and network interface cards in QEMU. disk NIC QEMU memory host OS KVM module

9 Memory Introspection (1/2)
Difficult to efficiently introspect QEMU's memory LibVMI obtains memory contents from QEMU KVMonitor shares VM's physical memory with QEMU via a memory file Access As a memory-mapped file Enable direct memory introspection In the original implementation of QEMU, VM's physical memory was internally allocated. So it was difficult to efficiently introspect the memory from the outside. In the case of LibVMI, it obtains memory contents from QEMU and this is the cause of inefficiency. To enable efficient memory introspection, KVMonitor shares VM's physical memory with QEMU via a file called a memory file. Of course, KVMonitor and QEMU don't access the memory file using file read/write functions. They map the memory file onto their memory address spaces using the memory-mapping function provided by the host OS. They access the memory file as a memory-mapped file. As such, KVMonitor can perform direct memory introspection. IDS VM KVMonitor QEMU VM's physical memory memory file VM's physical memory

10 Memory Introspection (2/2)
IDSes usually access OS data using virtual addresses KVMonitor translates virtual addresses into physical addresses Look up the page table for address translation Introspect the CR3 register using QMP A memory file contains the contents of VM's physical memory. But IDSes usually access OS data in a VM using virtual addresses, not physical addresses. So KVMonitor provides a function for translating virtual addresses into physical addresses. For such address translation, KVMonitor looks up the page table located in VM's physical memory. To find the page table, KVMonitor needs to know the physical address of the page table. The address is stored in the CR3 register of the CPU used in a VM. So KVMonitor introspects the CR3 register using QMP, which is a protocol for communicating with QEMU. IDS VM CR3 KVMonitor QEMU page table VM's physical memory VM's physical memory memory file

11 Disk/Network Introspection
KVMonitor introspects VM's disks via the network block device (NBD) Interpret the qcow2 format in the NBD server Interpret the filesystem in the host OS KVMonitor captures packets from a tap device disk image file VM For disk and network introspection, KVMonitor uses well-known techniques. In KVM, VM's disk is backed by a disk image file on the host OS. KVMonitor introspects the image file via the network block device, or NBD. When KVMonitor accesses that device, the NBD server reads a disk block from the disk image file and returns it. Since KVM uses a custom disk format named qcow2, KVMonitor interprets it using the functionality of the NBD server. Then KVMonitor interprets the filesystem used in VM's disk using the functionality of the host OS and accesses files and directories. For network introspection, KVMonitor captures packets from a tap network device. The tap device is created by QEMU for the network communication of a VM. IDS KVMonitor NBD server QEMU NBD tap host OS network

12 Transcall with KVMonitor
We have ported Transcall [Iida+ '11] for Xen to KVM Enable offloading legacy IDSes without any modifications Consist of a system call emulator and a shadow filesystem Including the proc filesystem Analyze OS data by memory introspection Using KVMonitor, we have ported Transcall developed by us for Xen to KVM. Transcall provides an execution environment for IDSes to transparently introspect a VM. In other words, it enables offloading legacy IDSes without any modifications. Transcall consists of a system call emulator and a shadow filesystem. The system call emulator traps system calls from IDSes and returns system information in a VM if necessary. The shadow filesystem provides completely the same filesystem in a VM, including the proc filesystem, which contains system information such as running processes and networks. For this purpose, Transcall analyzes OS data using memory introspection. IDS VM Transcall analyze KVMonitor QEMU

13 Experiments We examined that KVMonitor achieved
Efficient memory introspection No impact on memory performance of a VM Effective IDS offloading PC VM To show the effectiveness of KVMonitor, we first examined KVMonitor achieved more efficient memory introspection than LibVMI. Next, we examined the modification to QEMU didn't affect the memory performance of a VM. Finally, we examined KVMonitor enabled effective IDS offloading. CPU: Intel Xeon E5630 (12 MB L3 cache) Memory: 6 GB DDR3 PC3-8500 HDD: 250 GB SATA NIC: gigabit Ethernet Hypervisor: KVM 1.1.2 Host OS: Linux 3.2.0 CPU: 1 Memory: 512 MB Disk: 20 GB (ext3) Guest OS: Linux

14 KVMonitor vs. LibVMI We measured the performance of memory introspection Copy VM's physical memory by 4KB KVMonitor was 32x faster than LibVMI fast First, we measured the performance of memory introspection using our KVMonitor and existing LibVMI. We copied VM's physical memory by 4 KB. This figure shows the read throughput. The throughput in KVMonitor was 9.6 GB/s, whereas that in LibVMI was only 0.3 GB/s. For memory introspection, KVMonitor was 32 times faster than LibVMI.

15 Why is LibVMI so slow? LibVMI has to issue a QMP command for each memory access Memory contents are transferred from QEMU to LibVMI VM VM IDS IDS QMP Why is LibVMI so slow? This is because LibVMI has to issue a QMP command to QEMU for each memory access. When LibVMI sends a QMP request to QEMU, memory contents are transferred to LibVMI. In contrast, KVMonitor can read VM's physical memory directly through a memory-mapped file. LibVMI QEMU KVMonitor QEMU memory file VM's memory VM's memory VM's memory LibVMI KVMonitor

16 In-VM Memory Performance
Doesn't using a memory file affect memory performance of a VM? Using a memory file was as efficient as malloc Here, there is one question. Doesn't using a memory file as VM's physical memory affect memory performance of a VM? Unlike the traditional QEMU, our QEMU uses a memory-mapped file, not internally allocated memory by malloc, as VM's physical memory. So memory accesses may cause disk accesses. This figure shows the read and write throughputs inside a VM. As a result, using a memory file was as efficient as using malloc. Rather, the performance was slightly better, but the reason is unknown. VM VM QEMU QEMU memory file VM's memory VM's memory memory file malloc

17 KVMonitor vs. In-VM Access
KVMonitor was faster than in-VM memory access Due to virtualization overhead fast VM IDS Surprisingly, KVMonitor is faster than in-VM memory access. The read throughput of VMI was 9.6 GB/s, but that of in-VM access was 8.6 GB/s. This is due to virtualization overhead of a VM. The benchmark program inside a VM suffered from such overhead, but that outside the VM didn't. KVMonitor QEMU memory file VM's memory VM's memory

18 Offloading Legacy IDSes (1/3)
Tripwire Check filesystem integrity in disks We added, deleted, and modified files Offloaded Tripwire detected changed files Rule Name Added Removed Modified Monitor Filesystems Total Objects scanned: 67082 Total violations found: 3 We offloaded three legacy IDSes using the ported Transcall and KVMonitor. Tripwire is an IDS that checks filesystem integrity in disks. It scans the entire disk of a VM and stores file information in the database outside the VM in advance. Then it periodically scans the disk and compares the result with the database. If files are changed, Tripwire detects it as violation. In this experiment, we added, deleted, and modified three files inside a VM. As a result, offloaded Tripwire could detect all the changed files like this. VM Tripwire DB disk

19 Offloading Legacy IDSes (2/3)
Snort Inspect network packets We performed portscans from another host Offloaded Snort detected portscans [**] [1:1421:11] SNMP AgentX/tcp request [**] [Classification: Attempted Information Leak] ... 01/28-10:47: : > :705 Snort is an IDS that inspects network packets. Snort captures all the packets to and from a VM and detect attacks on the basis of rule sets. In this experiment, we performed portscans from another host. As a result, offloaded Snort could detect this attack. This is the alert log at that time. VM Snort rule sets portscan packets

20 Offloading Legacy IDSes (3/3)
Chkrootkit Detect rootkits using ps, netstat, and file inspection We tampered with ps and netstat in a VM Offloaded chkrootkit detected tampered commands ROOTDOR is ’/’ Checking ’ps’...INFECTED Checking ’netstat’...INFECTED : Chkrootkit is an IDS that detects rootkits. Rootkits are malicious software installed in compromised systems. Chkrootkit is a shell script and uses ps and netstat as external commands. Offloaded chkrootkit executes these commands outside the VM securely. These commands obtain system information in a VM by memory introspection. In addition, chkrookit inspects files to find infection. In this experiment, we tampered with ps and netstat inside a VM. As a result, offloaded chkrootkit could detect these tampered commands like this. VM disk chkrootkit execute ps netstat ps netstat ...

21 Cross-view Diff (1/2) A technique for detecting hidden malware
Compare the results of VMI and in-VM monitoring The difference means the existence of hidden malware cross-view diff engine C is hidden We conducted cross-view diff using KVMonitor. Cross-view diff is a technique for detecting hidden malware. Malware often hides its existence from IDSes to avoid detection. For example, malware may remove malicious processes from the process list. Cross-view diff compares the results of VMI and monitoring inside a VM. If there are any differences, that means the existence of hidden malware. Consider an IDS inside a VM reported "A B D." At that time, an offloaded IDS reported "A B C D." The cross-view diff engine compares these results and can conclude C is hidden. VM A B C D ... A B D ... monitor IDS IDS

22 Cross-view Diff (2/2) We tampered with ps in a VM
A hidden process was detected as malicious We tampered with netstat in a VM A hidden port was detected as a backdoor PID TTY TIME CMD 1 ? 00:00:00 init 2 ? 00:00:00 kthreadd : PID TTY TIME CMD 2 ? 00:00:00 kthreadd : ps In this experiment, we tampered with the ps command in a VM so that it hid the init process. The tampered ps reported the process list on the right side. In contrast, the offloaded, not-tampered ps reported the process list on the left side. By comparing these, we could detect the init process was hidden. Similarly, we tampered with the netstat command in a VM so that it hid the port used by VNC. The tampered netstat reported the socket list on the right side. The offloaded netstat reported the socket list on the left side. From these results, we could detect the VNC port was open as a backdoor. Proto ... Local Address ... tcp :5900 tcp :22 : Proto ... Local Address ... tcp :22 : netstat results from offloaded commands results from in-VM commands

23 KVMonitor vs. Xen We compared the performance of VMI between KVM and Xen Using a VMI tool for Xen Memory: standard library Disk: loopback mount Network: tap device Dom0 (VM) VM disk image file tap Next, we compared the performance of VMI between KVM and Xen. In Xen, offloaded IDSes run in the privileged VM called Domain 0. For memory introspection, we used the standard library for Xen named libxenctrl. For disk introspection, we simply used a loopback mount because Xen's disk used a directly mountable raw format. For network introspection, we captured packets from a tap device like KVMonitor. IDS libxenctrl Hypervisor: Xen 4.1.3 Dom0 OS: Linux 3.2.0 VM: fully virtualized hypervisor

24 Memory Introspection We measured read throughput KVMonitor was
Copy VM's physical memory by 4KB KVMonitor was 48x faster than Xen fast To examine the performance of memory introspection, we copied VM's physical memory by 4KB and measured read throughput. According to the result, KVMonitor was 48 times faster than Xen. KVMonitor achieved 9.6 GB/s, but Xen did only 0.2 GB/s.

25 Why is Xen so slow? Xen has to map each memory page
It cannot map all the pages in advance It takes time proportional to the number of pages KVMonitor can read a pre-mapped file VM IDS IDS Why is Xen so slow? Xen has to map each memory page when it accesses the memory of a VM. Xen cannot map all the pages in advance. According to our experiment, we can map multiple pages at once to some degree, but it takes time proportional to the number of pages. On the other hand, KVMonitor can map a memory file without taking time and read the pre-mapped file without mapping overhead. memory file libxenctrl KVMonitor VM's memory map Xen KVMonitor

26 Kernel Integrity Checking
We measured the execution time of the kernel integrity checker Read the code area Translate virtual to physical addresses KVMonitor was 118x faster than Xen Next, we measured the execution time of the kernel integrity checker. The checker reads the kernel code area to examine the tamper with the OS kernel. Unlike the previous memory benchmark, this kernel checker translates virtual addresses of the kernel into physical addresses. This figure shows the execution time. In Xen, it took 224 ms. In KVM, it took only 1.9 ms. So KVMonitor was 118 times faster than Xen. fast

27 Why is the speedup so larger?
The speedup in the real IDS was much larger 48x (simple benchmark) 118x (kernel checker) Due to address translation In Xen, the access cost of the page table is high Only 8 bytes are read after memory mapping Surprisingly, the speedup in the real IDS was much larger than in the previous simple benchmark. The speedup in the simple benchmark was 48 times, but that in the kernel checker was 118 times. Why is the speedup larger in the real IDS? This is due to the address translation. In Xen, the access cost of the page table is high. For each memory page used as the page table, only 8 bytes of a page table entry are read after mapping. This means the cost of memory mapping is relatively higher than simple benchmark, which reads 4 KB per page. VM VM IDS IDS map & read map & read libxenctrl libxenctrl simple benchmark real kernel checker

28 Disk Introspection We measured the execution time of Tripwire
For two formats of disks raw and qcow2 KVMonitor was Comparable to Xen The difference between formats was larger Raw was faster than qcow2 To compare the performance of disk introspection, we measured the execution time of Tripwire. This measurement was done for two formats of disks: raw and qcow2. The raw format is the default in Xen, and the qcow2 format is the default in KVM. But we can use both formats in both virtualization software. From the result, KVMonitor is comparable to Xen. Rather, the difference between disk formats is larger. The raw format is faster than qcow2. This is because qcow2 is more complex and needs NBD for introspection. fast

29 Network Introspection
We measured the packet loss rate in Snort Send many packets as fast as possible KVMonitor was more lightweight than Xen Dom0 suffered from virtualization overhead For network introspection, we measured the packet loss rate in Snort. We sent many packets as fast as possible to a VM. According to the result, KVMonitor is more lightweight than Xen. This is probably because Snort was offloaded to Domain 0 in Xen and Domain 0 suffered from virtualization overhead. fast

30 Chkrootkit We measured the execution time of chkrootkit KVMonitor was
1.6x faster than Xen Efficient memory introspection No virtualization overhead 2x slower than in-VM Due to system call traps fast Finally, we measured the execution time of chkrootkit. Chkrootkit needs memory introspection as well as disk introspection. We used the qcow2 format in KVM and the raw format in Xen because these are the defaults. From the previous experiment, the performance of qcow2 is less than that of the raw format. Nevertheless, the execution time was 55 seconds in Xen, while it was only 35 seconds in KVM. KVMonitor was 1.6 times faster than Xen. One reason is efficient memory introspection and another reason is that IDS offloading in KVM suffers from no virtualization overhead. Compared with the execution inside a VM, even KVMonitor was still 2 times slower. This is due to trapping system calls by Transcall to emulate system calls.

31 Related Work VMI tools Shm-snapshot for LibVMI [Xu+ PDL'13]
Livewire [Garfinkel+ NDSS'03] for VMware XenAccess [Payne+ ACSAC'07] for Xen Shm-snapshot for LibVMI [Xu+ PDL'13] Take a VM's memory snapshot in shared memory It takes 1.4 seconds for 3 GB Volatility [Walters '07] A memory forensics framework VMI for KVM is enabled by a Python adapter, PyVMI from LibVMI There are many VMI tools for various virtualization software. In Livewire, VMI was first implemented using VMware. For Xen, XenAccess was widely used open source implementation and LibVMI is its successor. In the latest LibVMI release candidate, shm-snapshot support is added. This mechanism takes a VM's physical memory snapshot in shared memory. IDSes can directly access VM's memory via the shared memory. But it takes time to take a memory snapshot, for example, 1.4 seconds for 3 GB of memory. In addition, because a snapshot becomes obsolete soon, we need to take a snapshot frequently. Volatility is a memory forensics framework. It's not a VMI tool, but VMI for KVM is enabled by a Python adapter named PyVMI from LibVMI. So it inherits the issues of LibVMI.

32 Conclusion KVMonitor Performance comparison with Xen Future work
Achieve efficient VM introspection (VMI) in KVM 32x faster than existing LibVMI Performance comparison with Xen 118x faster at maximum Chkrootkit was 1.6x faster Future work Comparison with other virtualization software Integration with LibVMI In conclusion, we have developed KVMonitor, which achieves efficient VM introspection in KVM. KVMonitor was 32 times faster than existing LibVMI for memory introspection. Using KVMonitor, we conducted performance comparison with Xen. At maximum, KVMonitor was 118 times faster than Xen. Even for chkrootkit, KVMonitor was 1.6 times faster than Xen. Our future work is to conduct performance comparison with other virtualization software. Another direction is to integrate KVMonitor with LibVMI.


Download ppt "Efficient VM Introspection in KVM and Performance Comparison with Xen"

Similar presentations


Ads by Google