VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,

Slides:

Advertisements

Similar presentations

Remus: High Availability via Asynchronous Virtual Machine Replication

Advertisements

Diagnosing Performance Overheads in the Xen Virtual Machine Environment Aravind Menon Willy Zwaenepoel EPFL, Lausanne Jose Renato Santos Yoshio Turner.

Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.

KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.

Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.

Performance Evaluation of Open Virtual Routers M.Siraj Rathore

VSphere vs. Hyper-V Metron Performance Showdown. Objectives Architecture Available metrics Challenges in virtual environments Test environment and methods.

KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.

Heterogeneous Live Migration of Virtual Machines Pengcheng Liu, Ziye Yang, Xiang Song, Yixun Zhou, Haibo Chen, and Binyu Zang Parallel Processing Institute,

XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.

Database Systems on Virtual Machines: How Much Do We Lose? Kristin Travis March 2, 2011.

A Secure System-wide Process Scheduling across Virtual Machines Hidekazu Tadokoro (Tokyo Institute of Technology) Kenichi Kourai (Kyushu Institute of Technology)

Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.

Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference.

Network Implementation for Xen and KVM Class project for E : Network System Design and Implantation 12 Apr 2010 Kangkook Jee (kj2181)

Hosted VMM Architecture Advantages: –Installs and runs like an application –Portable – host OS does I/O access –Coexists with applications running on.

The Design of Robust and Efficient Microkernel ManRiX, The Design of Robust and Efficient Microkernel Presented by: Manish Regmi

Virtualization for Cloud Computing

Container-based OS Virtualization A Scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Pötzl, Marc Fiuczynski, Andy Bavier.

Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p

Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.

CSE598C Virtual Machines and Their Applications Operating System Support for Virtual Machines Coauthored by Samuel T. King, George W. Dunlap and Peter.

Virtual WiFi: Bring Virtualization from Wired to Wireless Lei Xia, Sanjay Kumar, Xue Yang Praveen Gopalakrishnan, York Liu, Sebastian Schoenberg, Xingang.

Week 6 Operating Systems.

1 Scheduling I/O in Virtual Machine Monitors© 2008 Diego Ongaro Scheduling I/O in Virtual Machine Monitors Diego Ongaro, Alan L. Cox, and Scott Rixner.

Microkernels, virtualization, exokernels Tutorial 1 – CSC469.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Virtualization The XEN Approach. Virtualization 2 CS5204 – Operating Systems XEN: paravirtualization References and Sources Paul Barham, et.al., “Xen.

Virtual Machine Scheduling for Parallel Soft Real-Time Applications

Benefits: Increased server utilization Reduced IT TCO Improved IT agility.

Xen Overview for Campus Grids Andrew Warfield University of Cambridge Computer Laboratory.

Improving Network I/O Virtualization for Cloud Computing.

Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Ardalan Kangarlou, Sahan Gamage, Ramana Kompella, Dongyan Xu

Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.

A study of introduction of the virtualization technology into operator consoles T.Ohata, M.Ishii / SPring-8 ICALEPCS 2005, October 10-14, 2005 Geneva,

Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.

1 Xen and Co.: Communication-aware CPU Scheduling for Consolidated Xen-based Hosting Platforms Sriram Govindan, Arjun R Nath, Amitayu Das, Bhuvan Urgaonkar,

Xen (Virtual Machine Monitor) Operating systems laboratory Esmail asyabi- April 2015.

Dynamic Resource Monitoring and Allocation in a virtualized environment.

Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.

Srihari Makineni & Ravi Iyer Communications Technology Lab

Quantifying and Improving I/O Predictability in Virtualized Systems Cheng Li, Inigo Goiri, Abhishek Bhattacharjee, Ricardo Bianchini, Thu D. Nguyen 1.

Micro-sliced Virtual Processors to Hide the Effect of Discontinuous CPU Availability for Consolidated Systems Jeongseob Ahn, Chang Hyun Park, and Jaehyuk.

Dynamic and Secure Application Consolidation with Nested Virtualization and Library OS in Cloud Kouta Sannomiya and Kenichi Kourai (Kyushu Institute of.

The xCloud and Design Alternatives Presented by Lavone Rodolph.

Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.

Introduction to virtualization

Full and Para Virtualization

Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,

Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.

An Efficient Threading Model to Boost Server Performance Anupam Chanda.

Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.

Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.

© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.

Presented by Yoon-Soo Lee

Current Generation Hypervisor Type 1 Type 2.

Virtualization OVERVIEW

Container-based Operating System Virtualization: A scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Potzl, Marc E. Fiuczynski,

CS490 Windows Internals Quiz 2 09/27/2013.

Comparison of the Three CPU Schedulers in Xen

Xen Network I/O Performance Analysis and Opportunities for Improvement

Mid Term review CSC345.

CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors

Operating Systems Structure

System Virtualization

Xing Pu21 Ling Liu1 Yiduo Mei31 Sankaran Sivathanu1 Younggyun Koh1

Presentation transcript:

vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, Dongyan Xu 2013 USENIX Annual Technical Conference

 Pay-as-you-go: Server Consolidation  Save cost in running application and operational expenditure  Multiple VMs sharing the same core  CPU access latencyMotivation VM1VM2VM3VM4 Hypervisor(or VMM) Low I/O Throughput

 Two basic stages  Device interrupts are processed synchronously in the kernel  Application asynchronously copies the data in kernel buffer I/O Processing VM1VM2VM3 CPU Time IRQ Processing Kernel Buffer Application IRQ processing delay

Effect of CPU Sharing on TCP Receive TCP Client HypervisorShared Buffer Scheduled VMs DATA VM1 VM2 VM3 DATA ACK IRQ Processing Delay

Effect of CPU Sharing on UDP Receive UDP Client HypervisorShared Buffer Scheduled VMs VM1 VM2 VM3 DATA Shared Buffer Full Dropped Application Buffer DATA

Effect of CPU Sharing on Disk Write ApplicationKernel MemoryDisk DriveScheduled VMs VM1 VM2 VM3 DATA Kernel Memory VM3 DATA IRQ Processing Delay

 Reduce time-slice of each VM  Causes significant context switch overhead Intuitive Solution

Our Solution: vTurbo

 IRQ processing offloaded to a dedicated turbo core  Turbo core : Any physical core with micro-slicing (e.g., 0.1 ms)  Expose turbo core as a special vCPU to the VM  Turbo vCPU runs on a turbo core  Regular vCPUs run on regular cores  Pin IRQ context of guest OS to turbo vCPU  Benefits  Improved I/O throughput (TCP/UDP, Disk)  Self-adaptive system Our Solution: vTurbo

vTurbo Design

VM1VM2VM3 Regular Core VM3VM1VM2VM3VM1VM2 Turbo Core IRQ Buf Application Time Data

vTurbot’s Impact on Disk Write Application Kernel Memory vTurbo Regular Core VM1 VM2 Kernel Memory VM3 Disk Drive DATA VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3

Kernel Buffer Application Buffer Effect of CPU Sharing on UDP Receive UDP Client HypervisorShared Buffer Regular Cores VM1 VM2 VM3 DATA Shared Buffer vTurbo VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 Kernel Buffer DATA

ACK Effect of CPU Sharing on TCP Receive TCP Client HypervisorShared Buffer Regular Cores VM1 VM2 VM3 vTurbo VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 VM1 VM2 VM3 Kernel Buffer Backlog Queue Receive Queue Application Buffer Locked DATA

 Turbo cores are not free  Maintain CPU fair-share among VMs  Calculate the credits on both regular and turbo cores  Guarantee the CPU allocation on turbo cores  Deduct I/O intensive VMs’ credits on regular cores  Allocate the deduction to non-IO intensive VMs VM Scheduling Policy for Fairness

 VM hosts  3.2 GHz Intel Xeon Quad-cores CPU, 16GB RAM  Assign an independent core to driver domain(dom0)  Xen  Linux 3.2  Choose 1 core as Turbo core  Gigabit Ethernet switch(10Gbps for 2 experiments) Evaluation

File Read/Write Throughput: Micro-Benchmark regular core turbo core

TCP/UDP Throughput : Micro-Benchmark

NFS/SCP Throughput : Application Benchmark

Apache Olio : Application Benchmark  3 components  a web server to process user requests  a MySQL database server to store user profiles and event information  an NFS server to store images and documents specific to events

Conclusions  Problem : CPU sharing affects I/O throughput  Solution : vTurbo  Offload IRQ processing to a turbo-sliced dedicated core  Results :  Improve UDP throughput up to 4x  Improve TCP throughput up to 3x  Improve Disk write up to 2x  Improve NFS’ throughput up to 3x  Improve Olio’s throughput by up to 38.7%

Reference  CHENG, L., AND WANG, C.-L. “vbalance: Using interrupt load balance to improve i/o performance for smp virtual machine”, In ACM SoCC (2012)  DONG, Y., YU, Z., AND ROSE, G. “SR-IOV networking in Xen: architecture, design and implementation”, In WIOV (2008).  GORDON, A., AMIT, N., HAR’EL, N., BEN-YEHUDA, M., LANDAU, A., SCHUSTER, A., AND TSAFRIR, D. “ELI: baremetal performance for I/O virtualization”, In ACM ASPLOS(2012).

THANK YOU !