Predicting The Performance Of Virtual Machine Migration Presented by : Eli Nazarov Sherif Akoush, Ripduman Sohan, Andrew W.Moore, Andy Hopper University.

Slides:

Advertisements

Similar presentations

Remus: High Availability via Asynchronous Virtual Machine Replication

Advertisements

Live migration of Virtual Machines Nour Stefan, SCPD.

Concurrent programming: From theory to practice Concurrent Algorithms 2014 Vasileios Trigonakis Georgios Chatzopoulos.

A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield.

1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.

International Symposium on Low Power Electronics and Design Dynamic Workload Characterization for Power Efficient Scheduling on CMP Systems 1 Gaurav Dhiman,

Parallelizing Live Migration of Virtual Machines

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration Xiang Zhang 1,2, Zhigang Huo 1, Jie Ma 1, Dan Meng 1 1. National Research Center.

KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield.

Heterogeneous Live Migration of Virtual Machines Pengcheng Liu, Ziye Yang, Xiang Song, Yixun Zhou, Haibo Chen, and Binyu Zang Parallel Processing Institute,

XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

COMMA: Coordinating the Migration of Multi-tier applications 1 Jie Zheng* T.S Eugene Ng* Kunwadee Sripanidkulchai† Zhaolei Liu* *Rice University, USA †NECTEC,

Post-Copy Live Migration of Virtual Machines Michael R. Hines, Umesh Deshpande, Kartik Gopalan Computer Science, Binghamton University(SUNY) SIGOPS 09’

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice In search of a virtual yardstick:

11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.

UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.

Cmpt-225 Algorithm Efficiency.

1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.

1 Distributed Systems: Distributed Process Management – Process Migration.

Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

Scalability Module 6.

Presented by : Ran Koretzki. Basic Introduction What are VM’s ? What is migration ? What is Live migration ?

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.

Virtualization and Cloud Computing Research at Vasabilab Kasidit Chanchio Vasabilab Dept of Computer Science, Faculty of Science and Technology, Thammasat.

Kenichi Kourai (Kyushu Institute of Technology) Takuya Nagata (Kyushu Institute of Technology) A Secure Framework for Monitoring Operating Systems Using.

Zero-copy Migration for Lightweight Software Rejuvenation of Virtualized Systems Kenichi Kourai Hiroki Ooba Kyushu Institute of Technology.

Adam Duffy Edina Public Schools.  Traditional server ◦ One physical server ◦ One OS ◦ All installed hardware is limited to that one server ◦ If hardware.

© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Profiling and Modeling Resource Usage.

Live Migration of Virtual Machines

Dynamic Resource Monitoring and Allocation in a virtualized environment.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen†,Eric Jul†, Christian Limpach, Ian Pratt, Andrew Warfield.

Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.

High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.

Eric Burgener VP, Product Management A New Approach to Storage in Virtual Environments March 2012.

Investigating the Effects of Using Different Nursery Sizing Policies on Performance Tony Guan, Witty Srisa-an, and Neo Jia Department of Computer Science.

Embedded System Lab 김해천 Thread and Memory Placement on NUMA Systems: Asymmetry Matters.

Resource Predictors in HEP Applications John Huth, Harvard Sebastian Grinstein, Harvard Peter Hurst, Harvard Jennifer M. Schopf, ANL/NeSC.

Clint Huffman Microsoft Premier Field Engineer (PFE) Microsoft Corporation SESSION CODE: VIR315 Kenon Owens Technical Product Manager Microsoft Corporation.

VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,

Kara Zaffarano SunyIT.  Detect CPU bound and I/O bound processes  Increase process priority for CPU bound processes and lower nice value  Lower priority.

1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.

Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can

Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,

Core Migration On SCC [keyword : Lookup Table, MPB] Chan Seok Kang 2013/06/19.

Scalable and Coordinated Scheduling for Cloud-Scale computing

PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,

CSE598c - Virtual Machines - Spring Diagnosing Performance Overheads in the Xen Virtual Machine EnvironmentPage 1 CSE 598c Virtual Machines “Diagnosing.

Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.

Live Migration of Virtual Machines Authors: Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew.

1 EMC CONFIDENTIAL—INTERNAL USE ONLY SMI-S Performance Overview and Status 8/28/2014.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

1 Automated Power Management Through Virtualization Anne Holler, VMware Anil Kapur, VMware.

CS 695 Topics in Virtualization and Cloud Computing, Autumn 2012 CS 695 Topics in Virtualization and Cloud Computing Live Migration of Virtual Machines.

Virtualization.

Time-Bounded, Thread-Based Live Migration of Virtual Machines

Presented by Yoon-Soo Lee

Seth Pugsley, Jeffrey Jestes,

Virtualization OVERVIEW

Kenichi Kourai Hiroki Ooba Kyushu Institute of Technology, Japan

CERN Benchmarking Cluster

Yi Wu 9/17/2018.

Chapter 3: Principles of Scalable Performance

CPU SCHEDULING.

A workload-aware energy model for VM migration

Microsoft Virtual Academy

Efficient Migration of Large-memory VMs Using Private Virtual Memory

Presentation transcript:

Predicting The Performance Of Virtual Machine Migration Presented by : Eli Nazarov Sherif Akoush, Ripduman Sohan, Andrew W.Moore, Andy Hopper University of Cambridge

Agenda  Introduction.  How to migrate?  Defining migration performance.  Performance prediction.  The AVG & HIST models.  Evaluation.  Conclusions.

Why performance prediction matters?  Provision and control computing capacity.  Guarantee performance levels.  Efficient management. Better VM placement. Better resource utilization (e.g. load balancing).

How to migrate?  Stop-and-Copy  Minimizes total migration time.  Highest downtime.   On-Demand.  Short downtime.  Very high total migration time. 

Pre-Copy migration  Pre-Copy migration involves 6 steps:  Initialization Pre-select a target for migration.  Reservation Reserve resources on the destination host.  Iterative Pre-Copy First Iteration : Send all RAM. Each iteration : Send modified pages.  Stop-and-Copy Stop VM for final transfer.  Commitment Destination host acknowledges that the copy finished correctly.  Activation Re-attachment of resources to VM on the destination host. Pre-copy phase Copy phase

Xen Stop Conditions  Less then 50 pages were dirtied during the last pre- copy iteration.  Guarantees short downtime.  29 pre-copy iteration have been carried out.  Already copied more then 3*|VM|. At iteration N-1 we copied 3*|VM|-1page  Forces Stop-and-Copy stage.

Migration & Down times

How To Predict?  Calculate Bounds.

Bounds are not enough!  Don’t give accurate prediction.  Reason: Significant differences in lower and upper bounds due to link speed and VM size correlation.  Example: For VM Size=1,024 MB MT =Total Migration Time, DT=Total Downtime, LB=Lower Bound, UB=Upper Bound  For big VM memory sizes even larger differences.  We need something more accurate. Speed 100 Mbps96.3 s459.1 s0.314 s s 1 Gbps13.3 s49.9 s0.314 s s 10 Gbps5.3 s10.1 s0.314 s s

Parameters affecting migration  Migration link bandwidth.  Higher speed links allow faster transfers.  Pre and Post migration overheads.  Operations that aren’t part of the actual transfer.  Examples: Initializing container in destination host. Reattaching device drivers to the new VM. etc.  Example: 10 Gbps, VM size = 512MB Pre-overhead = 77%

Parameters affecting migration (cont.)  Page dirty rate.  The rate at which memory pages in VM are modified.  Affects the number of pages transferred in each pre- copy integration.  Page dirty rate and performance relation is not linear Reason: Link speed.

Page dirty rate and link speed  Downtime at low page dirty rate is almost constant and close to lower bound.  Downtime increases to upper bound when page dirty rate is high (reaches link capacity). 10Gbps – Total downtime

Page dirty rate and link speed (cont.)  Total migration time increases with page dirty rate.  Total migration time goes back to lower bound for extremely high page dirty rate. Back to pure Stop-and-Copy. 10Gbps – Total migration time 100Mbps – Total migration time

What's next?  Prediction using all parameters affecting migration.  Link speed.  Page dirty rate.  VM memory size.  Overheads. AVG - Average Page Dirty Rate HIST – History Based Page Dirty

The AVG model  Based on the migration logic.  Assumes constant or average page dirty rate.  Useful when the dirty page rate is stable.  Follow the core functionality of migration in Xen.

The AVG model (cont.)  Input parameters:  Link Speed.  Page Dirty Rate. Analytically determinable.  Pre\Post overheads. Time spent during actual transfer – Time to migrate idle VM  VM Size.  Xen functionality:  sim_clean(): returns the set of dirty pages + sets state to “all clean”.  sim_peek(): returns bitmap of dirty pages (no state change).

Algorithm - the AVG model  Each Pre-Copy phase:  Get dirty bitmap – sim_peek().  Skip the pages re-dirtied in this iteration  Collect at most 1024 pages – batch.  migration_time +=  if (last_iteration) downtime_time +=  Clean pages status – sim_clean().  Calculate the total times:  total_migration_time = migration_time + pre_overheads + post_overheads.  total_downtime = downtime + post_overheads.

The HIST model  Used in cases where the dirty page rate is a function of time.  Depends on the history log of page dirty rate.

The HIST model (cont.)  Given the start time of migration – t  Predict migration times based on: t+1,t+2, …, t+N  Changed sim_clean() and sim_peek() to return #dirty pages at the above points in time for log.  Use AVG algorithm with these function. Observation: For deterministic processes the set of dirtied pages at any point in time will be approximately the same as for previous runs of the same workload running in a similar environment.

Evaluation  Test-bed:  Xenserver (Xen 3.3.1) on 3 servers. 1 pool master, 2 hosts for migration.  Each server: 2 Intel® Xeon™ 2.13 GHZ, 6GB DDR3.  SAN – IBM eserver xSeries GB DIMM. Ultra320 SCSI. Ubuntu kernel.  Compared to:  Actual migration using 2 SolarFlare10Gbps NICs.

Evaluation (Cont.)  Page Modification Micro-Benchmark  Can be used both for AVG & HIST.  Deterministic application.  Writes to memory pages at fixed rates.  High resolution of page modification Up to pages/sec.  Over 25,000 live migrations.

Evaluation (cont.) - Results AVG v.s Real migration HIST v.s Real migration

Results (Cont.) - Results  For |VM|=1024MB, LinkSpeed=10Gbps:  HIST mean deviation from the measurements :  3.3% - total migration time.  6.2% - total downtime.  AVG mean deviation from the measurements:  2.6% - total migration time.  3.3% - total downtime.

Evaluation(cont.) – Industry workloads  Comparing against a set of industry-standard workloads.  SPEC CPU For CPU bounds workloads.  SPECweb WebServer workloads.  SPECsfs I/O, MapReduce & non-interactive workloads.

Industry workloads - Results CPU5.8 s5.7 s2.4%0.317 s0.314 s2.4% WEB7.5 s7.4 s2.0%0.449 s0.42 s6.4% SFS14.8 s14.9 s1.5% s s0.1% MR14.9s15.13s1.4%0.348 s % MT =Total Migration Time, DT=Total Downtime, A=Actual Measurements P=HIST Prediction

Comments  Presented an accurate model for prediction.  Performed a large scale evaluation.   Very specific to Xen implementation.  Didn’t perform evaluation comparing to other prediction methods.  Didn’t state how to predict with bounds.

Questions? ?