Virtual Machine in HPC PAK MARKTHUB (13M54040) 1 VIRTUAL MACHINE IN HPC.

Slides:



Advertisements
Similar presentations
Live migration of Virtual Machines Nour Stefan, SCPD.
Advertisements

Adding the Easy Button to the Cloud with SnowFlock and MPI Philip Patchin, H. Andrés Lagar-Cavilla, Eyal de Lara, Michael Brudno University of Toronto.
Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.
KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
SLA-Oriented Resource Provisioning for Cloud Computing
Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Introduction CSCI 444/544 Operating Systems Fall 2008.
XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.
COMMA: Coordinating the Migration of Multi-tier applications 1 Jie Zheng* T.S Eugene Ng* Kunwadee Sripanidkulchai† Zhaolei Liu* *Rice University, USA †NECTEC,
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
DESIGN CONSIDERATIONS OF A GEOGRAPHICALLY DISTRIBUTED IAAS CLOUD ARCHITECTURE CS 595 LECTURE 10 3/20/2015.
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
Virtualization for Cloud Computing
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Ch 4. The Evolution of Analytic Scalability
Cyberaide Virtual Appliance: On-demand Deploying Middleware for Cyberinfrastructure Tobias Kurze, Lizhe Wang, Gregor von Laszewski, Jie Tao, Marcel Kunze,
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人:碩資工一甲 董耀文.
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.
Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Improving Network I/O Virtualization for Cloud Computing.
Challenges towards Elastic Power Management in Internet Data Center.
COMS E Cloud Computing and Data Center Networking Sambit Sahu
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Server Virtualization
© 2012 IBM Corporation Platform Computing 1 IBM Platform Cluster Manager Data Center Operating System April 2013.
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
VMware vSphere Configuration and Management v6
Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
NTU Cloud 2010/05/30. System Diagram Architecture Gluster File System – Provide a distributed shared file system for migration NFS – A Prototype Image.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Web Log Data Analytics with Hadoop
Architecture & Cybersecurity – Module 3 ELO-100Identify the features of virtualization. (Figure 3) ELO-060Identify the different components of a cloud.
MapReduce & Hadoop IT332 Distributed Systems. Outline  MapReduce  Hadoop  Cloudera Hadoop  Tutorial 2.
Microsoft Cloud Solution.  What is the cloud?  Windows Azure  What services does it offer?  How does it all work?  How to go about using it  Further.
Load Rebalancing for Distributed File Systems in Clouds.
IMPROVEMENT OF COMPUTATIONAL ABILITIES IN COMPUTING ENVIRONMENTS WITH VIRTUALIZATION TECHNOLOGIES Abstract We illustrates the ways to improve abilities.
Virtualization for Cloud Computing
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Introduction to Distributed Platforms
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
LIGHTWEIGHT CLOUD COMPUTING FOR FAULT-TOLERANT DATA STORAGE MANAGEMENT
Diskpool and cloud storage benchmarks used in IT-DSS
Distributed Network Traffic Feature Extraction for a Real-time IDS
Running virtualized Hadoop, does it make sense?
Hadoop Clusters Tess Fulkerson.
Myoungjin Kim1, Yun Cui1, Hyeokju Lee1 and Hanku Lee1,2,*
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Sky Computing on FutureGrid and Grid’5000
Ch 4. The Evolution of Analytic Scalability
Cloud Computing Architecture
Sky Computing on FutureGrid and Grid’5000
Presentation transcript:

Virtual Machine in HPC PAK MARKTHUB (13M54040) 1 VIRTUAL MACHINE IN HPC

2 Review paper “Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics” [2012 IEEE Conference on Open Systems (ICOS)] Amril Nazir, Yaszrina Mohamad Yassin, Chong Poh Kit, Ettikan Kandasamy Karuppiah MIMOS Bhd, Technology Park Malaysia VIRTUAL MACHINE IN HPC

3 Review paper “Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks” [2013 IEEE International Congress on Big Data (BigData Congress)] Jack Li, Qingyang Wang, Deepal Jayasinghe, Junhee Park, Tao Zhu, Calton Pu School of Computer Science at the College of Computing Georgia Institute of Technology VIRTUAL MACHINE IN HPC

4 Review paper “Cooperative VM migration for a virtualized HPC cluster with VMM-bypass I/O devices” [2012 IEEE 8th International Conference on E-Science (e-Science)] Ryousei Takano, Hidemoto Nakada, Takahiro Hirofuchi, Yoshio Tanaka, and Tomohiro Kudoh National Institute of Advanced Industrial Science and Technology (AIST) VIRTUAL MACHINE IN HPC

Outline ◦A Brief Introduction to Virtual Machine Technologies ◦Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics ◦Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks 5 VIRTUAL MACHINE IN HPC

A Brief Introduction to Virtual Machine Technologies 6 VIRTUAL MACHINE IN HPC

Cloud Computing The phrase is also more commonly used to refer to network- based services which appear to be provided by real server hardware, which in fact are served up by virtual hardware, simulated by software running on one or more real machines. Such virtual servers do not physically exist and can therefore be moved around and scaled up (or down) on the fly without affecting the end user—arguably, rather like a cloud. 7 Source: VIRTUAL MACHINE IN HPC

Advantages of Cloud Computing ◦Economies of scale ◦Scalability ◦Hardware independent ◦Fault tolerant ◦Etc. 8 VIRTUAL MACHINE IN HPC

Cloud Computing & Big Data 9 Source: Gartner VIRTUAL MACHINE IN HPC

Virtual Machine (VM) A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine. ◦2 classifications: ◦System virtual machine ◦Process virtual machine 10 Source: VIRTUAL MACHINE IN HPC

VM Architecture 11 Source: /doc.22/e15444/ovmserver.htm VIRTUAL MACHINE IN HPC

Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics 2012 IEEE Conference on Open Systems (ICOS) Amril Nazir, Yaszrina Mohamad Yassin, Chong Poh Kit, Ettikan Kandasamy Karuppiah 12 VIRTUAL MACHINE IN HPC

Abstract - Summarized ◦Objective: Evaluate how well large-scale distributed data analysis application can run in virtualized environment. ◦Why?: ◦Data analytics in Cloud computing is very attractive for small and medium organizations. ◦Non-expert user can provision resources as VMs on Cloud very easily. ◦Results: ◦One should minimize the number of VMs deployed for each application. ◦One should equip each VM with sufficient memory and reasonable number of CPU cores. 13 VIRTUAL MACHINE IN HPC

Financial Application Architecture 14 Data extraction ◦Process of identifying source and collection data ◦Source of data: Bursa, NASDAQ, LSE, etc. Data pre-processing ◦Process of transforming data into appropriate format for data analysis ◦Ex: Text matching, mismatched/duplicated records identification/elimination. Data analysis ◦Sometimes data are too large for processing in 1 machines ◦Cloud can be useful to distribute tasks on- demand. VIRTUAL MACHINE IN HPC

Distributed Financial Application Architecture 15 VIRTUAL MACHINE IN HPC

Computational Needs for Financial Application Data intensive ◦Extraction of financial data from source. ◦Need high-memory machines for extraction. Compute intensive ◦Data pre-processing ◦Data analysis 16 VIRTUAL MACHINE IN HPC

Experiment – Environment ◦Use 4 physical machines (with Hypervisor) ◦1 Gb Ethernet connection ◦VM (RHEL5 with kernel ) ◦Use RapidMiner as a financial application 17 VIRTUAL MACHINE IN HPC

Experiment I ◦Increase problem size. ◦Observe how much time is used for finishing the task for each configuration. 18 Task CPU 1 core CPU Task 4 cores CPU VM Task 1 core per VM CPU Task VM 4 cores per VM CPU Task VM 4 VMs 1 core each NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Experiment I - Result 19 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Experiment II ◦Fix the problem size (the size of securities) ◦Vary resources assigned to a VM ◦The number of CPU cores ◦Amount of memory ◦Observe time used for executing the task 20 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen VIRTUAL MACHINE IN HPC

Experiment II - Result 21 Size of SecuritiesMemory (GB)1 core per VM4 cores per VM8 cores per VM 32273(16,17)(119,1430) (12,13)(14,18) NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen VIRTUAL MACHINE IN HPC

Experiment III ◦Fix the problem size (32 securities) ◦Fix VM configuration (4 cores, 2 GB) ◦Vary physical machines (phy1 VS phy4) ◦Observe the task execution time 22 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Experiment III - Result 23 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon3.2GBNot state Machine1 core per VM4 cores per VM phy173(16,17) phy4104(25,34) VIRTUAL MACHINE IN HPC

Experiment IV ◦Compare the execution time of ◦4 cores per VM ◦2 VMs with 2 cores each 24 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Experiment IV - Result 25 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Experiment V – VM Provisioning ◦By the previous experiments, the sufficient amount of memory for each CPU core is around 1024 MB. ◦Provision VM using memory per cores constrain VS OpenNebula VM provisioning which uses round robin. 26 VIRTUAL MACHINE IN HPC

Experiment V - Result 27 av=the available machine memory m=the optimal memory per core set by user (1024 MB in this experiment) n=the number of cores assigned to VM NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon12GBKVM phy38 3.2GHz-i73GBKVM phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC

Conclusion ◦One should minimize the number of VMs deployed for each application. ◦One should equip a VM with sufficient memory. ◦There is a need for more effective discovery, scheduling, and load balancing than the typical round robin employed by current Cloud middleware. 28 VIRTUAL MACHINE IN HPC

My opinions STRONG POINTS ◦Experiment with various configurations of VMs. ◦The results can directly use for improving nowadays VM provisioning software. WEAK POINTS ◦Experimental datasets are too small. ◦Only concern about data- and compute-intensive workload. ◦Experiment V does not concern about the effect of different hypervisors. 29 VIRTUAL MACHINE IN HPC

30 Questions Questions And And Answers Answers VIRTUAL MACHINE IN HPC

Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks 2013 IEEE International Congress on Big Data (BigData Congress) Jack Li, Qingyang Wang, Deepal Jayasinghe, Junhee Park, Tao Zhu, Calton Pu 31 VIRTUAL MACHINE IN HPC

Abstract - Summarized ◦Objective: Benchmark the performance of 3 popular hypervisors using various type of Hadoop workloads ◦Why?: Hypervisors are widely used in Cloud environment ◦Results: ◦Workload type, workload size, and VM placement yielded performance differences amount different Hypervisors. ◦CPU-bound workload has negligible performance differences among different Hypervisors, however I/O- bound workload has significant differences. 32 VIRTUAL MACHINE IN HPC

Testing Environment ◦Benchmark 3 hypervisors: KVM, Xen, and a commercial hypervisor (CVM – not state which one is used due to license). ◦Run Hadoop MapReduce 4- node virtual cluster in the same physical machine. ◦Each VM node has 1 virtual core pinned to a different physical core. 33 VIRTUAL MACHINE IN HPC

CPU-Bound Benchmark 34 VIRTUAL MACHINE IN HPC

TestDFSIO Write & Filebench Write 35 VIRTUAL MACHINE IN HPC

TestDFSIO Read & Filebench Read 36 VIRTUAL MACHINE IN HPC

TestDFSIO Read CPU Utilization 37 VIRTUAL MACHINE IN HPC

TeraSort Benchmark 38 VIRTUAL MACHINE IN HPC

TeraSort – Why Xen is good? 39 VIRTUAL MACHINE IN HPC

Effects of Adding VMs on the same Host 40 VIRTUAL MACHINE IN HPC

Effects of Adding VMs on the same Host: Read/Write Throughput 41 VIRTUAL MACHINE IN HPC

Effect of Changing VM Placement 42 VIRTUAL MACHINE IN HPC

Conclusion ◦CPU-bound benchmark: no significant performance difference. ◦KVM is the best for disk reading. ◦CVM is the best for disk writing. ◦Xen is the best for CPU-and-I/O-intensive task. ◦Increasing the number of VMs per host is not good for I/O-intensive workload, but may be good for CPU-and- I/O-intensive one. 43 VIRTUAL MACHINE IN HPC

My opinions STRONG POINTS ◦Benchmark datasets are quite large. ◦Compare many aspects of different hypervisors. WEAK POINTS ◦Don’t have any physical-host benchmark results. ◦Putting 4 VMs in the same host does not reflect real- world provisioning pattern. 44 VIRTUAL MACHINE IN HPC

45 Questions Questions And And Answers Answers VIRTUAL MACHINE IN HPC

46 The End The End VIRTUAL MACHINE IN HPC

Supporting Slides 47 VIRTUAL MACHINE IN HPC

KVM Overview 48 Source: VIRTUAL MACHINE IN HPC

KVM – Host Interaction 49 Source: Userspace VIRTUAL MACHINE IN HPC

KVM – VIRTIO Device 50 Source: VIRTUAL MACHINE IN HPC

Hadoop Map-Reduce 51 Source: VIRTUAL MACHINE IN HPC

Hadoop Word Count 52 Source: VIRTUAL MACHINE IN HPC