Virtual Machine in HPC PAK MARKTHUB (13M54040) 1 VIRTUAL MACHINE IN HPC
2 Review paper “Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics” [2012 IEEE Conference on Open Systems (ICOS)] Amril Nazir, Yaszrina Mohamad Yassin, Chong Poh Kit, Ettikan Kandasamy Karuppiah MIMOS Bhd, Technology Park Malaysia VIRTUAL MACHINE IN HPC
3 Review paper “Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks” [2013 IEEE International Congress on Big Data (BigData Congress)] Jack Li, Qingyang Wang, Deepal Jayasinghe, Junhee Park, Tao Zhu, Calton Pu School of Computer Science at the College of Computing Georgia Institute of Technology VIRTUAL MACHINE IN HPC
4 Review paper “Cooperative VM migration for a virtualized HPC cluster with VMM-bypass I/O devices” [2012 IEEE 8th International Conference on E-Science (e-Science)] Ryousei Takano, Hidemoto Nakada, Takahiro Hirofuchi, Yoshio Tanaka, and Tomohiro Kudoh National Institute of Advanced Industrial Science and Technology (AIST) VIRTUAL MACHINE IN HPC
Outline ◦A Brief Introduction to Virtual Machine Technologies ◦Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics ◦Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks 5 VIRTUAL MACHINE IN HPC
A Brief Introduction to Virtual Machine Technologies 6 VIRTUAL MACHINE IN HPC
Cloud Computing The phrase is also more commonly used to refer to network- based services which appear to be provided by real server hardware, which in fact are served up by virtual hardware, simulated by software running on one or more real machines. Such virtual servers do not physically exist and can therefore be moved around and scaled up (or down) on the fly without affecting the end user—arguably, rather like a cloud. 7 Source: VIRTUAL MACHINE IN HPC
Advantages of Cloud Computing ◦Economies of scale ◦Scalability ◦Hardware independent ◦Fault tolerant ◦Etc. 8 VIRTUAL MACHINE IN HPC
Cloud Computing & Big Data 9 Source: Gartner VIRTUAL MACHINE IN HPC
Virtual Machine (VM) A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine. ◦2 classifications: ◦System virtual machine ◦Process virtual machine 10 Source: VIRTUAL MACHINE IN HPC
VM Architecture 11 Source: /doc.22/e15444/ovmserver.htm VIRTUAL MACHINE IN HPC
Evaluation of Virtual Machine Scalability on Distributed Multi/Many-core Processors for Big Data Analytics 2012 IEEE Conference on Open Systems (ICOS) Amril Nazir, Yaszrina Mohamad Yassin, Chong Poh Kit, Ettikan Kandasamy Karuppiah 12 VIRTUAL MACHINE IN HPC
Abstract - Summarized ◦Objective: Evaluate how well large-scale distributed data analysis application can run in virtualized environment. ◦Why?: ◦Data analytics in Cloud computing is very attractive for small and medium organizations. ◦Non-expert user can provision resources as VMs on Cloud very easily. ◦Results: ◦One should minimize the number of VMs deployed for each application. ◦One should equip each VM with sufficient memory and reasonable number of CPU cores. 13 VIRTUAL MACHINE IN HPC
Financial Application Architecture 14 Data extraction ◦Process of identifying source and collection data ◦Source of data: Bursa, NASDAQ, LSE, etc. Data pre-processing ◦Process of transforming data into appropriate format for data analysis ◦Ex: Text matching, mismatched/duplicated records identification/elimination. Data analysis ◦Sometimes data are too large for processing in 1 machines ◦Cloud can be useful to distribute tasks on- demand. VIRTUAL MACHINE IN HPC
Distributed Financial Application Architecture 15 VIRTUAL MACHINE IN HPC
Computational Needs for Financial Application Data intensive ◦Extraction of financial data from source. ◦Need high-memory machines for extraction. Compute intensive ◦Data pre-processing ◦Data analysis 16 VIRTUAL MACHINE IN HPC
Experiment – Environment ◦Use 4 physical machines (with Hypervisor) ◦1 Gb Ethernet connection ◦VM (RHEL5 with kernel ) ◦Use RapidMiner as a financial application 17 VIRTUAL MACHINE IN HPC
Experiment I ◦Increase problem size. ◦Observe how much time is used for finishing the task for each configuration. 18 Task CPU 1 core CPU Task 4 cores CPU VM Task 1 core per VM CPU Task VM 4 cores per VM CPU Task VM 4 VMs 1 core each NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Experiment I - Result 19 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Experiment II ◦Fix the problem size (the size of securities) ◦Vary resources assigned to a VM ◦The number of CPU cores ◦Amount of memory ◦Observe time used for executing the task 20 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen VIRTUAL MACHINE IN HPC
Experiment II - Result 21 Size of SecuritiesMemory (GB)1 core per VM4 cores per VM8 cores per VM 32273(16,17)(119,1430) (12,13)(14,18) NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen VIRTUAL MACHINE IN HPC
Experiment III ◦Fix the problem size (32 securities) ◦Fix VM configuration (4 cores, 2 GB) ◦Vary physical machines (phy1 VS phy4) ◦Observe the task execution time 22 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Experiment III - Result 23 NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon3.2GBNot state Machine1 core per VM4 cores per VM phy173(16,17) phy4104(25,34) VIRTUAL MACHINE IN HPC
Experiment IV ◦Compare the execution time of ◦4 cores per VM ◦2 VMs with 2 cores each 24 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Experiment IV - Result 25 NameCPUMemoryHypervisor phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Experiment V – VM Provisioning ◦By the previous experiments, the sufficient amount of memory for each CPU core is around 1024 MB. ◦Provision VM using memory per cores constrain VS OpenNebula VM provisioning which uses round robin. 26 VIRTUAL MACHINE IN HPC
Experiment V - Result 27 av=the available machine memory m=the optimal memory per core set by user (1024 MB in this experiment) n=the number of cores assigned to VM NameCPUMemoryHypervisor phy18 2.4GHz-Xeon10.5GBXen phy GHz-Xeon12GBKVM phy38 3.2GHz-i73GBKVM phy GHz-Xeon3.2GBNot state VIRTUAL MACHINE IN HPC
Conclusion ◦One should minimize the number of VMs deployed for each application. ◦One should equip a VM with sufficient memory. ◦There is a need for more effective discovery, scheduling, and load balancing than the typical round robin employed by current Cloud middleware. 28 VIRTUAL MACHINE IN HPC
My opinions STRONG POINTS ◦Experiment with various configurations of VMs. ◦The results can directly use for improving nowadays VM provisioning software. WEAK POINTS ◦Experimental datasets are too small. ◦Only concern about data- and compute-intensive workload. ◦Experiment V does not concern about the effect of different hypervisors. 29 VIRTUAL MACHINE IN HPC
30 Questions Questions And And Answers Answers VIRTUAL MACHINE IN HPC
Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks 2013 IEEE International Congress on Big Data (BigData Congress) Jack Li, Qingyang Wang, Deepal Jayasinghe, Junhee Park, Tao Zhu, Calton Pu 31 VIRTUAL MACHINE IN HPC
Abstract - Summarized ◦Objective: Benchmark the performance of 3 popular hypervisors using various type of Hadoop workloads ◦Why?: Hypervisors are widely used in Cloud environment ◦Results: ◦Workload type, workload size, and VM placement yielded performance differences amount different Hypervisors. ◦CPU-bound workload has negligible performance differences among different Hypervisors, however I/O- bound workload has significant differences. 32 VIRTUAL MACHINE IN HPC
Testing Environment ◦Benchmark 3 hypervisors: KVM, Xen, and a commercial hypervisor (CVM – not state which one is used due to license). ◦Run Hadoop MapReduce 4- node virtual cluster in the same physical machine. ◦Each VM node has 1 virtual core pinned to a different physical core. 33 VIRTUAL MACHINE IN HPC
CPU-Bound Benchmark 34 VIRTUAL MACHINE IN HPC
TestDFSIO Write & Filebench Write 35 VIRTUAL MACHINE IN HPC
TestDFSIO Read & Filebench Read 36 VIRTUAL MACHINE IN HPC
TestDFSIO Read CPU Utilization 37 VIRTUAL MACHINE IN HPC
TeraSort Benchmark 38 VIRTUAL MACHINE IN HPC
TeraSort – Why Xen is good? 39 VIRTUAL MACHINE IN HPC
Effects of Adding VMs on the same Host 40 VIRTUAL MACHINE IN HPC
Effects of Adding VMs on the same Host: Read/Write Throughput 41 VIRTUAL MACHINE IN HPC
Effect of Changing VM Placement 42 VIRTUAL MACHINE IN HPC
Conclusion ◦CPU-bound benchmark: no significant performance difference. ◦KVM is the best for disk reading. ◦CVM is the best for disk writing. ◦Xen is the best for CPU-and-I/O-intensive task. ◦Increasing the number of VMs per host is not good for I/O-intensive workload, but may be good for CPU-and- I/O-intensive one. 43 VIRTUAL MACHINE IN HPC
My opinions STRONG POINTS ◦Benchmark datasets are quite large. ◦Compare many aspects of different hypervisors. WEAK POINTS ◦Don’t have any physical-host benchmark results. ◦Putting 4 VMs in the same host does not reflect real- world provisioning pattern. 44 VIRTUAL MACHINE IN HPC
45 Questions Questions And And Answers Answers VIRTUAL MACHINE IN HPC
46 The End The End VIRTUAL MACHINE IN HPC
Supporting Slides 47 VIRTUAL MACHINE IN HPC
KVM Overview 48 Source: VIRTUAL MACHINE IN HPC
KVM – Host Interaction 49 Source: Userspace VIRTUAL MACHINE IN HPC
KVM – VIRTIO Device 50 Source: VIRTUAL MACHINE IN HPC
Hadoop Map-Reduce 51 Source: VIRTUAL MACHINE IN HPC
Hadoop Word Count 52 Source: VIRTUAL MACHINE IN HPC