Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape Town
Overview 1.Background 2.Research Objectives 3.HPC 4.Virtualization 5.Performance Tuning 6.The Cloud Cluster 7.Benchmarks 8.Results 9.Conclusions
1. Background BSc (Eng) final year research project CHPC Advanced Computer Engineering (ACE) Lab Cloud Cluster is currently being commissioned Research focused on evaluating the hardware and software configurations
2. Research Objectives 1.Present an in-depth report on the current technologies being developed in the field of High Performance Computing. 2.Provide a quantitative performance analysis of the costs associated with Virtualization, specifically in the field of HPC.
3. High Performance Computing HPC data centres are rapidly growing in size and complexity Currently emphasis is placed on improving efficiency and utilization Wide selection of applications/requirements Bioinformatics Astrophysics Simulation Modelling
4. Virtualization
5. Performance Tuning Host memory reservation of Linux huge pages KVM vCPU pinning to improve NUMA cell awareness
6. The Cloud Cluster Compute Nodes: 2x Intel Xeon E5-2690, 20MB L3 cache, 2.90 GHz 256GB, DDR3-1600, CL11 Mellanox ConnectX-3 VPI FDR 56Gbps HCA Gigabit Ethernet NIC Switch Infrastructure: Mellanox SX6036 FDR 36 port Infiniband Switch
6. The Cloud Cluster CentOS 6.4 OFED 2.0 (with SR-IOV) OpenNebula 4.2
7. Performance Benchmarks HPC Challenge HPLinpack MPI Random Access STREAM Effective bandwidth & latency OpenFOAM 7 million cell, 5 millisecond transient simulation snappyHexMesh
8. Results
8.1 Software Comparison HPLinpack throughput comparison of compiler selection
8.2 Single Node Evaluation HPLinpack throughput efficiency of virtual machines MPI Random Access Performance STREAM Memory Bandwidth
8.3 Cluster Evaluation HPLinpack throughput efficiency of virtual machines
8.3 Cluster Evaluation OpenFOAM runtime efficiency of virtual machines
8.4 Interconnect Evaluation Typical Verbs Latency of virtual machines Typical IPoIB Latency of virtual machines Native Verbs Vs. IP over Infiniband
8.5 Supplementary Tests Intel ® Hyper-threading HPLinpack throughput
8.5 Supplementary Tests Virtual machine Scaling OpenFOAM runtime
9. Conclusions KVM typically provides good performance for HPC Tuning is necessary to further improve performance Efficiency is highly application dependant SR-IOV for Infiniband effectively reduced I/O Virtualization overheads Synthetic and real-world results often contradict
Questions ?