Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.

Slides:



Advertisements
Similar presentations
MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA.
Advertisements

Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton)
KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Head-to-TOE Evaluation of High Performance Sockets over Protocol Offload Engines P. Balaji ¥ W. Feng α Q. Gao ¥ R. Noronha ¥ W. Yu ¥ D. K. Panda ¥ ¥ Network.
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula,
Advanced Virtualization Techniques for High Performance Cloud Cyberinfrastructure Andrew J. Younge Ph.D. Candidate Indiana University Advisor: Geoffrey.
Evaluation of ConnectX Virtual Protocol Interconnect for Data Centers Ryan E. GrantAhmad Afsahi Pavan Balaji Department of Electrical and Computer Engineering,
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Beowulf Supercomputer System Lee, Jung won CS843.
Performance Characterization of a 10-Gigabit Ethernet TOE W. Feng ¥ P. Balaji α C. Baron £ L. N. Bhuyan £ D. K. Panda α ¥ Advanced Computing Lab, Los Alamos.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
A Comparative Study of Network Protocols & Interconnect for Cluster Computing Performance Evaluation of Fast Ethernet, Gigabit Ethernet and Myrinet.
Efficient Cloud Computing Through Scalable Networking Solutions.
1 Performance Evaluation of Gigabit Ethernet & Myrinet
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji  Hemal V. Shah ¥ D. K. Panda 
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
SALSASALSASALSASALSA Performance Analysis of High Performance Parallel Applications on Virtualized Resources Jaliya Ekanayake and Geoffrey Fox Indiana.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
1 Some Context for This Session…  Performance historically a concern for virtualized applications  By 2009, VMware (through vSphere) and hardware vendors.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Virtualizing Modern High-Speed Interconnection Networks with Performance and Scalability Institute of Computing Technology, Chinese Academy of Sciences,
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
Dual Stack Virtualization: Consolidating HPC and commodity workloads in the cloud Brian Kocoloski, Jiannan Ouyang, Jack Lange University of Pittsburgh.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Roland Dreier Technical Lead – Cisco Systems, Inc. OpenIB Maintainer Sean Hefty Software Engineer – Intel Corporation OpenIB Maintainer Yaron Haviv CTO.
Ishikawa, The University of Tokyo1 GridMPI : Grid Enabled MPI Yutaka Ishikawa University of Tokyo and AIST.
Reliable Datagram Sockets and InfiniBand Hanan Hit NoCOUG Staff 2010.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
Emalayan Vairavanathan
HPCS Lab. High Throughput, Low latency and Reliable Remote File Access Hiroki Ohtsuji and Osamu Tatebe University of Tsukuba, Japan / JST CREST.
Virtual Machine Scheduling for Parallel Soft Real-Time Applications
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
An architecture for space sharing HPC and commodity workloads in the cloud Jack Lange Assistant Professor University of Pittsburgh.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
TILEmpower-Gx36 - Architecture overview & performance benchmarks – Presented by Younghyun Jo 2013/12/18.
Gilad Shainer, VP of Marketing Dec 2013 Interconnect Your Future.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
HPCVL High Performance Computing Virtual Laboratory Founded 1998 as a joint HPC lab between –Carleton U. (Comp. Sci.) –Queen’s U. (Engineering) –U. of.
IM&T Vacation Program Benjamin Meyer Virtualisation and Hyper-Threading in Scientific Computing.
InfiniBand in the Lab Erik 1.
HPC computing at CERN - use cases from the engineering and physics communities Michal HUSEJKO, Ioannis AGTZIDIS IT/PES/ES 1.
© 2012 MELLANOX TECHNOLOGIES 1 Disruptive Technologies in HPC Interconnect HPC User Forum April 16, 2012.
2009/4/21 Third French-Japanese PAAP Workshop 1 A Volumetric 3-D FFT on Clusters of Multi-Core Processors Daisuke Takahashi University of Tsukuba, Japan.
2007/11/2 First French-Japanese PAAP Workshop 1 The FFTE Library and the HPC Challenge (HPCC) Benchmark Suite Daisuke Takahashi Center for Computational.
Yang Yu, Tianyang Lei, Haibo Chen, Binyu Zang Fudan University, China Shanghai Jiao Tong University, China Institute of Parallel and Distributed Systems.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
Rick Claus Sr. Technical Evangelist,
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Use case of RDMA in Symantec storage software stack Om Prakash Agarwal Symantec.
Performance Evaluation of Parallel Algorithms on a Computational Grid Environment Simona Blandino 1, Salvatore Cavalieri 2 1 Consorzio COMETA, 2 Faculty.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,
REMINDER Check in on the COLLABORATE mobile app Best Practices for Oracle on VMware - Deep Dive Darryl Smith Chief Database Architect Distinguished Engineer.
A Practical Evaluation of Hypervisor Overheads Matthew Cawood Supervised by: Dr. Simon Winberg University of Cape Town Performance Analysis of Virtualization.
HPC need and potential of ANSYS CFD and mechanical products at CERN A. Rakai EN-CV-PJ2 5/4/2016.
Balazs Voneki CERN/EP/LHCb Online group
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Joint Techs Workshop InfiniBand Now and Tomorrow
System G And CHECS Cal Ribbens
Versatile HPC: Comet Virtual Clusters for the Long Tail of Science SC17 Denver Colorado Comet Virtualization Team: Trevor Cooper, Dmitry Mishin, Christopher.
Microsoft Virtual Academy
Can (HPC)Clouds supersede traditional High Performance Computing?
Cluster Computers.
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape Town

Overview 1.Background 2.Research Objectives 3.HPC 4.Virtualization 5.Performance Tuning 6.The Cloud Cluster 7.Benchmarks 8.Results 9.Conclusions

1. Background BSc (Eng) final year research project CHPC Advanced Computer Engineering (ACE) Lab Cloud Cluster is currently being commissioned Research focused on evaluating the hardware and software configurations

2. Research Objectives 1.Present an in-depth report on the current technologies being developed in the field of High Performance Computing. 2.Provide a quantitative performance analysis of the costs associated with Virtualization, specifically in the field of HPC.

3. High Performance Computing HPC data centres are rapidly growing in size and complexity Currently emphasis is placed on improving efficiency and utilization Wide selection of applications/requirements Bioinformatics Astrophysics Simulation Modelling

4. Virtualization

5. Performance Tuning Host memory reservation of Linux huge pages KVM vCPU pinning to improve NUMA cell awareness

6. The Cloud Cluster Compute Nodes: 2x Intel Xeon E5-2690, 20MB L3 cache, 2.90 GHz 256GB, DDR3-1600, CL11 Mellanox ConnectX-3 VPI FDR 56Gbps HCA Gigabit Ethernet NIC Switch Infrastructure: Mellanox SX6036 FDR 36 port Infiniband Switch

6. The Cloud Cluster CentOS 6.4 OFED 2.0 (with SR-IOV) OpenNebula 4.2

7. Performance Benchmarks HPC Challenge HPLinpack MPI Random Access STREAM Effective bandwidth & latency OpenFOAM 7 million cell, 5 millisecond transient simulation snappyHexMesh

8. Results

8.1 Software Comparison HPLinpack throughput comparison of compiler selection

8.2 Single Node Evaluation HPLinpack throughput efficiency of virtual machines MPI Random Access Performance STREAM Memory Bandwidth

8.3 Cluster Evaluation HPLinpack throughput efficiency of virtual machines

8.3 Cluster Evaluation OpenFOAM runtime efficiency of virtual machines

8.4 Interconnect Evaluation Typical Verbs Latency of virtual machines Typical IPoIB Latency of virtual machines Native Verbs Vs. IP over Infiniband

8.5 Supplementary Tests Intel ® Hyper-threading HPLinpack throughput

8.5 Supplementary Tests Virtual machine Scaling OpenFOAM runtime

9. Conclusions KVM typically provides good performance for HPC Tuning is necessary to further improve performance Efficiency is highly application dependant SR-IOV for Infiniband effectively reduced I/O Virtualization overheads Synthetic and real-world results often contradict

Questions ?