Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,

Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology, Japan ISGC 2015@Taipei, 20 March. 2015 AIST Super Green Cloud: Lessons Learned from the Operation and the Performance Evaluation of HPC Cloud

2 A IST S uper G reen C loud This talk is about…

Introduction HPC Cloud is promising HPC platform. Virtualization is a key technology. – Pro: a customized software environment, elasticity, etc – Con: a large overhead, spoiling I/O performance. VMM-bypass I/O technologies, e.g., PCI passthrough and SR-IOV, can significantly mitigate the overhead. “99% of HPC jobs running on US NSF computing centers fit into one rack.” -- M. Norman, UCSD Current virtualization technologies are feasible enough for supporting such a scale. 3

LINPACK on ASGC 4 Performance degradation: 5.4 - 6.6% Efficiency* on 128 nodes ・ Physical cluster: 90% ・ Virtual cluster: 84% *) Rmax / Rpeak IEEE CloudCom 2014

Introduction (cont’d) HPC Clouds are heading for hybrid-cloud and multi-cloud systems, where the user can execute their application anytime and anywhere he/she wants. Vision of AIST HPC Cloud: “Build once, run everywhere” AIST Super Green Cloud (ASGC): a fully virtualized HPC system 5

Outline AIST Super Green Cloud (ASGC) and HPC Cloud service Lessons learned from the first six months of operation Experiments Conclusion 6

Vision of AIST HPC Cloud “Build Once, Run Everywhere” 7 Academic Cloud Private Cloud Commercial Cloud Virtual cluster templates Deploy a Virtual Cluster Feature 1: Create a customized virtual cluster easily Feature 2: Build a virtual cluster once, and run it everywhere on clouds

Usage Model of AIST Cloud 8 Allow users to customize their virtual clusters ログインして利用 Web apps BigData HPC 1. Select a template of a virtual machine 2. Install required software package VM template files HPC + Ease of use deploy take snapshots Launch a virtual machine when necessary 3. Save a user-customized template in the repository

9 Elastic Virtual Cluster Cloud controller Login node sgc-tools Image repositor y Virtual cluster template Cloud controller Frontend node cmp node Scale in/ scale out NFSd Job scheduler Virtual Cluster on ASGC InfiniBand/Ethernet Image repositor y Import/ export Create a virtual cluster Virtual Cluster on Public Cloud Frontend node cmp node Ethernet Submit a job In operationUnder development

ASGC Hardware Spec. 10 Compute Node CPUIntel Xeon E5-2680v2/2.8GHz (10 core) x 2CPU Memory128 GB DDR3-1866 InfiniBandMellanox ConnectX-3 (FDR) EthernetIntel X520-DA2 (10 GbE) DiskIntel SSD DC S3500 600 GB 155 node-cluster consists of Cray H2312 blade server The theoretical peak performance is 69.44 TFLOPS Network switch InfiniBandMellanox SX6025 EthernetExtreme BlackDiamond X8

ASGC Software Stack Management Stack – CentOS 6.5 (QEMU/KVM 0.12.1.2) – Apache CloudStack 4.3 + our extensions PCI passthrough/SR-IOV support (KVM only) sgc-tools: Virtual cluster construction utility – RADOS cluster storage HPC Stack (Virtual Cluster) – Intel Cluster Studio SP1 1.2.144 – Mellanox OFED 2.1 – TORQUE job scheduler 4.2.8 11

Storage Architecture 12 User-attached storage x N VLAN Compute nodes Compute network (Infiniband FDR) Management network (10/1GbE) x 155 BDX8 x 155 x 5 VMDI public SW RGW RADOS VMDI cluster SW x 10 VMDI storage Data network (10GbE) NFS x 2 VM No shared storage/ filesystem VMDI (Virtual Machine Disk Image) storage – RADOS storage cluster – RADOS gateway – NFS secondary staging server User-attached storage primary storage (local SSD) secondary storage user data

Zabbix Monitoring System 13 CPU Usage Power Usage

Outline AIST Super Green Cloud (ASGC) and HPC Cloud service Lessons learned from the first six months of operation – CloudStack on a Supercomputer – Cloud Service for HPC users – Utilization Experiments Conclusion 14

Overview of ASGC Operation The operation started from July 2014. Accounts: 30+ – Main users are material scientists and genome scientists. Utilization: < 70% 95% of the total usage time is consumed for running HPC VM instances. Hardware failures: 19 (memory, M/B, power supply) 15

CloudStack on Supercomputer Supercomputer is not designed for cloud computing. – Cluster management software is troublesome. We can launch a highly productive system in a short development time by leveraging open source system software. Software maturity of CloudStack – Our storage architecture is slightly uncommon, that is we use local SSD disk as primary storage, and S3- compatible object store as secondary storage. – We discovered and resolved several serious bugs. 16

Software Maturily CloudStack IssueOur actionStatus cloudstack-agent jsvc gets too large virtual memory space PatchFixed listUsageRecords generates NullPointerExceptions for expunging instances PatchFixed Duplicate usage records when listing large number of records / Small page sizes return duplicate results BackportingFixed Public key content is overridden by template's meta data when you create a instance Bug reportFixed Migration of a VM with volumes in local storage to another host in the same cluster is failing BackportingFixed Negative ref_cnt of template(snapshot/volume)_store_ref results in out-of-range error in MySQL Patch (not merged)Fixed [S3] Parallel deployment makes reference count of a cache in NFS secondary staging store negative(-1) Patch (not merged)Unresolve d Can't create proper template from VM on S3 secondary storage environment PatchFixed Fails to attach a volume (is made from a snapshot) to a VM with using local storage as primary storage Bug reportUnresolve d 17

Cloud service for HPC users SaaS is the best if the target application is clear. IaaS is quite flexible. However, it is difficult to manage an HPC environment from scratch for application users. To bridge this gap, sgc-tools is introduced on top of an IaaS service. We believe it works well, although some minor problems are remained. To improve the ability to maintain VM template, the idea of “Infrastructure as code” can help. 18

Utilization Efficient use of limited resources is required. A virtual cluster dedicates resources whether the user fully utilizes them or not. sgc-tools do not support queuing at system-wide, therefore, the users need to check the availability. Introducing a global scheduler, e.g., Condor VM universe, can be a solution for this problem. 19

Outline AIST Super Green Cloud (ASGC) and HPC Cloud service Lessons learned from the first six months of operation Experiments – Deployment time – Performance evaluation of SR-IOV Conclusion 20

Virtual Cluster Deployment 21 Breakdown (second) Device attach (before OS boot) 90 OS boot90 FS creation (mkfs) 90 transfer from RADOS to SS transfer from SS to local node VM RADOS NFS/SS Compute node VM

Benchmark Programs Micro benchmark – Intel Micro Benchmark (IMB) version 3.2.4 Point-to-point Collectives: Allgather, Allreduce, Alltoall, Bcast, Reduce, Barrier Application-level benchmark – LAMMPS Molecular Dynamics Simulator version 28 June 2014 EAM benchmark, 100x100x100 atoms 22

MPI Point-to-point communication 23 6.00GB/s 5.72GB/s 5.73GB/s IMB The overhead is less than 5% with large message, though it is up to 30% with small message.

MPI Collectives 24 Time [microsecond] BM6.87 (1.00) PCI passthrough 8.07 (1.17) SR-IOV9.36 (1.36) ReuceBcast AllreduceAllgatherAlltoall The performance of SR-IOV is comparable to that of PCI passthrough while unexpected performance degradation is often observed Barrier

LAMMPS: MD simulator EAM benchmark: – Fixed problem size (1M atoms) – #proc: 20 - 160 VCPU pinning reduces performance fluctuation. Performance overhead of PCI passthrough and SR- IOV is about 13 %. 25

Findings The performance of SR-IOV is comparable to that of PCI passthrough while unexpected performance degradation is often observed. VCPU pinning improves the performance for HPC applications. 26

Outline AIST Super Green Cloud (ASGC) and HPC Cloud service Lessons learned from the first six months of operation Experiments Conclusion 27

Conclusion and Future work ASGC is a fully virtualized HPC system. We can launch a highly productive system in a short development time by leveraging start-of- the-art open source system software. – Extension: PCI passthrough/SR-IOV support, sgc- tools – Bug fixes… Future research direction: data movement is key. – Efficient data management and transfer methods – Federated identity management 28

Question? Thank you for your attention! 29 Acknowledgments: This work was partly supported by JSPS KAKENHI Grant Number 24700040.

Motivating Observation 30 Performance evaluation of HPC cloud – (Para-)virtualized I/O incurs a large overhead. – PCI passthrough significantly mitigate the overhead. The overhead of I/O virtualization on the NAS Parallel Benchmarks 3.3.1 class C, 64 processes. BMM: Bare Metal Machine KVM (virtio) VM1 10GbE NIC VMM Guest driver Physical driver Guest OS KVM (IB) VM1 IB QDR HCA VMM Physical driver Guest OS Bypass Improvement by PCI passthrough

Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,

Similar presentations

Presentation on theme: "Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,

Similar presentations

Presentation on theme: "Ryousei Takano, Yusuke Tanimura, Akihiko Oota, Hiroki Oohashi, Keiichi Yusa, Yoshio Tanaka National Institute of Advanced Industrial Science and Technology,"— Presentation transcript:

Similar presentations

About project

Feedback