How to use HybriLIT Matveev M. A., Zuev M.I. Heterogeneous Computations team HybriLIT Laboratory of Information Technologies (LIT), Joint Institute for Nuclear Research
Computation nodes Processor Number Num. of cores TDP (W) Base Frequency(GHz) Memory Size (Gb) Intel Xeon CPUE v NVIDIA TESLAK NVIDIA TESLAK20X Intel Xeon Phi5110P Intel Xeon Phi7120P Hardware Single precision76.8 Tf Double precision :29.1 Tf TDP:6.4 kW
Lang GNU Intel OpenMPI 1.6.5, Intel MPI CUDA 5.5, 6.0, 7.0 PGI 15.3 Cgcciccmpiccmpiiccnvccpgcc C++g++icpcmpicxxmpiicpcnvccpgc++ Fortrangfortranifort mpif77/ mpif90 mpiifortpgf77/pgf90 Software OpenMP: GNU 4.4.7, Intel OpenCL: CUDA 6.0, 7.0, Intel OpenCL 4.6.0
SLURM To run applications by batch mode; MODULES To set environment variables; System software Scientific Linux 6.7. Parallel technologies for C/C++, Fortran: MPI, OpenMP, CUDA, OpenCL I II III
Module add Compile application Run application Run task 3 steps
Run task Step 1: m odule add $ module avail $ module add openmpi/1.6.5 $ module add cuda-6.5-x86_64 $ module list Step 2: compile application $ mpicc source_mpi.c –o exec_mpi $ nvcc source_cuda.cu –o exec_cuda
Run task $ srun -p test mpiexec –n 1 exec_mpi Step 3: r un CPU application by SRUN Hello world from process 0 of 1 by host blade04.hydra.local SLURM note -p, --partition=name. Partition avail: cpu7 nodes; gpu4 nodes; phi2 nodes; test2 nodes. -n, --ntasks=num. mpiexec – to run mpi application.
Run task $ srun -p test --gres=gpu:3 exec_cuda Step 3: r un GPU application by SRUN Hello World! Detected 3 CUDA capable device(s) SLURM note -p, --partition=name. --gres=gpu:3 – number of gpu or phi devices to task.
Run task $ sbatch script.sh Submitted batch job 586 $ cat slurm-586.out Hello world from process 0 of 1 by host blade04.hydra.local Step 3: r un application by SBATCH $ cat script.sh #!/bin/sh #SBATCH –p test mpiexec –n 1./exec_mpi
SRUN or SBATCH? srun -p test -w blade06 -n 1 --gres=gpu:3 exec_cuda Hello World! Detected 3 CUDA capable device(s) OR $ sbatch script.sh Submitted batch job 587 $ cat slurm-587.out Hello World! Detected 3 CUDA capable device(s) #!/bin/sh #SBATCH –p test #SBATCH –w blade06 #SBATCH –n 1 #SBATCH --gres=gpu:3./exec_cuda SBATCH
Basic SLURM commands JOBID NAME USER STATE TIME NODE 312 script.sh user R 0:31 blade05 $ sinfo – to show status of partitions: PART AVAIL TIMELIM NODES STATE cpu up infinite 7 idle gpu up infinite 4 idle phi up infinite 2 idle test up infinite 1 idle $scancel – to cancel running task. $ srun – to run application by interactive mode; $ sbatch – to run application by queue mode; $ squeue – to show status of tasks:
Thanks for attention!