Parallel computation with R & Python on TACC HPC server Dr. Cuixian Chen, 10/12/2017 Declaimer: This PPT is modified based on Dr. Yishi Wang's HPC tutorial on 9/21/2017.
Overview on HPC with R and Python Online resources:
For first time users: follow the instruction for system setups
Login to your TACC account, as a first-time user
Login to your TACC account, as a first-time user Please follow the instruction to obtain the valid Token and get ready for the next step.
Logins and file transferring into Xsede
Log into TACC – From Windows From Computer Lab desktops: search for putty From personal Windows laptops: Install and open Putty. Hostname: Saved Sessions: give a name for the future use, such as Stampede2.
Log into TACC – From Windows From Computer Lab desktops: search for putty From personal Windows laptops: Install and open Putty. Username: Use the one such as tg****** Eg: For Chen is tg842131 Then use the same password with your Xsede account. Input your valid Token for the login here.
Log into TACC – From Windows If you can log in successfully, you will see the following message:
Log into TACC – From Mac Start Terminal in Mac; Or you can use xshell from Windows;
To transfer files into Xsede systems To transfer files, use Cyberduck for mac. For windows, you can consider: Xftp5 Winscp (chen’s option) Filezilla
Use Winscp To transfer files into Xsede 1) Hostname: 2) Username: Use the one such as tg****** Eg: For Chen is tg842131 3) Then use the same password with your Xsede account. 4) Input your valid Token for the login here.
Use Winscp To transfer files into Xsede Once you log into the system: Now you are able to transfer files between your computer and the Xsede system.
Use Cyberduck for mac To transfer files into Xsede dos2unix sbatch -A TG-TRA150002 sbatch -A TG-DMS170019
Run R in HPC Job submission R tutorials:
Load R in both Windows or Mac Type "module load Rstats", then "R” You then can do a lot of things with R Type “library(parallel)” Type ”detectCores()” module load Rstats R library(parallel) detectCores() Right click the mouse to paste the contents.
Log into TACC – From Windows After typing in the following notes, you will see the following message: moduel load Rstats R library(parallel) detectCores()
Useful commands lscpu to see the info about cpu ls to list all files Familiar with UNIX commands, vi? ’showq -u tg831870’, to see any job working? logout to logout
Use Cyberduck for Mac To transfer files into Xsede dos2unix Job submission: sbatch -A TG-TRA150002 sbatch -A TG-DMS170019
How to submit a job in Windows echo $HOME echo $WOKR echo $DATA echo $SCRATCH To submit a job: sbatch
How to submit a job in Windows After submitting a job: sbatch
How to submit a job in Windows When a job is done, an email will be received: Now go into your winscp to look at the result file "ccx326512.txt" Message Passing Interface (MPI)
How to make parallel computing in R Example code for parallel computing in R: library(parallel) workerFunc <- function(n) {message(paste('we are working on the ',n,'th loop')); x<-rnorm(N,5,1); return(mean(x)) } numWorkers <- detectCores(); set.seed(12345); st<-Sys.time() res<-mclapply(c(1:n), workerFunc, mc.cores = numWorkers); Sys.time()-st Message Passing Interface (MPI)
Run Python in HPC Python Tutorials:
How to run python in TACC module load python module load python3* first, type $ module spider python3 to get instructions on other required modules
Multiprocess example from multiprocessing import Pool def f(x): return x*x p=Pool(4) ##starts 4 worker process print(, range(10))) #prints[0, 1, 4, ..., 81] # idev -r ## This is only for reservation nodes Run in an interactive session: module load python python
Use a standard batch file: #!/bin/bash #SBATCH -p development #SBATCH -A your_account_name_goes_here #SBATCH -J mpi4py-demo #SBATCH -o mpi4py-demo.o%j #SBATCH -n 16 #SBATCH -t 00:05:00 # Prohibit writing core files on error. ulimit -c 0 set -x python & ibrun python --loghost `hostname` --level debug pkill python
pip install --user line_profiler Time python