VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013
Basic Beowulf Cluster Structure
A brief look of our cluster
VIPBG Beowulf Cluster Server Name: IP: nd server as failover: IP: (invisible on mission) To access our server and how to use it, check the wiki page
Access Cluster What you need to do to access to server? get username and password Get to install VCU webvpn on your PC so you can access it from anywhere. change your password to be qualified password: $passwd set up necessary variables to customize your personal console templates: ~/.cshrc ~/.login echo $PATH - add searching path into your.cshrc file Make temp and bin directory under your home dir $mkdir tmp $mkdir bin
Access Cluster Server and nodes Master node : Running CentOS ( red hat kernel)Version 5.6, x86-64 Open source or Software download – choose 64 bits CentOS or RHEL 5 if possible Purposes and policy: front-end user interface; Do not run job directly on master, it will be terminated without contact user. accessible from outside by permission and webvpn Slave nodes (nodes): – (8 cores Xeon processors with 32 GB RAM) Node 2-19 fast nodes ( 12 cores Xeon processors with 98GB RAM on each node) Purposes and policy: computation; not prefer to access user interface, accessible via master and managed by portable batch management ( PBS ); fast; internal network; X, not accessible directly from outside
Access Cluster Nodes and queue configuration $qstat –q ## will give you all the queues and current running status server: light Queue Memory CPU Time Walltime Node Run Que Lm State workq E R serial E R mxq E R express E R openmx E R slowq E R $pbsnodes –a |more ## will give you all of the queue and nodes detailed information with page by page
Access Cluster Nodes and queues Serial ( default ): nodes assigned: Node2-3:8 cores, 24GB RAM Node19-15: 23 cores, 64GB RAM workq ( dedicated to converge project) Node14-12:12 cores, 64GB RAM Openmx ( dedicated to R openmx and parallel jobs) Node11-9: 23 cores, 64GB RAM Mxq (dedicated to traditional mx jobs or other open sources jobs, such as plink) Node6-5 Floating nodes: node4, node7, node8 – currently assigned to workq
Accessing Cluster Software available on master and nodes R with CRAN libraries and bioconductor libraries C++/G++ compiler, fortran compiler ( f77/f90) Python/biopython compilers Open sources needed by users Upon users requests SAS 9.3 is on all nodes PLINK Open Mx Impute2, samtool, gtool and open sources as requested.
Commands to be used on cluster Submitting R jobs on normal queue $qR MYSCRIPT ( if the script name is MYSCRIPT.R, submit it with no.R extension) each users is allowed to run 50 jobs simultaneously Submitting jobs on large memory queue large memory queue is on node1 for memory intensive jobs ( limited 8 totally) $qRL MYSCRIPT
Template used on cluster Modify template to create your own pbs script for running programs #!/bin/bash #PBS -q QUEUENAME ##serial, sasq, workq #PBS -N MYSCRIPT # echo "******STARTING****************************" # # cd to the directory from which I submitted the job. Otherwise it will execute in my home directory. # set WORKDIR = ~/YOURWORDIR #PBS -V #echo “PBS batch job id is $PBS_JOBID“ echo "Working directory of this job is: " $WORKDIR # echo "Beginning to run job“ Command line you need to execute the job ( /home/huan/bin/calculate - PARAMETEERS) SAVE IT IN AN FILE MYSCRIPT $qsub MYSCRIPT
Commands used on cluster Submitting interactive job when there is no script command for submitting jobs using new application $qsub -I to get on a node NODE7$plink –script PLKSCRIPT Checking job status “R” Running; “E” Exiting “H” Holding “Q” Queued $qstat $qstat –n ( show which node your job is on)
Use cluster wisely Quit or cancel job submission $qstat ( to get the jobID) #qdel YOURJOBID To kill all of your jobs if you have too many $qstat -u YOURNAME | tail --lines=+6 |awk '{print "qdel ", $1}‘ |/bin/sh Limitation for the name of the SCRIPT No more than 10 characters no space in between no special characters. use a temporary name if necessary and change it back when the job is done. Maximum job for each useer: 30, No more than 50 jobs for each submission No ssh connection directly to nodes Send request to admin if you need to run large jobs
New policies User quota will be enabled on cluster, each one will have 1TB, special request needed for more space. 6 month after leave vipbg, yoru account will be deactivated Always check ~/tmp and remove the temp files your program generated.