Presentation is loading. Please wait.

Presentation is loading. Please wait.

CommLab PC Cluster (Ubuntu OS version)

Similar presentations


Presentation on theme: "CommLab PC Cluster (Ubuntu OS version)"— Presentation transcript:

1 CommLab PC Cluster (Ubuntu OS version)
PC Cluster Manager: Sandy Ardianto

2 Outline Architecture Sending jobs to slaves Specification Torque PBS
Torque Features How to use PBS file example Registration PBS command Connect using Putty Python Example Upload & Download files Matlab Example References

3 Architecture *NAS (Network Attached Storage) Master
Public IP: Local IP: Slave01 IP: Slave02 IP: Slave16 IP: NAS *NAS (Network Attached Storage)

4 Specification Master Slave01-14 Slave15-16 CPUs 8 cores Xeon 16 cores Memory 16GB 4GB Folder /home of master and all slaves are synchronized using NAS All OS have been changed from Centos 5.6 to Ubuntu 14.04

5 How to use Commlab PC cluster

6 Registration Contact cluster manager (sandyardianto@gmail.com)
<name> <username> (ex. sardianto [Sandy Ardianto]) <password> <advisor> < >

7 Connect using SSH Putty:

8 Upload & Download files
Filezilla : Default location: /home/<username>

9 Sending Jobs to Slaves

10 Torque PBS (Portable Batch System)
Terascale Open-source Resource and QUEue Manager (TORQUE) a distributed resource manager providing control over batch jobs and distributed compute nodes

11 Torque Features (1/2) Fault Tolerance
Additional failure conditions checked/handled Node health check script support Scheduling Interface Extended query interface providing the scheduler with additional and more accurate information Extended control interface allowing the scheduler increased control over job behavior and attributes Allows the collection of statistics for completed jobs

12 Torque Features (2/2) Scalability
Significantly improved server to MOM communication model Ability to handle larger clusters (over 15 TF/2,500 processors) Ability to handle larger jobs (over 2000 processors) Ability to support larger server messages Usability Extensive logging additions More human readable logging (i.e. no more 'error on command 42')

13 PBS File Example Some useful variables: $PBS_JOBID: the job identifier
Job Name Some useful variables: $PBS_JOBID: the job identifier $PBS_JOBNAME: the job name $PBS_O_WORKDIR: the absolute path where qsub command sent Error output file Output string on the terminal Queue name (batch, batch1-batch16) Ppn: Processor per nodes Compute unit Assign specific slave Nodes=slaveXX (XX=01-16)

14 Check which computer available to use
Open in browser

15 PBS Command Sending jobs: qsub <filename.sh>
Show jobs status: qstat Run the jobs: qrun <job ID> Stop jobs: qdel <job ID> Status: Q - Queue R - Running E - Error C - Completed

16 Python - Hello World Example
Files available at pbs.sh hello.py

17 Running Hello World qsub pbs.sh qrun <job ID> qstat
cat 3.master-job_name.log Qsub to send job to master Qrun to run the job Qstat to check job status

18 Matlab Example Files available at http://140.113.211.20
pbs_matlab.sh mtest.m

19 Running Matlab Example (1/2)
qsub pbs_matlab.sh qrun <job ID> qstat Qsub to send job to master Qrun to run the job Qstat to check job status

20 Running Matlab Example (2/2)
head master- job_name.log Head master-job_name.log  show first 20 line of log

21 Any Problem/Question ? Contact me!


Download ppt "CommLab PC Cluster (Ubuntu OS version)"

Similar presentations


Ads by Google