Download presentation
Presentation is loading. Please wait.
Published byBruce Richard Modified over 9 years ago
1
Introduction to HPC Workshop October 9 2014
2
Introduction Rob Lane HPC Support Research Computing Services CUIT
3
Introduction HPC Basics
4
Introduction First HPC Workshop
5
Yeti 2 head nodes 101 execute nodes 200 TB storage
6
Yeti 101 execute nodes –38 x 64 GB –8 x 128 GB –35 x 256 GB –16 x 64 GB + Infiniband –4 x 64 GB + nVidia K20 GPU
7
Yeti CPU –Intel E5-2650L –1.8 GHz –8 Cores –2 per Execute Node
8
Yeti Expansion Round –66 new systems –Faster CPU –More Infiniband –More GPU (nVidia K40) –ETA January 2015
9
Yeti
10
HP S6500 Chassis
11
HP SL230 Server
12
Job Scheduler Manages the cluster Decides when a job will run Decides where a job will run We use Torque/Moab
13
Job Queues Jobs are submitted to a queue Jobs sorted in priority order Not a FIFO
14
Access Mac Instructions 1.Run terminal
15
Access Windows Instructions 1.Search for putty on Columbia home page 2.Select first result 3.Follow link to Putty download page 4.Download putty.exe 5.Run putty.exe
16
Access Mac (Terminal) $ ssh UNI@yetisubmit.cc.columbia.edu Windows (Putty) Host Name: yetisubmit.cc.columbia.edu
17
Work Directory $ cd /vega/free/users/your UNI Replace “your UNI” with your UNI $ cd /vega/free/users/hpc2108
18
Copy Workshop Files Files are in /tmp/workshop $ cp /tmp/workshop/*.
19
Editing No single obvious choice for editor vi – simple but difficult at first emacs – powerful but complex nano – simple but not really standard
20
nano $ nano hellosubmit “^” means “hold down control” ^a : go to beginning of line ^e : go to end of line ^k: delete line ^o: save file ^x: exit
21
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V # Set output and error directories #PBS -o localhost:/vega/free/users/UNI #PBS -e localhost:/vega/free/users/UNI # Print "Hello World" echo "Hello World" # Sleep for 10 seconds sleep 10 # Print date and time date
22
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V # Set output and error directories #PBS -o localhost:/vega/free/users/UNI #PBS -e localhost:/vega/free/users/UNI # Print "Hello World" echo "Hello World" # Sleep for 10 seconds sleep 10 # Print date and time date
23
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
24
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
25
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
26
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
27
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
28
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
29
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
30
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
31
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
32
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V
33
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m n #PBS -V
34
hellosubmit #!/bin/sh # Directives #PBS -N HelloWorld #PBS -W group_list=yetifree #PBS -l nodes=1:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m n #PBS -V
35
hellosubmit # Set output and error directories #PBS -o localhost:/vega/free/users/UNI #PBS -e localhost:/vega/free/users/UNI
36
hellosubmit # Set output and error directories #PBS -o localhost:/vega/free/users/UNI #PBS -e localhost:/vega/free/users/UNI
37
hellosubmit # Print "Hello World" echo "Hello World" # Sleep for 10 seconds sleep 10 # Print date and time date
38
hellosubmit $ qsub hellosubmit
39
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $
40
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $
41
qstat $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
42
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
43
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
44
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
45
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
46
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1
47
hellosubmit $ qsub hellosubmit 298151.elk.cc.columbia.edu $ qstat 298151 Job ID Name User Time Use S Queue ---------- ------------ ---------- -------- - ----- 298151.elk HelloWorld hpc2108 0 Q batch1 $ qstat 298151 qstat: Unknown Job Id Error 298151.elk.cc.columbia.edu
48
hellosubmit $ ls -l total 4 -rw------- 1 hpc2108 yetifree 398 Oct 8 22:13 hellosubmit -rw------- 1 hpc2108 yetifree 0 Oct 8 22:44 HelloWorld.e298151 -rw------- 1 hpc2108 yetifree 41 Oct 8 22:44 HelloWorld.o298151
49
hellosubmit $ ls -l total 4 -rw------- 1 hpc2108 yetifree 398 Oct 8 22:13 hellosubmit -rw------- 1 hpc2108 yetifree 0 Oct 8 22:44 HelloWorld.e298151 -rw------- 1 hpc2108 yetifree 41 Oct 8 22:44 HelloWorld.o298151
50
hellosubmit $ ls -l total 4 -rw------- 1 hpc2108 yetifree 398 Oct 8 22:13 hellosubmit -rw------- 1 hpc2108 yetifree 0 Oct 8 22:44 HelloWorld.e298151 -rw------- 1 hpc2108 yetifree 41 Oct 8 22:44 HelloWorld.o298151
51
hellosubmit $ ls -l total 4 -rw------- 1 hpc2108 yetifree 398 Oct 8 22:13 hellosubmit -rw------- 1 hpc2108 yetifree 0 Oct 8 22:44 HelloWorld.e298151 -rw------- 1 hpc2108 yetifree 41 Oct 8 22:44 HelloWorld.o298151
52
hellosubmit $ ls -l total 4 -rw------- 1 hpc2108 yetifree 398 Oct 8 22:13 hellosubmit -rw------- 1 hpc2108 yetifree 0 Oct 8 22:44 HelloWorld.e298151 -rw------- 1 hpc2108 yetifree 41 Oct 8 22:44 HelloWorld.o298151
53
hellosubmit $ cat HelloWorld.o298151 Hello World Thu Oct 9 12:44:05 EDT 2014
54
hellosubmit $ cat HelloWorld.o298151 Hello World Thu Oct 9 12:44:05 EDT 2014 Any Questions?
55
Interactive Most jobs run as “batch” Can also run interactive jobs Get a shell on an execute node Useful for development, testing, troubleshooting
56
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
57
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
58
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
59
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
60
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
61
Interactive $ cat interactive qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb
62
Interactive $ qsub -I -W group_list=yetifree -l walltime=5:00,mem=100mb qsub: waiting for job 298158.elk.cc.columbia.edu to start
63
Interactive qsub: job 298158.elk.cc.columbia.edu ready.--.,-,-,--(/o o\)-,-,-,.,' // oo \\ ',,' /| __ |\ ',,' //\,__,/\\ ',, /\ /\,, /'`\ /' \, | /' `\ /' '\ | | \ ( ) / | ( /\| /' '\ |/\ ) \| /' /'`\ '\ |/ | /' `\ | ( ( ) ) `\ \ /' /' / / \ \ v v v v v v +--------------------------------+ | | | You are in an interactive job. | | | | Your walltime is 00:05:00 | | | +--------------------------------+
64
Interactive $ hostname charleston.cc.columbia.edu
65
Interactive $ exit logout qsub: job 298158.elk.cc.columbia.edu completed $
66
GUI Can run GUI’s in interactive jobs Need X Server on your local system See user documentation for more information
67
User Documentation hpc.cc.columbia.edu Go to “HPC Support” Click on Yeti user documentation
68
Job Queues Scheduler puts all jobs into a queue Queue selected automatically Queues have different settings
69
QueueTime LimitMemory Limit Max. User Run Batch 112 hours4 GB512 Batch 212 hours16 GB128 Batch 35 days16 GB64 Batch 43 daysNone8 Interactive4 hoursNone4 Job Queues
70
qstat -q $ qstat -q server: elk.cc.columbia.edu Queue Memory CPU Time Walltime Node Run Que Lm State ---------------- ------ -------- -------- ---- --- --- -- ----- batch1 4gb -- 12:00:00 -- 42 15 -- E R batch2 16gb -- 12:00:00 -- 129 73 -- E R batch3 16gb -- 120:00:0 -- 148 261 -- E R batch4 -- -- 72:00:00 -- 11 12 -- E R interactive -- -- 04:00:00 -- 0 1 -- E R interlong -- -- 48:00:00 -- 0 0 -- E R route -- -- -- -- 0 0 -- E R ----- ----- 330 362
71
yetifree Maximum processors limited –Currently 4 maximum Storage quota –16 GB No email support
72
yetifree $ quota -s Disk quotas for user hpc2108 (uid 242275): Filesystem blocks quota limit grace files quota limit grace hpc-cuit-storage-2.cc.columbia.edu:/free/ 122M 16384M 16384M 8 4295m 4295m
73
yetifree $ quota -s Disk quotas for user hpc2108 (uid 242275): Filesystem blocks quota limit grace files quota limit grace hpc-cuit-storage-2.cc.columbia.edu:/free/ 122M 16384M 16384M 8 4295m 4295m
74
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
75
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
76
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
77
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
78
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
79
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
80
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
81
email from: root to: hpc2108@columbia.edu date: Wed, Oct 8, 2014 at 11:41 PM subject: PBS JOB 298161.elk.cc.columbia.edu PBS Job Id: 298161.elk.cc.columbia.edu Job Name: HelloWorld Exec host: dublin.cc.columbia.edu/4 Execution terminated Exit_status=0 resources_used.cput=00:00:02 resources_used.mem=8288kb resources_used.vmem=304780kb resources_used.walltime=00:02:02 Error_Path: localhost:/vega/free/users/hpc2108/HelloWorld.e298161 Output_Path: localhost:/vega/free/users/hpc2108/HelloWorld.o298161
82
Intern Research Computing Services (RCS) is looking for an intern Paid position ~10 hours a week Will be on LionShare next week
83
MPI Message Passing Interface Allows applications to run across multiple computers
84
MPI Edit MPI submit file Load MPI environment module Compile sample program
85
MPI #!/bin/sh # Directives #PBS -N MpiHello #PBS -W group_list=yetifree #PBS -l nodes=3:ppn=1,walltime=00:01:00,mem=20mb #PBS -M UNI@columbia.edu #PBS -m abe #PBS -V # Set output and error directories #PBS -o localhost:/vega/free/users/UNI #PBS -e localhost:/vega/free/users/UNI # Load mpi module. module load openmpi # Run mpi program. mpirun mpihello
86
MPI $ module load openmpi $ which mpicc /usr/local/openmpi/bin/mpicc $ mpicc -o mpihello mpihello.c
87
MPI $ qsub mpisubmit 298501.elk.cc.columbia.edu
88
Questions? Any questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.