Download presentation
Presentation is loading. Please wait.
Published byCuthbert Kennedy Modified over 9 years ago
1
How to get started on cees Mandy SEP Style
2
Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores per node) Computer node hardware2.26 GHz Dual Processor Quad-Core Nehalem cees-rcf SEP-reserved disk30TB SEP reserved node21 (16 cores per node) Default max node137 (16 cores per node) Computer node hardwaresandy bridge
3
Home and working directories /home/username – 10GB quota – Backed up daily – Mounted read-only on compute nodes /data/sep/username – Everyone have write access to 20TB in /data/cees – Not backed up – SEP partition in /data/sep (20TB for cees-clusters and 30TB for cees-rcfs) Options 1) Run your code in /home but use absolute paths for outputting in /data 2) Run your code in /data but back-up your code in /home Tips A lot faster to write to /tmp within each node first and then copy back to /data
4
Where is SEPlib? # my own environmental variable setenv SEP /usr/local/SEP setenv SEPINC /usr/local/SEP/include setenv SEPBIN /usr/local/SEP/bin
5
How to submit a job Number of nodes and cores you need
6
How to submit a job The max run time of your job before it is killed Note: must be < 2hours for default queue
7
How to submit a job Stdout and Stderr logs
8
How to submit a job Queue, either default or sep
9
How to submit a job Jobname
10
How to submit a job The command for your jobs
11
How to submit a job Submit your job using qsub
12
Do not run big jobs on the head node -Talk to Dennis when moving large dataset - You can use cees-rcf-tools to test jobs as well
13
Check jobs
14
Cancel jobs
15
Need 40 nodes Need 1 node Need 40 nodes Need 1 node Need 40 nodes Ex. Stacking, step sizes, updating Ex. Pre-stack forward or adjoint operation Typical computation structure 1 job or many jobs?
16
reserved queue jobs can run forever default queue jobs must finish in 2 hours Waiting…
18
Need 40 nodes Need 1 node Need 40 nodes Need 1 node Need 40 nodes Ex. Stacking, step sizes, updating Ex. Pre-stack forward or adjoint operation I am taking over every single node. muahahaha
19
Bob’s advice Break your jobs into 2 hours block and use the default queue Only store intermediate result on the clusters
20
Scripting is useful for job management On cees-clusters /data/sep/mandyman/Tutorial 1.Embarrassingly parallel jobs submission 2.Timer to check jobs
21
Sharing resources We are here now
22
Sharing resources
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.