Using the division's Compute-servers PubH /17/14
List of compute servers ● For students: ● potassium.biostat.umn.edu ● cesium.biostat.umn.edu ● carbon.biostat.umn.edu ● chromium.biostat.umn.edu ● For faculty: ● rubidium.biostat.umn.edu ● silicon.biostat.umn.edu ● All servers have 24 cores (processors)
How to access servers ● Directly ssh in from anywhere within the umn.edu domain (U of M network) ● If coming from outside of umn.edu ● ssh to merganser.biostat.umn.edu, then ssh to the desired server ● Log into the University VPN and ssh directly to the desired host
Server utilization - CPU days
Cesium usage, June 2012
Potassium usage, June 2012
Computing etiquette ● Don't run open-ended jobs ● Estimate of the job's duration ● Run a small version of the job to get estimate ● “nice” your jobs ● 'r' within “top” ● “renice 19 ” on command line ● Don't run too many concurrent jobs (or too many threads) on one server
Monitor your jobs ● Use “top” ● Know how to kill your jobs ● In “top” use “k” ● From command line: – kill – kill -1 – kill -9 (sure kill, use as last resort) ● Write to a file as job progresses
Automatic monitoring ● System programs monitor each user's compute- bound jobs ● If user runs too many compute-bound processes/threads, jobs are automatically “niced” to a lower priority ● Perhaps we will add annoying notifications if jobs run too many days
Threaded jobs ● A threaded job breaks into “threads” where each thread runs independently and can run on different processors ● A threaded job could saturate all 24 cores ● “top” can tell you if a job is threading ● Run only 1 or 2 threaded jobs per server ● Serialize threaded jobs, rather than running many such jobs concurrently
Using “top” ● 'q' to quit ● '1' toggle “show all cores/show summary” ● Need a window with at least 36 lines ● 'H' toggle “show all threads/show summary” ● 'k' kill a job ● 'h' help