Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005.

Slides:



Advertisements
Similar presentations
Cluster Computing at IQSS Alex Storer, Research Technology Consultant.
Advertisements

Running Jobs on Franklin Richard Gerber NERSC User Services NERSC User Group Meeting September 19, 2007.
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
Illinois Campus Cluster Program User Forum April 24, 2012 NCSA Room :00 AM - 11:00 AM.
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Job Submission Using PBSPro and Globus Job Commands.
Koç University High Performance Computing Labs Hattusas & Gordion.
Network for Computational Nanotechnology (NCN) Purdue, Norfolk State, Northwestern, UC Berkeley, Univ. of Illinois, UTEP Basic Portable Batch System (PBS)
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Batch Queuing Systems The Portable Batch System (PBS) and the Load Sharing Facility (LSF) queuing systems share much common functionality in running batch.
Using Clusters -User Perspective. Pre-cluster scenario So many different computers: prithvi, apah, tejas, vayu, akash, agni, aatish, falaq, narad, qasid.
High Performance Computing Systems for IU Researchers – An Introduction IUB Wells Library 10-Sep-2012 Jenett Tillotson George Turner
ISG We build general capability Job Submission on the Olympus Cluster J. DePasse; S. Brown, PhD; T. Maiden Pittsburgh Supercomputing Center Public Health.
High Performance Computing
Job Submission on WestGrid Feb on Access Grid.
Linux+ Guide to Linux Certification, Second Edition
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh ssh.fsl.byu.edu You will be logged in to an interactive node.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
ISG We build general capability Purpose After this tutorial, you should: Be comfortable submitting work to the batch queuing system of olympus and be familiar.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
 Accessing the NCCS Systems  Setting your Initial System Environment  Moving Data onto the NCCS Systems  Storing Data on the NCCS Systems  Running.
Electronic Visualization Laboratory, University of Illinois at Chicago MPI on Argo-new Venkatram Vishwanath Electronic Visualization.
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
Introduction to Using SLURM on Discover Chongxun (Doris) Pan September 24, 2013.
Linux+ Guide to Linux Certification, Third Edition
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Network Queuing System (NQS). Controls batch queues Only on Cray SV1 Presently 8 queues available for general use and one queue for the Cray analyst.
Using the BYU SP-2. Our System Interactive nodes (2) –used for login, compilation & testing –marylou10.et.byu.edu I/O and scheduling nodes (7) –used for.
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
Part Five: Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Cluster Computing Applications for Bioinformatics Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Linux+ Guide to Linux Certification, Second Edition
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Using ROSSMANN to Run GOSET Studies Omar Laldin ( using materials from Jonathan Crider, Harish Suryanarayana ) Feb. 3, 2014.
Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.
Advanced Computing Facility Introduction
GRID COMPUTING.
Specialized Computing Cluster An Introduction
Auburn University
Welcome to Indiana University Clusters
PARADOX Cluster job management
Unix Scripts and PBS on BioU
Assumptions What are the prerequisites? … The hands on portion of the workshop will be on the command-line. If you are not familiar with the command.
HPC usage and software packages
MPI Basics.
Welcome to Indiana University Clusters
How to use the HPCC to do stuff
Architecture & System Overview
Postdoctoral researcher Department of Environmental Sciences, LSU
Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Paul Sexton CS 566 February 6, 2006
Compiling and Job Submission
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Queueing System Peter Wad Sackett.
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005

2 Topics Interactive –Serial –Parallel –Limits Batch –Serial –Parallel –Queues and Policies Charging Comparison with Seaborg

3 Execution Environment Four login nodes –Serial jobs only –CPU limit: 60 minutes –Memory limit: 64 MB 320 compute nodes –“Interactive” parallel jobs –Batch serial and parallel jobs –Scheduled by PBSPro Queue limits and policies established to meet system objectives –User input is critical!

4 Interactive Jobs Serial jobs run on login nodes – cd, ls, pathf90, etc. –./a.out Parallel jobs run on compute nodes –Controlled by PBSPro mpirun -np 16./a.out qsub -I -q interactive -l nodes=8:ppn=2 % cd $PBS_O_WORKDIR % mpirun -np 16./a.out qsub -I -q batch -l nodes=32:ppn=2,walltime=18:00:00

5 PBSPro Marketed by Altair Engineering –Based on open source Portable Batch System developed for NASA –Also installed on DaVinci Batch scripts contain directives: #PBS -o myjob.out Directives may also appear as command- line options: qsub -o myjob.out …

6 Simple Batch Script #PBS -l nodes=8:ppn=2,walltime=00:30:00 #PBS -N myjob #PBS -o myjob.out #PBS -e myjob.err #PBS -A mp999 #PBS -q debug #PBS -V cd $PBS_O_WORKDIR mpirun -np 16./a.out

7 Useful PBS Options (1) -A repo Charge this job to repository repo Default: Your default repository -N jobname Provide name for job; up to 15 printable, non- whitespace characters Default: Name of batch script -q qname Submit job to batch queue qname Default: batch

8 Useful PBS Options (2) -S shell Specify shell as the scripting language Default: Your login shell -V Export current environment variables into the batch job environment Default: Do not export

9 Useful PBS Options (3) -o outfile Write STDOUT to outfile Default:.o -e errfile Write STDERR to errfile Default:.e -j [ eo | oe ] Join STDOUT and STDERR on STDOUT ( eo ) or STDERR ( oe ) Default: Do not join

10 Useful PBS Options (4) -m [ a | b | e | n ] E-main notification a = send mail when job aborted by system b = send mail when job begins e = send mail when job ends n = do not send mail Options a, b, and e may be combined Default: a

11 Batch Queues SubmitExecuteNodesWalltime interactive 1 – 1630 mins debug 1 – 3230 mins batch batch16 1 – 1648 hours batch32 17 – 3224 hours batch64 33 – 6412 hours batch – 1286 hours batch – 2566 hours low 1 – 646 hours

12 Batch Queue Policies Each user may have: –One running interactive job –One running debug job –Four jobs running over entire system Only one batch128 job is allowed to run at a time. The batch256 queue usually has a run limit of zero. NERSC staff will arrange to run jobs of this size.

13 Submitting Batch Jobs % qsub myjob jacin03 % Record jobid for tracking!

14 Deleting Batch Jobs % qdel jacin03 %

15 Monitoring Batch Jobs (1) PBS command qstat % qstat Job id Name User Time Use S Queue jacin03-ib job5 einstein 00:00:00 R batch jacin03 EV80fl02_3 legendre 0 H batch jacin03 test.script laplace 00:00:23 R batch jacin03 runlu8x8 rasputin 0 Q batch jacin03-m mtp_mg_3wat_o2a fibonacci 00:00:11 R batch16... Use -u option for single-user output % qstat -u einstein Job id Name User Time Use S Queue jacin03-ib job5 einstein 00:00:00 R batch16 %

16 Monitoring Batch Jobs (2) NERSC command qs % qs JOBID ST USER NAME NDS REQ USED SUBMIT R gauss STDIN 1 00:30:00 00:10:43 Oct 2 16:47: R einstein runlu4x :00:00 00:38:48 Oct 2 15:23: R inewton r4_ :00:00 00:10:37 Oct 2 15:36: Q inewton r4_ :00:00 - Oct 2 08:42: Q rasputin nodemove 64 00:05:00 - Oct 2 12:00: Q einstein runlu8x :00:00 - Oct 2 15:24: H legendre EV80fl02_2 4 03:00:00 - Oct 2 15:24: H legendre EV80fl02_3 4 03:00:00 - Oct 2 15:24: H legendre EV80fl98_5 4 03:00:00 - Oct 2 15:26:06... Also provides -u option

17 Monitoring Batch Jobs (3) NERSC website has current queue look: Also has completed jobs list: Numerous filtering options available –Owner –Account –Queue –Jobid

18 Charging Machine charge factor (cf) = 4 –Based on benchmarks and user applications –Currently under review Serial interactive –Charge = cf cputime –Always charged to default repository All parallel –Charge = cf 2 nodes walltime –Charged to default repo unless -A specified

19 Things To Look Out For (1) Do not set group write permission for your home directory; it will prevent PBS from running your jobs. Library modules must be loaded at runtime as well as linktime. Propagation of environment variables to remote processes is incomplete; contact NERSC consulting for help.

20 Things To Look Out For (2) Do not run more that one MPI program in a single batch script. If your login shell is bash, you may see: accept: Resource temporarily unavailable done. In this case, specify a different shell using the -S directive, such as: #PBS -S /usr/bin/ksh

21 Things To Look Out For (3) Batch jobs always start in $HOME. To get to directory where job was submitted: cd $PBS_O_WORKDIR For jobs that work with large files: cd $SCRATCH/some_subdirectory PBS buffers output and error files until job completes. To view files (in home directory) while running: -k oe

22 Things To Look Out For (3) The following is just a warning and can be ignored: Warning: no access to tty (Bad file descriptor). Thus no job control in this shell.

23 LoadLeveler vs. PBS LLPBSLLPBS node #PBS -l nodes notification #PBS -m tasks_per_node #PBS -l ppn shell #PBS -S wall_clock_limit #PBS -l walltime output #PBS -o class #PBS -q error #PBS -e job_name #PBS -N environment #PBS -V account_no #PBS -A

24 Resources NERSC Website NERSC Consulting NERSC, menu option 3, 8 am - 5 pm, Pacific time (510) , menu option 3, 8 am - 5 pm, Pacific time