Using the BYU SP-2. Our System Interactive nodes (2) –used for login, compilation & testing –marylou10.et.byu.edu I/O and scheduling nodes (7) –used for.

Slides:



Advertisements
Similar presentations
Parasol Architecture A mild case of scary asynchronous system stuff.
Advertisements

Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
PBS Job Management and Taskfarming Joachim Wagner
An overview of Torque/Moab queuing. Topics ARC topology Authentication Architecture of the queuing system Workflow Job Scripts Some queuing strategies.
Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
High Performance Computing Systems for IU Researchers – An Introduction IUB Wells Library 10-Sep-2012 Jenett Tillotson George Turner
IT Systems Operating System EN230-1 Justin Champion C208 –
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh ssh.fsl.byu.edu You will be logged in to an interactive node.
Operating Systems Concepts Professor Rick Han Department of Computer Science University of Colorado at Boulder.
1 Operating Systems Ch An Overview. Architecture of Computer Hardware and Systems Software Irv Englander, John Wiley, Bare Bones Computer.
J. Skovira 5/05 v11 Introduction to IBM LoadLeveler Batch Scheduling System.
Derek Wright Computer Sciences Department, UW-Madison Lawrence Berkeley National Labs (LBNL)
 Introduction Introduction  Definition of Operating System Definition of Operating System  Abstract View of OperatingSystem Abstract View of OperatingSystem.
Week 6 Operating Systems.
Task Farming on HPCx David Henty HPCx Applications Support
 Scheduling  Linux Scheduling  Linux Scheduling Policy  Classification Of Processes In Linux  Linux Scheduling Classes  Process States In Linux.
Operating Systems.  Operating System Support Operating System Support  OS As User/Computer Interface OS As User/Computer Interface  OS As Resource.
Chapter 4 Processor Management
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
CPS120: Introduction to Computer Science Operating Systems Nell Dale John Lewis.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
Office of Science U.S. Department of Energy Evaluating Checkpoint/Restart on the IBM SP Jay Srinivasan
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Processes and Threads CS550 Operating Systems. Processes and Threads These exist only at execution time They have fast state changes -> in memory and.
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
Intermediate Condor Rob Quick Open Science Grid HTC - Indiana University.
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Batch Systems In a number of scientific computing environments, multiple users must share a compute resource: –research clusters –supercomputing centers.
Operating Systems Process Management.
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.
Research Computing Environment at the University of Alberta Diego Novillo Research Computing Support Group University of Alberta April 1999.
Building a Real Workflow Thursday morning, 9:00 am Greg Thain University of Wisconsin - Madison.
© 2005 IBM MPI Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006.
Using hpc Instructor : Seung Hun An, DCS Lab, School of EECSE, Seoul National University.
CE Operating Systems Lecture 7 Threads & Introduction to CPU Scheduling.
Processes and Process Control 1. Processes and Process Control 2. Definitions of a Process 3. Systems state vs. Process State 4. A 2 State Process Model.
OS, , Part I Operating - System Structures Department of Computer Engineering, PSUWannarat Suntiamorntut.
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Rochester Institute of Technology 1 Job Submission Andrew Pangborn & Myles Maxfield 01/19/09Service Oriented Cyberinfrastructure Lab,
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Process Control Management Prepared by: Dhason Operating Systems.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
HPC usage and software packages
OpenPBS – Distributed Workload Management System
CPU SCHEDULING.
Operating Systems •The kernel is a program that constitutes the central core of a computer operating system. It has complete control over everything that.
Joker: Getting the most out of the slurm scheduler
Intro to Processes CSSE 332 Operating Systems
Compiling and Job Submission
So far…. Firmware identifies hardware devices present
Operating Systems.
Chapter5: CPU Scheduling
Introduction to High Throughput Computing and HTCondor
CPU scheduling decisions may take place when a process:
Overview of Workflows: Why Use Them?
Introduction to OS (concept, evolution, some keywords)
Chapter 5: CPU Scheduling
The performance of NAMD on a large Power4 system
Process Management -Compiled for CSIT
Introduction to OS (concept, evolution, some keywords)
Presentation transcript:

Using the BYU SP-2

Our System Interactive nodes (2) –used for login, compilation & testing –marylou10.et.byu.edu I/O and scheduling nodes (7) –used for the batch scheduling system and the parallel file system Compute nodes (26) –22 4 processor –416 processor

Compilers xlcC xlCC++ xlfFortran Parallel Compilers –mpcc –mpCC –mpxlf Optimization –-O5 -qarch=pwr3 -qtune=pwr3 -qhot Libraries –-lblas, -lfftw, -llapack, -lessl

Other Stuff Documentation – – Launching parallel jobs –done through the batch scheduler –Your job is a shell script that you hand to the batch scheduler for execution –Can look at xloadl for help creating script

Batch job scheduler Batch Schedulers –PBS (Portable Batch System) open source –LoadLeveler - descendent of Condor The process –user submits jobs to queue –machines register with scheduler offering to run jobs of certain class –scheduler allocates jobs to machines and tracks them –once started, jobs are scheduled by kernel

Scheduling parallel jobs jobs can ask for –number of nodes (1 CPU) –number of tasks per node (multiple CPUs) –non shared nodes (multiple CPUs) mixing jobs can be bad –two intense I/O processes on a 2 CPU node can ruin performance for both –same for two RAM intensive processes

Scheduling parallel jobs (2) All allocated nodes and processors and resources are allocated for the duration of the entire job No dynamic adjustments, except by creating jobs with multiple steps –each step can have different requirements –each step can express dependency on other steps

Scheduling parallel jobs (3) Management must –allow some jobs to use the entire machine –allow short jobs to get started quickly they should not have to wait weeks in the queue Some very long jobs may be needed, but are to be avoided

Backfill scheduling time Job A Job B 10 nodes system Job C A B CD Job D

Backfill scheduling Requires real time limit to be set More accurate (shorter) estimate gives more chance to be running earlier Short jobs can move through system quicker Uses system better by avoiding waste of cycles during wait

Using LoadLeveler Graphical user interface: xloadl Make shell script with LoadLeveler keywords as shell comments = thing.log = thing.err = short = thingx = 6,10 = 4 = (Adapter==hps_us)

Sample LoadLeveler Script #!/bin/ksh job_type = parallel input = /dev/null output = $(Executable).$(Cluster).$(Process).out error = $(Executable).$(Cluster).$(Process).err initialdir = /gstudent/student_rt_y/directory notify_user = class = short notification = complete checkpoint = no restart = no requirements = (Arch == "power3") blocking = unlimited total_tasks = 4 network.MPI = switch,shared,US queue./your_exe_and_any_args

Sample serial job #!/bin/ksh job_type = serial input = /dev/null output = $(Executable).$(Cluster).$(Process).out error = $(Executable).$(Cluster).$(Process).err initialdir = /gstudent/student_rt_y notify_user = class = medium notification = complete checkpoint = no restart = no queue paupnew Hlav3ashort.paup

LoadLeveler commands llq: shows all jobs –can also use showq llq -s JobID : show why not running llclass : shows classes llstatus : shows machines llcancel JobID : cancel job llhold JobID : put job in hold state

Sample llq output bash-2.05a$ llq Id Owner Submitted ST PRI Class Running On m1015i mdt36 8/7 12:41 R 50 long m1009i m1015i mdt36 8/7 12:41 R 50 long m1019i m1015i jl447 8/12 16:25 R 50 long m1012i m1015i to5 8/13 08:44 R 50 long m1045i m1015i to5 8/13 08:44 R 50 long m1045i … m1015i taskman 8/14 08:13 R 50 short m1017i m1015i taskman 8/14 08:13 R 50 short m1014i m1015i taskman 8/14 08:13 R 50 short m1017i m1015i taskman 8/14 08:13 R 50 short m1014i m1015i taskman 8/14 08:13 R 50 short m1011i m1015i mendez 8/14 13:07 I 50 long m1015i cr66 8/14 12:40 I 50 medium m1015i jl447 8/13 07:08 I 50 long m1015i dvd 8/13 10:45 I 50 medium m1015i dvd 8/13 11:22 I 50 medium m1015i dvd 8/13 11:25 I 50 medium m1015i mdt36 8/13 08:51 I 50 long m1015i mdt36 8/13 08:50 I 50 long … m1015i taskman 8/14 08:27 I 50 short m1015i taskman 8/14 08:57 I 50 short m1015i taskman 8/14 08:57 I 50 short 58 job step(s) in queue, 23 waiting, 0 pending, 35 running, 0 held, 0 preempted

Sample showq output bash-2.05a$ showq ACTIVE JOBS JOBNAME USERNAME STATE PROC REMAINING STARTTIME m1015i taskman Running 1 18:39:00 Wed Aug 14 08:06:24 m1015i taskman Running 1 18:39:00 Wed Aug 14 08:06:24 m1015i taskman Running 1 18:39:00 Wed Aug 14 08:06:24 … m1015i taskman Running 1 21:33:42 Wed Aug 14 11:01:06 m1015i taskman Running 1 23:43:05 Wed Aug 14 13:10:29 m1015i dvd Running 4 2:15:10:38 Wed Aug 14 04:38:02 m1015i mdt36 Running 8 2:23:14:21 Wed Aug 7 12:41:45 … m1015i jar65 Running 4 9:04:07:44 Tue Aug 13 17:35:08 m1015i jar65 Running 4 9:08:28:16 Tue Aug 13 21:55:40 m1015i to5 Running 8 9:21:11:49 Wed Aug 14 10:39:13 m1015i to5 Running 8 9:21:11:49 Wed Aug 14 10:39:13 35 Active Jobs 150 of 184 Processors Active (81.52%) 26 of 34 Nodes Active (76.47%) IDLE JOBS JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME m1015i jl447 Idle 2 5:00:00:00 Tue Aug 13 07:08:09 m1015i dvd Idle 8 3:00:00:00 Tue Aug 13 10:45:18 … 23 Idle Jobs NON-QUEUED JOBS JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 58 Active Jobs: 35 Idle Jobs: 23 Non-Queued Jobs: 0

LoadLeveler environment Normally same as your login environment Limits are set, use llclass -l to see values –ulimit -S -a –ulimit -H -a Big heap requirements –-bmaxdata:0x up to 2 GB data (heap) –-q64 -bmaxdata:0x…. Up to 8 EB