OAR : a batch scheduler Grenoble University LIG (Mescal Team)

Slides:



Advertisements
Similar presentations
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Jim Donahue | Principal Scientist Adobe Systems Technology Lab Flint: Making.
Advertisements

6/2/20071 Grid Computing Sun Grid Engine (SGE) Manoj Katwal.
Operating Systems Concepts Professor Rick Han Department of Computer Science University of Colorado at Boulder.
HPCC Mid-Morning Break Powertools Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Research
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Automated Integrations An End-to-End Solution August 15, 2008.
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Resource management system for distributed environment B4. Nguyen Tuan Duc.
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
©Brooks/Cole, 2003 Chapter 7 Operating Systems. ©Brooks/Cole, 2003 Define the purpose and functions of an operating system. Understand the components.
Scientific Computing Division Juli Rew CISL User Forum May 19, 2005 Scheduler Basics.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Grid Computing I CONDOR.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Introduction to Using SLURM on Discover Chongxun (Doris) Pan September 24, 2013.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Batch Systems In a number of scientific computing environments, multiple users must share a compute resource: –research clusters –supercomputing centers.
Process Control. Module 11 Process Control ♦ Introduction ► A process is a running occurrence of a program, including all variables and other conditions.
Chapter 7 Operating Systems. Define the purpose and functions of an operating system. Understand the components of an operating system. Understand the.
Guide to Linux Installation and Administration, 2e1 Chapter 11 Using Advanced Administration Techniques.
Faucets Queuing System Presented by, Sameer Kumar.
We will focus on operating system concepts What does it do? How is it implemented? Apply to Windows, Linux, Unix, Solaris, Mac OS X. Will discuss differences.
UNIX Unit 1- Architecture of Unix - By Pratima.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
Basic UNIX Concepts. Why We Need an Operating System (OS) OS interacts with hardware and manages programs. A safe environment for programs to run is required.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
CCNA1 v3 Module 1 v3 CCNA 1 Module 1 JEOPARDY K. Martin.
CSC414 “Introduction to UNIX/ Linux” Lecture 3
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Using ROSSMANN to Run GOSET Studies Omar Laldin ( using materials from Jonathan Crider, Harish Suryanarayana ) Feb. 3, 2014.
CFI 2004 UW A quick overview with lots of time for Q&A and exploration.
Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.
Advanced Network Administration Computer Clusters.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Advanced Computing Facility Introduction
GRID COMPUTING.
Auburn University
Scheduling systems Carsten Preuß
Assumptions What are the prerequisites? … The hands on portion of the workshop will be on the command-line. If you are not familiar with the command.
HPC usage and software packages
OpenPBS – Distributed Workload Management System
Dockerize OpenEdge Srinivasa Rao Nalla.
How to use the HPCC to do stuff
GWE Core Grid Wizard Enterprise (
ASU Saguaro 09/16/2016 Jung Hyun Kim.
Joker: Getting the most out of the slurm scheduler
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
THE OPERATION SYSTEM The need for an operating system
CRESCO Project: Salvatore Raia
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Resource Management for High-Throughput Computing at the ESRF G
Chapter 1: Introduction
Short Read Sequencing Analysis Workshop
Semiconductor Manufacturing (and other stuff) with Condor
NCSA Supercluster Administration
Advanced Computing Facility Introduction
Wide Area Workload Management Work Package DATAGRID project
Sun Grid Engine.
The Main Features of Operating Systems
Quick Tutorial on MPICH for NIC-Cluster
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Short Read Sequencing Analysis Workshop
Presentation transcript:

OAR : a batch scheduler Grenoble University LIG (Mescal Team)

OAR : a batch scheduler A batch scheduler is a software for sharing computing resources between users. It places jobs on the cluster's resources depending on the time they have been submitted, the load of the nodes and some constraints

OAR : a batch scheduler OAR is OpenSource, distributed under the GPL license It is written in Perl. It uses SSH as the communication model and sudo as the security model. An SQL database is used as the core model.

OAR main features - Batch and Interactive jobs - Admission rules - Walltime - Matching of resources (job/node properties) - Hold and resume jobs - Multi-schedulers support (simple fifo and fifo with matching) - Multi-queues with priority - Best-effort queues (for exploiting idle resources) - Check compute nodes before launching - Epilogue/Prologue scripts - No Daemon on compute nodes - Dynamic insertion/deletion of compute node - Logging - Backfiling - First-Fit Scheduler with matching resource (very soon) - Fairsharing - Advance Reservation - Cpuset feature used to optimize, clean and control access to nodes

OAR usage A command to watch resources: 'oarnodes'

OAR usage A command to watch jobs: 'oarstat'

OAR usage A command to submit jobs: 'oarsub'

OAR usage ● oarsub examples: – # I want 2 nodes: oarsub -l /nodes=2 “sleep 300” – # I want 4 cpus, with an interactive shell, for 4 hours: oarsub -l /cpu=4,walltime=4 -I – # I want 4 cpus, but 1 per node: oarsub -l /nodes=4/cpu=1 -I – # I want 2 cores on a node which has a Myrinet controler: oarsub -l /core=2 -p “myrinet='YES'” -I – # I want 5 nodes that are on the same switch: oarsub -l /switch=1/nodes=5 – # I want to connect to a node of the job number 1021: oarsub -C 1021

OAR usage When in non-interactive (batch) mode, the given program is launched on the first allocated node. Allocated resources hostnames are listed into the file whose name is contained into the $OAR_NODEFILE environment variable

The cpuset feature With the cpuset feature, you must use “oarsh” as a connector between nodes if you want a very good cleaning after the job is finished. You can enforce the usage of oarsh by simple sshd configuration. The cpuset feature may only be used on Linux kernels which have loaded the “cpuset” module

Visualisation tools MonikaDrawGantt

OAR coupled with PBS for CIGRI use