An Introduction to Gauss Paul D. Baines University of California, Davis November 20 th 2012.

Slides:



Advertisements
Similar presentations
Parallel R Andrew Jaffe Computing Club 4/5/2015. Overview Introduction multicore Array jobs The rest.
Advertisements

Introduction to Linux command line for bioinformatics Wenjun Kang, MS Jorge Andrade, PhD 6/28/2013 Bioinformatics Core, Center.
Introduction to Matlab Workshop Matthew Johnson, Economics October 17, /13/20151.
Linux, it's not Windows A short introduction to the sub-department's computer systems Gareth Thomas.
HCC Workshop Department of Earth and Atmospheric Sciences September 23/30, 2014.
CCPR Workshop Lexis Cluster Introduction October 19, 2007 David Ash.
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
Introduction to RCC for Intro to MRI 2014 July 25, 2014.
“Final?” Day Unix/Linux April 8, 2014 Dr. Bob Mathis.
A crash course in njit’s Afs
Introduction to Linux Workshop February Introduction Rob Lane & The HPC Support Team Research Computing Services CUIT.
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 22, 2011assignprelim.1 Assignment Preliminaries ITCS 6010/8010 Spring 2011.
Introduction to UNIX/Linux Exercises Dan Stanzione.
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2012, Jan 18, 2012assignprelim.1 Assignment Preliminaries ITCS 4145/5145 Spring 2012.
CSE 390a Editing and Moving Files
HCC Workshop August 29, Introduction to LINUX ●Operating system like Windows or OS X (but different) ●OS used by HCC ●Means of communicating with.
1 Intro to Linux - getting around HPC systems Himanshu Chhetri.
Linux environment ● Graphical interface – X-window + window manager ● Text interface – terminal + shell.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
Essential Unix at ACEnet Joey Bernard, Computational Research Consultant.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
HPC at HCC Jun Wang Outline of Workshop1 Overview of HPC Computing Resources at HCC How to obtain an account at HCC How to login a Linux cluster at HCC.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
Guide to Linux Installation and Administration, 2e1 Chapter 10 Managing System Resources.
Swarm on the Biowulf2 Cluster Dr. David Hoover, SCB, CIT, NIH September 24, 2015.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Agenda Link of the week Use of Virtual Machine Review week one lab assignment This week’s expected outcomes Review next lab assignments Break Out Problems.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Usage of Workstation Lecturer: Yu-Hao( 陳郁豪 ) Date:
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
Getting Started on Emerald Research Computing Group.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Unix Servers Used in This Class  Two Unix servers set up in CS department will be used for some programming projects  Machine name: eustis.eecs.ucf.edu.
How to use HybriLIT Matveev M. A., Zuev M.I. Heterogeneous Computations team HybriLIT Laboratory of Information Technologies (LIT), Joint Institute for.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
Modules, Compiling WRF, and Running on CHPC Clusters Adam Varble WRF Users Meeting 10/26/15.
+ Vieques and Your Computer Dan Malmer & Joey Azofeifa.
+ Introduction to Unix Joey Azofeifa Dowell Lab Short Read Class Day 2 (Slides inspired by David Knox)
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Usage of Workstation Lecturer: Yu-Hao( 陳郁豪 ) Date:
Services for Sensitive Research Data Iozzi Maria Francesca, Group Leader & Nihal D. Perera, Senior Engineer Research Support Services Group ”Services for.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Assignprelim.1 Assignment Preliminaries © 2012 B. Wilkinson/Clayton Ferner. Modification date: Jan 16a, 2014.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Advanced Computing Facility Introduction
Linux & Joker – An Introduction
Interacting with the cluster ssh, sftp, & slurm batch scripts
Hackinars in Bioinformatics
Hands on training session for core skills
GRID COMPUTING.
Welcome to Indiana University Clusters
PARADOX Cluster job management
Assumptions What are the prerequisites? … The hands on portion of the workshop will be on the command-line. If you are not familiar with the command.
How to use the HPCC to do stuff
Creating and running applications on the NGS
ASU Saguaro 09/16/2016 Jung Hyun Kim.
Joker: Getting the most out of the slurm scheduler
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
Assignment Preliminaries
NGS computation services: APIs and Parallel Jobs
Short Read Sequencing Analysis Workshop
College of Engineering
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Advanced Computing Facility Introduction
High Performance Computing in Bioinformatics
Introduction to High Performance Computing Using Sapelo2 at GACRC
EN Software Carpentry The Linux Kernel and Associated Operating Systems.
Short Read Sequencing Analysis Workshop
Maxwell Compute Cluster
Presentation transcript:

An Introduction to Gauss Paul D. Baines University of California, Davis November 20 th 2012

12 node compute cluster (2 x 16 cores per node) 1 TB storage per node ~ 11 TB storage on head node 64GB RAM per node Total 416 cores (inc. head node) What is Gauss?

Running large numbers of independent jobs Running long-running jobs Running jobs involving parallel computing Running large-memory jobs What is Gauss good for?

Running simple, fast jobs (just use your laptop) Running interactive R sessions Running GPU-based calculations What Gauss is not designed for…

Create your public/private key (see Wiki for details) Provide CSE with your public key and campus username (via to Log in to Gauss via ssh: (e.g., ssh –X When you ssh into Gauss, you log in to the head node If you just directly type R at the command line, you will be running R on the head node (Please do not do this !) To use the compute nodes you submit jobs via SLURM SLURM manages which jobs runs on which nodes Gauss Overview

Head Node SLURM Compute Node 1 Compute Node 2 Compute Node 3 Compute Node … Compute Node 12 Gauss Structure

Important commands to know: sbatch (submit a job to Gauss) sarray(submit an array job to Gauss) squeue(check the status of running jobs) scancel(cancel a job) Examples (more detailed examples later): squeue# view all running jobs squeue –u pdbaines# check all jobs scancel –u pdbaines# cancel all of pdbaines jobs scancel # cancel job SLURM Basics

The compute resources (CPUs, memory) are shared across all Gauss users. When users submit jobs, SLURM allocates resources. You must be sure to request sufficient resources (e.g., cores, memory) for your jobs to run Resource requests are made when submitting your job (via your sbatch or sarray scripts) Resources are allocated as per user requests, but strict limits are not enforced If you use more memory than you requested it can ~massively~ slow down yours (and others) jobs! To check the memory usage of your jobs you can use the myjobs command (see examples later) Resource Allocation on Gauss

Gauss is a shared resource – your bad code can (potentially) ruin someone elses simulation! Test your code thoroughly before running large jobs Make sure you request the correct amount of resources for your jobs Regularly check memory usage for long-running jobs Be considerate of others! Gauss Etiquette

To use Gauss you need to know some basic Linux commands (these work on a Mac terminal too) You should already be, or quickly get, familiar with the following commands: ls, cd, cp, mv, rm, pwd, cat, tar, grep It helps if you learn how to use a command line editor such as vim or nano. (hint: use vim ) Aside: Linux Basics

Bob has been given a large dataset by a collaborator and told to analyze it in. The dataset is large and the job will take about 3 days to complete so he doesnt want to use his laptop! Bob can submit the job on Gauss, and keep on working on other stuff in the meantime. Ways to use Gauss: Example 1

Code files: bob_example_1.R bob_example_1.sh To submit: sbatch bob_example_1.sh Example 1 cont…

Example 1 Code: SLURM script

How do you know how much memory to request? Run small trial jobs! Use the myjobs command e.g., myjobs Tue Nov 20 10:27:45 PST pdbaines has jobs running on: c0-11 jobs for pdbaines on c0-11 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND pdbaines ? R 10:25 3:12 R pdbaines ? R 10:25 3:12 R pdbaines ? R 10:25 3:12 R pdbaines ? R 10:25 3:12 R VSZ and RSS give a rough indication of how much memory your job is using (in Kb) e.g., The above R jobs are using ~ Mb each. Allocating Resources

Bob has been given 3 more datasets to analyze by his collaborator (or three new analyses to perform on the same dataset). He just needs to set up the same thing as example 1 multiple times. Ways to use Gauss: Example 2

Code files: bob_example_2A.R, bob_example_2B.R, bob_example_2C.R bob_example_2A.sh, bob_example_2B.sh, bob_example_2C.sh, To submit: sbatch bob_example_2A.sh sbatch bob_example_2B.sh Example 2 cont…

Example 2 Code: SLURM script

Bob has developed a new methodology for analyzing super- complicated data. He wants to run a simulation to prove to the world how awesome his method is compared to his competitors methods. He decides to simulate 100 datasets, and analyze each of them with his method, and his competitors methods. This is done using an array job. Ways to use Gauss: Example 3

Bob writes an R script to randomly generate and analyze one dataset at a time He would like to run the script 100 times on Gauss To do this, he write a shell script to submit to SLURM Each run must use a different random seed, o/w he will analyze the same dataset 100 times! He will also need to write an R script to combine the results from all 100 jobs He will also need a shell script to submit the post-processing portion of the analysis (Note: I have described this process in detail on the Gauss page of the CSE Wiki: Example 3 cont…

Code files: bob_example_3.R Bob_post_process.R To submit: sarray bob_example_3.sh sbatch bob_post_process.sh Example 3 cont…

Example 3: SLURM script

Example 3: Modified R Code

To copy results back from Gauss to your laptop: Archive them e.g., tar –cvzf all_results.tar.gz my_results/ Copy them by either using a file transfer (sftp) program, or, just use the command line (Linux/Mac users) e.g., scp Retrieving your results

Gauss can be setup to run parallel computing jobs using MPI, OpenMP etc. SLURM submit files need to be modified to specify number of tasks, CPUs, memory per CPU etc. New (free) software can be installed on Gauss at your request by ing More Advanced Usage

Pre-requisite Linux skills: Gauss/SLURM Links: ng_keys ng_keys References