Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.

Slides:



Advertisements
Similar presentations
Three types of remote process invocation
Advertisements

Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
Setting up Small Grid Testbed
Running DiFX with SGE/OGE Helge Rottmann Max-Planck-Institut für Radioastronomie Bonn, Germany DiFX Meeting Sydney.
Parallel ISDS Chris Hans 29 November 2004.
Protocols and software for exploiting Myrinet clusters Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin.
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
Guide To UNIX Using Linux Third Edition
Lecture 8 Configuring a Printer-using Magic Filter Introduction to IP Addressing.
ISG We build general capability Purpose After this tutorial, you should: Be comfortable submitting work to the batch queuing system of olympus and be familiar.
Input/Output Controller (IOC) Overview Andrew Johnson Computer Scientist, AES Controls Group.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
Electronic Visualization Laboratory, University of Illinois at Chicago MPI on Argo-new Venkatram Vishwanath Electronic Visualization.
Trilinos 101: Getting Started with Trilinos November 7, :30-9:30 a.m. Mike Heroux Jim Willenbring.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Parallel Interactive Computing with PyTrilinos and IPython Bill Spotz, SNL (Brian Granger, Tech-X Corporation) November 8, 2007 Trilinos Users Group Meeting.
Grid Computing I CONDOR.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Writing Shell Scripts ─ part 3 CSE 2031 Fall October 2015.
Oracle Data Integrator Procedures, Advanced Workflows.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
1 High-Performance Grid Computing and Research Networking Presented by David Villegas Instructor: S. Masoud Sadjadi
How to for compiling and running MPI Programs. Prepared by Kiriti Venkat.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Cluster Computing Applications for Bioinformatics Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
Threaded Programming Lecture 2: Introduction to OpenMP.
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
Multiple File Compilation and linking By Bhumik Sapara.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Five todos when moving an application to distributed HTC.
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
1 High-Performance Grid Computing and Research Networking Presented by Javier Delgodo Slides prepared by David Villegas Instructor: S. Masoud Sadjadi
Advanced Computing Facility Introduction
Welcome to Indiana University Clusters
PARADOX Cluster job management
gLite MPI Job Amina KHEDIMI CERIST
Running a job on the grid is easier than you think!
Using Paraguin to Create Parallel Programs
Creating and running applications on the NGS
BIMSB Bioinformatics Coordination
CRESCO Project: Salvatore Raia
NGS computation services: APIs and Parallel Jobs
Writing Shell Scripts ─ part 3
Paul Sexton CS 566 February 6, 2006
Compiling and Job Submission
Sun Grid Engine.
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Overview ● What is and why use SCore? ● Compiling SCore MPI jobs. ● Submitting parallel jobs via mpisub. ● Writing your own qsub script. ● Themes and variations.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 What is SCore? ● Advanced cluster O/S and run-time environment – Cluster installation & management tools – Integrated communication layer – Includes an advanced implementation of the MPI standard ● presently based on mpich ● Commercially developed/supported by Streamline ● Original developer works for Streamline in Japan

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Why SCore? ● Performance – Employs a TCP/IP bypass on Ethernet ● Inter-node latencies typically around 15 µsec (vs µsec with regular mpich ) ● Inter-node bandwidth around 110 MB/s (vs. ~ 100 MB/s with mpich) – Network fail over mode allows multiple NICs to be employed. ● latency circa 20 µsec, b’width ~ 200 MB/s – Intra-node performance excellent.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Why SCore? ● Expanded feature set (active development in Japan via PC Cluster Consortium) – Good checkpoint / restart mechanisms (soon to extend to multi-threaded and dynamically linked applications). – Robust job failure (relatively few problems with rogue processes). – Close integration with Sun Grid Engine.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Compiling SCore MPI jobs ● Suitable only for 64-bit binaries!! ● Default environment should put /opt/score/bin near the top of your PATH list. ● Use of compiler wrappers essential: – MPI library links / includes handled automatically. – mpif90 for PGI pgf95 – mpicc, mpif77 and mpic++ use the GNU compilers – mpicc –compiler pgi to access PGI C compiler – mpicc –help shows command to get more information.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 More on compiling ● Compiler options, library references are passed through to the associated compiler / loader mpicc -compiler pgi -fastsse -tp k8-64 ● Go faster options for PGI cc and f90: -fastsse -tp k8-64 – Note flag -tp k8-64 must be included on the link line too. ● 64-bit binaries can be tricky to get right – With VASP (f90 code) needed to add –Mlfs

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Using mpisub ● The mpisub script generator is the easy way to submit SCore-MPI jobs to the system. – Underlies the MPI job type via the sge jobmanager with globus-job-submit / globus-job-run. – Basic syntax easy: mpisub NxS exec arg1 arg2 ● N – number of nodes, S – 1, 2 or 4 cores / node mpisub 5x4 pi2 “< infile”

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Using mpisub-2 ● mpisub: – Generates a job submission script, – Submits the job to an appropriate parallel environment – is non-blocking! – Performs a qstat command to give you an idea of where job your lies in the queue. – Options further controlled using the values of ● $QSUB_OPTIONS ● $DEFAULTQ ● $SGE_RESOURCES

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 #!/bin/sh #$ -cwd -V -j y #$ -pe score 3 EXEC="pi_redirect arg1” PROCS=$((NSLOTS-1))x4 scout -wait -F $HOME/.score/ndfile.$JOB_ID \ -e /tmp/scrun.$JOB_ID \ -nodes=$PROCS, $EXEC A minimal qsub script Grid Engine Options Parallel environment is score Executable & argument NSLOTS – predefined = 3 here so PROCS is 8 on 2 nodes JOB_ID – SGE job id

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 qsub scripts – General points ● Need to specify number of slots = nodes+1 in the parallel environment line. ● A script can be parameterised fully if move the –pe line out as a qsub argument. ● EXEC and PROC variables make the script more generic, but are not essential. – EXEC string is usually just the executable name ● $NSLOTS is predefined by SGE to the number of slots. ● $JOB_ID is predefined to the SGE job id. ● scout line can be copied once and left alone if use $EXEC and $PROC.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 qsub scripts - Further details ● Sample qsub command for script without the –pe line: qsub -pe score 7 qsub_NeedsPE_pi.sh ● Typical run generates 4 files (SGE output & error plus job output & error). – #$ -j y – combines error and output files – #$ -o outfile - redirect standard output ● Similar for standard error ● Combine above two commands - all output in outfile.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 qsub scripts - More further details ● Sometimes need special environment variables passed to the compute nodes. – #$ -V → passes environment to starting process ● scout them propagates this environment to compute nodes. ● Default location for output etc. – home directory – #$ -cwd → changes this to current working directory.

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 qsub scripts – Final points ● Example programs, Makefile, README can be found in /usr/local/examples (at least on lv1). ● mpirun can be used instead of scout to launch parallel jobs. – Standard arguments ● Machinefile: $HOME/.score/ndfile.$JOB_ID – May not be as robust as scout on failed jobs!

Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 Grid implications ● qsub commands are non-blocking operations – Need to capture something like Job ID and then check to see when that job is no longer on the system (e.g. build on lines like) Job_ID=`qsub myscript | cut -f 3 -d " "` qstat –u $USER | grep $Job_ID – Can use gsiscp to stage / retrieve files – Can use gsissh to run commands / scripts – Make certain remote submission needs are not satisfied using mpisub (much easier to work with!)