Using Parallel Computing Resources at Marquette

Slides:



Advertisements
Similar presentations
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
Advertisements

Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Parallel ISDS Chris Hans 29 November 2004.
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Introduction to HPC Workshop October Introduction Rob Lane HPC Support Research Computing Services CUIT.
DCC/FCUP Grid Computing 1 Resource Management Systems.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh ssh.fsl.byu.edu You will be logged in to an interactive node.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
Introduction to UNIX/Linux Exercises Dan Stanzione.
MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
National Alliance for Medical Image Computing Grid Computing with BatchMake Julien Jomier Kitware Inc.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Electronic Visualization Laboratory, University of Illinois at Chicago MPI on Argo-new Venkatram Vishwanath Electronic Visualization.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
Introduction to the HPCC Dirk Colbry Research Specialist Institute for Cyber Enabled Research.
Grid Computing I CONDOR.
Lab System Environment
Introduction to Using SLURM on Discover Chongxun (Doris) Pan September 24, 2013.
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
1 High-Performance Grid Computing and Research Networking Presented by David Villegas Instructor: S. Masoud Sadjadi
How to for compiling and running MPI Programs. Prepared by Kiriti Venkat.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Cluster Computing Applications for Bioinformatics Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to HPC Workshop October Introduction Rob Lane & The HPC Support Team Research Computing Services CUIT.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Introduction to HPC Workshop March 1 st, Introduction George Garrett & The HPC Support Team Research Computing Services CUIT.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
1 High-Performance Grid Computing and Research Networking Presented by Javier Delgodo Slides prepared by David Villegas Instructor: S. Masoud Sadjadi
Advanced Computing Facility Introduction
GRID COMPUTING.
Auburn University
Welcome to Indiana University Clusters
PARADOX Cluster job management
HPC usage and software packages
Welcome to Indiana University Clusters
Using Paraguin to Create Parallel Programs
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
Paul Sexton CS 566 February 6, 2006
Introduction to HPC Workshop
Compiling and Job Submission
Requesting Resources on an HPC Facility
Introduction to High Performance Computing Using Sapelo2 at GACRC
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

Using Parallel Computing Resources at Marquette

HPC Resources Local Resources Regional Resources National Resources HPCL Cluster hpcl.mscs.mu.edu PARIO Cluster pario.eng.mu.edu PERE Cluster pere.marquette.edu MU Grid Regional Resources Milwaukee Institute SeWhip National Resources NCSA http://www.ncsa.illinois.edu/ ANL http://www.anl.gov/ TeraGrid Resources http://www.teragrid.org/ Commercial Resources Amazon EC2 http://aws.amazon.com/ec2/

Pere Cluster 128 HP ProLiant BL280c G6 Server Blade To MARQNET 1024 Intel Xeon 5550 Cores (Nehalem) 50 TB raw storage 3 TB main memory To MARQNET 134.48. Gigabit Ethernet Interconnection Head Node Infiniband Interconnection Compute Node #1 10..1.0.0/16 172.16.0.0/16 Compute Node #2 Compute Node #3 Compute Node #128

Steps to Run A Parallel Code Get the source code You can do it either on your local computer and then transfer to hpcl.mscs.mu.edu, or Use vi to edit a new one on hpcl.mscs.mu.edu Compile your source code using mpicc, mpicxx or mpif77 Write a submission script for your job vi myscript.sh Use qsub to submit the script. qsub myscript.sh

Getting Parallel Code hello.c You can write the code on your development machine using IDE and then transfer the code to the cluster. (Recommended) For small code, you can also directly edit it on the cluster.

Transfer File to Cluster Method 1: sftp (text or GUI) sftp mscs6060@pere.marquette.edu put simple.c bye Method 2: scp scp simple.c mscs6060@pere.marquette.edu:example/ Method 3: rsync rsync -rsh=ssh -av example \ mscs6060@pere.marquette.edu:

Compile MPI Programs Method 1: Using MPI compiler wrappers mpicc: for c code mpicxx/mpic++/mpiCC: for c++ code mpif77, mpif90: for FORTRAN code Examples: mpicc –o hello hello.c mpif90 –o hello hello.f

Compile MPI Programs (cont.) Method 2: Using standard compilers with mpi library Note: MPI is just a library, so you can link the library to your code to get the executables. Examples: gcc -o ping ping.c \ -I/usr/mpi/gcc/openmpi-1.2.8/include \ -L/usr/mpi/gcc/openmpi-1.2.8/lib64 -lmpi

Compiling Parallel Code – Using Makefile

Job Scheduler A kind of software that provide Job submission and automatic execution Job monitoring and control Resource management Priority management Checkpoint …. Usually implemented as master/slave architecture Commonly used Job Schedulers PBS: PBS Pro/TORQUE SGE (Sun Grid Engine, Oracle) LSF (Platform Computing) Condor (UW Madison)

Access the Pere Cluster ssh <your-marquette-id>@pere.marquette.edu Account management Based on Active Directory, you use the same username and password to login Pere as the one you are using for your Marquette email. Need your professor to help you sign up. Transfer files from/to Pere

Modules The Modules package is used to customize your environment settings. control what versions of a software package will be used when you compile or run a program. Using modules module avail check which modules are available module load <module> set up shell variables to use a module module unload remove a module module list show all loaded modules module help get help on using module

Using MPI on Pere Multiple MPI compilers available, each may need different syntax OpenMPI compiler (/usr/mpi/gcc/openmpi-1.2.8) mpicc –o prog prog.c mpif90 –o prog prog.f mvapich compiler (/usr/mpi/gcc/mvapich-1.1.0) PGI compiler (/cluster/pgi/linux86-64/10.2) pgcc –Mmpi –o prog prog.c pgf90 –Mmpi –o prog prog.f Intel compiler icc –o prog prog.c –lmpi ifort –o prog prog.f -lmpi

Pere Batch Queues Pere current runs PBS/TORQUE TORQUE usage qsub myjob.qsub submit job scripts qstat view job status qdel job-id delete job pbsnodes show nodes status pbstop show queue status

Sample Job Scripts on Pere #!/bin/sh #PBS -N hpl #PBS -l nodes=64:ppn=8,walltime=01:00:00 #PBS -q batch #PBS -j oe #PBS -o hpl-$PBS_JOBID.log cd $PBS_O_WORKDIR cat $PBS_NODEFILE mpirun -np 512 --hostfile `echo $PBS_NODEFILE` xhpl Assign a name to the job Request resources: 64 nodes, each with 8 processors, 1 hour Submit to batch queue Merge stdout and stderr output Redirect output to a file Change work dir to current dir Print allocated nodes (not required) Run the mpi program

Extra Help For Accessing Pere Contact me. User’s guide for pere

Using Condor Resources: http://www.cs.wisc.edu/condor/tutorials/

Using Condor 1. Write a submit script – simple.job Universe = vanilla Executable = simple Arguments = 4 10 Log = simple.log Output = simple.out Error = simple.error Queue 2. Submit the script to condor pool condor_submit simple.job 3. Watch the job run condor_q condor_q –sub <you-username>

Doing a Parameter Sweep Can put a collections of jobs in the same submit scripts to do a parameter sweep. Universe = vanilla Executable = simple Arguments = 4 10 Log = simple.log Output = simple.$(Process).out Error = simple.$(Process).error Queue Arguments = 4 11 Arguments = 4 12 Tell condor to use different output for each job Use queue to tell the individual jobs Can be run independently

Condor DAGMan DAGMAn, lets you submit complex sequences of jobs as long as they can be expressed as a directed acylic graph Each job in the DAG can only one queue. Commands: condor_submit_dag simple.dag ./watch_condor_q

Submit MPI Jobs to Condor Difference from serial jobs: use MPI universe machine_count > 1 When there is no shared file system, transfer executables and output from/to local systems by specifying should_transfer_file and when_to_transfer_output

Questions How to implement parameter sweep using SGE/PBS? How to implement DAG on SGE/PBS? Is there better ways to run the a large number of jobs on the cluster? Which resource I should use and where I can find help?

Gigabit Ethernet Interconnection HPCL Cluster Head Node Compute Node #1 Compute Node #2 Compute Node #3 Compute Node #4 Gigabit Ethernet Interconnection To MARQNET 134.48. 10..1.0.0/16

How to Access HPCL Cluster On Windows: Using SSH Secure Shell or PUTTY On Linux: Using ssh command

Developing & Running Parallel Code Identify Problem & Analyze Requirement Analyze Performance Bottleneck Designing Parallel Algorithm Coding Writing Parallel Code Building Binary Code (Compiling) Compiling Testing Code Running Solving Realistic Problems (Running Production Release)

Steps to Run A Parallel Code Get the source code You can do it either on your local computer and then transfer to hpcl.mscs.mu.edu, or Use vi to edit a new one on hpcl.mscs.mu.edu Compile your source code using mpicc, mpicxx or mpif77 They are located under /opt/openmpi/bin. Use which command to find it location; If not in your path, add the next line to your shell initialization file (e.g., ~/.bash_profile) export PATH=/opt/openmpi/bin:$PATH Write a submission script for your job vi myscript.sh Use qsub to submit the script. qsub myscript.sh

Getting Parallel Code hello.c You can write the code on your development machine using IDE and then transfer the code to the cluster. (Recommended) For small code, you can also directly edit it on the cluster.

Transfer File to Cluster Method 1: sftp (text or GUI) sftp mscs6060@hpcl.mscs.mu.edu put simple.c bye Method 2: scp scp simple.c mscs6060@hpcl.mscs.mu.edu:example/ Method 3: rsync rsync -rsh=ssh -av example \ mscs6060@hpcl.mscs.mu.edu: Method 4: svn or cvs svn co \ svn+ssh://hpcl.mscs.mu.edu/mscs6060/example

Compile MPI Programs Method 1: Using MPI compiler wrappers mpicc: for c code mpicxx/mpic++/mpiCC: for c++ code mpif77, mpif90: for FORTRAN code Examples: mpicc –o hello hello.c mpif90 –o hello hello.f Looking the cluster documentation or consulting system administrators for the types of available compilers and their locations.

Compile MPI Programs (cont.) Method 2: Using standard compilers with mpi library Note: MPI is just a library, so you can link the library to your code to get the executables. Examples: gcc -o ping ping.c \ -I/usr/mpi/gcc/openmpi-1.2.8/include \ -L/usr/mpi/gcc/openmpi-1.2.8/lib64 -lmpi

Compiling Parallel Code – Using Makefile

Job Scheduler A kind of software that provide Job submission and automatic execution Job monitoring and control Resource management Priority management Checkpoint …. Usually implemented as master/slave architecture Commonly used Job Schedulers PBS: PBS Pro/TORQUE SGE (Sun Grid Engine, Oracle) LSF (Platform Computing) Condor (UW Madison)

Using SGE to Manage Jobs HPCL cluster using SGE as job scheduler Basic commands qsub submit a job to the batch scheduler qstat examine the job queue qdel delete a job from the queue Other commands qconf SGE queue configuration qmon graphical user's interface for SGE qhost show the status of SGE hosts, queues, jobs

Submit a Serial Job simple.sh

Submit Parallel Jobs to HPCL Cluster force to use bash for shell interpreter Request Parallel Environment orte using 64 slots (or processors) Run the job in specified director Merge two output files (stdout, stderr) Redirect output to a log file Run mpi program For your program, you may need to change the processor number, the program name at the last line, and the job names.

References SUN Grid Engine User’s Guide http://docs.sun.com/app/docs/doc/817-6117 Command used commands Submit job: qsub Check status: qstat Delete job: qdel Check configuration: qconf Check the manual of a command man qsub