HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.

Slides:



Advertisements
Similar presentations
An Introduction to Gauss Paul D. Baines University of California, Davis November 20 th 2012.
Advertisements

Tutorial1: NEMO5 Technical Overview
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Parallel ISDS Chris Hans 29 November 2004.
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Running Jobs on Jacquard An overview of interactive and batch computing, with comparsions to Seaborg David Turner NUG Meeting 3 Oct 2005.
VIPBG LINUX CLUSTER By Helen Wang Sept. 10, 2014.
Tutorial on MPI Experimental Environment for ECE5610/CSC
IT MANAGEMENT OF FME, 21 ST JULY  THE HPC FACILITY  USING PUTTY AND WINSCP TO ACCESS THE SERVER  SENDING FILES TO THE SERVER  RUNNING JOBS 
New MPI Library on the cluster Since WSU’s Grid had an upgrade of its operating system recently, we need to use a new MPI Library to compile and run our.
High Performance Computing
Information Technology Center Introduction to High Performance Computing at KFUPM.
Job Submission on WestGrid Feb on Access Grid.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
Bioinformatics Facility at the Biotechnology/Bioservi ces Center Co-Heads : J.P. Gogarten, Paul Lewis Facility Scientist : Pascal Lapierre Hardware/Software.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
HPCC Mid-Morning Break Interactive High Performance Computing Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
ISG We build general capability Purpose After this tutorial, you should: Be comfortable submitting work to the batch queuing system of olympus and be familiar.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
HPC at HCC Jun Wang Outline of Workshop1 Overview of HPC Computing Resources at HCC How to obtain an account at HCC How to login a Linux cluster at HCC.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Introduction to Parallel Programming with C and MPI at MCSR Part 2 Broadcast/Reduce.
17-April-2007 High Performance Computing Basics April 17, 2007 Dr. David J. Haglin.
Lab System Environment
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
How to get started on cees Mandy SEP Style. Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers Chapter 2: Message-Passing Computing LAM/MPI at the.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to HPC Workshop October Introduction Rob Lane & The HPC Support Team Research Computing Services CUIT.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
IPPP Grid Cluster Phil Roffe David Ambrose-Griffith.
Advanced Computing Facility Introduction
Hands on training session for core skills
GRID COMPUTING.
Specialized Computing Cluster An Introduction
Welcome to Indiana University Clusters
PARADOX Cluster job management
INTRODUCTION TO VIPBG LINUX CLUSTER
HPC usage and software packages
INTRODUCTION TO VIPBG LINUX CLUSTER
MPI Basics.
Welcome to Indiana University Clusters
How to use the HPCC to do stuff
BIOSTAT LINUX CLUSTER By Helen Wang October 29, 2015.
Computational Physics (Lecture 17)
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
CommLab PC Cluster (Ubuntu OS version)
Postdoctoral researcher Department of Environmental Sciences, LSU
Welcome to our Nuclear Physics Computing System
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Welcome to our Nuclear Physics Computing System
High Performance Computing in Bioinformatics
Parallel computation with R & Python on TACC HPC server
MPI MPI = Message Passing Interface
Introduction to High Performance Computing Using Sapelo2 at GACRC
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Presentation transcript:

HPC for Statistics Grad Students

A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU

Clusters Stat Cluster – Need FAS account – If you have FAS account and can’t access the cluster contact – Read Matt Pratola’s webpage first (references)! – 160 CPUs in 20 nodes (1 node = 2 x Intel quad- core Xeon 2.66GHz w/ 2GB RAM per CPU) – Access: ssh warrior.stat.sfu.ca ssh stat-cl.stat.sfu.ca

Clusters IRMACS Cluster – Need IRMACS account – Contact ???? for access if you don’t have – 80 CPUs in10 nodes (1 node = 2 x Intel quad-core Xeon 2.66GHz w/ 2GB RAM per CPU) – Access: ssh head.irmacs.sfu.ca

WestGrid – Need to apply for account with permission from Charmaine – > 5000 CPUs with various chips and >> 10 TB of storage

Cluster Login & File Access Log in from the terminal ssh head.irmacs.sfu.ca File transfer: scp (secure copy) and fugu scp Script-pbs.txt

Submitting Jobs to the Queuing Software (stat-cl) PBS Script #!/bin/bash #PBS -N job_name #PBS -q queue_name #PBS -M user_ #PBS -m bae ” (b)efore execution (a)fter execution (e)rror occurs" #PBS -l nodes=2 ”Number of CPUs needed” #PBS -o path_to_job_log #PBS -e path_to_error_log Example: Single R job #! /bin/bash #PBS -N Chi-square #PBS -q batch #PBS -M #PBS -m bae /usr/local/bin/R CMD BATCH Chi-square.R

Threading & Parallel Processing Lots of statistical jobs can utilize parallel processing, but threading is much less common Threading: sending calls to subroutines out to separate CPUs for simultaneous processing Parallel processing: separate CPUs performing similar, but independent jobs – Simulation – Bootstrapping

Parallel Processing in Clusters MPI – Message Passing Interface – Software which manages the how the parallel (or threaded) processes are sent arguments and return their results Use RMPI package to construct a parallel job in R then use MPIRUN to send that job to CPUs on the cluster – Master – Slaves

Example PBS Script for RMPI Job (Stat Cluster) #! /bin/bash #PBS -N Cox_MPI #PBS -q default #PBS -M #PBS -j oe #PBS -o cox_mpi.out #PBS -e cox_mpi.err #PBS -d /home/math2/dthompso/RMPIex #PBS -m ea #PBS -l nodes=4:ppn=8 # The mpirun command line is rather complicated, so we define it here. DO NOT CHANGE! set MPIRUN="mpirun -np 1 -hostfile $PBS_NODEFILE --mca btl ^openib,udapl --mca pls_rsh_agent /usr/bin/ssh" # Here is where you specify the executable you want the cluster to run. $MPIRUN /math/local2-linux/stat/bin/R --vanilla --no-save --no-restore -f RMPI_cox_test.R

R Code RMPI Job Components of RMPI R Code: - initialization of slaves - creation of function to pass to slaves - submitting data & functions to the slaves - output results

R Packages on the Cluster Some packages are already installed – check! 1.Run an interactive session qsub -I 2.Start R (stat cluster) R (IRMACS cluster) /usr/local/bin/R 3.Attempt to access the library library( )

R Packages on the Cluster (cont’d) If you need a new package 1.Download the binary and place the gz file on the cluster (Fugu or scp ) 2.Make a package installation folder: mkdir $HOME/R mkdir $HOME/R/x86_64-unknown-linux-gnu-library mkdir $HOME/R/x86_64-unknown-linux-gnu-library/2.7 3.Install: cd $HOME R CMD INSTALL PACKAGENAME.gz

Helpful online resources RMPI Stat Cluster IRMACS Cluster WestGrid