Roadrunner Supercluster University of New Mexico -- National Computational Science Alliance Paul Alsing
23 September 1999 Cactus Workshop2 Alliance/UNM Roadrunner SuperCluster
23 September 1999 Cactus Workshop3 Alliance/UNM Roadrunner SuperCluster Strategic Collaborations with Alta Technologies Intel Corp. Node configuration Dual 450MHz Intel Pentium II processors 512 KB cache, 512 MB ECC SDRAM 6.4 GB IDE hard drive Fast Ethernet and Myrinet NICs
23 September 1999 Cactus Workshop4 Alliance / UNM Roadrunner Interconnection Networks Control: 72-port Fast Ethernet Foundry switch with 2 Gigabit Ethernet uplinks Data: Four Myrinet Octal 8-port switches Diagnostic: Chained serial ports
23 September 1999 Cactus Workshop5 A Peek Inside Roadrunner
23 September 1999 Cactus Workshop6 Roadrunner System Software Redhat Linux 5.2 (6.0) SMP Linux kernel MPI (Argonne’s MPICH 1.1.2) Portland Group Compiler Suite Myricom GM Drivers (1.086) and MPICH-GM ( ) Portable Batch Scheduler (PBS)
HPF Parallel Fortran for clusters F90 Parallel SMP Fortran 90 F77 Parallel SMP Fortran 77 CC Parallel SMP C/C++ DBG symbolic debugger PROF performance profiler
23 September 1999 Cactus Workshop8 Roadrunner System Libraries BLAS LAPACK ScaLAPACK Petsc FFTw Cactus Globus Grid Infrastructure
23 September 1999 Cactus Workshop9 Parallel Job Scheduling Node-based resource allocation Job monitoring and auditing Resource reservations
23 September 1999 Cactus Workshop10 Computational Grid National Technology Grid Globus Infrastructure Authentication Security Heterogenous environments Distributed applications Resource monitoring
23 September 1999 Cactus Workshop11 For more information: Contact Information To Apply for an Account
23 September 1999 Cactus Workshop12 Easy to Use rr% ssh -l username rr.alliance.unm.edu rr% mpicc -o prog helloWorld.c rr% qsub -I -l nodes=64 r021 % mpirun prog
23 September 1999 Cactus Workshop13 Job Monitoring with PBS
23 September 1999 Cactus Workshop14 Roadrunner Performance
23 September 1999 Cactus Workshop15 Roadrunner Ping-Pong Time
23 September 1999 Cactus Workshop16 Roadrunner Bandwidth
23 September 1999 Cactus Workshop17 Applications on RR MILC QCD (Bob Sugar, Steve Gottlieb) A body of high performance research software for doing SU(3) and SU(2) lattice gauge theory on several different (MIMD) parallel computers in current use ARPI3D (Dan Weber) 3-D numerical weather prediction model to simulate the rise of a moist warm bubble in a standard atmosphere AS-PCG (Danesh Tafti) 2-D Navier Stokes solver BEAVIS (Marc Ingber, Andrea Mammoli) 1994 Gordon Bell Prize-winning dynamic simulation code for particle-laden, viscous suspensions
23 September 1999 Cactus Workshop18 Applications: CACTUS 3D Numerical Relativity Toolkit for Computational Astrophysics (Courtesy of Gabrielle Allen and Ed Seidel) Roadrunner performance under the Cactus application benchmark shows near-perfect scalability.
23 September 1999 Cactus Workshop19 CACTUS Performance (Graphs - courtesy of O. Wehrens)
23 September 1999 Cactus Workshop20 CACTUS Scaling (Graphs - courtesy of O. Wehrens)
23 September 1999 Cactus Workshop21 CACTUS: The evolution of a pure gravitational wave A subcritical Brill wave (Amplitude=4.5), showing the Newman-Penrose Quantity as volume rendered 'glowing clouds'. The lapse function is shown as a height field in the bottom part of the picture. (Courtesy of Werner Benger)
23 September 1999 Cactus Workshop22 TeraScale computing “A SuperCluster in every lab” Efficient use of SMP nodes Scalable interconnection networks High-performance I/O Advanced programming models for hybrid (SMP and Grid-based) clusters
23 September 1999 Cactus Workshop23 Exercises Login to Roadrunner % ssh roadrunner.alliance.unm.edu -l cactusXX Request interactive session % qsub -I -l nodes=n Create Myrinet Node-Configuration File % gmpiconf $PBS_NODEFILE (to use 1 CPU per node) % gmpiconf2 $PBS_NODEFILE (to use 2 CPUs per node) Run Job % mpirun cactus_wave wavetoyf90.par (on 1 CPU per node) % mpirun -np 2*n cactus wavetoyf90.par (on 2 CPUs per node)
23 September 1999 Cactus Workshop24 Compiling Cactus: WaveToy Login to Roadrunner % ssh roadrunner.alliance.unm.edu -l cactusXX.cshrc #MPI (season to taste) #setenv MPIHOME /usr/parallel/mpich-eth.pgi #ethernet/Portland Grp #setenv MPIHOME /usr/parallel/mpich-eth.gnu #ethernet/GNU setenv MPIHOME /usr/parallel/mpich-gm.pgi #myrinet/Portland Grp #setenv MPIHOME /usr/parallel/mpich-gm.gnu #myrinet/GNU if you modify.cshrc make sure to source.cshrc; rehash echo $MPIHOME #should read /usr/parallel/mpich-gm.pgi
23 September 1999 Cactus Workshop25 Compiling Cactus: WaveToy Create WaveToy configuration % gmake wave F90=pgf90 MPI=MPICH MPICH_DIR=$MPIHOME Compile WaveToy % gmake wave % cd ~/Cactus/exe Copy all.par files into this directory (not necessary) % foreach file (`find ~/Cactus -name “*.par” -print`) foreach> cp $file. foreach> end
23 September 1999 Cactus Workshop26 Running WaveToy on RoadRunner Run wave interactively on RoadRunner PBS job scheduler: request interactive nodes % qsub -I -l nodes=4 (note: -I = interactive) Note: prompt changes from a front-end node name like exe] to an compute-node name for e.g. exe] Note: you should compile on the front-end and run on the compute nodes (open 2 windows) PBS job scheduler: setup a node-configuration file % gmpiconf $PBS_NODEFILE Note: cat ~/.gmpi/conf-xxxx.rr will contain specific node names Run the job from ~/Cactus/exe % mpirun cactus_wave wavetoyf90.par % mpirun -np 2 cactus_wave wavetoyf90.par
23 September 1999 Cactus Workshop27 Running WaveToy on RoadRunner Run wave batch on RoadRunner PBS script: (call it, for e.g.) wave.pbs #PBS -l nodes=4 # pbs script for wavetoy: 1 processor per node gmpiconf $PBS_NODEFILE mpirun ~/Cactus/exe/cactus_wave wavetoyf90.par #(use full path) % Submit batch PBS job % qsub wave.pbs % (PBS responds with your job_id #) % qstat -a (check status of your job) % qstat -n (check status, and see the nodes you are on) % qdel (remove job from queue) % dsh killall cactus_wave (if things hang, mess up, etc…)