An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas
Introduction SGI Altix - Hecate IBM Blue Gene/L – Orangena Dell Beowulf Cluster – Della Storage Other resources
TIGRESS High Performance Computing Center Terascale Infrastructure for Groundbreaking Research in Engineering and Science
Partnerships Princeton Institute for Computational Science and Engineering (PICSciE) Office of Information Technology (OIT) School of Engineering and Applied Science (SEAS) Lewis-Sigler Institute for Integrative Genomics Astrophysical Sciences Princeton Plasma Physics Laboratory (PPPL)
SGI Altix - Hecate GHz Itanium2 processors 256 GB RAM (4 GB per processor) NUMAlink interconnect 5 TB local disk 360 GFlops
SGI Altix – Itanium GHz 4 MB L3 Cache –256 KB L2 Cache –32 KB L1 Cache
SGI Altix - NUMAlink NUMAlink GB/s per direction Physical latency – 28 ns MPI latency – 1 s Up to 256 processors
SGI Altix - Software SLES 9 with SGI ProPack – sn2 kernel Intel Fortran compilers v8.1 Intel C/C++ compilers v8.1 Intel Math Kernel Libraries v7 Intel vtune Torque/Maui OpenMP MPT (SGI mpich libraries) fftw-2.1.5, fftw hdf4, hdf5 ncarg petsc
IBM Blue Gene/L - Orangena MHz Power4 processors 1024 nodes 512 MB RAM (256 MB per processor) 5 Interconnects including a 3D torus 8 TB local disk TFlops
IBM Blue Gene/L – Full system architecture 1024 nodes –2 PowerPC 440 cpus –512 MB RAM –1 rack –35 kVA –100 kBTU/hr 2 racks of supporting servers and disks –Service node –Front end node –8 storage nodes –8 TB GPFS storage –1 Cisco switch
IBM Blue Gene/L
IBM Blue Gene/L - networks 3D Torus network Collective (tree) network Barrier network Functional network Service network
IBM Blue Gene/L - Software LoadLeveler (coming soon) mpich XL Fortran Advanced Edition V9.1 –mpxlf, mpf90, mpf95 XL C/C++ Advanced Edition V7.0 –Mpcc, mpxlc, mpCC fftw and fftw hdf netcdf BLAS, LAPACK, ScaLAPACK
IBM Blue Gene/L – More…
Dell Beowulf Cluster - Della GHz Xeon processors 256 nodes 2 TB RAM (4 GB per processor) Gigabit Ethernet 64 nodes connected to Infiniband 3 TB local disk TFlops
Dell Beowulf Cluster – Interconnects All nodes connected with Gigabit Ethernet –1 Gb/s –MPI latency ~ 30 s 64 nodes connected with Infiniband –10 Gb/s –MPI latency ~5 s
Dell Beowulf Cluster - Software Elders RHEL 4 based image – ELsmp kernel Intel compilers Torque/Maui OpenMPI-1.1 fftw-2.1.5, fftw R MatlabR2006a
Dell Beowulf Cluster – More…
Storage 38 TB delivered GPFS filesystem At least 200 MB/s Installation at the end of this month Fees to recover half the cost
Getting Access 1 – 3 page proposal Scientific background and merit Resource requirements –# concurrent cpus –Total cpu hours –Memory per process/total memory –Disk space A few references
Other resources adrOIT Condor Programming help
Questions