Download presentation
Presentation is loading. Please wait.
Published bySilas Barrett Modified over 8 years ago
1
Programming on IBM Cell Triblade Jagan Jayaraj,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009
2
An instability of an interface between two fluids of different densities, which occurs when the lighter fluid is pushing the heavier fluid. Using multi-fluids Piecewise-Parabolic Method(PPM) to implement R-T instability simulation Program is written in Fortran Rayleigh–Taylor instability
3
TriBlade ▫Two QS22 blades, each with 2 PowerXCell 8i CPUs ▫LS21 blade with two dual-core AMD Opterons ▫16GB memory for LS21 and 8GB memory for QS22
5
LCSE Cell Cluster 6 Triblades 4 QS22 Cell blades 2 QS20 Cell blades 4 AMD Quadcore Systems
7
Login instructions Account credentials should be in your email. Guest account: lcse / lcse$ncsa! Login steps: ▫SSH to frodo.lcse.umn.edu ▫Once logged in to frodo SSH to an assigned Cell Processor host AMD – rra001a ~ rra006a Cell – rra001b / rra001c ~ rra006b/rra006c
8
Software available Cell SDK 3.1 OpenMPI 1.3 DaCS Fortran bindings Compilers ▫AMD: gfortran, gcc 4.1.2 ▫PPU: ppuxlf, ppu-gcc ▫SPU: spuxlf, spu-gcc Example code is available on /mnt/scratch/NCSA_Example
9
Compilation and Execution On AMD node: ▫make ppm4f-x86 On Cell node: ▫make ppm4f-ppu On AMD node: ▫./ppm4f-x86
10
Three levels of parallelism: within-Cell within-node node-to-node Compute-communication overlap DMA DaCS MPI Triblade programming paradigm
11
Single code for Roadrunner and non-RR systems ◦ Using lots #ifdef, #if, #endif… ◦ Using preprocessor to generate three codes Minimize the manual translation for SPU code ◦ Using Fortran to Cell C translator, Tedious portions of the SPU code can be translated. Fortran codes for PPU and AMD ◦ Fortran binding programs for C intrinsic libraries Keep memory footprint small Programming for IBM Cell Tri-blade
12
Single Source Code Preprocessor PPU Fortran codeSPU Fortran codeAMD Fortran code Translation SPU C codeFortran Binding Programs SPU C Compiler PPU Fortran Compiler GNU Fortran Compiler AMD ExecutablePPU ExecutableSPU Executable Embedded
13
Division of labor ▫Define jobs for AMD, PPU and SPU clearly AMD: I/O, MPI, relay data to Cell… PPU: Transfer data, manage SPUs SPU: Just compute
14
▫Three codes for three different ISAs ▫Different endian-ness between PPU and AMD Need to do byte-swapping ▫64bit/32bit conversion SPU supports 32bit address only, but DaCS requires 64bit address mode Items to care
15
Translator Fortran to C with Cell extensions Needs directives Built with ANTLR Handles: ▫Vector and scalar loops ▫DMAs (Including List DMAs) ▫Variable declarations ▫Conditional vector moves
16
References Woodward, P. R., J. Jayaraj, P.-H. Lin, and P.-C. Yew, “Moving Scientific Codes to Multicore Microprocessor CPUs,” Computing in Science & Engineering, special issue on novel architectures, Nov., 2008, p. 16-25. Also available at www.lcse.umn.edu/CiSE. Woodward, P. R., J. Jayaraj, P.-H. Lin, and D. Porter, “Programming Techniques for Moving Scientific Simulation Codes to Roadrunner,” tutorial given 3/12/08 at Los Alamos, link available at www.lanl.gov/roadrunner/rrtechnicalseminars2008. Woodward, P. R., J. Jayaraj, P.-H. Lin, and W. Dai, “First Experience of Compressible Gas Dynamics Simulation on the Los Alamos Roadrunner Machine,” submitted to Concurrency and Computation Practice and Experience, preprint available at www.lcse.umn.edu/RR-docs. http://www.lcse.umn.edu/NCSA_Workshop/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.