IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing
Outline Architectural overview Definition of Cell Processor Shared vs Private Memory Design System Scalability
Architectural Features Power Processor Element (PPE) –64 bit Synergistic Processing Elements (SPE) –32 bit –256 kb of on chip RAM Element Interconnect Bus (EIB) I/O Interface RAM
Synergistic Processing Elements Vector Instructions PPE delegates anything that is parallelizable Local Store
Stream Processing
Vector Instructions
Advantages of Vector Instructions Fewer instructions Fewer branch instructions -- fewer mispredictions Access memory block at a time Less memory access = faster processing time Example: convert an image to grayscale
Disadvantages of Vector Processor More expensive to produce Increased code complexity May be difficult to port between systems Increased power consumption Wasted resources if using scalar instructions
Definition of Cell Processor Microprocessor designed to optimize cooperation between ordinary desktop processors and more specialized high- performance processors (like a GPU) Performance and hardware simplicity prioritized over programming convenience
Shared Memory RAM Private Memory Local Stores on SPE Cache on PPE Pretty Simple
System Design Vector instruction optimization takes planning (ie shopping list) Gaming (PS3) Cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations
Scalability Harder to optimize programs Easier to optimize hardware
Why use a cell processor? General purpose processor Designed not to have any slow components Even though you cannot vectorize every instruction, the SPE’s are still useful Worse case - just as fast as an ordinary desktop processor