Cell Broadband Engine Architecture Bardia Mahjour ENCM 515 March 2007 Bardia Mahjour ENCM 515 March 2007.

Slides:



Advertisements
Similar presentations
Parallel Processing with PlayStation3 Lawrence Kalisz.
Advertisements

Systems and Technology Group © 2006 IBM Corporation Cell Programming Tutorial - JHD24 May 2006 Cell Programming Tutorial Jeff Derby, Senior Technical Staff.
A Seamless Communication Solution for Hybrid Cell Clusters Natalie Girard Bill Gardner, John Carter, Gary Grewal University of Guelph, Canada.
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Multicore Architectures Michael Gerndt. Development of Microprocessors Transistor capacity doubles every 18 months © Intel.
Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.
Introduction Introduction Håkon Kvale Stensland August 19 th, 2012 INF5063: Programming heterogeneous multi-core processors.
Cell Broadband Engine. INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Cell Broadband Engine Structure SPE PPE MIC EIB.
Ido Tov & Matan Raveh Parallel Processing ( ) January 2014 Electrical and Computer Engineering DPT. Ben-Gurion University.
Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective.
High Performance Embedded Computing © 2007 Elsevier Lecture 15: Embedded Multiprocessor Architectures Embedded Computing Systems Mikko Lipasti, adapted.
Sony PLAYSTATION 3 and the Cell Processor Dr. Hayden So Department of Electrical and Electronic Engineering 3 Sep, 2008.
Hot Threads Investigating Multi-Core and Cell Processor Security Dr. Jim Alves-Foss Jessica Smith Rachel Bonas Andrew Groenewald Xiaohui He Mufaddal Taj.
Using Cell Processors for Intrusion Detection through Regular Expression Matching with Speculation Author: C˘at˘alin Radu, C˘at˘alin Leordeanu, Valentin.
Development of a Ray Casting Application for the Cell Broadband Engine Architecture Shuo Wang University of Minnesota Twin Cities Matthew Broten Institute.
ELEC 6200, Fall 07, Oct 29 McPherson: Vector Processors1 Vector Processors Ryan McPherson ELEC 6200 Fall 2007.
Michael A. Baker, Pravin Dalale, Karam S. Chatha, Sarma B. K. Vrudhula
CS 7810 Lecture 24 The Cell Processor H. Peter Hofstee Proceedings of HPCA-11 February 2005.
Cell Broadband Processor Daniel Bagley Meng Tan. Agenda  General Intro  History of development  Technical overview of architecture  Detailed technical.
Emotion Engine A look at the microprocessor at the center of the PlayStation2 gaming console Charles Aldrich.
Computer performance.
J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy IBM Systems and Technology Group IBM Journal of Research and Development.
Programming the Cell Multiprocessor Işıl ÖZ. Outline Cell processor – Objectives – Design and architecture Programming the cell – Programming models CellSs.
Cell Architecture. Introduction The Cell concept was originally thought up by Sony Computer Entertainment inc. of Japan, for the PlayStation 3 The architecture.
Introduction to the Cell multiprocessor J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy (IBM Systems and Technology Group)
Cell/B.E. Jiří Dokulil. Introduction Cell Broadband Engine developed Sony, Toshiba and IBM 64bit PowerPC PowerPC Processor Element (PPE) runs OS SIMD.
Cell Systems and Technology Group. Introduction to the Cell Broadband Engine Architecture  A new class of multicore processors being brought to the consumer.
2006 Chapter-1 L2: "Embedded Systems - Architecture, Programming and Design", Raj Kamal, Publs.: McGraw-Hill, Inc. 1 Introduction to Embedded Systems –
Evaluation of Multi-core Architectures for Image Processing Algorithms Masters Thesis Presentation by Trupti Patil July 22, 2009.
Agenda Performance highlights of Cell Target applications
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Introduction to CMOS VLSI Design Lecture 22: Case Study: Intel Processors David Harris Harvey Mudd College Spring 2004.
High Performance Computing on the Cell Broadband Engine
1/21 Cell Processor (Cell Broadband Engine Architecture) Mark Budensiek.
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
March 12, 2007 Introduction to PS3 Cell BE Programming Narate Taerat.
EKT 422 Computer Architecture
Programming Examples that Expose Efficiency Issues for the Cell Broadband Engine Architecture William Lundgren Gedae), Rick Pancoast.
1 The IBM Cell Processor – Architecture and On-Chip Communication Interconnect.
Kevin Eady Ben Plunkett Prateeksha Satyamoorthy.
Cell Processor Programming: An introduction Pascal Comte Brock University, Fall 2007.
Sam Sandbote CSE 8383 Advanced Computer Architecture The IBM Cell Architecture Sam Sandbote CSE 8383 Advanced Computer Architecture April 18, 2006.
High Performance Computing Group Feasibility Study of MPI Implementation on the Heterogeneous Multi-Core Cell BE TM Architecture Feasibility Study of MPI.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
LYU0703 Parallel Distributed Programming on PS3 1 Huang Hiu Fung Wong Chung Hoi Supervised by Prof. Michael R. Lyu Department of Computer.
The Octoplier: A New Software Device Affecting Hardware Group 4 Austin Beam Brittany Dearien Brittany Dearien Warren Irwin Amanda Medlin Amanda Medlin.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Sony PlayStation 3 Sony also laid out the technical specs of the device. The PlayStation 3 will feature the much-vaunted Cell processor, which will run.
Optimizing Ray Tracing on the Cell Microprocessor David Oguns.
Intro This talk will focus on Cell processor –Cell Broadband Engine Architecture (CBEA) Power Processing Element (PPE) Synergistic Processing Element.
WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.
Presented by Jeremy S. Meredith Sadaf R. Alam Jeffrey S. Vetter Future Technologies Group Computer Science and Mathematics Division Research supported.
Aarul Jain CSE520, Advanced Computer Architecture Fall 2007.
High performance computing architecture examples Unit 2.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
1/21 Cell Processor Systems Seminar Diana Palsetia (11/21/2006)
Chapter 1 Introduction.   In this chapter we will learn about structure and function of computer and possibly nature and characteristics of computer.
HPEC-1 SMHS 7/7/2016 MIT Lincoln Laboratory Focus 3: Cell Sharon Sacco / MIT Lincoln Laboratory HPEC Workshop 19 September 2007 This work is sponsored.
● Cell Broadband Engine Architecture Processor ● Ryan Layer ● Ben Kreuter ● Michelle McDaniel ● Carrie Ruppar.
EEE4084F Digital Systems Lecture 24 RC Platform Case Studies 1/2
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Cell Architecture.
Introduction.
EEE4084F Digital Systems Lecture 24: RC Platform Case Studies 1/2
Chapter 1 Introduction.
Chapter 1 Introduction.
Multicore and GPU Programming
Presentation transcript:

Cell Broadband Engine Architecture Bardia Mahjour ENCM 515 March 2007 Bardia Mahjour ENCM 515 March 2007

Agenda  Introduction  History  Applications  Architecture  Features  Some Statistics  Programming Model  CBEA as DSP  Comparison with TigerSHARC  Conclusion  Introduction  History  Applications  Architecture  Features  Some Statistics  Programming Model  CBEA as DSP  Comparison with TigerSHARC  Conclusion

Introduction  Single Chip Multi-processor  9 processors built into a single die Needs that arose in areas such as:  Cryptography  Graphics transformations and lighting  Physics  Fast-Fourier Transforms (FFT)  Matrix operations  Scientifically compute-intensive tasks Goals:  power-efficient  cost-effective  high-performance processing  wide range of applications including game consoles. IBM XL Family of compilers (XL C/C++)  Single Chip Multi-processor  9 processors built into a single die Needs that arose in areas such as:  Cryptography  Graphics transformations and lighting  Physics  Fast-Fourier Transforms (FFT)  Matrix operations  Scientifically compute-intensive tasks Goals:  power-efficient  cost-effective  high-performance processing  wide range of applications including game consoles. IBM XL Family of compilers (XL C/C++) Cell die photo courtesy of Thomas Way, IBM Burlington

History  A joint venture by Sony, Toshiba, and IBM (STI)  Official Design phase started in March of 2001  Three giant companies spent 4 years and US$400M to design and develop Cell  First Commercial Use in Sony’s PlayStation 3 in November  A joint venture by Sony, Toshiba, and IBM (STI)  Official Design phase started in March of 2001  Three giant companies spent 4 years and US$400M to design and develop Cell  First Commercial Use in Sony’s PlayStation 3 in November 2006.

Applications  Console Video Games  PlayStation 3  Home Cinema  Toshiba ’ s HDTV  Embedded Applications  Medical Imaging, aerospace, telecommunication, defense, etc.  Mercury Computer Systems, Inc.  Super Computing  Roadrunner  Blade Servers  Console Video Games  PlayStation 3  Home Cinema  Toshiba ’ s HDTV  Embedded Applications  Medical Imaging, aerospace, telecommunication, defense, etc.  Mercury Computer Systems, Inc.  Super Computing  Roadrunner  Blade Servers

Architecture  PowerPC Processor Element (PPE) - 64-bit PowerPC RISC core (can run OS)  Synergistic Processor Elements (SPEs) - Each element is a DSP processor. CBEA has 8 of them!  Element Interconnect Bus (EIB)  Memory Interface Controller (MIC)  Cell Broadband Engine Interface (BEI)  PowerPC Processor Element (PPE) - 64-bit PowerPC RISC core (can run OS)  Synergistic Processor Elements (SPEs) - Each element is a DSP processor. CBEA has 8 of them!  Element Interconnect Bus (EIB)  Memory Interface Controller (MIC)  Cell Broadband Engine Interface (BEI)

Features PPE has a pipeline 10 levels deep Each SPE has:  a 128x128 register file  a floating-point unit  two fixed-point units  VMX vector arithmetic unit  Local Store  DMA controller PPE has a pipeline 10 levels deep Each SPE has:  a 128x128 register file  a floating-point unit  two fixed-point units  VMX vector arithmetic unit  Local Store  DMA controller

Some Statistics  Observed clock speed: > 4 GHz  Peak performance (single precision): > 256 Gflops  Peak performance (double precision): >26 GFlops  Local storage size per SPU: 256KB  Area: 221 mm ²  Technology 90nm SOI  Total number of transistors: 234M  Observed clock speed: > 4 GHz  Peak performance (single precision): > 256 Gflops  Peak performance (double precision): >26 GFlops  Local storage size per SPU: 256KB  Area: 221 mm ²  Technology 90nm SOI  Total number of transistors: 234M

Programming Model  Function Offload Model  Device Extension Model  Computational Acceleration Model  Streaming Models  Shared-memory Multi-processor Model  Asymmetric Thread Runtime Model  User-Mode Thread Model  SPE Overlay  Function Offload Model  Device Extension Model  Computational Acceleration Model  Streaming Models  Shared-memory Multi-processor Model  Asymmetric Thread Runtime Model  User-Mode Thread Model  SPE Overlay

Function Offload Model Remote Procedure Call (RPC)

/* file hello.idl */ interface greeting{[sync] idl_id_t hello ([in] int nbytes, [in, size_is(nbytes)] char message[]);} /* file hello.c */ #include int main( ){ char* str = “Hi, from the Cell!”; hello( strlen(str), str); } /* file spu_hello.c */ #include idl_id_t hello( int nbytes, char msg[]) { printf(“SPE: %s\n”, ms); return 0; } /* file hello.idl */ interface greeting{[sync] idl_id_t hello ([in] int nbytes, [in, size_is(nbytes)] char message[]);} /* file hello.c */ #include int main( ){ char* str = “Hi, from the Cell!”; hello( strlen(str), str); } /* file spu_hello.c */ #include idl_id_t hello( int nbytes, char msg[]) { printf(“SPE: %s\n”, ms); return 0; } Function Offload Model

Thread Runtime Model speid_t spe_create_thread( spe_gid_t gid, spe_program_handle_t *spe_program_handle,void *argp, void *envp, unsigned long *mask, int flags ); Example PPE Code: #include #define NUM_SPES 8 extern spe_program_handle_t spe_code; int main( ) { for (i = 0; i < NUM_SPES; i++) spe_ids[i] = spe_create_thread(gid,&spe_code, NULL, NULL, -1, 0); return 0; } speid_t spe_create_thread( spe_gid_t gid, spe_program_handle_t *spe_program_handle,void *argp, void *envp, unsigned long *mask, int flags ); Example PPE Code: #include #define NUM_SPES 8 extern spe_program_handle_t spe_code; int main( ) { for (i = 0; i < NUM_SPES; i++) spe_ids[i] = spe_create_thread(gid,&spe_code, NULL, NULL, -1, 0); return 0; }

CBEA as DSP Strictly speaking : Cell is a microprocessor Designed to bridge the gap between conventional and special-purpose processors Handles heavy digital signal processing workloads ( 3D graphics, 48 MPEG-2 Channels, etc. ) Meets most of the ideal DSP processor requirements Strictly speaking : Cell is a microprocessor Designed to bridge the gap between conventional and special-purpose processors Handles heavy digital signal processing workloads ( 3D graphics, 48 MPEG-2 Channels, etc. ) Meets most of the ideal DSP processor requirements

Comparison with TigerSHARC  Size requirement  Power consumption and heat generation  Supports floating-point ops in hardware  Bandwidth and data-width  Avoids resource dependencies  Scalability  Ease of programming  Size requirement  Power consumption and heat generation  Supports floating-point ops in hardware  Bandwidth and data-width  Avoids resource dependencies  Scalability  Ease of programming

Conclusion Cell Broadband Engine Architecture is an extremely powerful, scalable and fast processor. It is not purely a digital signal processor, however, the wide range of applications it is suited for includes DSP. Furthermore, many of the requirements of DSP applications were the rationale behind CBEA ’ s design and architectural decisions.

References [1] IBM Research, The Cell Architecture, Innovation Matters. Available at Accessed Feb 19 th, 2007 [2] IBM Systems and Technology Group, Cell Broadband Engine Programming Tutorial Version 2.0, December 15, 2006 [3] Wikipedia, Cell Microprocessor Implementations. Available at - endnote_sti32nmhttp://en.wikipedia.org/wiki/Cell_microprocessor_implementations - endnote_sti32nm Accessed Feb 20 th, 2007 [4] Signalogic , DSP Applications. Available at Accessed Feb 21 st, 2007 [5] Wikipedia, Cell Microprocessor. Available at Accessed Feb 22 nd, 2007 [6] IBM Journal of Research and Development, Introduction to the Cell multiprocessor (September 7, 2005) Available at [7] Smith, M. R. (1992). How RISCy is DSP? Micro, IEEE, Volume 12, Issue 6, [8] Analog Devices Inc. One Technology Way, ADSP-TS201 TigerSHARC Processor Programming Reference, Version 1.1, April 2005 [1] IBM Research, The Cell Architecture, Innovation Matters. Available at Accessed Feb 19 th, 2007 [2] IBM Systems and Technology Group, Cell Broadband Engine Programming Tutorial Version 2.0, December 15, 2006 [3] Wikipedia, Cell Microprocessor Implementations. Available at - endnote_sti32nmhttp://en.wikipedia.org/wiki/Cell_microprocessor_implementations - endnote_sti32nm Accessed Feb 20 th, 2007 [4] Signalogic , DSP Applications. Available at Accessed Feb 21 st, 2007 [5] Wikipedia, Cell Microprocessor. Available at Accessed Feb 22 nd, 2007 [6] IBM Journal of Research and Development, Introduction to the Cell multiprocessor (September 7, 2005) Available at [7] Smith, M. R. (1992). How RISCy is DSP? Micro, IEEE, Volume 12, Issue 6, [8] Analog Devices Inc. One Technology Way, ADSP-TS201 TigerSHARC Processor Programming Reference, Version 1.1, April 2005