Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective.

Similar presentations


Presentation on theme: "Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective."— Presentation transcript:

1 Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective authors as listed in the references and have been used here for educational purposes only. ECE Department – University of Tehran May 2005 S.H.R. Ahmadi The CELL processor

2

3 Notice: Photos and Diagrams are proprietary to IBM The Cell processor, Power & PowerPC are trademarks of IBM PlayStation™ 3 is a trademark of Sony Computer Entertainment Inc. (SCEI) FlexIO™ & XDR™ are Rambus Inc. trademarks All data are gathered from public sources which are listed in the “References”

4 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

5 Development History Completely secret and under cover March 12, 2001 – “Cell” announced –“supercomputer-on-a-chip” from Sony,Toshiba,IBM –Capable of TeraFlops computation speed –$400m investment in 5 years March, 2002 – Okamoto speech –2005 target date –First glimpse of cell idea: 1000x figure August, 2002 – Cell design finished –near “tape out” –“4-16 general-purpose processor cores per chip”

6 Development History November, 2002 – Rambus licenses “Yellowstone” technology to Toshiba –Yellowstone : 3.2 GHz memory January, 2003 – Rambus licenses Yellowstone/Redwood Technology to Sony –Redwood – parallel interface between chips January, 2003 –Cell at 4 GHz, 1024 bit bus, 64 MB memory, PowerPC –At least 4 patents in 2002 & 2003 on: Hardware & software architecture Processing modules Memory protection data synchronization

7 Development History 2004 –Marketing NEWS –Some general technical data May, 2004 –CELL-based Workstation will be made Application : digital content creation February, 2005 –Formal introduction at ISSCC’05 –Extensive media coverage May, 2005 –Sony’s PlayStation3 formal announcement

8 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

9 Specifications & Architecture Broadband Processor Architecture –Optimized for broadband media and 3D graphics 90-nm PD-SOI process, 8M (copper) 234 million transistors in ~ 235 mm 2 4.6 GHz operation at 1.3v 85° Celsius operating temp. with heat sink Thermal protection schemes 2965 core connections / ~ 1300 pins 256 GFlops SP-FP, 26 GFlops DP-FP HUGE communication speed to outside 4 x 128 bit internal bus (ring), 96 Bytes/cycle

10 Specifications & Architecture BPA (Cell) design features: Multi-Core Architecture Based on the Power Architecture –Code compatibility Coherent and cooperative off-load processing Enhanced SIMD architecture Power efficiency improved “Absolute timers“ allow "hard” realtime data processing –Good estimation of execution time is possible Big-endian memory –Support Apple, but not Intel Isolation mechanism for secure code execution

11 Specifications & Architecture BPA (Cell) design justification: Multi-Core Non-Homogeneous Architecture –Better Power 3-level Model of Memory –Main Memory, Local Store, Registers –Better Memory Large Register File & SW Controlled Branching –Allows deeper pipelines –Better Frequency

12 FlexIO

13

14

15 Specifications & Architecture CPU: (Power Processor Element) 64-bit Power Architecture™ with VMX(SIMD) In-order, 2-way hardware Multi-threading –Simple design  improvements possible –predictable execution times Coherent Load/Store Cache (32KB L1 - 512KB L2) Redesigned for use in the Cell processor Serves as a: multi-OS GPP Control unit for SPEs

16 Specifications & Architecture SPE: (synergistic Processing Element) Dual issue, 128-bit 4-way SIMD –Vector Processing 4 Integer Units + 4 FP Units 8-,16-,32-bit Integer + 32-,64-bit FP 128x128-bit Registers 256KB Local-Store Memory (specially designed) –Caches are not used –Data & Instruction in LS

17 Specifications & Architecture SPE: Coherent & Cooperative off-load engines for CPU –Works independently –Not directly tied to CPU as co-processor Dedicated DMA engine –Move data : CPU  SPE or SPE  SPE –Parallel or Serial with other SPEs Dynamically configurable to protect resources Can perform security algorithms

18 Specifications & Architecture 8 SPE blocks, each with 32 GFlops or 32 Gops  Monstrous processing power  Need to be fed accordingly  Solution : EIB High-Speed MEM (Dual XDR™) High-Speed IO (FlexIO™)

19 Specifications & Architecture EIB: (Element Interconnect Bus) Data ring for internal communication Four 16 byte data rings – low latency Multiple simultaneous transfers 96B/cycle peak bandwidth (@ ½ CPU speed )

20 Specifications & Architecture External Memory Bus: Licensed from Rambus Dual XDR™ interface (25.6GB/s @ 3.2GHz) External IO: Licensed from Rambus FlexIO™ interface (each 2-wire bit @ 800Mbps) Total 76.8 GB/s ( 7 Tx Bytes + 5 Rx Bytes ) Excessive Shielding is necessary –Many VDD/GND wires –90% of all pins

21 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

22 Applications According to IBM: CELL design was based on the analysis of a broad range of workloads in areas such as cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations, and scientific workloads The Cell processor is designed for graphics- and network-intensive jobs ranging from video games to complex imaging for the medical, defense, automotive and aerospace industries

23 Applications Games,3D Graphics,Video,Audio –Image manipulation; Video processing, encoding, decoding DSP (Digital Signal Processing) –FFT (e.g. SETI); Distributed DSP Digital Rights Management –Cryptography; Secure data processing Scientific Calculations –Linear system solvers; Linear algebra; PDE Super Computing Servers (Commercial databases) Stream Processing Applications –Serial use of SPE blocks (e.g. Digital TV)

24 Applications

25 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

26 Software Aspects According to Experts: Programming the Cell processor requires new tools & new programming paradigm –Because SPE programs should be self-contained with data and instruction bundles For a game console, programmers will craft custom optimized code. The next challenge for the STI is to find a way to make this architecture accessible to programmers beyond game developers Cell is "OS neutral" and supports multiple OS simultaneously

27 Software Aspects Tool chain for Cell is built on PowerPC Linux –Early availability of SIMD-optimized compilers –Development of high-performance graphics and media libraries for the Broadband Architecture entirely in C –CELL team developed the first SPU compiler –Development of an advanced parallelizing compiler with auto-SIMDization features based on IBM XL compiler technology

28 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

29 Marketing & NEWS “Cell is basically a vector supercomputer on a chip”, we present the 2004 Microprocessor Report Analysts’ Choice Award for Best Technology to the Cell Processor IBM is working with companies to integrate Cell microprocessor into third-party products The companies are working with open-source compiler developers to create software development tools for programmers

30 Marketing & NEWS Sony PlayStation™ 3 Cell Processor running at 3.2Ghz –7 special purpose 3.2Ghz processors –218 gigaflops of performance 256Mb XDR main RAM at 3.2 GHz 256Mb of GDDR VRAM at 700Mhz Support for seven Bluetooth controllers Supports Blu-ray DVD format System Floating Point Performance of 2 teraflops Communication Ethernet, Wi-Fi IEEE 802.11, Bluetooth Output in HDTV resolution up to 1080p as standard

31 Marketing & NEWS Cell Processor Based Workstation (CPBW) From Sony Group and IBM First Prototype “Powered On” 16 TeraFlops in a rack (est.) Optimized for Digital Content Creation –Computer entertainment –Movies –Real-time rendering –Physics simulation Affordable by Small Businesses (and Individuals)

32 Marketing & NEWS CELL Industries Our Objective : Distributing Cell Power Facilitate small-scale supercomputer applications for Cell Cell-based systems –affordable for individuals and small to medium-sized businesses Our Cell PCI-x plug-in card, xpac-zero –fastest and most economical way for people to get their hands on some real computing power Uses Cell as a general-purpose numerical accelerator –The xpac-zero card acts much like a video card

33 Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

34 IBM, Sony, Toshiba papers in ISSCC’05 –“A Streaming Processing Unit for a CELL Processor”, B. Flachs et. al. –“The Design and Implementation of a First- Generation CELL Processor”, D. Pham et. al. “Microprocessor Report”, Reed Electronics Group, 2005, Jan. 31 & Feb. 14 “IBM’s Cell Processor : The next generation of computing?”, D.K. Every, Shareware Press, Feb. 2005

35 References “Power Efficient Processor Architecture and The Cell Processor”, H.P. Hofstee, HPCA-11 2005 “Power Efficient Processor Design and the Cell Processor”, IBM, 2005 “Introducing the IBM/Sony/Toshiba Cell Processor“, J. H. Stokes, http://arstechnica.com/ “Cell Architecture Explained”, N. Blachford, http://www.blachford.info/

36 Thank you


Download ppt "Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective."

Similar presentations


Ads by Google