Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective.

Slides:



Advertisements
Similar presentations
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Advertisements

Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
ARCHITECTURE OF APPLE’S G4 PROCESSOR BY RON WEINWURZEL MICROPROCESSORS PROFESSOR DEWAR SPRING 2002.
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Multicore Architectures Michael Gerndt. Development of Microprocessors Transistor capacity doubles every 18 months © Intel.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Implementation of 2-D FFT on the Cell Broadband Engine Architecture William Lundgren Gedae), Kerry Barnes (Gedae), James Steed (Gedae)
Cell Broadband Engine. INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Cell Broadband Engine Structure SPE PPE MIC EIB.
Ido Tov & Matan Raveh Parallel Processing ( ) January 2014 Electrical and Computer Engineering DPT. Ben-Gurion University.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
 Understanding the Sources of Inefficiency in General-Purpose Chips.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
Copyright © 2006, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners Intel® Core™ Duo Processor.
CS 7810 Lecture 24 The Cell Processor H. Peter Hofstee Proceedings of HPCA-11 February 2005.
architectural overview
Cell Broadband Processor Daniel Bagley Meng Tan. Agenda  General Intro  History of development  Technical overview of architecture  Detailed technical.
Computer Organization and Assembly language
Engineering 1040: Mechanisms & Electric Circuits Fall 2011 Introduction to Embedded Systems.
Prof. Milo Martin for CIS700
Emotion Engine A look at the microprocessor at the center of the PlayStation2 gaming console Charles Aldrich.
Computer performance.
J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy IBM Systems and Technology Group IBM Journal of Research and Development.
Programming the Cell Multiprocessor Işıl ÖZ. Outline Cell processor – Objectives – Design and architecture Programming the cell – Programming models CellSs.
Cell Architecture. Introduction The Cell concept was originally thought up by Sony Computer Entertainment inc. of Japan, for the PlayStation 3 The architecture.
Introduction to the Cell multiprocessor J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy (IBM Systems and Technology Group)
Evaluation of Multi-core Architectures for Image Processing Algorithms Masters Thesis Presentation by Trupti Patil July 22, 2009.
Computer Architecture ECE 4801 Berk Sunar Erkay Savas.
Cell Broadband Engine Architecture Bardia Mahjour ENCM 515 March 2007 Bardia Mahjour ENCM 515 March 2007.
Agenda Performance highlights of Cell Target applications
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Introduction to CMOS VLSI Design Lecture 22: Case Study: Intel Processors David Harris Harvey Mudd College Spring 2004.
Multi-Core Architectures
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
1/21 Cell Processor (Cell Broadband Engine Architecture) Mark Budensiek.
Introduction of Intel Processors
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Towards the Design of Heterogeneous Real-Time Multicore System m Yumiko Kimezawa February 1, 20131MT2012.
1 The IBM Cell Processor – Architecture and On-Chip Communication Interconnect.
Computer Organization and Design Computer Abstractions and Technology
Kevin Eady Ben Plunkett Prateeksha Satyamoorthy.
COMPUTER ARCHITECTURE. Recommended Text 1Computer Organization and Architecture by William Stallings 2Structured Computer Organisation Andrew S. Tanenbaum.
1 Latest Generations of Multi Core Processors
Cell Processor Programming: An introduction Pascal Comte Brock University, Fall 2007.
Sam Sandbote CSE 8383 Advanced Computer Architecture The IBM Cell Architecture Sam Sandbote CSE 8383 Advanced Computer Architecture April 18, 2006.
LYU0703 Parallel Distributed Programming on PS3 1 Huang Hiu Fung Wong Chung Hoi Supervised by Prof. Michael R. Lyu Department of Computer.
The Octoplier: A New Software Device Affecting Hardware Group 4 Austin Beam Brittany Dearien Brittany Dearien Warren Irwin Amanda Medlin Amanda Medlin.
Sony PlayStation 3 Sony also laid out the technical specs of the device. The PlayStation 3 will feature the much-vaunted Cell processor, which will run.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
Succeeding with Technology Chapter 2 Hardware Designed to Meet the Need The Digital Revolution Integrated Circuits and Processing Storage Input, Output,
Optimizing Ray Tracing on the Cell Microprocessor David Oguns.
Presented by Jeremy S. Meredith Sadaf R. Alam Jeffrey S. Vetter Future Technologies Group Computer Science and Mathematics Division Research supported.
Aarul Jain CSE520, Advanced Computer Architecture Fall 2007.
FFTC: Fastest Fourier Transform on the IBM Cell Broadband Engine David A. Bader, Virat Agarwal.
High performance computing architecture examples Unit 2.
Types of RAM (Random Access Memory) Information Technology.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
1/21 Cell Processor Systems Seminar Diana Palsetia (11/21/2006)
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
● Cell Broadband Engine Architecture Processor ● Ryan Layer ● Ben Kreuter ● Michelle McDaniel ● Carrie Ruppar.
M. Bellato INFN Padova and U. Marconi INFN Bologna
Types of RAM (Random Access Memory)
Cell Architecture.
Hot Processors Of Today
Presented by: Tim Olson, Architect
CSE 502: Computer Architecture
ADSP 21065L.
Presentation transcript:

Prepared and Presented by: Class Presentation of Custom DSP Implementation Course This is a class presentation. All data are copyrights of their respective authors as listed in the references and have been used here for educational purposes only. ECE Department – University of Tehran May 2005 S.H.R. Ahmadi The CELL processor

Notice: Photos and Diagrams are proprietary to IBM The Cell processor, Power & PowerPC are trademarks of IBM PlayStation™ 3 is a trademark of Sony Computer Entertainment Inc. (SCEI) FlexIO™ & XDR™ are Rambus Inc. trademarks All data are gathered from public sources which are listed in the “References”

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

Development History Completely secret and under cover March 12, 2001 – “Cell” announced –“supercomputer-on-a-chip” from Sony,Toshiba,IBM –Capable of TeraFlops computation speed –$400m investment in 5 years March, 2002 – Okamoto speech –2005 target date –First glimpse of cell idea: 1000x figure August, 2002 – Cell design finished –near “tape out” –“4-16 general-purpose processor cores per chip”

Development History November, 2002 – Rambus licenses “Yellowstone” technology to Toshiba –Yellowstone : 3.2 GHz memory January, 2003 – Rambus licenses Yellowstone/Redwood Technology to Sony –Redwood – parallel interface between chips January, 2003 –Cell at 4 GHz, 1024 bit bus, 64 MB memory, PowerPC –At least 4 patents in 2002 & 2003 on: Hardware & software architecture Processing modules Memory protection data synchronization

Development History 2004 –Marketing NEWS –Some general technical data May, 2004 –CELL-based Workstation will be made Application : digital content creation February, 2005 –Formal introduction at ISSCC’05 –Extensive media coverage May, 2005 –Sony’s PlayStation3 formal announcement

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

Specifications & Architecture Broadband Processor Architecture –Optimized for broadband media and 3D graphics 90-nm PD-SOI process, 8M (copper) 234 million transistors in ~ 235 mm GHz operation at 1.3v 85° Celsius operating temp. with heat sink Thermal protection schemes 2965 core connections / ~ 1300 pins 256 GFlops SP-FP, 26 GFlops DP-FP HUGE communication speed to outside 4 x 128 bit internal bus (ring), 96 Bytes/cycle

Specifications & Architecture BPA (Cell) design features: Multi-Core Architecture Based on the Power Architecture –Code compatibility Coherent and cooperative off-load processing Enhanced SIMD architecture Power efficiency improved “Absolute timers“ allow "hard” realtime data processing –Good estimation of execution time is possible Big-endian memory –Support Apple, but not Intel Isolation mechanism for secure code execution

Specifications & Architecture BPA (Cell) design justification: Multi-Core Non-Homogeneous Architecture –Better Power 3-level Model of Memory –Main Memory, Local Store, Registers –Better Memory Large Register File & SW Controlled Branching –Allows deeper pipelines –Better Frequency

FlexIO

Specifications & Architecture CPU: (Power Processor Element) 64-bit Power Architecture™ with VMX(SIMD) In-order, 2-way hardware Multi-threading –Simple design  improvements possible –predictable execution times Coherent Load/Store Cache (32KB L KB L2) Redesigned for use in the Cell processor Serves as a: multi-OS GPP Control unit for SPEs

Specifications & Architecture SPE: (synergistic Processing Element) Dual issue, 128-bit 4-way SIMD –Vector Processing 4 Integer Units + 4 FP Units 8-,16-,32-bit Integer + 32-,64-bit FP 128x128-bit Registers 256KB Local-Store Memory (specially designed) –Caches are not used –Data & Instruction in LS

Specifications & Architecture SPE: Coherent & Cooperative off-load engines for CPU –Works independently –Not directly tied to CPU as co-processor Dedicated DMA engine –Move data : CPU  SPE or SPE  SPE –Parallel or Serial with other SPEs Dynamically configurable to protect resources Can perform security algorithms

Specifications & Architecture 8 SPE blocks, each with 32 GFlops or 32 Gops  Monstrous processing power  Need to be fed accordingly  Solution : EIB High-Speed MEM (Dual XDR™) High-Speed IO (FlexIO™)

Specifications & Architecture EIB: (Element Interconnect Bus) Data ring for internal communication Four 16 byte data rings – low latency Multiple simultaneous transfers 96B/cycle peak bandwidth ½ CPU speed )

Specifications & Architecture External Memory Bus: Licensed from Rambus Dual XDR™ interface 3.2GHz) External IO: Licensed from Rambus FlexIO™ interface (each 2-wire 800Mbps) Total 76.8 GB/s ( 7 Tx Bytes + 5 Rx Bytes ) Excessive Shielding is necessary –Many VDD/GND wires –90% of all pins

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

Applications According to IBM: CELL design was based on the analysis of a broad range of workloads in areas such as cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations, and scientific workloads The Cell processor is designed for graphics- and network-intensive jobs ranging from video games to complex imaging for the medical, defense, automotive and aerospace industries

Applications Games,3D Graphics,Video,Audio –Image manipulation; Video processing, encoding, decoding DSP (Digital Signal Processing) –FFT (e.g. SETI); Distributed DSP Digital Rights Management –Cryptography; Secure data processing Scientific Calculations –Linear system solvers; Linear algebra; PDE Super Computing Servers (Commercial databases) Stream Processing Applications –Serial use of SPE blocks (e.g. Digital TV)

Applications

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

Software Aspects According to Experts: Programming the Cell processor requires new tools & new programming paradigm –Because SPE programs should be self-contained with data and instruction bundles For a game console, programmers will craft custom optimized code. The next challenge for the STI is to find a way to make this architecture accessible to programmers beyond game developers Cell is "OS neutral" and supports multiple OS simultaneously

Software Aspects Tool chain for Cell is built on PowerPC Linux –Early availability of SIMD-optimized compilers –Development of high-performance graphics and media libraries for the Broadband Architecture entirely in C –CELL team developed the first SPU compiler –Development of an advanced parallelizing compiler with auto-SIMDization features based on IBM XL compiler technology

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

Marketing & NEWS “Cell is basically a vector supercomputer on a chip”, we present the 2004 Microprocessor Report Analysts’ Choice Award for Best Technology to the Cell Processor IBM is working with companies to integrate Cell microprocessor into third-party products The companies are working with open-source compiler developers to create software development tools for programmers

Marketing & NEWS Sony PlayStation™ 3 Cell Processor running at 3.2Ghz –7 special purpose 3.2Ghz processors –218 gigaflops of performance 256Mb XDR main RAM at 3.2 GHz 256Mb of GDDR VRAM at 700Mhz Support for seven Bluetooth controllers Supports Blu-ray DVD format System Floating Point Performance of 2 teraflops Communication Ethernet, Wi-Fi IEEE , Bluetooth Output in HDTV resolution up to 1080p as standard

Marketing & NEWS Cell Processor Based Workstation (CPBW) From Sony Group and IBM First Prototype “Powered On” 16 TeraFlops in a rack (est.) Optimized for Digital Content Creation –Computer entertainment –Movies –Real-time rendering –Physics simulation Affordable by Small Businesses (and Individuals)

Marketing & NEWS CELL Industries Our Objective : Distributing Cell Power Facilitate small-scale supercomputer applications for Cell Cell-based systems –affordable for individuals and small to medium-sized businesses Our Cell PCI-x plug-in card, xpac-zero –fastest and most economical way for people to get their hands on some real computing power Uses Cell as a general-purpose numerical accelerator –The xpac-zero card acts much like a video card

Outline Development History Specifications & Architecture Applications Software Aspects Marketing & NEWS References

IBM, Sony, Toshiba papers in ISSCC’05 –“A Streaming Processing Unit for a CELL Processor”, B. Flachs et. al. –“The Design and Implementation of a First- Generation CELL Processor”, D. Pham et. al. “Microprocessor Report”, Reed Electronics Group, 2005, Jan. 31 & Feb. 14 “IBM’s Cell Processor : The next generation of computing?”, D.K. Every, Shareware Press, Feb. 2005

References “Power Efficient Processor Architecture and The Cell Processor”, H.P. Hofstee, HPCA “Power Efficient Processor Design and the Cell Processor”, IBM, 2005 “Introducing the IBM/Sony/Toshiba Cell Processor“, J. H. Stokes, “Cell Architecture Explained”, N. Blachford,

Thank you