III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007.

Slides:



Advertisements
Similar presentations
III. Multicore Processors (6) Dezső Sima Spring 2007 (Ver. 2.1)  Dezső Sima, 2007.
Advertisements

III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.1)  Dezső Sima, 2007.
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.
III. Multicore Processors (4) Dezső Sima Spring 2007 (Ver. 2.1)  Dezső Sima, 2007.
Chapter 1 An Introduction To Microprocessor And Computer
Evolution of Chip Design ECE 111 Spring A Brief History 1958: First integrated circuit – Flip-flop using two transistors – Built by Jack Kilby at.
What you wanted to know about the iSeries hardware POWER 5, POWER 6 and POWER 7 Bill Fuller Natco Products Corporation
Presented by Performance and Productivity of Emerging Architectures Jeremy Meredith Sadaf Alam Jeffrey Vetter Future Technologies.
ELEC 6200, Fall 07, Oct 29 McPherson: Vector Processors1 Vector Processors Ryan McPherson ELEC 6200 Fall 2007.
Michael A. Baker, Pravin Dalale, Karam S. Chatha, Sarma B. K. Vrudhula
1 CS402 PPP # 1 Computer Architecture Evolution. 2 John Von Neuman original concept.
CS 7810 Lecture 24 The Cell Processor H. Peter Hofstee Proceedings of HPCA-11 February 2005.
Cell Broadband Processor Daniel Bagley Meng Tan. Agenda  General Intro  History of development  Technical overview of architecture  Detailed technical.
Computer Organization and Assembly language
III. Multicore Processors (3)
Microarchitecture of Superscalars (4) Decoding Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
Programming the Cell Multiprocessor Işıl ÖZ. Outline Cell processor – Objectives – Design and architecture Programming the cell – Programming models CellSs.
Cell Architecture. Introduction The Cell concept was originally thought up by Sony Computer Entertainment inc. of Japan, for the PlayStation 3 The architecture.
Introduction to the Cell multiprocessor J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy (IBM Systems and Technology Group)
Evaluation of Multi-core Architectures for Image Processing Algorithms Masters Thesis Presentation by Trupti Patil July 22, 2009.
Cell Broadband Engine Architecture Bardia Mahjour ENCM 515 March 2007 Bardia Mahjour ENCM 515 March 2007.
Agenda Performance highlights of Cell Target applications
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Introduction to CMOS VLSI Design Lecture 22: Case Study: Intel Processors David Harris Harvey Mudd College Spring 2004.
Lynn Choi School of Electrical Engineering Microprocessor Microarchitecture The Past, Present, and Future of CPU Architecture.
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
1/21 Cell Processor (Cell Broadband Engine Architecture) Mark Budensiek.
12. Multithreaded Processors Dezső Sima Fall 2006  D. Sima, 2006.
Winter 2004 Class Representation For Advanced VLSI Course Instructor : Dr S.M.Fakhraie Presented by : Naser Sedaghati Major Reference : Design and Implementation.
A Gentler, Kinder Guide to the Multi-core Galaxy Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Guest lecture for ECE4100/6100.
Computer Science and Engineering Advanced Computer Architecture CSE 8383 April 17, 2008 Session 11.
Intel’s Penryn Sima Dezső Fall 2007 Version nm quad-core -
1 The IBM Cell Processor – Architecture and On-Chip Communication Interconnect.
Computer Organization and Design Computer Abstractions and Technology
Kevin Eady Ben Plunkett Prateeksha Satyamoorthy.
Dezső Sima Fall 2007 (Ver. 2.1)  Dezső Sima, 2007 Multicore Processors (2)
Sam Sandbote CSE 8383 Advanced Computer Architecture The IBM Cell Architecture Sam Sandbote CSE 8383 Advanced Computer Architecture April 18, 2006.
High Performance Computing Group Feasibility Study of MPI Implementation on the Heterogeneous Multi-Core Cell BE TM Architecture Feasibility Study of MPI.
© 2005 IBM Essential Overview Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006.
Aarul Jain CSE520, Advanced Computer Architecture Fall 2007.
Dezső Sima Fall 2007 (Ver. 2.1)  Dezső Sima, 2007 Multicore Processors (5)
© 2004 IBM Corporation Power Everywhere POWER5 Processor Update Mark Papermaster VP, Technology Development IBM Systems and Technology Group.
Lecture 3 Dr. Muhammad Ayaz Computer Organization and Assembly Language. (CSC-210)
BITS Pilani Pilani Campus Pawan Sharma ES C263 Microprocessor Programming and Interfacing.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
Lecture 3 (Microprocessor) Dr. Muhammad Ayaz Computer Organization and Assembly Language. (CSC-210)
Sima Dezső 2007 őszi félév (Ver. 2.1)  Dezső Sima, 2007 Többmagos Processzorok (3)
Hardware Architecture
1/21 Cell Processor Systems Seminar Diana Palsetia (11/21/2006)
Microarchitecture of Superscalars (6) Register renaming Dezső Sima Spring 2008 (Ver. 2.0)  Dezső Sima, 2008.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
● Cell Broadband Engine Architecture Processor ● Ryan Layer ● Ben Kreuter ● Michelle McDaniel ● Carrie Ruppar.
Itanium® 2 Processor Architecture
Manycore processors Sima Dezső October Version 6.2.
Lynn Choi School of Electrical Engineering
Lynn Choi School of Electrical Engineering
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Cell Architecture.
9/18/2018 Accelerating IMA: A Processor Performance Comparison of the Internal Multiple Attenuation Algorithm Michael Perrone Mgr, Cell Solution Dept.,
Parallel Computers Today
Technology and Historical Perspective: A peek of the microprocessor Evolution 11/14/2018 cpeg323\Topic1a.ppt.
III. Multicore Processors (2)
11. Multicore Processors Dezső Sima Fall 2006  D. Sima, 2006.
Többmagos Processzorok (2)
Multicore Processors (5)
Microarchitecture of Superscalars (4) Decoding
Presentation transcript:

III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007

POWER line Cell BE 10.3 IBM’s MC processors

POWER4180 nm 10/2001 POWER nm 11/ POWER line POWER5130 nm 5/2004 POWER5+ 90 nm 10/2005 POWER6 65 nm 2007

Figure: The evolution of IBM’s major RISC lines Evolution of IBM’s major RISC lines

Figure : POWER4 chip logical view Built-In-SelfTest Service Processor Power On Reset Core interface Unit (crossbar) Non-Cacheable Unit MultiChip Module POWER4 (1) Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture,, IBM J. Res. & Dev. Vol. 46, No. 1, Jan. 2002, pp. 5-25,

Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., Figure: Logical view of the L3 controller POWER4 (2)

Figure: The memory cotroller of the POWER4 Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., POWER4 (3)

Figure: I/O controller of the POWER4 Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., Fabric Controller POWER4 (4)

Figure: POWER4 chip Source: R. Kalla, B. Sinharoy, J. Tendler: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, POWER4 (5)

POWER4 (6) Table: Main features of IBM’s dual-core POWER line Off-chipMem. contr. L3 L MB/sharedSize/allocation On-chipImplementation 32 MBSize 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip mtrs 412 mm nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

POWER4+ (1) Figure: New features of the POWER5+ Source: Grassl C., „New IBM Components for HPCx”, Dec. 2003,

POWER4+ (2) Table: Main features of IBM’s dual-core POWER line On-chipOff-chipMem. contr. L3 L2 1.5 MB/shared1.44 MB/sharedSize/allocation On-chip Implementation 32 MB Size SCM 1 /MCM mtrs 380 mm nm 11/2002 DC POWER4+ 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip mtrs 412 mm nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

Figure 5.14: Contrasting POWER4 and POWER5 system structures Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, POWER5 (1)

Figure: Block diagram of the POWER5 (1) Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, / POWER5 (2)

Figure: Block diagram of the POWER5 (2) POWER5 (3)

POWER5 (4) Figure: Floorplan of the POWER5 Source: Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp

POWER4 POWER5 180 nm, 412 mm nm, 389 mm 2 (enlarged) POWER5 (6) Figure: Contrasting the floor plans of the POWER4 and POWER5 dies Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp Sources: R. Kalla, B. Sinharoy, J. Tendler: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003http://

Figure: Packaging alternatives of the POWER4/5 processors Source: Partridge R. and Ghatpande S., IBM Introduces POWER5+ and Quad-Core Modules in System p5,” Tech Trends Monthly, Nov./Dec. 2005, POWER5 + Dual-Core Module POWER5 (7)

POWER4 MCM Photo32-way System Showing 4 MCMs and L3 Cache Figure: Quad–Chip POWER4 module (MCM) and a 32-way POWER4 system Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, POWER5 (8)

Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, Figure: Interpretation of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER POWER5 (9)

Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, Figure: Photos of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER POWER5 (10)

Source: Kalla R., „IBM’s POWER5 Microprocessor Design and Methodology,” 2003, www-csl.csres.utexas.edu/users/billmark/teach/cs spring/lectures/Lecture22-RonKallaIBM.pdf Figure: The Multi-chip module of the POWER POWER5 (11)

POWER5 (12) Table: Main features of IBM’s dual-core POWER line On-chip Off-chipMem. contr. L3 L2 1.9 MB/shared1.5 MB/shared1.44 MB/sharedSize/allocation On-chip Implementation 36 MB32 MB Size 36 MB Tags on-chip DPM 6 DCM 3 /MCM 2 80 (est) 1.65/ mtrs 389 mm nm 5/2004 DC POWER5 SCM 1 /MCM mtrs 380 mm nm 11/2002 DC POWER4+ 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip mtrs 412 mm nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, Figure: Block diagram of the POWER POWER5+ (1)

Figure: Dual-Core Modules (DCMs) and Quad-Core Modules (QCM) of the POWER5+ Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, POWER5+ (2)

POWER5+ (3) Table: Main features of IBM’s dual-core POWER line SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

POWER6 POWER5+ Figure: Contrasting the block diagrams of the POWER5 and POWER6 processors Source: Kanter D., „IBM Previews the Power6,” Oct. 2006, Hardware support of decimal arithmetic POWER6 (1)

POWER6 (2) Table: Main features of IBM’s dual-core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

10.3 IBM’s MC processors Cell BE90 nm 2/ Cell BE

Hofstee H. P., „Cell today and tomorrow,” 2005, Sources: Brochard L., A Cell History,” Cell Workshop, April, Figure: The history and development cost of the Cell BE Cell BE (1)

AUC: Atomic Update Cache BIC: Bus Interface Contr. EIB: Element Interface Bus LS: Local Store of 256 KB MFC: Memory Flow Controller MIC: Memory Interface Contr. PPE: Power Processing Element PXU: POWER Execution Unit SMF: Synergistic Memory Flow Unit SPU: Synergistic Processor Unit SXU: Synergistic Execution Unit XDR: Rambus DRAM Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: Block diagram of the Cell BE Cell BE (2)

PPE: dual-threaded > 200 GFLOPS (SP) > 20 GFLOPS (DP) > 25 GB/s memory BW > 75 GB/s I/O BW > 300 GB/s EIB BW fc > 4 GHz (lab) publib.boulder.ibm.com/.../stgv1r0/topic/com.ibm.iea.cbe/cbe/1.0/Overview/L1T1H1_02_CellOverview.pdf Source: IBM „Cell Broadband Engine Overview,” Course Code L1T1H1-02, Mai 2006 Figure: Main design parameters of the Cell BE Cell BE (3) Design parameters of the Cell BE:

Figure 5.16: Cell SPE architecture Source: Blachford N.: „Cell Architecture Explained Version 2”, Cell BE (4)

Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: Block diagram of the SPE Cell BE (5)

Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: Pipeline stages of the Cell BE Cell BE (6)

Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: Floor plan of a single SPE Cell BE (7)

Source: Keable C., „And we also have hardware...” 17th Machine Evaluation Workshop, Dec. 2006, Principle of operation of the Element Interface Bus (EIB) Cell BE (8)

Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: The Element Interface Bus EIB) Cell BE (9)

Figure: The Synergistic Memory Flow unit (SMF) Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Cell BE (10)

Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Figure: Floor plan of the Cell BE processor 235 mm mtrs Cell BE (11)

Cell BE (12) Table: Main features of the IBM’s Cell BE L3 On-chipMemory controller Ring basedInterconnection network Up to 75 MB/sI/O bandwidth PPE: 2-way SPE: Multithreading 95 3GHzTDP [W] 25 GB/sMemory bandwidth PPE: 512 KB SPE: 256 KB Local Store (128*128 bit) L2 3.0/3.2f c [GHz] 234 mtrsNr. of transistors 221 mm 2 Die size 90 nmTechnology 9/2006 (in the QS20 BladeCenter)Introduction PPE: 64-bit RISC SPE: Dual-issue 32-bit SIMD with 128 bit capability Cores PowerPC 2.02Architecture Heterogeneous 1xPPE, 8*SPE Implementation Cell BESeries

Source: Brochard L., A Cell History,” Cell Workshop, April, Figure: Cell BE Blade Roadmap Cell BE (13)

Source: Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, Figure: Roadmap of the Cell BE Cell BE (14)

10.3 Literature (1) POWER4, POWER4+ Grassl C., „New IBM Components for HPCx”, Dec. 2003, Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, Krevell K., „IBM’s POWER4 Unveiling Continuues”, Microprocessor Report, Nov , pp- 1-4 Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture, IBM Server, Technical White Paper, October POWER5, POWER5+ Grassl C., „New IBM Components for HPCx”, Dec. 2003, Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, Kalla R., „IBM’s POWER5 Microprocessor Design and Methodology,” 2003, www-csl.csres.utexas.edu/users/billmark/teach/cs spring/lectures/Lecture22-RonKallaIBM.pdf Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture,, IBM J. Res. & Dev. Vol. 46, No. 1, Jan. 2002, pp. 5-25,

Kalla R., Sinharoy B., Tendler J.: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, Krevell K., „POWER5 Tops on Bandwidth”, Microprocessor Report, Dec Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp Kanter D., „IBM Previews the Power6,” Oct. 2006, Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, POWER6 POWER5, POWER5+ (cont.) Cell BE Brochard L., A Cell History,” Cell Workshop, April, Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, Blachford N.: „Cell Architecture Explained Version 2”, Day M. and Hofstee P., „Hardware and Software Architectures for the Cell Broadband Engine processor, ”CODES, Sept. 2006, Literature (2)

10.3 Literature (3) Cell BE (cont.) Keable C., „And we also have hardware...” 17th Machine Evaluation Workshop, Dec. 2006, Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, Solie, D., „Technology Trends Presentation,” Power Symposium, Aug. 2006, file14+-+darryl+solie+-+ibm+power+symposium+presentation/$file/ 14+-+darryl+solie-ibm-power+symposium+presentation+v2.pdf - „Cell Broadband Engine processor – based systems,” White Paper, IBM Corp., 2006 Krewell K., „Cell Moves Into The Limelight,” Microprocessor Report, Febr , pp. 1-9 Gschwind M., Hofstee H. P., Flachs B. K., Hophkins M., Watanabe Y., Yamazaki T „Synergistic Processing in Cell's Multicore Architecture,” IEEE Micro, Vol. 26, No. 2, 2006, pp Krolak D., „Unleashing the Cell Broadband Engine Processor,” MPR Fall Proc. Forum, Nov. 2005,