Presentation is loading. Please wait.

Presentation is loading. Please wait.

III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007.

Similar presentations


Presentation on theme: "III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007."— Presentation transcript:

1 III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007

2 10.3.1 POWER line 10.3.2 Cell BE 10.3 IBM’s MC processors

3 POWER4180 nm 10/2001 POWER4+ 130 nm 11/2002 10.3.1 POWER line POWER5130 nm 5/2004 POWER5+ 90 nm 10/2005 POWER6 65 nm 2007

4 Figure: The evolution of IBM’s major RISC lines 10.3.1 Evolution of IBM’s major RISC lines

5 Figure : POWER4 chip logical view Built-In-SelfTest Service Processor Power On Reset Core interface Unit (crossbar) Non-Cacheable Unit MultiChip Module 10.3.1 POWER4 (1) Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture,, IBM J. Res. & Dev. Vol. 46, No. 1, Jan. 2002, pp. 5-25, http://www.research.ibm.com/journal/rd/461/tendler.pdf

6 Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., http://www-03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power4.pdf Figure: Logical view of the L3 controller 10.3.1 POWER4 (2)

7 Figure: The memory cotroller of the POWER4 Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., http://www-03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power4.pdf 10.3.1 POWER4 (3)

8 Figure: I/O controller of the POWER4 Source: Power4 System Microarchitecture, Technical White Paper, 2001, IBM Corp., http://www-03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power4.pdf Fabric Controller 10.3.1 POWER4 (4)

9 Figure: POWER4 chip Source: R. Kalla, B. Sinharoy, J. Tendler: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003 http://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdf 10.3.1 POWER4 (5)

10 10.3.1 POWER4 (6) Table: Main features of IBM’s dual-core POWER line Off-chipMem. contr. L3 L2 1.44 MB/sharedSize/allocation On-chipImplementation 32 MBSize 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip 1.3 174 mtrs 412 mm 2 180 nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

11 10.3.2 POWER4+ (1) Figure: New features of the POWER5+ Source: Grassl C., „New IBM Components for HPCx”, Dec. 2003, http://www.hpcx.ac.uk/about/events/annual2003/Grassl.pdf

12 10.3.1 POWER4+ (2) Table: Main features of IBM’s dual-core POWER line On-chipOff-chipMem. contr. L3 L2 1.5 MB/shared1.44 MB/sharedSize/allocation On-chip Implementation 32 MB Size SCM 1 /MCM 2 70 1.7 184 mtrs 380 mm 2 130 nm 11/2002 DC POWER4+ 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip 1.3 174 mtrs 412 mm 2 180 nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

13 Figure 5.14: Contrasting POWER4 and POWER5 system structures Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ 10.3.1 POWER5 (1)

14 Figure: Block diagram of the POWER5 (1) Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http:// www.llnl.gov/computing/tutorials/ibm_sp / 10.3.1 POWER5 (2)

15 http://studies.ac.upc.edu/ETSETB/SEGPAR/microprocessors/power5%20(2)%20(mpr).pdf Figure: Block diagram of the POWER5 (2) 10.3.1 POWER5 (3)

16 10.3.1 POWER5 (4) Figure: Floorplan of the POWER5 Source: Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp. 505-521

17 POWER4 POWER5 180 nm, 412 mm 2 130 nm, 389 mm 2 (enlarged) 10.3.1 POWER5 (6) Figure: Contrasting the floor plans of the POWER4 and POWER5 dies Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp. 505-521 Sources: R. Kalla, B. Sinharoy, J. Tendler: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003http://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdfhttp://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdf

18 Figure: Packaging alternatives of the POWER4/5 processors Source: Partridge R. and Ghatpande S., IBM Introduces POWER5+ and Quad-Core Modules in System p5,” Tech Trends Monthly, Nov./Dec. 2005, POWER5 + Dual-Core Module 10.3.1 POWER5 (7)

19 POWER4 MCM Photo32-way System Showing 4 MCMs and L3 Cache Figure: Quad–Chip POWER4 module (MCM) and a 32-way POWER4 system Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ 10.3.1 POWER5 (8)

20 Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ Figure: Interpretation of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 10.3.1 POWER5 (9)

21 Source:Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ Figure: Photos of Dual-Chip Modules (DCMs) and Multi-Chip Modules (MCM) of the POWER5 10.3.1 POWER5 (10)

22 Source: Kalla R., „IBM’s POWER5 Microprocessor Design and Methodology,” 2003, www-csl.csres.utexas.edu/users/billmark/teach/cs352-05-spring/lectures/Lecture22-RonKallaIBM.pdf Figure: The Multi-chip module of the POWER5 10.3.1 POWER5 (11)

23 10.3.1 POWER5 (12) Table: Main features of IBM’s dual-core POWER line On-chip Off-chipMem. contr. L3 L2 1.9 MB/shared1.5 MB/shared1.44 MB/sharedSize/allocation On-chip Implementation 36 MB32 MB Size 36 MB Tags on-chip DPM 6 DCM 3 /MCM 2 80 (est) 1.65/1.9 276 mtrs 389 mm 2 130 nm 5/2004 DC POWER5 SCM 1 /MCM 2 70 1.7 184 mtrs 380 mm 2 130 nm 11/2002 DC POWER4+ 32 MB Tags on-chip SCM 1 /MCM 2 115/125 Tags on-chip, data off-chip 1.3 174 mtrs 412 mm 2 180 nm 10/2001 DC POWER4 L3 size L3 impl. Power management Dual threaded Packaging TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduced Dual/Quad-Core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

24 Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf Figure: Block diagram of the POWER5+ 10.3.1 POWER5+ (1)

25 Figure: Dual-Core Modules (DCMs) and Quad-Core Modules (QCM) of the POWER5+ Source: Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf 10.3.1 POWER5+ (2)

26 10.3.1 POWER5+ (3) Table: Main features of IBM’s dual-core POWER line 10.3 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

27 POWER6 POWER5+ Figure: Contrasting the block diagrams of the POWER5 and POWER6 processors Source: Kanter D., „IBM Previews the Power6,” Oct. 2006, dkanter@realwordtech.com Hardware support of decimal arithmetic 10.3.1 POWER6 (1)

28 10.3.1 POWER6 (2) Table: Main features of IBM’s dual-core POWER line 1 SMC: Single Chip Module 2 MCM: Multi Chip Module 3 DCM: Dual Chip Module 4 DCM: Dual Core Module 5 QCM: Quad Core Module 6 DPM: Dynamic Power Management

29 10.3 IBM’s MC processors Cell BE90 nm 2/2006 10.3.2 Cell BE

30 Hofstee H. P., „Cell today and tomorrow,” 2005, http://www.stanford.edu/class/ee380/Abstracts/Cell_060222.pdf Sources: Brochard L., A Cell History,” Cell Workshop, April, 2006 http://www.irisa.fr/orap/Constructeurs/Cell/Cell%20Short%20Intro%20Luigi.pdf Figure: The history and development cost of the Cell BE 10.3.2 Cell BE (1)

31 AUC: Atomic Update Cache BIC: Bus Interface Contr. EIB: Element Interface Bus LS: Local Store of 256 KB MFC: Memory Flow Controller MIC: Memory Interface Contr. PPE: Power Processing Element PXU: POWER Execution Unit SMF: Synergistic Memory Flow Unit SPU: Synergistic Processor Unit SXU: Synergistic Execution Unit XDR: Rambus DRAM Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: Block diagram of the Cell BE 10.3.2 Cell BE (2)

32 PPE: dual-threaded > 200 GFLOPS (SP) > 20 GFLOPS (DP) > 25 GB/s memory BW > 75 GB/s I/O BW > 300 GB/s EIB BW fc > 4 GHz (lab) publib.boulder.ibm.com/.../stgv1r0/topic/com.ibm.iea.cbe/cbe/1.0/Overview/L1T1H1_02_CellOverview.pdf Source: IBM „Cell Broadband Engine Overview,” Course Code L1T1H1-02, Mai 2006 Figure: Main design parameters of the Cell BE 10.3.2 Cell BE (3) Design parameters of the Cell BE:

33 Figure 5.16: Cell SPE architecture Source: Blachford N.: „Cell Architecture Explained Version 2”, http://www.blachford.info/computer/Cell/Cell1_v2.html 10.3.2 Cell BE (4)

34 Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: Block diagram of the SPE 10.3.2 Cell BE (5)

35 Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: Pipeline stages of the Cell BE 10.3.2 Cell BE (6)

36 Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: Floor plan of a single SPE 10.3.2 Cell BE (7)

37 Source: Keable C., „And we also have hardware...” 17th Machine Evaluation Workshop, Dec. 2006, http://www.cse.clrc.ac.uk/disco/mew17/talks/Keable_IBM_MEW17.pdf Principle of operation of the Element Interface Bus (EIB) 10.3.2 Cell BE (8)

38 Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: The Element Interface Bus EIB) 10.3.2 Cell BE (9)

39 Figure: The Synergistic Memory Flow unit (SMF) Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf 10.3.2 Cell BE (10)

40 Source: Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Figure: Floor plan of the Cell BE processor 235 mm 2 241 mtrs 10.3.2 Cell BE (11)

41 10.3.2 Cell BE (12) Table: Main features of the IBM’s Cell BE L3 On-chipMemory controller Ring basedInterconnection network Up to 75 MB/sI/O bandwidth PPE: 2-way SPE: Multithreading 95 W @ 3GHzTDP [W] 25 GB/sMemory bandwidth PPE: 512 KB SPE: 256 KB Local Store (128*128 bit) L2 3.0/3.2f c [GHz] 234 mtrsNr. of transistors 221 mm 2 Die size 90 nmTechnology 9/2006 (in the QS20 BladeCenter)Introduction PPE: 64-bit RISC SPE: Dual-issue 32-bit SIMD with 128 bit capability Cores PowerPC 2.02Architecture Heterogeneous 1xPPE, 8*SPE Implementation Cell BESeries

42 Source: Brochard L., A Cell History,” Cell Workshop, April, 2006 http://www.irisa.fr/orap/Constructeurs/Cell/Cell%20Short%20Intro%20Luigi.pdf Figure: Cell BE Blade Roadmap 10.3.2 Cell BE (13)

43 Source: Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, http://www.cercs.gatech.edu/docs/SC06_Cell_111606.pdf Figure: Roadmap of the Cell BE 10.3.2 Cell BE (14)

44 10.3 Literature (1) POWER4, POWER4+ Grassl C., „New IBM Components for HPCx”, Dec. 2003, http://www.hpcx.ac.uk/about/events/annual2003/Grassl.pdf Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, http://h21007.www2.hp.com/dspp/files/unprotected/Itanium/sizingsuperheavys.pdf Krevell K., „IBM’s POWER4 Unveiling Continuues”, Microprocessor Report, Nov. 20. 2000, pp- 1-4 Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture, IBM Server, Technical White Paper, October 2001 http://www-03.ibm.coom/servers/eserver/pseries/hardware/whitepapers/power4.pdf POWER5, POWER5+ Grassl C., „New IBM Components for HPCx”, Dec. 2003, http://www.hpcx.ac.uk/about/events/annual2003/Grassl.pdf Barney B., „IBM POWER Systems Overview”, Livermore Computing, 2006, http://www.llnl.gov/computing/tutorials/ibm_sp/ DeMone P., „Sizing Up the Super Heavyweights,” Real Word Technologies, Sept. 2004, http://h21007.www2.hp.com/dspp/files/unprotected/Itanium/sizingsuperheavys.pdf Kalla R., „IBM’s POWER5 Microprocessor Design and Methodology,” 2003, www-csl.csres.utexas.edu/users/billmark/teach/cs352-05-spring/lectures/Lecture22-RonKallaIBM.pdf Tendler, J.M., Dodson, S., Fields S., Le H., Sinharoy B.: Power4 System Microarchitecture,, IBM J. Res. & Dev. Vol. 46, No. 1, Jan. 2002, pp. 5-25, http://www.research.ibm.com/journal/rd/461/tendler.pdf

45 Kalla R., Sinharoy B., Tendler J.: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003 http://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdf Krevell K., „POWER5 Tops on Bandwidth”, Microprocessor Report, Dec. 2003 http://studies.ac.upc.edu/ETSETB/SEGPAR/microprocessors/power5%20(2)%20(mpr).pdf Shinharoy B., Kalla R.N., Tendler J.M., Eickenmeyer R.J., Joyner J.B., „POWER5 system microarchitecture,” IBM J. R&D, Vol. 49, No. 4/5, 2005, pp. 505-521 Kanter D., „IBM Previews the Power6,” Oct. 2006, dkanter@realwordtech.com Vetter S. et al., IBM System p5 Quad-Core Module Based on POWER5+ Technology,” Redbooks paper, IBM Corp. 2006, http://www.redbooks.ibm.com/redpapers/pdfs/redp4150.pdf POWER6 POWER5, POWER5+ (cont.) Cell BE Brochard L., A Cell History,” Cell Workshop, April, 2006 http://www.irisa.fr/orap/Constructeurs/Cell/Cell%20Short%20Intro%20Luigi.pdf Gshwind M., „Chip Multiprocessing and the Cell BE,” ACM Computing Frontiers, 2006, http://beatys1.mscd.edu/compfront//2006/cf06-gschwind.pdf Blachford N.: „Cell Architecture Explained Version 2”, http://www.blachford.info/computer/Cell/Cell1_v2.html Day M. and Hofstee P., „Hardware and Software Architectures for the Cell Broadband Engine processor, ”CODES, Sept. 2006, http://www.casesconference.org/cases2005/pdf/Cell-tutorial.pdf 10.3 Literature (2)

46 10.3 Literature (3) Cell BE (cont.) Keable C., „And we also have hardware...” 17th Machine Evaluation Workshop, Dec. 2006, http://www.cse.clrc.ac.uk/disco/mew17/talks/Keable_IBM_MEW17.pdf Hofstee H. P., „Real-time Superconputing and Technology for Games and Entertainment,” 2006, http://www.cercs.gatech.edu/docs/SC06_Cell_111606.pdf Solie, D., „Technology Trends Presentation,” Power Symposium, Aug. 2006, http://www-03.ibm.com/procurement/proweb.nsf/objectdocswebview/ file14+-+darryl+solie+-+ibm+power+symposium+presentation/$file/ 14+-+darryl+solie-ibm-power+symposium+presentation+v2.pdf - „Cell Broadband Engine processor – based systems,” White Paper, IBM Corp., 2006 Krewell K., „Cell Moves Into The Limelight,” Microprocessor Report, Febr. 14 2005, pp. 1-9 Gschwind M., Hofstee H. P., Flachs B. K., Hophkins M., Watanabe Y., Yamazaki T „Synergistic Processing in Cell's Multicore Architecture,” IEEE Micro, Vol. 26, No. 2, 2006, pp. 10-24 Krolak D., „Unleashing the Cell Broadband Engine Processor,” MPR Fall Proc. Forum, Nov. 2005, http://www-128.ibm.com/developerworks/power/library/pa-fpfeib/?ca=dgr-lnxwCellConnects


Download ppt "III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007."

Similar presentations


Ads by Google