11. Multicore Processors Dezső Sima Fall 2006  D. Sima, 2006.

Slides:



Advertisements
Similar presentations
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Advertisements

Multicore Architectures Michael Gerndt. Development of Microprocessors Transistor capacity doubles every 18 months © Intel.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
III. Multicore Processors (5) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007.
Nov COMP60621 Concurrent Programming for Numerical Applications Lecture 6 Chronos – a Dell Multicore Computer Len Freeman, Graham Riley Centre for.
Dr. Alexandra Fedorova August 2007 Introduction to Systems Research at SFU.
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
Multi-core processors. History In the early 1970’s the first Microprocessor was developed by Intel. It was a 4 bit machine that was named the 4004 The.
Intel® 64-bit Platforms Platform Features. Agenda Introduction and Positioning of Intel® 64-bit Platforms Intel® 64-Bit Xeon™ Platforms Intel® Itanium®
CMPE 511 Computer Architecture Caner AKSOY CmpE Boğaziçi University December 2006 Intel ® Core 2 Duo Desktop Processor Architecture.
Multi-core Processing The Past and The Future Amir Moghimi, ASIC Course, UT ECE.
III. Multicore Processors (3)
Microarchitecture of Superscalars (4) Decoding Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
| © 2006 Lenovo Processor roadmap R.Vodenicharov.
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Current Computer Architecture Trends CE 140 A1/A2 29 August 2003.
A Gentler, Kinder Guide to the Multi-core Galaxy Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Guest lecture for ECE4100/6100.
Intel’s Penryn Sima Dezső Fall 2007 Version nm quad-core -
History of Microprocessor MPIntroductionData BusAddress Bus
Kevin Eady Ben Plunkett Prateeksha Satyamoorthy.
1 Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5)
1 Latest Generations of Multi Core Processors
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
Dezső Sima Fall 2007 (Ver. 2.1)  Dezső Sima, 2007 Multicore Processors (2)
III. Multicore Processors (1) Dezső Sima Spring 2007 (Ver. 2.0)  Dezső Sima, 2007.
Shashwat Shriparv InfinitySoft.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
1 Lecture: Cache Hierarchies Topics: cache innovations (Sections B.1-B.3, 2.1)
Dezső Sima Fall 2007 (Ver. 2.2)  Dezső Sima, 2007 Multicore Processors (1)
Sima Dezső 2007 őszi félév (Ver. 2.1)  Dezső Sima, 2007 Többmagos Processzorok (3)
Hardware Architecture
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
Modern Processors.  Desktop processors  Notebook processors  Server and workstation processors  Embedded and communications processors  Internet.
Guide to Operating Systems, 5th Edition
Presented by: Nick Kirchem Feb 13, 2004
Lynn Choi School of Electrical Engineering
Multiprocessing.
ECE232: Hardware Organization and Design
Lecture: Large Caches, Virtual Memory
Microarchitecture.
Lecture: Large Caches, Virtual Memory
Chapter 1: Introduction
Multi-core processors
Guide to Operating Systems, 5th Edition
Steven Ge, Xinmin Tian, and Yen-Kuang Chen
Multi-core processors
Architecture & Organization 1
What happens inside a CPU?
Cache Memory Presentation I
Phnom Penh International University (PPIU)
Intel’s Core i7 Processor
William Stallings Computer Organization and Architecture 7th Edition
Cache memory Direct Cache Memory Associate Cache Memory
A Comprehensive Study of Intel Core i3, i5 and i7 family
Multi-Core Computing Osama Awwad Department of Computer Science
Parallel Computers Today
Lecture 12: Cache Innovations
III. Multicore Processors (2)
Architecture & Organization 1
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Többmagos Processzorok (2)
Multicore Processors (5)
Computer Evolution and Performance
Microarchitecture of Superscalars (4) Decoding
Multicore Processors (1)
Lecture: Cache Hierarchies
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Presentation transcript:

11. Multicore Processors Dezső Sima Fall 2006  D. Sima, 2006

Overview 1 Overview of MCPs 2 Attaching L2 caches 4 Connecting memory and I/O 5 Case examples

Figure 1.1: Processor power density trends 1. Overview of MCPs (1) Figure 1.1: Processor power density trends Source: D. Yen: Chip Multithreading Processors Enable Reliable High Throughput Computing http://www.irps.org/05-43rd/IRPS_Keynote_Yen.pdf

Figure 1.2: Single-stream performance vs. cost 1. Overview of MCPs (2) Figure 1.2: Single-stream performance vs. cost Source: Marr T.T. et al. „Hyper-Threading Technology Architecture and Microarchitecture Intel Technology Journal, Vol. 06, Issue 01, Febr 14, 2002, pp. 4-16

Figure 1.2: Dual/multi-core processors (1) 1. Overview of MCPs (2) Figure 1.2: Dual/multi-core processors (1)

Figure 1.3: Dual/multi-core processors (2) 1. Overview of MCPs (3) Figure 1.3: Dual/multi-core processors (2)

1. Overview of MCPs (4) Macro architecture of dual/multi-core processors (MCPs) Layout of the cores Attaching of L2 caches Attaching of L3 caches (if available) Layout of the I/O and memory architecture

2.1 Main aspects of attaching L2 caches to MCPs (1) Allocation to the cores Use by instructions/data Integration of L2 caches to the proc. chip Inclusion policy Banking policy

Allocation of L2 caches to the cores Shared L2 cache for all cores Allocation of L2 caches to the cores Private L2 cache for each core UltraSPARC IV (2004) UltraSPARC T1 (2005) Smithfield (2005) Yonah (2006) Athlon 64 X2 (2005) Core Duo (2006) POWER4 (2001) POWER5 (2005) Montecito (2006?) Expected trend

2.1 Main aspects of attaching L2 caches to MCPs (2) Allocation to the cores Use by instructions/data Integration of L2 caches to the proc. chip Inclusion policy Banking policy

Inclusion policy of L2 caches Exclusive L2 Inclusion policy of L2 caches Inclusive L2 L1 L1 L2 L2 Memory Lines replaced (victimized) in the L1 are Memory written into the L2 References to data in the L2 initiate reloading that cache line into the L1, L2 operates usually as write back cache (only modified data that is replaced in the L2 is written back to the memory), Unmodified data that is replaced in the L2 is deleted.

Figure 1.1: Implementation of exclusive L2 caches Source: Zheng, Y., Davis, B.T., Jordan, M.: “ Performance evaluation of exclusive cache hierarchies”, 2004 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2004, pp. 89-96.

Inclusion policy of L2 caches Exclusive L2 Inclusion policy of L2 caches Inclusive L2 Most implementations Athlon 64X2 (2005) Expected trend

2.1 Main aspects of attaching L2 caches to MCPs (3) Allocation to the cores Use by instructions/data Integration of L2 caches to the proc. chip Inclusion policy Banking policy

Use by instructions/data Unified instr./data cache(s) Use by instructions/data Split instr./data caches UltraSPARC IV (2004) Montecito (2006?) UltraSPARC T1 (2005) POWER4 (2001) POWER5 (2005) Smithfield (2005) Yonah (2006) Core Duo (2006) Athlon 64 X2 (2005) Expected trend

2.1 Main aspects of attaching L2 caches to MCPs (4) Allocation to the cores Use by instructions/data Integration of L2 caches to the proc. chip Inclusion policy Banking policy

Single-banked implementation Multi-banked implementation Banking policy Multi-banked implementation

2.1 Main aspects of attaching L2 caches to MCPs (5) Allocation to the cores Use by instructions/data Integration of L2 caches to the proc. chip Inclusion policy Banking policy

Integration to the processor chip On chip L2 tags/contr., off chip data Entire L2 on chip UltraSPARC IV (2004) UltraSPARC V (2005) POWER4 (2001) POWER5 (2005) Smithfield (2005) Presler (2005) Athlon 64 X2(2005) Expected trend

2.2 Examples of attaching L2 caches to MCPs (1) Private L2 caches for each core Unified instruction / data caches Split instruction/data caches On-chip L2 tags/contr., off-chip data Entire L2 on-chip On-chip L2 t/c off-chip data Entire L2 on-chip Examples: Montecito (2006?) UltraSPARC IV (2004) Smithfield (2005) Athlon 64 X2 (2005) Presler (2005) (Exclusive L2) L2 data L2 data Core Core Core Core Core Core L2 tags/contr. L2 tags/contr. L2 L2 Interconn. Core network L2 I L2 D Core L2 I L2 D L2 L2 System Request Queue L3 L3 Syst. if. Mem. contr. Xbar Syst. if. Fire Plane bus Memory HT-bus contr. Mem contr. Syst. if. FSB FSB Memory HT-bus

2.2 Examples of attaching L2 caches to MCPs (2) Shared L2 caches for all cores Dual core/single banked L2 Dual core/multi banked L2 Multi core/multi banked L2 Yonah Duo (2006) POWER4 (2001) UltraSPARC T1 (2005) Examples: (Niagara) Core (2006) POWER5 (2005) (8 cores/4xL2 banks) Core Core Core Core Core Core X-bar L2 contr. X-bar L2 L2 L2 L2 L2 L2 Fabric Bu SContr. Fabric Bus Contr. System if. Mem. contr. Mem. contr. L3 tags/ contr. FSB GX contr. Memory Memory GX bus Mapping of addresses to the banks: Mapping of addresses to the banks: The 128-byte long L2 cache lines are hashed across The four L2 modules are interleaved at 64-byte blocks. the 3 modules. Hashing is performed by modulo 3 arithmetric applied on a large number of real address bits. 6 7 Addr. 2 1 Modulo 3 S 256 64 128 196

3. Attaching L3 caches Macro architecture of dual/multi-core processors (MCPs) Layout of the cores Attaching of L2 caches Attaching of L3 caches (if available) Layout of the I/O and memory architecture

3.1 Main aspects of attaching L3 caches to MCPs (1) Allocation to the L2 cache(s) Use by instructions/data Integration of L3 caches to the proc. chip Inclusion policy Banking policy

Allocation of L3 caches to the L2 caches Shared L3 cache for all L2s Allocation of L3 caches to the L2 caches Private L3 cache for each L2 POWER5 (2005) POWER4 (2001) UltraSPARC IV+ (2004) Montecito (2006?)

3.1 Main aspects of attaching L3 caches to MCPs (2) Allocation to the L2 cache(s) Use by instructions/data Integration of L3 caches to the proc. chip Inclusion policy Banking policy

Inclusion policy of L3 caches Exclusive L3 Inclusion policy of L3 caches Inclusive L3 L2 L2 L3 L3 Memory Lines replaced (victimized) in the L2 are Memory written into the L3 References to data in the L3 initiate reloading that cache line into the L2, L3 operates usually as write back cache (only modified data that is replaced in the L3 is written back to the memory), Unmodified data that is replaced in the L3 is deleted.

Inclusion policy of L3 caches Exclusive L3 Inclusion policy of L3 caches Inclusive L3 POWER4 (2001) POWER5 (2005) UltraSPARC IV+ (2004) Montecito (2006?) Expected trend

3.1 Main aspects of attaching L3 caches to MCPs (3) Allocation to the L2 cache(s) Use by instructions/data Integration of L3 caches to the proc. chip Inclusion policy Banking policy

Use by instructions/data Unified instr./data cache(s) Use by instructions/data Split instr./data caches All multicore processors unveiled until now hold both instruction and data

3.1 Main aspects of attaching L3 caches to MCPs (4) Allocation to the L2 cache(s) Use by instructions/data Integration of L3 caches to the proc. chip Inclusion policy Banking policy

Single-banked implementation Multi-banked implementation Banking policy Multi-banked implementation

3.1 Main aspects of attaching L3 caches to MCPs (5) Allocation to the L2 cache(s) Use by instructions/data Integration of L3 caches to the proc. chip Inclusion policy Banking policy

Integration to the processor chip On chip L3 tags/contr., off chip data Entire L3 on chip UltraSPARC IV+ (2005) POWER4 (2001) POWER5 (2005) Montecito (2006?) Expected trend

3.2 Examples of attaching L3 caches to MCPs (1) Inclusive L3 cache Private L3 caches for each L2 cache banks Shared L3 cache for all cache banks On-chip L3 tags/contr., off-chip data Entire L3 on-chip On-chip L3 tags/contr., off-chip data Entire L3 on-chip Examples: POWER4 (2001) Montecito (2006?) L2 L2 I L2 D L2 I L2 D Fabric Bus Contr. L3 L3 L3 tags/contr. Arbiter L3 data System if. Mem. contr. FSB Memory

3.2 Examples of attaching L3 caches to MCPs (2) Exclusive L3 cache Private L3 caches for each L2 cache banks Shared L3 cache for all cache banks On-chip L3 tags/contr., off-chip data Entire L3 on-chip On-chip L3 tags/contr., off-chip data Entire L3 on-chip Examples: POWER5 (2005): UltraSPARC IV+ (2005): L3 data L2 L3 tags/contr. L3 data L3 tags/contr. L2 L3 tags/contr. L3 data L2 L2 L3 tags/contr. L3 data Core Core Interconn. network Fabric Bus Contr. Syst. if. Mem. contr. Memory contr. Fire Plane bus Memory Memory

4. Connecting memory and I/O Macro architecture of dual/multi-core processors (MCPs) Layout of the cores Attaching of L2 caches Attaching of L3 caches (if available) Layout of the I/O and memory architecture

4.1 Overview Layout of the I/O and memory architecture in dual/multi-core processors Connection policy of I/O and memory Integration of the memory controller to the processor chip

4.2 Connection policy (1) Connection policy of I/O and memory Connecting both I/O and memory via the system bus Dedicated connection of I/O and memory Asymmetric connection of I/O and memory Symmetric connection of I/O and memory PA-8800 (2004) POWER4 (2001) POWER5 (2005) PA-8900 (2005) UltraSPARC T1 (2005) UltraSPARC IV (2004) Smithfield (2005) UltraSPARC IV+ (2005) Presler (2005) Athlon64 X2 (2005) Yonah Duo (2006) Core (2006) Montecito (2006?)

Connecting both I/O and memory via the system bus 4.2 Connection policy (2) Connecting both I/O and memory via the system bus Examples: Smithfield/Presler (2005/2005) Yonah Duo/Core (2006/2006) L2 L2 L2 Syst. bus if. Syst. bus if. FSB FSB Montecito (2006) PA-8800 (2004) PA-8900 (2005) L2 L2 I/ L2 I/ L2 D L2 D Core Core L2 L3 L3 contr. Syst. bus if. Syst. bus if. FSB FSB

4.2 Connection policy (3) Connection policy of I/O and memory Connecting both I/O and memory via the system bus Dedicated connection of I/O and memory Asymmetric connection of I/O and memory Symmetric connection of I/O and memory (Connecting I/O via the internal interconnection network, and memory via the L2/L3 cache) (Connecting both I/O and memory via the internal interconnection network PA-8800 (2004) POWER4 (2001) POWER5 (2005) PA-8900 (2005) UltraSPARC T1 (2005) UltraSPARC IV (2004) Smithfield (2005) UltraSPARC IV+ (2005) Presler (2005) Athlon64 X2 (2005) Yonah Duo (2006) Core (2006) Montecito (2006?)

Asymmetric connection of I/O and memory 4.2 Connection policy (4) Asymmetric connection of I/O and memory UltraSPARC T1 (2005) POWER4 (2001) L2 Core 0 Memory L2 L2 L2 L2 M. contr. X b a r L2 Memory Chip-to-chip/ M. contr. Mem.-to-Mem. Fabric Bus Contr. interconn. L2 Memory M. contr. GX contr. L3 dir./ contr. L2 Memory M. contr. Core 7 GX-bus L3 data Bus if. Mem. contr. JBus Memory

4.2 Connection policy (5) Connection policy of I/O and memory Connecting both I/O and memory via the system bus Dedicated connection of I/O and memory Asymmetric connection of I/O and memory Symmetric connection of I/O and memory (Connecting I/O via the internal interconnection network, and memory via the L2/L3 cache) (Connecting both I/O and memory via the internal interconnection network PA-8800 (2004) POWER4 (2001) POWER5 (2005) PA-8900 (2005) UltraSPARC T1 (2005) UltraSPARC IV (2004) Smithfield (2005) UltraSPARC IV+ (2005) Presler (2005) Athlon64 X2 (2005) Yonah Duo (2006) Core (2006) Montecito (2006?)

Symmetric connection of I/O and memory (1) 4.2 Connection policy (6) Symmetric connection of I/O and memory (1) POWER5 (2005) UltraSPARC IV (2004) L2 data L2 data L2 L3 L2 tags/contr. L2 tags/contr. Chip-chip/ Mem.-Mem. interconn. Fabric Bus Contr. Interconn. Core network Core GX contr. Mem contr. Syst. if. Mem. contr. Memory GX. bus Fire Plane bus Memory

Symmetric connection of I/O and memory (2) 4.2 Connection policy (7) Symmetric connection of I/O and memory (2) Athlon 64 X2 (2005) UltraSPARC IV+ (2005) L3 data L2 L2 L3 tags/contr. System Request Queue L2 Xbar Core Core Interconn. network HT-bus contr. Mem contr. Syst. if. Mem. contr. Memory HT-bus Fire Plane bus Memory

4.3 Integration of the memory controller to the processor chip Off-chip memory controller On-chip memory controller POWER4 (2001) POWER5 (2005) PA-8800 (2004) UltraSPARC IV (2004) PA-8900 (2005) UltraSPARC IV+ (2005) UltraSPARC T1 (2005) Smithfield (2005) Presler (2005) Athlon 64 X2 (2005) Yonah Duo (2006) Core (2006) Montecito (2006?) Expected trend

Figure 5.1: The move to Intel multi-core 5. Case examples 5.1 Intel MCPs (1) Figure 5.1: The move to Intel multi-core Source: A. Loktu: Itanium 2 for Enterprise Computing http://h40132.www4.hp.com/upload/se/sv/Itanium2forenterprisecomputing.pps

Source: http://www.intel.com/products/processor/index.htm 5.1 Intel MCPs (2) Figure 5.2: Processor specifications of Intel’s Pentium D family (90 nm) Source: http://www.intel.com/products/processor/index.htm

5.1 Intel MCPs (3) ED: Execute Disable Bit Malicious buffer overflow attacks pose a significant security threat. In a typical attack, a malicious worm creates a flood of code that overwhelms the processor, allowing the worm to propagate itself to the network, and to other computers. It can help prevent certain classes of malicious buffer overflow attacks when combined with a supporting operating system. Execute Disable Bit allows the processor to classify areas in memory by where application code can execute and where it cannot. When a malicious worm attempts to insert code in the buffer, the processor disables code execution, preventing damage and worm propagation. VT: Virtualization Technology It is a set of hardware enhancements to Intel’s server and client platforms that can improve the performance and robustness of traditional software-based virtualization solutions. Virtualization solutions will allow a platform to run multiple operating systems and applications in independent partitions. Using virtualization capabilities, one computer system can function as multiple "virtual" systems. EIST: Enhanced Intel SpeedStep Technology First delivered in Intel’s mobile and server platforms, It allows the system to dynamically adjust processor voltage and core frequency, which can result in decreased average power consumption and decreased average heat production.

Source: http://www.intel.com/products/processor/index.htm 5.1 Intel MCPs (4) Figure 5.3: Processor specifications of Intel’s Pentium D family (65 nm) Source: http://www.intel.com/products/processor/index.htm

Source: http://www.intel.com/products/processor/index.htm 5.1 Intel MCPs (5) Figure 5.4 Specifications of Intel’s Pentium Processor Extrem Edition models 840/955/965 Source: http://www.intel.com/products/processor/index.htm

Source: http://www.intel.com/products/processor/index.htm 5.1 Intel MCPs (6) Figure 5.5: Procesor specifications of Intel’s Yonah Duo (Core Duo) family Source: http://www.intel.com/products/processor/index.htm

Figure 5.6 Specifications of Intel’s Core Processors 5.1 Intel MCPs (7) Figure 5.6 Specifications of Intel’s Core Processors Source: http://www.intel.com/products/processor_number/chart/core2duo.htm

Figure 5.7: Future 65 nm processors (overview) 5.1 Intel MCPs (8) Category Code Name Cores Cache Market Desktop Kentsfield Dual core multi-die 4 MB Mid 2007 Conroe Dual core single die 4 MB shared End 2006 Allendale 2 MB shared Cedar Mill (NetBurst/P4) Single core 512 kB, 1 MB, 2 MB Early 2006 Presler (NetBurst/P4) Dual core, dual die Desktop/Mobile Millville 1 MB Early 2007 Mobile Yonah2 Dual core, single die 2 MB Yonah1 1/2 MB Mid 2006 Stealey 512 kB Merom 2/4 MB shared Enterprise Sossaman Woodcrest Clovertown Quad core, multi-die Dempsey (NetBurst/Xeon) Tulsa 4/8/16 MB Whitefield Quad core single die 8 MB, 16 MB shared Early 2008 Figure 5.7: Future 65 nm processors (overview) Source: P. Schmid: Top Secret Intel Processor Plans Uncovered www.tomshardware.com/2005/12/04/top_secret_intel_processor_plans_uncovered

Figure 5.8: Future 45 nm processors (overview) 5.1 Intel MCPs (9) Codename Cores Cache Market Desktop Wolfdale Dual core, single die 3 MB shared 2008 Ridgefield Dual core single die 6 MB shared Yorkfield 8 cores multi-die 12 MB shared 2008+ Bloomfield Quad core, single die - Desktop/Mobile Perryville Single core 2 MB Mobile Penryn 3 MB, 6 MB shared Silverthorne Enterprise Hapertown Figure 5.8: Future 45 nm processors (overview) Source: P. Schmid: Top Secret Intel Processor Plans Uncovered www.tomshardware.com/2005/12/04/top_secret_intel_processor_plans_uncovered

Figure 5.9: AMD Athlon 64 X2 dual-core processor architecture Source: AMD Athlon 64 X2 Dual-Core Processor for Desktop – Key Architecture Features, http:///www.amd.com/us-en/Processors/ProductInformation/0,,30_118_9485_13041.00.html

5.3 Sun’s UltraSPARC IV/IV+ (1) ARB: Arbiter Figure 5.10: UltraSPARC IV (Jaguar) Source: C. Boussard: Architecture des processeurs http://laser.igh.cnrs.fr/IMG/pdf/SUN-CNRS-archi-cpu-3.pdf

5.3 Sun’s UltraSPARC IV/IV+ (2) Figure 5.11: UltraSPARC IV+ (Panther) Source: C. Boussard: Architecture des processeurs http://laser.igh.cnrs.fr/IMG/pdf/SUN-CNRS-archi-cpu-3.pdf

Figure 5.12: POWER4 chip logical view 5.4 POWER4/POWER5 (1) Core interface Unit (crossbar) Service Processor Power On Reset Built-In-SelfTest Non-Cacheable Unit MultiChip Module Figure 5.12: POWER4 chip logical view Source: J.M. Tendler, S. Dodson, S. Fields, H. Le, B. Sinharoy: Power4 System Microarchitecture, IBM Server, Technical White Paper, October 2001 http://www-03.ibm.coom/servers/eserver/pseries/hardware/whitepapers/power4.pdf

5.4 POWER4/POWER5 (2) Figure 5.13: POWER4 chip Source: R. Kalla, B. Sinharoy, J. Tendler: Simultaneous Multi-threading Implementation in Power5 – IBM’s Next Generation POWER Microprocessor, 2003 http://www.hotchips.org/archives/hc15/3_Tue/11.ibm.pdf

Figure 5.14: POWER4 and POWER5 system structures 5.4 POWER4/POWER5 (3) Fabric Controller Figure 5.14: POWER4 and POWER5 system structures Source: R. Kalla, B. Sinharoy, J.M. Tendler: IBM Power5 chip: A Dual-core multithreaded Processor, IEEE. Micro, Vol. 24, No.2, March-April 2004, pp. 40-47.

5.5 Cell (1) SPE: Synergistic Processing Element EIB: Element Interface Bus MFC: Memory Flow Controller PPE: Power Processing Element AUC: Atomic Update Cache Figure 5.15: Cell (BE) microarchitecture Source: IBM: „Cell Broadband Engine™ processor – based systems”, IBM corp. 2006

Figure 5.16: Cell SPE architecture Source: Blachford N.: „Cell Architecture Explained Version 2”, http://www.blachford.info/computer/Cell/Cell1_v2.html

Figure 5.17: Cell floorplan Source: Blachford N.: „Cell Architecture Explained Version 2”, http://www.blachford.info/computer/Cell/Cell1_v2.html