III. Multicore Processors (6) Dezső Sima Spring 2007 (Ver. 2.1) Dezső Sima, 2007
Gemini UltraSPARC IV line UltraSPARC T line Sun’s MC processors
Figure: Overview of Sun’s major processor families [4.1] Multi-core processors 10.4 Evolution of Sun’s processor lines
Gemini (cancelled)130 nm 4/ Gemini 10.4 Sun’s MC processors
Figure: Block diagram and die shot of the Gemini [4.1] Gemini (1)
Figure: Main features of the Gemini processor [4.1] Gemini (2)
Gemini (34) Table: Main features of Sun’s Gemini On-dieMem. Contr. 2*512 KB/privateSize/allocation L3 L2 Size 959 pin PGA 32 On-die 1.0/ mtrs. 206 mm nm Cancelled 4/2004 2* UltraSPARC II DC Gemini Multithreading Socket TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduction Cores Dual/Quad-Core Model
UltraSPARC IV Jaguar3/ nm UltraSPARC IV line 10.4 Sun’s MC processors UltraSPARC IV+ Panther9/ nm
Figure : UltraSPARC IV (Jaguar) [4.2] ARB: Arbiter UltraSPARC IV (1)
Figure: Floor plan of the UltraSPARC IV [4.3] UltraSPARC IV (2)
UltraSPARC IV (3) Table: Main features of Sun’s UltraSPARC IV processor On-die Mem. Contr. 2*8 MB/private2*512 KB/privateSize/allocation L3 L2 16 MB/sharedSize 959 pin PGA 32 On-die 1.0/ mtrs. 206 mm nm Cancelled 4/2004 2* UltraSPARC II DC Gemini 1368 pin LGA 108 Off-die, L2 tags on-die 1.050/1.200/ mtrs. 352 mm nm 7/2004 2*UltraSPARC III DC UltraSPARC IV (Jaguar) Multithreading Socket TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduction Cores Dual/Quad-Core Model
Figure: UltraSPARC IV+ (Panther) [4.2] UltraSPARC IV+ (1)
Figure: Die shot and floor plan of the UltraSPARCIV+ [4.7] 19.7 x 17.0 mm UltraSPARC IV+ (2) UltraSPARC IV+
Figure: Contrasting the floor plans of the UltraSPARC IV and UltraSPARC IV+ dies [4.3], [4.7] UltraSPARC IV UltraSPARC IV+ 130 nm/356 mm 2 /66 mtrs 90 nm/335 mm 2 /295 mtrs UltraSPARC IV+ (3)
Figure: Schmoo plot of the UltraSPARCIV+ [4.7] UltraSPARC IV+ (4)
UltraSPARC IV+ (5) Table: Main features of Sun’s IV+ processor On-die Mem. Contr. 2 MB/shared2*8 MB/private2*512 KB/privateSize/allocation L3 L2 32 MB/shared16 MB/sharedSize 959 pin PGA 32 On-die 1.0/ mtrs. 206 mm nm Cancelled 4/2004 2* UltraSPARC II DC Gemini 1368 pin LGA 108 Off-die, L2 tags on-die 1.050/1.200/ mtrs. 352 mm nm 7/2004 2*UltraSPARC III DC UltraSPARC IV (Jaguar) Multithreading 1368 pin LGASocket 90TDP [W] L3 tags on-die, L3 exclusive of L2 Implementation On-dieImplementation 1.5/1.8f c [GHz] 295 mtrs.Nr. of transistors 335 mm 2 Die size 90 nmTechnology 9/2005Introduction 2*UltraSPARC IIICores DCDual/Quad-Core UltraSPARC IV+ (Panther) Model
UltraSPARC T line 10.4 Sun’s MC processors UltraSPARC T1 Niagara11/ nm UltraSPARC T2 Niagara nm
Figure: Block diagram of the UltraSPARC T1 (Niagara) [4.10] UltraSPARC T1 (1)
Figure: Pipeline stages of the Niagara cores (scalar FX cores) [4.10] UltraSPARC T1 (2)
Figure: Die shot of Niagara [4.10] UltraSPARC T1 (3)
Figure: Floor plan and main features of Niagara [4.10] UltraSPARC T1 (4)
UltraSPARC T1 (5) Table: Main features of Sun’s UltraSPARC T1 processor L3 UltraSPARC T1Series JBus (3.2 GB/s)I/O-bus 4-channels, on-die, 400 MT/sMemory controller Bandwidth: >200 GB/sInterconnection NW MonolithicImpl. or the cores SPARC V9Architecture 25.6 GB/sMemory bandwidth 3 MB/sharedSize/allocation L2 4-way/core 63 On-die mtrs. 379 mm 2 90 nm 11/2005 Scalar integer FX cores 8 cores UltraSPARC T1 Multithreading TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduction Cores Nr. of cores Models
Figure: Block diagram of UltraSPARC 2 (Niagara-2) [4.12] UltraSPARC T2 (1)
Figure: block diagram of the cores in Niagara 2 [4.12] UltraSPARC T2 (2)
Figure: The full crossbar swith of Niagara 2 [4.12] UltraSPARC T2 (3)
Main features and floor plan of the Niagara-2 [4.12] UltraSPARC T2 (4)
Figure: Floor plan of the Niagara-2 [4.13] UltraSPARC T2 (5)
Figure: Comparison of the block diagrams of Niagara-1 and -2 [4.14] UltraSPARC T2 (6)
UltraSPARC T2 (7) Table: Main features of Sun’s UltraSPARC T2 processor L3 UltraSPARC T1/T2Series JBus (3.2 GB/s) I/O-bus 4-channels, on-die, 400 MT/s Memory controller Full 8*9 crossbar switchBandwidth: >200 GB/sInterconnection NW Monolithic Impl. or the cores SPARC V9 Architecture 42.7 GB/s25.6 GB/sMemory bandwidth 4 MB/shared3 MB/sharedSize/allocation L2 4-way/core 63 On-die mtrs. 379 mm 2 90 nm 11/2005 Scalar integer FX cores 8 cores UltraSPARC T1 8-way/coreMultithreading 72 (est.)TDP [W] On-dieImplementation 1.4f c [GHz] n.a.Nr. of transistors 342 mm 2 Die size 65 nmTechnology 2007Introduction Dual-issue FX/FP coresCores 8 coresNr. of cores UltraSPARC T2Models
10.4 Literature (1) UltraSPARC IV [4.1] Kapil S., „Gemini,” 2003, [4.6] Boussard C., „Architecture des processeurs,” [4.3] Krewell K., „UltraSPARCIV Mirrors Predecessor, MPR, Nov. 10, 2003, [4.7] Dixit A. et al., „Implementation and Productization of a 4th Generation 1.8 GHz Dual-Core SPARC V9 Microprocessor, Febr. 2006, Gemini UltraSPARC IV+ [4.2] Boussard C., „Architecture des processeurs,” [4.8] - „UltraSPARC IV+ Processor User’s Manual Supplement,” Ver. 1.0, Sun Microsystems, Oct. 2005, [4.4] - „UltraSPARC IV Processor User’s Manual Supplement,” Ver. 1.0, Sun Microsystems, Apr. 2004, [4.5] - UltraSPARC IV Processor Architecture Overview, Technical Whitepaper, Febr. 2004,
UltraSPARC T1 UltraSPARC T2 [4.10] Laudon J., „UltraSPARC T1: A 32-threaded CMP for Servers, 2006, [4.12] Golla R., „Niagara2: A Highly Threaded Server-on-a-Chip,” Oct. 2006, [4.13] Grohoski G., „Niagara-2,” Aug. 2006, [4.14] Kanter D.” Niagara II, The Hydra Returns,” Literature (2) [4.15] McGhan H., „Niagara 2 Opens The Floodgates,” Microprocessor Report, Nov. 6, 2006, pp. 1-9 [4.9] Kongetira P., Aingaran K., Olukoton K., „Niagara: A 32-way Multithreaded SPARC Processor,” IEEE Micro, March-April 2005, pp [4.11] - „UltraSPARC T1 Supplement to the UltraSPARC architecture 2005, Draft D2.0, March 2006,
SPARC64 VI SPARC64 VII Fujitsu’s MC processors
SPARC64 VIOlympus90 nm 2007 SPARC64 VII Jupiter65 nm 2008 Dual-core SPARC64 line 10.5 Fujitsu’s MC processors
Reservation Stations (E: FX, F: FP, A: Adress, BR: Branch, FP/SP: L/S) Execution Units (EX: FX, FL: PA, AGEN: Adr. Gen.) 10.5 SPARC64 VI Figure: Block diagram of the SPARC64 VI [5.1]
10.5 SPARC64 VII (1) Figure: Block diagram of the SPARC64 VII [5.1]
10.5 SPARC64 VI/VII (2) Table: Main features of Fujitiu’s multi-core processors (superscalar RISC’s) L3 Jupiter Bus FSB [MT/s] 6 MB/shared Size/allocation L2 2-way 120 On-die mtrs. 421 mm 2 90 nm *SPARC64V (enhanced) SPARC64 VI (Olympus) SPARC64 2-wayMultithreading ~ 120TDP [W] On-dieImplementation ~ 2.7f c [GHz] n.ANr. of transistors 464 mm 2 Die size 65 nmTechnology 2008Introduction 4*SPARC64 VI (enhanced)Cores SPARC64 VII (Jupiter) Models Series
10.5 Literature Sparc64 line [5.1] Inouo A., „Fujitsu SPARC64 VI,” Fall Microprocessor Forum, Oct. 2006, Fujitsu Ltd., [5.3] Krewell K., „SPARC’s Still Going Strong,” Microprocessor Report, Nov. 14, 2005, pp. 1-3 [5.2] Krewell K., „Fujitsu Makes SPARC See Double,” Microprocessor Report, Nov. 24, 2003, pp. 1-3 [5.4] Maruyama T., „SPARC64 VI/VI+ Next Generation processor,” MPF, Oct. 2005,
PA-8800 PA HP’s MC processors
PA-8800Mako130 nm 2/2004 PA-8900 Shortfin130 nm 5/2005 Dual-core PA-8xxx processors (PA 8700-based) 10.6 HP’s dual-core processors
Source: E&M Computing, Figure: The underlaying PA-8700 core 10.6 PA-8800 (1)
Further source: Lostcircuits, Oct. 2001, Figure: Block diagram of the PA-8800 [6.2] 10.6 PA-8800 (2)
Figure: Floorplan of the Mako [6.2] 10.6 PA-8800 (3)
Figure: Contrasting the Floorplans of the PA-8700 and PA-8800 processors [6.2] Further source: E&M Computing, PA-8800 (4)
10.6 PA-8900 (1) Table: Main features of HP’s PA-8800 and PA-8900 processors PA-RISC 2.0 Achitecture 400 MT/s (16 B) FSB Monolithic Impl. of the cores Off-die Mem. Contr. 64 MB/shared32 MB/sharedSize/allocation L2 55 Tags on-chip, data off-chip 0.8/ mtrs. 366 mm nm 2/2004 2* PA-8700 DC PA-8800 (Mako) n.a. Tags on-chip, data off-chip mtrs. 366 mm nm 5/2004 2*PA-8700 DC PA-8900 (Shortfin) TDP [W] Implementation f c [GHz] Nr. of transistors Die size Technology Introduction Cores Dual/Quad-Core Models
10.6 Literature [6.1] MS, „HP PA-8800 RISCProcessor,” Lostcircuits, Oct. 2001, PA 8800/8900 [6.2] Johnson D., „HP’s Mako processor”, Oct. 2001, ftp.parisc-linux.org/docs/whitepapers/mako_mpf_2001.pdf [6.3] Weissmann P., „The OpenPA Project,” First Edition, Berlin, 2007,
XLR line (embedded) RMI’ MC processors
XLR 90 nm 5/2005 XLR line (embedded) 10.7 RMI’s MC processors
Cores: 64-bit MIPS64 with XLR enhancements 4-way multithreaded up to 1.5 Gz 32KB L1 I$, 32 KB L1 D$ branch prediction Figure: XLR cores [7.3] Aim: Embedded systems, such as processing cores from packet data transfers, cryptography functions, authentication operations, TCP/IP CRC calculations and network interface data management XLR line (1)
Figure: Architecture of the XLR family [7.4] 10.7 XLR line (2)
Figure: Block diagram of RMI’s XLR family [7.1] 10.7 XLR line (3)
Figure: The Fast Messaging Network (FMN) [7.3] 10.7 XLR line (4)
Figure: The Memory Distributed Interconnect (MDI) providing 484 Gbits/s bandwidth [7.1] 10.7 XLR line (5)
Figure: Floor plan of the XLR die [7.1] 10.7 XLR line (6)
10.7 XLR line (7) Table: Main features of RMI’s XLR lines Three on-die rings: Memory Distributed Interconnect (48 GB/s) Fast Messaging Network (24 GB/s) I/O Distributed Interconnect (61 GB/s) Interconnection networks Two On-die memory controllers, each supporting two 32-bit or one 64-bit memory channel Memory controller L3 XLR 300/XLR 500/XLR 700Series MIPS 64Architecture 2 MB/sharedSize/allocation L2 4-way/cores On-chip mtrs. ~ 220 mm 2 90 nm 5/2005 Scalar FX cores 2/4/8 Multithreading TDP [W] Impl. f c [GHz] Nr. of transistors Die size Technology Introduction Cores Nr. of cores
10.7 Literature XLR series [7.3] - „XLR Processor Product Overview,” Preliminary, May 2005, [7.1] Krewell K., „A New MIPS Powerhouse Arrives,”, Microprocessor Report, 5/17/ [7.2] - Multicore, multithreaded chips ship with Linux,” LinuxDevices, May 2005, [7.4] - „RMI XLR Processor Family Product Brief,” Document # 2001PB, RMI Inc., 2007,