Dezső Sima Evolution of Intel’s Basic Microarchitectures - 2 April 2013 Vers. 3.3.

Slides:



Advertisements
Similar presentations
Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils.
Advertisements

Intel Multi-Core Technology. New Energy Efficiency by Parallel Processing – Multi cores in a single package – Second generation high k + metal gate 32nm.
Intel ® Xeon ® Processor E v2 Product Family Ivy Bridge Improvements *Other names and brands may be claimed as the property of others. FeatureXeon.
III. Multicore Processors (4) Dezső Sima Spring 2007 (Ver. 2.1)  Dezső Sima, 2007.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
The First Microprocessor By: Mark Tocchet and João Tupinambá.
Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006.
OPTERON (Advanced Micro Devices). History of the Opteron AMD's server & workstation processor line 2003: Original Opteron released o 32 & 64 bit processing.
INTEL COREI3 INTEL COREI5 INTEL COREI7 Maryam Zeb Roll#52 GFCW Peshawar.
Lynnfield: Desktop Processor
Microprocessors I Time: Sundays & Tuesdays 07:30 to 8:45 Place: EE 4 ( New building) Lecturer: Bijan Vosoughi Vahdat Room: VP office, NE of Uni Office.
0 Copyright 2011 FUJITSU TECHNOLOGY SOLUTIONS ESPRIMO P400.
A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, Sixth Edition Chapter 4 Supporting Processors and Upgrading Memory.
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
Dezső Sima Evolution of Intel’s transistor technology 45 nm – 14 nm October 2014 Vers. 1.0.
Cosc 2150 Current CPUs Intel and AMD processors. Notes The information is current as of Dec 5, 2014, unless otherwise noted. The information for this.
7-Aug-15 (1) CSC Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.
The AMD and Intel Architectures COMP Jamie Curtis.
HS06 on the last generation of CPU for HEP server farm Michele Michelotto 1.
Intel® 64-bit Platforms Platform Features. Agenda Introduction and Positioning of Intel® 64-bit Platforms Intel® 64-Bit Xeon™ Platforms Intel® Itanium®
111 *Other names and brands may be claimed as the property of others Q Sell Up Guide Intel ® Core™ i7 (Bloomfield) vs. Lynnfield Positioning Intel.
Advantech Embedded System Group Q2 2013
Dezső Sima Fall 2007 (Ver. 1.0)  Sima Dezső, 2007 Multisocket system architectures.
COMPUTER ARCHITECTURE
Adam Meyer, Michael Beck, Christopher Koch, and Patrick Gerber.
Microprocessors Chapter 1 powered by dj1. Slide 2 of 66Chapter 1 Objectives  Discuss the working of microprocessor  Discuss the various interfaces of.
A+ Guide to Managing and Maintaining your PC, 6e Chapter 5 Processors and Chipsets (v0.9)
Intel’s Penryn Sima Dezső Fall 2007 Version nm quad-core -
پردازنده های چند هسته ای 1. چرا CPU های چند هسته ای ؟ 1. تقسیم بار سیستم 2. زیاد شدن توان عملیاتی ( Throughput) 3. اجرای بهتر برنامه های سنگین ( برنامه.
History of Microprocessor MPIntroductionData BusAddress Bus
Chapter One Understanding Computer Hardware Part I: Processors.
NVMe & Modern PC and CPU Architecture 1. Typical PC Layout (Intel) Northbridge ◦Memory controller hub ◦Obsolete in Sandy Bridge Southbridge ◦I/O controller.
GPGPUs - Data Parallel Accelerators-1 Dezső Sima February 2011 © Dezső Sima 2011 Ver. 1.0 (Updated 10/02/2011)
Dezső Sima Evolution of Intel’s Basic Microarchitectures - 2 November 2012 Vers. 3.2.
1 Latest Generations of Multi Core Processors
Evolution of Microprocessors Microprocessor A microprocessor incorporates most of all the functions of a computer’s central processing unit on a single.
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
Dezső Sima Fall 2007 (Ver. 2.1)  Dezső Sima, 2007 Multicore Processors (2)
transistor technology
Presented by : Nasser Hadjloo
Architecture of Microprocessor
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Chap 4: Processors Mainly manufactured by Intel and AMD Important features of Processors: Processor Speed (900MHz, 3.2 GHz) Multiprocessing Capabilities.
I7’s Core. Intel’s Core i7 Content Overview Socket SSE 4.2 Instruction Set Cores –Intel Quickpath Interconnect –Nehalem - new micro-architecture –EP,
Dezső Sima September 2015 (Ver. 1.3)  Sima Dezső, 2015 Intel’s High Performance MP Servers and Platforms.
Sima Dezső Introduction to multicores October Version 1.0.
The Technology Catalyst Performance: Computational capability – Improve application performance by as much as 10X, Increase density and lowering cost.
Microprocessor Design Process
PASTA 2010 CPU, Disk in 2010 and beyond m. michelotto.
Hardware Architecture
Dezső Sima April 2016 (Ver. 1.5)  Sima Dezső, 2016 Intel’s High Performance MP Servers and Platforms.
Modern Processors.  Desktop processors  Notebook processors  Server and workstation processors  Embedded and communications processors  Internet.
Intel and AMD processors
Manycore processors Sima Dezső October Version 6.2.
Multiprocessing.
transistor technology
GENERATIONS OF MICROPROCESSORS
System On Chip.
Evolution of Intel’s Basic Microarchitectures - 2
Phnom Penh International University (PPIU)
Intel’s Core i7 Processor
A Comprehensive Study of Intel Core i3, i5 and i7 family
III. Multicore Processors (2)
Introduction to CUDA Programming
transistor technology
Többmagos Processzorok (2)
Intel’s Core 2 family - TOCK lines
Chapter 4 Supporting Processors and Upgrading Memory
The University of Adelaide, School of Computer Science
Utsunomiya University
Presentation transcript:

Dezső Sima Evolution of Intel’s Basic Microarchitectures - 2 April 2013 Vers. 3.3

Contents 1. Introduction 2. Core 2 3. Penryn 4. Nehalem 7. Westmere-EX 5. Nehalem-EX 6. Westmere

Contents 9. Sandy Bridge Extreme Edition 10. Ivy Bridge 12. Overview of the evolution 8. Sandy Bridge 11. Haswell

8. Sandy Bridge 8.1 Introduction 8.2 Advanced Vector Extension (AVX) 8.3 On-die ring interconnect bus 8.4 On-die integrated graphics unit 8.5 Enhanced turbo boost technology

8.1 Introduction (1) Sandy Bridge is Intel’s new microarchitecture using 32 nm line width. First delivered in 1/ Introduction

32K L1D (3 clk) AVX 256 bit 4 Operands 256 KB L2 (9 clk) Hyperthreading AES Instr. VMX Unrestrict. 20 nm 2 / Core 256 KB L2 (9 clk) 256 KB L2 (9 clk) 256 KB L2 (9 clk) 256 KB L2 (9 clk) 256 KB L2 (9 clk) 256 KB L2 (9 clk) PCIe GHz (to L3 connected) 256 b/cycle Ring Architecture (25 clk) DDR GB/s Main functional units of Sandy Bridge [143] Part 4 32 nm process / ~225 nm 2 die size / 85W TDP 8.1 Introduction (2) 8 MB

Desktops Servers DP-Servers E5 2xxx, Sandy Bridge-EP, up to 8C, Q4/2011 UP-Servers E3 12xx, 4C, Sandy Bridge-H2, 4C, 3/2011 Mobiles Core i3-23xxM, 2C, 2/2011 Core i5-24xxM//25xxM, 2C, 2/2011 Core i7-26xxQM/27xxQM/28xxQM, 4C, 1/2011 Core i7 Extreme-29xxXM, 4C, Q Core i3-21xx, 2C,no HT, no vPro, 2/2011 Core i5-23xx 4C+G, no HT no VPro, 1/2011 Core i5/24xx/25xx, 4C+G, no HT, vPro, 1/2011 Core i7-26xx, 4C+G, HT, vPro, 1/2011 Core i7-2700K, 4C+G, HT, no vPro, 10/2011 MP-Servers E5 4xxx, Sandy Bridge-EX, up to 8C, Q1/2012 Overview of the Sandy Bridge based processor lines Based on [62] and [63] 8.1 Introduction (3) Core i7-3960X, 6C, HT, vPro??, 11/2011 Core i7-3930K, 6C, HT, vPro??, 11/2011 Desktops Sandy Bridge Sandy Bridge-E Section 9)

Key features and benefits of the Sandy Bridge line vs the 1. generation Nehalem line [61] 8.1 Introduction (4)

8.2 Advanced Vector Extension (AVX) (1) Figure: Evolution of the SIMD processing width [18] BMA-ból 8.2 Advanced Vector Extension (AVX) Sandy Bridge Introduction of AVX

Figure: Intel’s x86 ISA extensions - the SIMD register space (based on [18]) BMA Norhwood Northwood (Pentium4) 8 MM registers (64-bit), aliased on the FP Stack registers 8 XMM registers (128-bit) 16 XMM registers (128-bit) 16 YMM registers (256-bit) Ivy Bridge 8.2 Advanced Vector Extension (AVX) (2)

8.3 On-die ring interconnect bus (1) 8.4 The on die ring interconnect bus of Sandy Bridge [66] Six bus agents. The four cores and the L3 slices share interfaces.

8.4 On-die integrated graphics unit (1) 8.5 Sandy Bridge’s integrated graphics unit [102] Part4 12 EUs

Specification data of the HD 2000 and HD 3000 graphics [125] Part On-die integrated graphics unit (2)

frames per sec i5/i7 2xxx/3xxx: Sandy Bridge i5 6xx Arrandale HD ALUs Performance comparison: gaming [126] part On-die integrated graphics unit (3)

8.5 Enhanced turbo boost technology (1) Cooler Innovative concept of the 2.0 generation Turbo Boost technology Thermal capacitance The concept utilizes the real temperature response of processors to power changes in order to increase the extent of overclocking [64] 8.5 Enhanced turbo boost technology [64]

Concept: Use thermal energy budget accumulated during idle periods to push the core beyond the TDP for short periods of time (e.g. for 20 sec). Multiple algorithms manage in parallel current, power and die temperature. [64] 8.5 Enhanced turbo boost technology (2)

Intelligent power sharing between the cores and the integrated graphics [64] 8.5 Enhanced turbo boost technology (3)

[61] WSM/M WSM/D NHM/M NHM/D 8.5 Enhanced turbo boost technology (4)

Remark 8.5 Enhanced turbo boost technology (6) Individual cores may run at different frequencies but all cores share the same power plane. Individual cores may be shut down if idle by power gates.

9. The Sandy Bridge-E line

9. The Sandy Bridge-E line (1) 9. The Sandy Bridge-E line of processors (2. gen. Core i7 processors) Introduced in 11/2011 as a “precursor” of the upcoming DP/MP server lines. Key features vs the original Sandy Bridge line (1) a) 6 cores (with 2 cores disabled from the original design) but no integrated graphics [76].

32 nm 435 mm B trs 15 MB L3 32 nm 216 mm mtrs 8 MB L3 [61] [76] 9. The Sandy Bridge-E line (2) Sandy Bridge (2x) Sandy Bridge E

CPU Specification Comparison CPU Manufacturing Process Cores Transistor Count Die Size AMD Bulldozer 8C32nm8~2B315mm 2 AMD Thuban 6C45nm6904M346mm 2 AMD Deneb 4C45nm4758M258mm 2 Intel Gulftown 6C32nm61.17B240mm 2 Intel Sandy Bridge E (6C)32nm62.27B435mm 2 Intel Nehalem/Bloomfield 4C45nm4731M263mm 2 Intel Sandy Bridge 4C32nm4995M216mm 2 Intel Lynnfield 4C45nm4774M296mm 2 Intel Clarkdale 2C32nm2384M81mm 2 Intel Sandy Bridge 2C (GT1)32nm2504M131mm 2 Intel Sandy Bridge 2C (GT2)32nm2624M149mm 2 Comparison of die parameters of recent DT processors [77] 9. The Sandy Bridge-E line (3)

L1L2L3 Main Memory AMD FX-8150 (3.6GHz) AMD Phenom II X4 975 BE (3.6GHz) AMD Phenom II X6 1100T (3.3GHz) Intel Core i5 2500K (3.3GHz) Intel Core i7 3960X (3.3GHz) Cache/memory latencies of recent DT processors [77] 9. The Sandy Bridge-E line (4) Sandy Bridge Sandy Bridge-E Bulldozer

b) 4 parallel memory channels (inherited from the server side) instead of 2 of the previous lines. Support of DDR3 of up to 1600 MT/s. A single DDR DIMM per channel or 2 DDR DIMMs per channel [78]. 9. The Sandy Bridge-E line (5)

c) 40 PCIe 2. gen. lanes to connect graphics cards directly to the processor instead of 16 to 32 of the previous generation Sandy Bridge [78]. 9. The Sandy Bridge-E line (6)

1x x16 or 2x x8 lanes PCIe lanes provided on the processor 40 configurable lanes (e.g. 2x x16 + 1x x8 or 4x x8) PCIe 3.0 lanes PCIe 2.0 lanes Type of available PCIe lanes PCIe 1.0 lanes Mem. P Periph. Contr. PCIe 2.0 X16/ 2x x8 X16/ 2x x8 Mem. P Periph. Contr. PCIe 3.0 Intel 2. gen. Nehalem (Lynnfield) (4C), 2 MCh with P55 (2009) Intel Sandy Bridge (4C), 2 MCh with P67 (2011) Intel Ivy Bridge (4C), 2 MCh with Z77 PCH (2012) P55/P67 Z77 Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011) Main options of providing PCIe lanes on the processor for graphics cards in DT systems PCIe configurable lanes Mem. P Periph. Contr. X79

Lane configuration options - Sandy Bridge Extreme Edition [] Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011) Periph. Contr. Mem. P PCIe 3.0 x16 40 configurable lanes X79

PCIe 3.0 lanes PCIe 2.0 lanes Type of available PCIe lanes PCIe 1.0 lanes Trend Evolution of the topology and type of available PCIe lanes for graphics cards Topology of PCIe lanes provided for graphics cards PCIe lanes on both the NB and the SB PCIe lanes on the NB PCIe lanes on the processor PCIe lanes on the PCH 2. G. Nehalem (Lynnfield) (2009) Sandy Bridge (2011) Sandy Bridge EE, (2011) Ivy Bridge, (2012) Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011) 4.1 Introduction (6)/4

d) LGA-2011 socket instead of the LGA-1155 used in the pervious generation Sandy Bridge due to the increased number of memory channels connected to the processor.. 9. The Sandy Bridge-E line (7) LGA 2011 Sandy Bridge EE LGA gen. Nehalem (Bloomfield) LGA 1155 Sandy Bridge/Ivy Bridge LGA gen. Nehalem (Lynnfield) LGA 775 Pentium 4 Prescott until Nehalem LGA 775 Intel’s LGA sockets (Land Grid Array) LGA 2011 [87]

ProcessorCore Clock Cores / Threads L3 CacheMax Turbo Max Overclock Multiplier TDPPrice Intel Core i7 3960X 3.3GHz6 / 1215MB3.9GHz57x130W$990 Intel Core i7 3930K 3.2GHz6 / 1212MB3.8GHz57x130W$555 Intel Core i GHz4 / 810MB3.9GHz43x130WTBD Intel Core i7 2700K 3.5GHz4 / 88MB3.9GHz57x95W$332 Intel Core i7 2600K 3.4GHz4 / 88MB3.8GHz57x95W$317 Intel Core i GHz4 / 88MB3.8GHz42x95W$294 Intel Core i5 2500K 3.3GHz4 / 46MB3.7GHz57x95W$216 Intel Core i GHz4 / 46MB3.7GHz41x95W$205 Main features of the Sandy Bridge-E line vs the Sandy Bridge line [77] 9. The Sandy Bridge-E line (8)

10. The Ivy Bridge line

10. Te Ivy Bridge line – 10.1 Introduction (1) Introduced: 4/2012 Figure 10.1: Intel ’ s Tick-Tock development model [Based on 1] Tick-Tock Development Model Merom 1 NEW Microarchitecture 65nm Penryn NEW Process 45nm Nehalem NEW Microarchitecture 45nm Westmere NEW Process 32nm Sandy Bridge NEW Microarchitecture 32nm Ivy Bridge NEW Process 22nm Haswell NEW Microarchitecture 22nm TOCK TICKTOCKTICKTOCKTICK 10. The Ivy Bridge line 11.1 Introduction The Ivy Bridge is termed also as the 3. gen. Intel Core processors.

10.1 Introduction (2) 32 nm 216 mm mtrs 22 nm 160 mm mtrs (Resized to 32 nm feature size) Figure 10.2: Contrasting the Sandy Bridge and Ivy Bridge dies [81] Sandy Bridge Ivy Bridge 8 MB

10.1 Introduction (3) [84]

10.1 Introduction (4) Major innovations of Ivy Bridge [80]

11.2 The new 22 nm tri-gate process technology (1) 11.2 The new 22 nm tri-gate process technology [82]

10.2 The new 22 nm tri-gate process technology (2) [82]

10.2 The new 22 nm tri-gate process technology (3) [82]

10.2 The new 22 nm tri-gate process technology (4) [82]

10.2 The new 22 nm tri-gate process technology (5) [82]

10.2 The new 22 nm tri-gate process technology (6) [82]

10.2 The new 22 nm tri-gate process technology (7) [82]

10.2 The new 22 nm tri-gate process technology (8) [82]

10.2 The new 22 nm tri-gate process technology (9) Figure: Ivy Bridge chips on a 300 mm wafer

10.2 The new 22 nm tri-gate process technology (10) ProcessorFeature sizeNo. of cores L2 + L3 size No. of transistorDie size Ivy Bridge22 nm Tri-Gate4 (+ IGP)9 MB1,48 milliárd160 mm 2 Sandy Bridge32 nm HKMG4 (+ IGP)9 MB995 millió216 mm 2 Sandy Bridge-E32 nm HKMG616,5 MB2,27 milliárd435 mm 2 Gulftown32 nm HKMG613,5 MB1,17 milliárd240 mm 2 Lynnfield45 nm HKMG49 MB774 millió296 mm 2 Bloomfield45 nm HKMG49 MB731 millió263 mm 2 Orochi (Bulldozer)32 nm HKMG SOI8 (4 modul)16 MB~1,2 milliárd315 mm 2 Llano32 nm HKMG SOI4 (+ IGP)4 MB1,45 milliárd228 mm 2 Thuban45 nm SOI69 MB904 millió346 mm 2 Deneb45 nm SOI48 MB758 millió258 mm 2 Table: Main implementation parameters of recent processors [81]

10.3 Supervisory Mode Execution Protection (SMEP) [83]

10.4 System architecture (1) [81]

10.4 System architecture (2)/1 [81]

Analog video interfaces to external displays Digital video interfaces to external displays Video interfaces of computing devices to external displays MDA EGA DVI HDMI CGA Overview of video interfaces of computing devices to external displays No audio transmission Audio/video transmission Analog audio/ digital video i.f. Dig. audio /dig. video i.f. VGA DP Earliest video interfaces Legacy video interfaces Recently preferred video interfaces To TVs To displays Dig. audio /dig. video i.f.s 10.4 System architecture (2)/2

10.5 Performance (1) [81] Sandy Bridge Bulldozer Ivy Bridge Sandy Bridge EE

10.5 Performance (2) [81]

11. The Haswell line

11. The Haswell line of processors (1) Expected date of introduction: 4/2013 Figure 1.1: Intel ’ s Tick-Tock development model [Based on 1] Tick-Tock Development Model Merom 1 NEW Microarchitecture 65nm Penryn NEW Process 45nm Nehalem NEW Microarchitecture 45nm Westmere NEW Process 32nm Sandy Bridge NEW Microarchitecture 32nm Ivy Bridge NEW Process 22nm Haswell NEW Microarchitecture 22nm TOCK TICKTOCKTICKTOCKTICK 11. The Haswell line of processors

11. The Haswell line of processors (2) The Haswell die [85]

11. The Haswell line of processors (3) Haswell’s system architecture [86]

11. The Haswell line of processors (4) [80]

11. The Haswell line of processors (5) [80]

11. The Haswell line of processors (6)/1 [80] FMA: Fused Multiply-Add ( ax b+c)

Figure: Evolution of the SIMD processing width [18] BMA-ból 8.2 Advanced Vector Extension (AVX) Sandy Bridge Introduction of AVX 11. The Haswell line of processors (6)/2 Haswell

11. The Haswell line of processors (7) [80]

To 12 – Additional references

[81]: Olivera, A régóta várt Intel Ivy Bridge tesztje, Prohardware, , [80]: Chappell R., Toll B., Singhal R.: Intel Next Generation Microarchitecture Codename Haswell: New Processor Innovations, IDF 2012 [82]: Bohr M., Mistry K.: Intel’s Revolutionary 22 nm transistor technology, May 2011, [83]: George V., Piazza T.,Jiang H.: Technology Insight: Intel Next Generation Microarchitecture Codename Ivy Bridge, IDF 2011 [84] 3 rd Generation Intel Core Processor Family Quad Core Launch Product Information, April 23, _Intel_Core_Product_Information.pdf [85] Ivy Bridge and Haswell die configurations (estimates included), Anandtech, , [86]: Piazza T.,Jiang H., Hammerlund P., Singhal R.: Technology Insight: Intel Next Generation Microarchitecture Codename Haswell, IDF 2012 SPCS001 [87] Haynes D.: 2012 Socket Guide, Aug ,