CMPE 511 Computer Architecture Caner AKSOY CmpE Boğaziçi University December 2006 Intel ® Core 2 Duo Desktop Processor Architecture.

Slides:



Advertisements
Similar presentations
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
Advertisements

ARCHITECTURE OF APPLE’S G4 PROCESSOR BY RON WEINWURZEL MICROPROCESSORS PROFESSOR DEWAR SPRING 2002.
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
Fall EE 333 Lillevik 333f06-l20 University of Portland School of Engineering Computer Organization Lecture 20 Pipelining: “bucket brigade” MIPS.
Pentium microprocessors CAS 133 – Basic Computer Skills/MS Office CIS 120 – Computer Concepts I Russ Erdman.
Computer Organization and Assembly Languages Yung-Yu Chuang
Computers Organization & Assembly Language Chapter 1 THE 80x86 MICROPROCESSOR.
Mobile Pentium 4 Architecture Supporting Hyper-ThreadingTechnology Hakan Burak Duygulu CmpE
Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006.
Microprocessors I Time: Sundays & Tuesdays 07:30 to 8:45 Place: EE 4 ( New building) Lecturer: Bijan Vosoughi Vahdat Room: VP office, NE of Uni Office.
Pentium 4 and IA-32 ISA ELEC 5200/6200 Computer Architecture and Design, Fall 2006 Lectured by Dr. V. Agrawal Lectured by Dr. V. Agrawal Kyungseok Kim.
1 Microprocessor-based Systems Course 4 - Microprocessors.
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
Processor Technology and Architecture
CSCE101 – 4.2, 4.3 October 17, Power Supply Surge Protector –protects from power spikes which ruin hardware. Voltage Regulator – protects from insufficient.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
The Pentium 4 CPSC 321 Andreas Klappenecker. Today’s Menu Advanced Pipelining Brief overview of the Pentium 4.
Copyright © 2006, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners Intel® Core™ Duo Processor.
7-Aug-15 (1) CSC Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.
The AMD and Intel Architectures COMP Jamie Curtis.
Intel Pentium 4 Processor Presented by Presented by Steve Kelley Steve Kelley Zhijian Lu Zhijian Lu.
Hiep Hong CS 147 Spring Intel Core 2 Duo. CPU Chronology 2.
1 Comparing The Intel ® Core ™ 2 Duo Processor to a Single Core Pentium ® 4 Processor at Twice the Speed Performance Benchmarking and Competitive Analysis.
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
CS 6354 by WeiKeng Qin, Jian Xiang, & Ren Xu December 8, 2009.
How a Computer Processes Data Hardware. Major Components Involved: Central Processing Unit Types of Memory Motherboards Auxiliary Storage Devices.
An Introduction to IA-32 Processor Architecture Eddie Lopez CSCI 6303 Oct 6, 2008.
Processing Devices.
Intel Architecture. Changes in architecture Software architecture: –Front end (Feature changes such as adding more graphics, changing the background colors,
INTRODUCTION TO MICROPROCESSORS
Assembly Language for Intel-Based Computers, 4 th Edition Chapter 2: IA-32 Processor Architecture (c) Pearson Education, All rights reserved. You.
| © 2006 Lenovo Processor roadmap R.Vodenicharov.
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Computer Organization & Assembly Language
The Pentium Processor.
The Arrival of the 64bit CPUs - Itanium1 นายชนินท์วงษ์ใหญ่รหัส นายสุนัยสุขเอนกรหัส
Pre-Pentium Intel Processors /
Intel Pentium II Processor Brent Perry Pat Reagan Brian Davis Umesh Vemuri.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
History of Microprocessor MPIntroductionData BusAddress Bus
Comparing Intel’s Core with AMD's K8 Microarchitecture IS 3313 December 14 th.
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
Introduction to MMX, XMM, SSE and SSE2 Technology
November 22, 1999The University of Texas at Austin Native Signal Processing Ravi Bhargava Laboratory of Computer Architecture Electrical and Computer.
Lesson 3 0x Hardware Components
Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.
Computer Architecture Introduction Lynn Choi Korea University.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
SSE and SSE2 Jeremy Johnson Timothy A. Chagnon All images from Intel® 64 and IA-32 Architectures Software Developer's Manuals.
PROCESSOR Ambika | shravani | namrata | saurabh | soumen.
Introduction to Intel IA-32 and IA-64 Instruction Set Architectures.
Lecture # 10 Processors Microcomputer Processors.
Hardware Architecture
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
1 ECE 734 Final Project Presentation Fall 2000 By Manoj Geo Varghese MMX Technology: An Optimization Outlook.
Chapter Overview General Concepts IA-32 Processor Architecture
Itanium® 2 Processor Architecture
Microarchitecture.
CIT 668: System Architecture
HISTORY OF MICROPROCESSORS
Phnom Penh International University (PPIU)
INTRODUCTION TO MICROPROCESSORS
HISTORY OF MICROPROCESSORS
Introduction to Pentium Processor
MMX Multi Media eXtensions
Special Instructions for Graphics and Multi-Media
Microprocessor & Assembly Language
Other Processors Having learnt MIPS, we can learn other major processors. Not going to be able to cover everything; will pick on the interesting aspects.
Presentation transcript:

CMPE 511 Computer Architecture Caner AKSOY CmpE Boğaziçi University December 2006 Intel ® Core 2 Duo Desktop Processor Architecture

What’s next?  History  Intel Core 2 Duo  Intel Core 2 Microarchitecture  Intel Core 2 Models  Architectural Features of Core 2  What is an instruction set?  SSSE3 (x86)  Execute Disable Bit  Intel ® Wide Dynamic Execution  14 Stage pipeline  MacroFusion  Micro-op Fusion  What is L1 and L2?  Intel ® Advanced Smart Cache  Intel ® Smart Memory Access  Intel ® Advanced Digital Media Boost

History (List of Intel microprocessors)  The 4-bit processors 4004, 4040  The 8-bit processors 8008, 8080, 8085  The 16-bit processors: Origin of x , 8088, 80186, 80188,  The 32-bit processors: Non x86 iAPX 432, 80960, 80860, XScale  The 32-bit processors: The Range 80386DX, 80386SX, 80376, 80386SL, 80386EX  The 32-bit processors: The Range 80486DX, 80486SX, 80486DX2, 80486SL, 80486DX4  The 32-bit processors: The Pentium (“I”) Pentium, Pentium MMX  The 32-bit processors: P6/Pentium M Pentium Pro, Pentium II, Celeron, Pentium III, PII and III Xeon Celeron(PIII), Pentium M, Celeron M, Intel Core, Dual Core Xeon LV  The 32-bit processors: NetBurst microarchitecture Pentium 4, Xeon, Pentium 4 EE  The 64-bit processors: IA-64 Itanium, Itanium 2  The 64-bit processors: EM64T-NetBurst Pentium D, Pentium Extreme Edition, Xeon  The 64-bit processors: EM64T- Core microarchitecture Xeon, Intel Core 2

Intel Core 2 Duo 4 / 37

Intel Core 2 Microarchitecture Merom Conroe Woodcrest 65nm Server Optimized Desktop Optimized Mobile Optimized Intel ® Wide Dynamic Execution Intel ® Intelligent Power Capability Intel ® Advanced Smart Cache Intel ® Smart Memory Access Intel ® Advanced Digital Media Boost 5 / 37

Intel Core 2 models  Allendale, Conroe - 65 nm process technology65 nm Desktop CPU Introduced on July 27, 2006 Number of Transistors 291 Million on 4 MB Models Number of Transistors 167 Million on 2 MB Models Variants  Core 2 Duo E GHz (4 MB L2, 1066 MHz FSB)  Core 2 Duo E GHz (4 MB L2, 1066 MHz FSB)  Core 2 Duo E GHz (2 MB L2, 1066 MHz FSB)  Core 2 Duo E GHz (2 MB L2, 1066 MHz FSB)  Core 2 Duo E GHz (2 MB L2, 800 MHz FSB) 6 / 37

Intel Core 2 models  Woodcrest - 65 nm process technology65 nm Server optimized CPU Introduced on July 26, 2006 Same features as Conroe Variants  Xeon GHz (4 MB L2, 1333 MHz FSB, 80 W)  Xeon GHz (4 MB L2, 1333 MHz FSB, 65 W)  Xeon GHz (4 MB L2, 1333 MHz FSB, 65 W)  Xeon GHz (4 MB L2, 1333 MHz FSB, 65 W)  Xeon GHz (4 MB L2, 1066 MHz FSB, 65 W)  Xeon GHz (4 MB L2, 1066 MHz FSB, 65 W)  Xeon 5148LV GHz (4 MB L2,1333 MHz FSB,40 W) 7 / 37

Intel Core 2 models  Merom - 65 nm process technology65 nm Mobile CPU Introduced on July 27, 2006 Same features as Conroe Variants  Core 2 Duo T GHz (4 MB L2, 667 MHz FSB)  Core 2 Duo T GHz (4 MB L2, 667 MHz FSB)  Core 2 Duo T GHz (4 MB L2, 667 MHz FSB)  Core 2 Duo T GHz (2 MB L2, 667 MHz FSB)  Core 2 Duo T GHz (2 MB L2, 667 MHz FSB)  Core 2 Duo T GHz (2 MB L2, 533 MHz FSB) 8 / 37

Architectural Features of Core 2  SSSE3 SIMD instructions  Intel Virtualization Technology, multiple OS support  LaGrande Technology, enhanced security hardware extensions  Execute Disable Bit  EIST (Enhanced Intel SpeedStep Technology)  Intel Wide Dynamic Execution  Intel Intelligent Power Capability  Intel Advanced Smart Cache  Intel Smart Memory Access  Intel Advanced Digital Media Boost 9 / 37

What is an instruction set?  All instructions, and all their variations, that a processor can execute  Types: Arithmetic such as add and subtract Logic instructions such as and, or, and not Data instructions such as move, input, output, load, and store  Part of the computer architecture  Distinguished from the microarchitecture  Different microarchitectures can share common instruction set while their internal designs differ FetchDecode Operand Fetch ExecuteRetire 10 / 37

SSSE3 (x86) Supplemental Streaming SIMD Extension 3  Intel's name for the SSE instruction set's fourth iteration  Single Instruction Multiple Data instruction set  A revision of SSE3  CPUs with SSSE3 Xeon 5100 series Intel Core 2  Development Faster permutation of bytes Multiplying 16-bit fixed-point numbers with correct rounding Better word accumulation 11 / 37

SSSE3 (x86) Supplemental Streaming SIMD Extension 3  16 New instructions PSIGNB, PSIGNW, PSIGND  Packed Sign PABSB, PABSW, PABSD  Packed Absolute Value PALIGNR  Packed Align Right PSHUFB  Packed Shuffle Bytes PMULHRSW  Packed Multiply High with Round and Scale PMADDUBSW  Multiply and Add Packed Signed and Unsigned Bytes PHSUBW, PHSUBD  Packed Horizontal Subtract (Words or Doublewords) PHSUBSW  Packed Horizontal Subtract and Saturate Words PHADDW, PHADDD  Packed Horizontal Add (Words or Doublewords) PHADDSW  Packed Horizontal Add and Saturate Words 12 / 37

Execute Disable Bit  Problem Buffer overflow attacks of malicious software  Must be combined with a supporting operating system  Classifies areas in memory for protection  Disables code execution on an attack  Decreases the need for software patches and antivirus software 13 / 37

Intel ® Wide Dynamic Execution Advantage Wider execution Comprehensive Advancements Enabled in each core Each core fetches, dispatches, executes and returns up to four full instructions simultaneously. Performance increases while energy consumption decreases Branch – Add – Mul – Load - Store L2CACHEL2CACHE 14 / 37

14 Stage pipeline  Pentium D has 31 stage pipeline  AMD Athlon 64 has 12 stage pipeline  A question for the class: Why didn’t Intel increase the pipeline after a 31 stage experience with Pentium D? 15 / 37

14 Stage pipeline  Pentium D has 31 stage pipeline  AMD Athlon 64 has 12 stage pipeline  A question for the class: Why didn’t Intel increase the pipeline after a 31 stage experience with Pentium D? I100I99 ……………… I1I2I3 Jump! Bubble of non-work 16 / 37

MacroFusion  If (myVariable == myConstant) doThis(); Else doThat(); Compare instruction Jump instructions CompareJump += microOp 17 / 37

Micro-op Fusion Example: Load the contents of [mem] into a register (MOV EBX, [mem]) An ALU operation, ADD the two registers together (ADD EBX, EAX) Store the result back to memory (MOV [mem], EBX)  The micro-ops which are derived from the same macro-op are fused to reduce the number of micro-ops that need to be executed.  Gaining from the number of instruction to be executed.  Power consumption  Better scheduling  Reduces the number of micro-ops which are handled by the out-of- order logic. 18 / 37

What is L1 and L2?  Level-1 and Level-2 caches  The cache memories in a computer  Much faster than RAM  L1 is built on the microprocessor chi itself.  L2 is a seperate chip  L2 cache is much larger than L1 cache 19 / 37

Intel ® Advanced Smart Cache Decreased traffic Increased traffic Higher cache hit rate Reduced bus traffic Lower latency to data Advantage L2 cache is shared equally Data stored in one place Optimizes cache resource Up to 100% utilization of L2 cache 20 / 37

Intel ® Smart Memory Access 21 / 37

Intel ® Smart Memory Access 22 / 37

Intel ® Smart Memory Access 23 / 37

Intel ® Smart Memory Access 24 / 37

Intel ® Smart Memory Access 25 / 37

Intel ® Smart Memory Access 26 / 37

Intel ® Smart Memory Access 27 / 37

Intel ® Smart Memory Access 28 / 37

Intel ® Smart Memory Access 29 / 37

Intel ® Smart Memory Access 30 / 37

Intel ® Smart Memory Access 31 / 37

Intel ® Smart Memory Access 32 / 37

Intel ® Smart Memory Access 33 / 37

Intel ® Smart Memory Access  Why? Lost opportunities for out-of-order execution.  What is the idea? Ignore the store-load dependecies If there is a dependency, flash the load instruction  How is it checked? Verify by checking all dispatched store addresses in the memory order buffer There is a watchdog 34 / 37

Intel ® Advanced Digital Media Boost Lower 64 bit in one cycle, upper in the next 35 / 37

Intel ® Advanced Digital Media Boost 128 bit instruction completed in one cycle 36 / 37

Intel ® Advanced Digital Media Boost 37 / 37  Improves performance when executing SSE instructions 128 bit SIMD integer arithmetic 128 bit SIMD double precision floating point  Accelerate a broad range of applications Video, speech, image processing Encryption Financial Engineering and scientific

References  [1]  [2]  [3]  [4]  [5]  [6]  [7]  [8]  [9]

Any question