Microprocessors AMD Hammer AMD’s High Stakes RISC Entry May 2 nd, 2002.

Slides:



Advertisements
Similar presentations
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Advertisements

Microprocessors General Features To be Examined For Each Chip Jan 24 th, 2002.
Computers Organization & Assembly Language Chapter 1 THE 80x86 MICROPROCESSOR.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Pentium 4 and IA-32 ISA ELEC 5200/6200 Computer Architecture and Design, Fall 2006 Lectured by Dr. V. Agrawal Lectured by Dr. V. Agrawal Kyungseok Kim.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Trace Caches J. Nelson Amaral. Difficulties to Instruction Fetching Where to fetch the next instruction from? – Use branch prediction Sometimes there.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
CS2422 Assembly Language & System Programming September 22, 2005.
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
The AMD and Intel Architectures COMP Jamie Curtis.
The Pentium: A CISC Architecture Shalvin Maharaj CS Umesh Maharaj:
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
Cisc Complex Instruction Set Computing By Christopher Wong 1.
CS333 Intro to Operating Systems Jonathan Walpole.
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
The Pentium Processor.
The Pentium Processor Chapter 3 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
The Pentium Processor Chapter 3 S. Dandamudi.
Topic:The Motorola M680X0 Family Team:Ulrike Eckardt Frederik Fleck André Kudra Jan Schuster Date:Thursday, 12/10/1998 CS-350 Computer Organization Term.
1 4.2 MARIE This is the MARIE architecture shown graphically.
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Lecture 1 ECE Spring 2000 ECE 291 Spring 2000 Lecture 1: Microprocessor Evolution & Organization Constantine D. Polychronopoulos Professor, ECE.
University of Washington Roadmap 1 car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Car c = new Car(); c.setMiles(100);
Transmeta and Dynamic Code Optimization Ashwin Bharambe Mahim Mishra Matthew Rosencrantz.
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
Computer Architecture Lecture 32 Fasih ur Rehman.
 Introduction to SUN SPARC  What is CISC?  History: CISC  Advantages of CISC  Disadvantages of CISC  RISC vs CISC  Features of SUN SPARC  Architecture.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
IBM/Motorola/Apple PowerPC
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
1 x86 Programming Model Microprocessor Computer Architectures Lab Components of any Computer System Control – logic that controls fetching/execution of.
Computer Science 516 Intel x86 Overview. Intel x86 Family Eight-bit 8080, 8085 – 1970s 16-bit 8086 – was internally 16 bits, externally 8 bits.
ALPHA 21164PC. Alpha 21164PC High-performance alternative to a Windows NT Personal Computer.
Chapter Overview General Concepts IA-32 Processor Architecture
William Stallings Computer Organization and Architecture 6th Edition
Memory COMPUTER ARCHITECTURE
IA32 Processors Evolutionary Design
Visit for more Learning Resources
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Architecture & Organization 1
عمارة الحاسب.
Microprocessors Chapter 4.
64 BIT COMPUTING By: Kapil Kaushik VIII Sesmester(IT)
Architecture & Organization 1
The Pentium: A CISC Architecture
Comparison of Two Processors
Comparison of AMD64, IA-32e extensions and the Itanium architecture
Control unit extension for data hazards
Introduction to Microprocessor Programming
Control unit extension for data hazards
Control unit extension for data hazards
A Level Computer Science Topic 5: Computer Architecture and Assembly
The University of Adelaide, School of Computer Science
Machine-Level Programming I: Basics Comp 21000: Introduction to Computer Organization & Systems Instructor: John Barr * Modified slides from the book.
The von Neumann Machine
Presentation transcript:

Microprocessors AMD Hammer AMD’s High Stakes RISC Entry May 2 nd, 2002

AMD Hammer For all the usual reasons, AMD feels that it must address 64-bit computing. For all the usual reasons, AMD feels that it must address 64-bit computing. AMD has decided NOT to follow Intel AMD has decided NOT to follow Intel Instead it will generate its own 64-bit version of the ia32 architecture. Instead it will generate its own 64-bit version of the ia32 architecture. The general name is Hammer The general name is Hammer Sledge-hammer, the first chip, soon! Sledge-hammer, the first chip, soon! A reference for full information: A reference for full information: DownloadableAssets/MPF_Hammer_Presentation.PDF DownloadableAssets/MPF_Hammer_Presentation.PDF

Public Specification All aspects of this chip developed in public All aspects of this chip developed in public Announced at Linux World Announced at Linux World Uses GNU/Linux as native 64-bit OS Uses GNU/Linux as native 64-bit OS Public specification at Public specification at X86-64 is the official designation X86-64 is the official designation Hammer is like Pentium (wi different models) Hammer is like Pentium (wi different models) X86-64 is like ia32 or ia64 (architecture) X86-64 is like ia32 or ia64 (architecture)

Hammer Basics In the same way that the 386 extended the 286 architecture from 16 to 32 bits, Hammer extends from bits. In the same way that the 386 extended the 286 architecture from 16 to 32 bits, Hammer extends from bits. This is NOT a new architecture This is NOT a new architecture Hammer is 100% upwards compatible with the ia32, and can run any ia32 program unchanged. Hammer is 100% upwards compatible with the ia32, and can run any ia32 program unchanged. And the ia32 program will run fast, getting many of the benefits of the hammer. And the ia32 program will run fast, getting many of the benefits of the hammer.

The Move to 64-bit Enhancements Enhancements Add 8 new integer registers Add 8 new integer registers Add PC relative addressing Add PC relative addressing Add full support for SSE/SSEII floating-point Add full support for SSE/SSEII floating-point Including 16 registers Including 16 registers Additional registers added with prefixes Additional registers added with prefixes Prefixes specify addressing modes Prefixes specify addressing modes Prefixes specify additional registers Prefixes specify additional registers

64-bit Addressing 48-bit virtual addresses 48-bit virtual addresses As opposed to 32-bit on ia32 As opposed to 32-bit on ia32 Allows 256 terabytes of virtual memory Allows 256 terabytes of virtual memory (but not a full 64 bits, though this could be added relatively easily later, since addresses are always handled in 64 bit registers) (but not a full 64 bits, though this could be added relatively easily later, since addresses are always handled in 64 bit registers) 40-bit physical addresses 40-bit physical addresses As opposed to 32-bit on ia32 As opposed to 32-bit on ia32 Allows for one terabyte (1000 gig) phys mem Allows for one terabyte (1000 gig) phys mem

Register Structure 16 SSE Floating-Point registers 128-bits 16 SSE Floating-Point registers 128-bits 16 integer registers 16 integer registers E.g. RAX E.g. RAX Low 32 bits is EAX Low 32 bits is EAX Low 16 bits is AX (and also AH, AL) Low 16 bits is AX (and also AH, AL) Extra registers are R8-R15 Extra registers are R8-R15 8 x87 registers for compatibility (80 bits) 8 x87 registers for compatibility (80 bits) One 64-bit program counter One 64-bit program counter Low order 32 bits is EIP Low order 32 bits is EIP

Advantages of CISC and RISC Code density of CISC Code density of CISC Register usage and ABI models of RISC Register usage and ABI models of RISC Easy application of standard optimization algorithms. Easy application of standard optimization algorithms.

SpecInt 2000 Code Generation Code size grows less than 10% Code size grows less than 10% Due mostly to instruction prefixes Due mostly to instruction prefixes Static instruction count shrinks by 10% Static instruction count shrinks by 10% Dynamic instruction count shrinks by 5% Dynamic instruction count shrinks by 5% Dynanic load/store count shrinks by 20% Dynanic load/store count shrinks by 20% All without specific code optimizations All without specific code optimizations

Summary (AMD advertising ) Processor is fully x86 capable Processor is fully x86 capable Full native performance with 32-bit apps Full native performance with 32-bit apps Full compatibility (BIOS, OS, Drivers) Full compatibility (BIOS, OS, Drivers) Flexible deployment Flexible deployment Best in class 32-bit x86 performance Best in class 32-bit x86 performance Excellent 64-bit instruction execution when needed Excellent 64-bit instruction execution when needed Server/Workstation/Desktop/Mobile Server/Workstation/Desktop/Mobile Share common architecture, OS, etc Share common architecture, OS, etc

Architecture Nine pipelines (3 fpt, 3 integer, 3 address) Nine pipelines (3 fpt, 3 integer, 3 address) Integer pipeline has 12 stages (very deep) Integer pipeline has 12 stages (very deep) Accurate branch prediction Accurate branch prediction A lot of effort put in here! A lot of effort put in here! Large TLB (virtual memory lookup table) Large TLB (virtual memory lookup table) 512 entries for data 512 entries for data 512 entries for instructions 512 entries for instructions Integrated memory controller Integrated memory controller

Memory All memory is ECC protected All memory is ECC protected L1 Data cache L1 Data cache L2 cache L2 cache DRAM DRAM ECC stands for error correcting code ECC stands for error correcting code Detect all 2 bit errors Detect all 2 bit errors Auto-correct any single bit error Auto-correct any single bit error Useful for server/critical applications Useful for server/critical applications

Input-Output and Multi-Processing Very high bandwidth I/O Very high bandwidth I/O Planned for server applications Planned for server applications Multi-processing built in Multi-processing built in Can have 2-8 processors Can have 2-8 processors Memory appears flat and fully coherent Memory appears flat and fully coherent 25 gigabytes/second between processors 25 gigabytes/second between processors 8 gigabytes/second to/from memory 8 gigabytes/second to/from memory

Conclusion AMD and Intel go head to head AMD and Intel go head to head But with totally different technologies But with totally different technologies Fascinating Fascinating Many other references on net Many other references on net Do google search for AMD Hammer Do google search for AMD Hammer A good non-AMD reference is A good non-AMD reference is