Download presentation
Presentation is loading. Please wait.
Published byGervais Johnston Modified over 9 years ago
1
Microprocessors AMD Hammer AMD’s High Stakes RISC Entry May 2 nd, 2002
2
AMD Hammer For all the usual reasons, AMD feels that it must address 64-bit computing. For all the usual reasons, AMD feels that it must address 64-bit computing. AMD has decided NOT to follow Intel AMD has decided NOT to follow Intel Instead it will generate its own 64-bit version of the ia32 architecture. Instead it will generate its own 64-bit version of the ia32 architecture. The general name is Hammer The general name is Hammer Sledge-hammer, the first chip, soon! Sledge-hammer, the first chip, soon! A reference for full information: A reference for full information: http://www.amd.com/us-en/assets/content_type/ DownloadableAssets/MPF_Hammer_Presentation.PDF http://www.amd.com/us-en/assets/content_type/ DownloadableAssets/MPF_Hammer_Presentation.PDF http://www.amd.com/us-en/assets/content_type
3
Public Specification All aspects of this chip developed in public All aspects of this chip developed in public Announced at Linux World Announced at Linux World Uses GNU/Linux as native 64-bit OS Uses GNU/Linux as native 64-bit OS Public specification at www.x86-64.org Public specification at www.x86-64.orgwww.x86-64.org X86-64 is the official designation X86-64 is the official designation Hammer is like Pentium (wi different models) Hammer is like Pentium (wi different models) X86-64 is like ia32 or ia64 (architecture) X86-64 is like ia32 or ia64 (architecture)
4
Hammer Basics In the same way that the 386 extended the 286 architecture from 16 to 32 bits, Hammer extends from 32-64 bits. In the same way that the 386 extended the 286 architecture from 16 to 32 bits, Hammer extends from 32-64 bits. This is NOT a new architecture This is NOT a new architecture Hammer is 100% upwards compatible with the ia32, and can run any ia32 program unchanged. Hammer is 100% upwards compatible with the ia32, and can run any ia32 program unchanged. And the ia32 program will run fast, getting many of the benefits of the hammer. And the ia32 program will run fast, getting many of the benefits of the hammer.
5
The Move to 64-bit Enhancements Enhancements Add 8 new integer registers Add 8 new integer registers Add PC relative addressing Add PC relative addressing Add full support for SSE/SSEII floating-point Add full support for SSE/SSEII floating-point Including 16 registers Including 16 registers Additional registers added with prefixes Additional registers added with prefixes Prefixes specify addressing modes Prefixes specify addressing modes Prefixes specify additional registers Prefixes specify additional registers
6
64-bit Addressing 48-bit virtual addresses 48-bit virtual addresses As opposed to 32-bit on ia32 As opposed to 32-bit on ia32 Allows 256 terabytes of virtual memory Allows 256 terabytes of virtual memory (but not a full 64 bits, though this could be added relatively easily later, since addresses are always handled in 64 bit registers) (but not a full 64 bits, though this could be added relatively easily later, since addresses are always handled in 64 bit registers) 40-bit physical addresses 40-bit physical addresses As opposed to 32-bit on ia32 As opposed to 32-bit on ia32 Allows for one terabyte (1000 gig) phys mem Allows for one terabyte (1000 gig) phys mem
7
Register Structure 16 SSE Floating-Point registers 128-bits 16 SSE Floating-Point registers 128-bits 16 integer registers 16 integer registers E.g. RAX E.g. RAX Low 32 bits is EAX Low 32 bits is EAX Low 16 bits is AX (and also AH, AL) Low 16 bits is AX (and also AH, AL) Extra registers are R8-R15 Extra registers are R8-R15 8 x87 registers for compatibility (80 bits) 8 x87 registers for compatibility (80 bits) One 64-bit program counter One 64-bit program counter Low order 32 bits is EIP Low order 32 bits is EIP
8
Advantages of CISC and RISC Code density of CISC Code density of CISC Register usage and ABI models of RISC Register usage and ABI models of RISC Easy application of standard optimization algorithms. Easy application of standard optimization algorithms.
9
SpecInt 2000 Code Generation Code size grows less than 10% Code size grows less than 10% Due mostly to instruction prefixes Due mostly to instruction prefixes Static instruction count shrinks by 10% Static instruction count shrinks by 10% Dynamic instruction count shrinks by 5% Dynamic instruction count shrinks by 5% Dynanic load/store count shrinks by 20% Dynanic load/store count shrinks by 20% All without specific code optimizations All without specific code optimizations
10
Summary (AMD advertising ) Processor is fully x86 capable Processor is fully x86 capable Full native performance with 32-bit apps Full native performance with 32-bit apps Full compatibility (BIOS, OS, Drivers) Full compatibility (BIOS, OS, Drivers) Flexible deployment Flexible deployment Best in class 32-bit x86 performance Best in class 32-bit x86 performance Excellent 64-bit instruction execution when needed Excellent 64-bit instruction execution when needed Server/Workstation/Desktop/Mobile Server/Workstation/Desktop/Mobile Share common architecture, OS, etc Share common architecture, OS, etc
11
Architecture Nine pipelines (3 fpt, 3 integer, 3 address) Nine pipelines (3 fpt, 3 integer, 3 address) Integer pipeline has 12 stages (very deep) Integer pipeline has 12 stages (very deep) Accurate branch prediction Accurate branch prediction A lot of effort put in here! A lot of effort put in here! Large TLB (virtual memory lookup table) Large TLB (virtual memory lookup table) 512 entries for data 512 entries for data 512 entries for instructions 512 entries for instructions Integrated memory controller Integrated memory controller
12
Memory All memory is ECC protected All memory is ECC protected L1 Data cache L1 Data cache L2 cache L2 cache DRAM DRAM ECC stands for error correcting code ECC stands for error correcting code Detect all 2 bit errors Detect all 2 bit errors Auto-correct any single bit error Auto-correct any single bit error Useful for server/critical applications Useful for server/critical applications
13
Input-Output and Multi-Processing Very high bandwidth I/O Very high bandwidth I/O Planned for server applications Planned for server applications Multi-processing built in Multi-processing built in Can have 2-8 processors Can have 2-8 processors Memory appears flat and fully coherent Memory appears flat and fully coherent 25 gigabytes/second between processors 25 gigabytes/second between processors 8 gigabytes/second to/from memory 8 gigabytes/second to/from memory
14
Conclusion AMD and Intel go head to head AMD and Intel go head to head But with totally different technologies But with totally different technologies Fascinating Fascinating Many other references on net Many other references on net Do google search for AMD Hammer Do google search for AMD Hammer A good non-AMD reference is A good non-AMD reference is http://www.anandtech.com/showdoc.html?i=1546 http://www.anandtech.com/showdoc.html?i=1546
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.