The AMD K8 Processor Architecture December 14 th 2006.

The AMD K8 Processor Architecture December 14 th 2006

K7 vs K8 K7: 3 x86 decoding units, 3 integer units (ALU), 3 floating point units (FPU),128KB L1 cache K8: 3 decoders (16 bytes of instructions per clock cycle);  x86 instructions decoded into fixed length micro-operations (µOPs).  Complex instructions are decoded into 2 + µOps  FastPath: Certain µOPs are packed together  µOPs are then dispatched to the execution units.  3 Address Generation Units (AGU) for Loads and Stores  Three integer units (ALU): most µOps executed in one cycle, multiplication has a 3 cycles latency in 32 bits, and a 5 cycles latency in 64 bits  Three floating point units (FPU), that handle x87, MMX, 3DNow!, SSE and SSE2 instructions  Load/Store stage: The L1 is dual-ported, that means it can handle two 64 bits reads or writes each clock cycle

K8 Hammer Microarchitecture

K7 vs K8 Pipelines

K8 L1 and L2Cache The L1 cache CPUK8Athlon XPPentium 4 NorthwoodPentium 4 Prescott Size code : 64KB data : 64KB code : 64Ko data : 64KB TC : 12Kµops data : 8KB TC : 12Kµops data : 16KB Associativity code : 2 way data : 2 way TC : 8 way data : 4 way TC : 8 way data : 8 way Cache line size code : 64 bytes data : 64 bytes TC : n.a data : 64 bytes Write policyWrite Back Write Through Latency3 cycles 2 cycles4 cycles The L2 cache CPUK8Athlon XPPentium 4 NorthwoodPentium 4 Prescott Size 512KB (Newcastle) 1024KB (Hammer) 256 and 512KB512KB1024KB Associativity16 way 8 way Cache line size64 bytes Latency (given by manufacturer) ?8 cycles7 cycles11 cycles Bus width128 bits64 bits256 bits L1 relationshipexclusive inclusive

Exclusive vs Inclusive Cache Exclusive L1-L2 PositiveNegative L1 and L2 cache designs a cache line (instructions/data) is not persisted from L1 to L2 No constraint on the L2 size (it can be small). Total cache size is sum of the sub- level sizes. L2 performance impaired (latency) Need to use a Victim Buffer Inclusive L1-L2 PositiveNegative Duplicates the content of the L1 cache in the L2 Cache L2 performance improvedConstraint on the L1/L2 size ratio (relatively large L2) Total cache size may be smaller.

K8 Athlon 64

Athlon 64 Operating Modes

Opteron VS. Xeon

The AMD K8 Processor Architecture December 14 th 2006.

Similar presentations

Presentation on theme: "The AMD K8 Processor Architecture December 14 th 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The AMD K8 Processor Architecture December 14 th 2006.

Similar presentations

Presentation on theme: "The AMD K8 Processor Architecture December 14 th 2006."— Presentation transcript:

Similar presentations

About project

Feedback