Recitation 6 – 3/11/02 Outline Cache Organization Accessing Cache Replacement Policy Mengzhi Wang Office Hours: Thursday 1:30 – 3:00 Wean Hall 3108
Cache organization (review) B–110 B–110 valid tag set 0: B = 2 b bytes per cache block E lines per set S = 2 s sets t tag bits per line 1 valid bit per line B–110 B–110 valid tag set 1: B–110 B–110 valid tag set S-1: Cache is an array of sets. Each set contains one or more lines. Each line holds a block of data.
Addressing the cache (review) Address A is in the cache if its tag matches one of the valid lines in the set associated with the set index of A t bitss bits b bits 0m-1 Address A: B–110 B–110 v v tag set s:
Parameters of cache organization B = 2 b = line size E = associativity S = 2 s = number of sets Cache size = B × E × S Other parameters: –s = set index –b = byte offset –t = tag –m = address size –t + s + b = m
Determining cache parameters Suppose we are told we have a 8 KB, direct-map cache with 64 byte lines, and the word size is 32 bits. –A direct-map cache has an associativity of 1. What are the values of t, s, and b? B = 2 b = 64, so b = 6 B × E × S = C = 8192 (8 KB), and we know E = 1 S = 2 s = C / B = 128, so s = 7 t = m – s – b = 32 – 6 – 7 = t = 19s = 7b = 6
One more example Suppose our cache is 16 KB, 4-way set associative with 32 byte lines. These are the parameters to the L1 cache of the P3 Xeon processors used by the fish machines. B = 2 b = 32, so b = 5 B × E × S = C = (16 KB), and E = 4 S = 2 s = C / (E × B) = 128, so s = 7 t = m – s – b = 32 – 5 – 7 = t = 20s = 7b = 5
Accessing Cache: Direct Mapped Cache Reference String Assume Direct mapped cache, 4 four-byte lines, 6 bit addresses (t=2,s=2,b=2): LineVTagByte 0Byte 1Byte 2Byte
Direct Mapped Cache Reference String Assume Direct mapped cache, 4 four-byte lines, Final state: LineVTagByte 0Byte 1Byte 2Byte
Accessing Cache: Set Associative Cache Reference String Four-way set associative, 4 sets, one-byte blocks (t=4,s=2,b=0): SetVTagLine 0/2VTagLine 1/
Set Associative Cache Reference String Four-way set associative, 4 sets, one-byte block, Final state: SetVTagLine 0/2VTagLine 1/
Replacement Policy Replacement policy: –Determines which cache line to be evicted –Matters for set-associate caches Non-exist for direct-mapped cache
Example Assuming a 2-way associative cache, determine the number of misses for the following trace. A.. B.. C.. A.. B.. C.. B.. A.. B.. D... A, B, C, D all mapped to the same set.
Ideal Case: OPTIMAL Policy 0: OPTIMAL –Replace the cache line that is accessed furthest in the future Properties: –Knowledge of the future –The best case scenario
Ideal Case: OPTIMAL ABCABCBABDABCABCBABD Optimal # of Misses A, + A,B+ A,C+ A,C B,C+ B,C B,A+ B,A D,A+ 6
Policy 1: FIFO –Replace the oldest cache line
Policy 1: FIFO ABCABCBABDABCABCBABD Optimal # of Misses A, + A,B+ A,C+ A,C B,C+ B,C B,A+ B,A D,A+ 6 FIFO A, + A,B+ C,B+ C,A+ B,A+ B,C+ B,C A,C+ A,B+ D,B+ 9
Policy 2: LRU Policy 2: Least-Recently Used –Replace the least-recently used cache line Properties: –Approximate the OPTIMAL policy by predicting the future behavior using the past behavior The least-recently used cache line will not be likely to be accessed again in near future
Policy 2: LRU ABCABCBABDABCABCBABD Optimal # of Misses A, + A,B+ A,C+ A,C B,C+ B,C B,A+ B,A D,A+ 6 FIFO A, + A,B+ C,B+ C,A+ B,A+ B,C+ B,C A,C+ A,B+ D,B+ 9 LRU A, + A,B+ C,B+ C,A+ B,A+ B,C+ B,C B,A+ B,A B,D+ 8
Reality: Pseudo LRU Reality –LRU is hard to implement –Pseudo LRU is implemented as an approximation of LRU Pseudo LRU –Each cache line is equipped with a bit –The bit is cleared periodically –The bit is set when the cache line is accessed –Evict the cache line that has the bit unset
Accessing Cache: Fully Associative Cache Reference String Fully associative, 4 four-byte blocks (t=4,s=0,b=2): SetVTagByte 0Byte 1Byte 2Byte 3 0
Fully Associative Cache Reference String Fully associative, 4 four-byte blocks (t=4,s=0,b=2): SetVTagByte 0Byte 1Byte 2Byte Note: Used LRU eviction policy