Download presentation
Presentation is loading. Please wait.
Published byJasmine Black Modified over 9 years ago
1
1 CENG 450 Computer Systems and Architecture Cache Review Amirali Baniasadi amirali@ece.uvic.ca
2
Problem 1: Final 2004 zYou are building a system around a processor running at 2.0 GHz. The processor has a CPI of 0.5 excluding memory stalls. 30% of the instructions are loads and stores. The memory system is composed of a unified L1 cache that imposes 1ns penalty on hits. The L1-cache has a miss rate of 2% for both instructions and data and has 32-byte blocks. The unified L2 cache has 64-byte blocks and an access time of 25ns. It is connected to the L1 cache by a 256-bit data bus that runs at 400MHz and can transfer one 256-bit bus word per bus cycle. Of all memory references sent to the L2 cache, 95% are satisfied without going to the main memory. The main memory has an access latency of 100ns, after which any number of bus words may be transferred at the rate of one per cycle on the 128-bit-wide 133MHz main memory bus. zWhat is A.M.A.T? ( 6 points) zWhat is the over-all CPI? ( 6 points) zYou are considering replacing the 2.0GHz with one that runs at 3.0GHZ but otherwise is identical. How much faster does the system run with a faster processor? Assume that the speed of the memory system remains the same in absolute terms. (6 points)
3
Problem 2: Quiz 2004 zYou are building a system around a processor running at 2.5 GHz. The processor has a CPI of 0.4 excluding memory stalls. 30% of the instructions are loads and stores. The memory system is composed of a split L1 cache that imposes no penalties on hits. The I-cache has a miss rate of 2% and has 32- byte blocks. The D-cache has a 5% miss rate and 16-byte blocks. The 512 KB, unified L2 cache has 64-byte blocks and an access time of 15ns. It is connected to the L1 cache by a 128-bit data bus that runs at 400MHz and can transfer one 128-bit bus word per bus cycle. Of all memory references sent to the L2 cache, 90% are satisfied without going to the main memory. The main memory has an access latency of 75 ns, after which any number of bus words may be transferred at the rate of one per cycle on the 128-bit-wide 133MHz main memory bus. zWhat is A.M.A.T? zWhat is the over-all CPI? zYou are considering replacing the 2.5GHz with one that runs at 3.6GHZ but otherwise is identical. How much faster does the system run with a faster processor? Assume that the speed of the memory system remains the same in absolute terms. zYou have the following two options to improve system performance z1-Use a bigger L1 cache, which cuts miss rates to half but results in a 1-cycle L1 cache hit time. z2-Use a smaller L2 cache, which reduces the access time to 10ns but increases the L2 cache miss-rate to 15%. z How does each option impact performance and which one is a better choice (if any)?
4
Problem 3: Final 2004 A program consists of two-nested loops-a small inner loop and a much larger outer loop. The general structure of the program is given in the figure below. The decimal memory addresses shown represent the location of the two loops and the beginning and end of the total program. All memory locations in the various sections contain instructions to be executed in straight-line sequencing. The program is to be run on a computer that uses a 2-way set-associative cache. The main memory size is 64k-bytes, Cache size is 1k-bytes and block size is 128 bytes. START 139 189 265 Inner loop executed Outer loop executed 20 times Executed 10 times 1200 END The cycle time of the main memory is 100ns and the cycle time of the cache is 10ns. a) Specify the number of bits in the TAG, BLOCK, and WORD fields in main memory addressees. (5 points) b) Compute the total time needed for instruction fetching during the execution of the program in the above figure. (20 points) 339 1000
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.