Download presentation
Presentation is loading. Please wait.
Published byPriscilla Paul Modified over 9 years ago
1
B. Ramamurthy
2
12 stage pipeline At peak speed, the processor can request both an instruction and a data word on every clock. We cannot afford pipeline stalls: solution: add a cache Cache is 16KB, 16-word blocks
3
Send address to the appropriate cache. The address comes from either the PC or from the ALU. If the cache signals hit, the requested word is available on the data lines Since there are 16 words in the desired block, we need to select the right word. Block index field is used to select the indexed word from the 16 words in the indexed block. If cache signals miss, we send the address to main memory and get the data from main memory and fill the cache. Data is then read again. Lets look at the schematic of the organization: fig.7.9
4
CPU Cache Main Memory What is the bus width? How to organize the main memory?
5
Assume that on a cache miss, We need 1 memory cycle to send address to main memory 15 memory cycles to read DRAM memory word (assume bus width is 32 bits = 4 bytes) 1 memory cycle to send word of data back Total for block access: 1+ 4X15 + 1X4 = 1 + 60 + 4 = 65 cycle Bytes received = 1 block of cache = 4 X 4 = 16 bytes Byte/cycle = 16/65 = 0.25 ( too low for our fast processor!) What is your solution? Need better bandwidth. Increase bus width? Memory interleave? Wide memory organization? See fig. 7-11
6
Increase memory width: double it 1 + 2 X 15 + 2 X1 = 1+ 30 + 2 = 33 cycles 16/33 = 0.5 Memory interleaving: 1 + 15 + 4x1 = 20 cycles 16/20 = 4/5 = 0.8 65 cycles penalty 33 cycles 20 cycles (not bad at all)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.