Download presentation
Presentation is loading. Please wait.
1
Memory – Caching: Writes
CS/COE 1541 (term 2174) Jarrett Billingsley
2
Class Announcements I'm improving my time management and work environment! Project 1 writeup tomorrow GUARANTEED Due Sunday, March 5th...? Or during spring break? (Your choice!) Class chat – contact me if you need a link Are you getting my s? Sometimes I just don't know Would like a more permanent/centralized way to notify and communicate with you all... not sure what 2/13/2017 CS/COE 1541 term 2174
3
Some clarifications from HW and exam...
You can't fetch an instruction on every cycle when stalling. Fetches must stall too. Single-issue pipelines cannot have multiple instructions in same phase at once, and later instructions cannot finish before earlier instructions. Out-of-order is inherently superscalar and pipelined (overlapping!) Sum of instruction latencies / number of instructions is not CPI, but it is a useful metric: average instruction latency. sw reads the first register, confusingly. sw t0, 0(s0) reads t0. la and li do not touch memory! They just put constants in registers. 2/13/2017 CS/COE 1541 term 2174
4
Better terminology Some of the confusion on CPI is on the book, some on me... From now on I will try to use the following terms: CPI (and IPC) will refer to throughput. This is an average across a program. The two terms are "how many cycles it takes to complete the entire program" and "how many instructions". Instruction latency will refer to "how many cycles it takes to complete a given kind of instruction." I called this "intrinsic CPI" before. Amortized latency will be "instruction latency x percentage of program that consists of that instruction." I called this "average CPI" before. 2/13/2017 CS/COE 1541 term 2174
5
Handling Writes 2/13/2017 CS/COE 1541 term 2174
6
Let's start simple: cache write HITS.
Very common pattern: x++ lw t0, &x addi t0, t0, 1 sw t0, &x Assuming &x is , how will the lw change the cache? Now how will the sw change the cache...? Uh oh, now the cache is inconsistent. The contents of the cache differ from memory. How can we solve this? V Tag Data 000 001 010 011 100 101 110 111 1 111 24 25 2/13/2017 CS/COE 1541 term 2174
7
Technique 1: Write-through
When you write to cache, write the same data to memory simultaneously. What if we wrote to address ? What happens to data in cache? Eh, whatever. Cache is always consistent. Just overwrite it and change the tag. Consistency is solved! But what's the problem with write-through? Memory is slow, and we have to stall. How could we fix this... We could CACHE THE CACHE AAA Or use a tiny buffer! V Tag Data 000 001 010 1 111 24 ... 000 94 25 Memory Address Data ... 000010 17 111010 24 94 25 2/13/2017 CS/COE 1541 term 2174
8
Write buffers Instead of immediately writing the data to memory, buffer it. This lets the CPU "fire and forget" – it doesn't have to stall for memory; the buffer will write the data while the CPU keeps executing. Eventually the data ends up in memory. What if another write while the buffer is full? Stall. Speed at which buffer empties into memory (words/cycle) must exceed speed at which writes happen, or it's not worth it. Multi-entry buffers are common. Wide buses to memory are useful! V Tag Data 000 001 010 1 111 24 ... 25 Buffer Address Data - 111010 25 Memory Address Data ... 111010 24 25 2/13/2017 CS/COE 1541 term 2174
9
Uh oh Aold B Anew B Aold Anew Cache Memory Buffer
There are complications when you add a write buffer: Write to block A. Read block B. Collides with A. Read block A…? Uh-oh. Cache Memory Aold B Anew B Aold Buffer Anew 2/13/2017 CS/COE 1541 term 2174
10
Check that buffer To ensure consistency, we have to check the buffer too. Write buffers are essentially fully-associative FIFO caches. Since they’re fully-associative, a large write buffer would require a LOT of comparators. This is one reason why they’re usually 4-8 blocks in practice. Adding a write buffer can help amortize the cost of writes, which reduces miss penalty, but… It adds another step to checking for a hit, increasing hit time. 2/13/2017 CS/COE 1541 term 2174
11
Writing into the void sw t0, 0000002
What if we write to an address that is not cached? (empty/wrong block in cache) sw t0, Should we put an entry in the cache? Called write-allocate. Temporal locality says we might need to read it again soon! Or we could write to memory (or the write buffer...) and leave the cache unchanged. Called write-no-allocate. A common term for write-through paired with no-allocate is write-around, since you’re writing “around” the cache – skipping it. V Tag Data 000 ... 1 000 94 Memory Address Data 000000 24 ... 94 2/13/2017 CS/COE 1541 term 2174
12
Something more intuitive...
When you get your notebook out... you write things in it. And when you're done... you put it away. This is the intuition behind write-back: we only write changed cache data back to memory when we're "done with it." In other words, when it gets kicked out of the cache! But this scheme is more complex... We need to keep a dirty bit on each word which says whether the word has changed since it was brought in. HITs turn it on. We need to check the dirty bit when we want to overwrite a cache word. And if it's dirty, we need to write back to memory. This incurs an extra stall! Unless we PUT ANOTHER BUFFER ON THE CACHE, SO WE BUFFER WRITES TO THE CACHE AS WELL AAAAAAAHHHHH 2/13/2017 CS/COE 1541 term 2174
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.