Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cache Coherence for Shared Memory Multiprocessors

Similar presentations


Presentation on theme: "Cache Coherence for Shared Memory Multiprocessors"— Presentation transcript:

1 Cache Coherence for Shared Memory Multiprocessors

2 Cache Coherence Problem
Example Processors see different values for u after event 3 P P 2 P 1 3 4 u = ? 3 u = 7 5 u = ? $ $ $ 1 u :5 2 u :5 I/O devices u :5 Memory

3 Bus Snooping A coherence technique for Bus-based shared memory multiprocessors Snoopy cache controller (SCC) inserted to do bus snooping Bus transactions are visible to all SCCs $ P n 1 SCC Bus I/O devices Mem

4 Snooping for Write-Through Caches
When a SCC detects a relevant write transaction, it can either Invalidate the block containing the relevant variable (write-invalidate approach) Update the value in cache (write-update approach)

5 Write-Invalidate Protocol
Two states per block in each cache As in uniprocessor Hardware state bits associated with blocks that are in the cache Invalid state is also used in place of “not present” state I V BusWr / -- PrRd/ -- PrWr / BusWr PrRd / BusRd State Tag Data State Tag Data I/O devices Mem P 1 $ n Bus This is just a particular design where on a write miss, the processor writes to main memory. Other designs may read the block first to validate it. A/B: if A is observed, transaction B is generated

6 Example Three processors, consider the states of the blocks containing X Main memory P3 $ P2 $ P1 $ (X / State) Operation 10 ? / I Initially 10 / V P2 Rd X P3 Rd X 15 10 / I 15 / V ? /I P2 Wr X=15 P1 Rd X 3 15 / I 3 / V P1 Wr X = 3 6 3 / I P3 Wr X = 6 Block remains invalid. Updating the value of X isn’t enough to validate the whole block

7 Snoopy Cache Controller
Bus Snooping Advantages No need to change processor design No explicit coherence statements added to program Snoopy cache controller observes events from Local processor Bus Write operations Write-invalidate vs. write-update Write-through caches See last lecture Write-back caches Now, writes take place locally; SCCs don’t observe them How can we handle this? Extra work has to be done Snoopy Cache Controller

8 Write-Back Caches Usually have a “dirty bit” One bit per block State
True: block has been modified False: block unchanged Use for uniprocessor Block has to be written back to memory upon replacement Use for multiprocessors Same as uniprocessor plus It means the processor “owns” the block

9 The Extra Work … ...before a processor writes into cache, it performs an “ownership” transaction… Case 1: No other modified copies of block in system Processor can write back Case 2: A modified copy exists somewhere in the system Old owner Writes block to memory Invalidates its local copy New owner Reads the block as it’s being written back to memory Performs write What the new owner did is called “read to own” (read to modify) transaction There is only one owner at a time Still don’t get it? Wait until you see the MSI protocol!

10 Ownership Overhead Ownership transactions are overhead
If it happens every time a write is needed A block will be written back to memory every time Then, write-back caches would be as good/bad as write-through Let’s cross our fingers and count on the concept of locality Spatial and temporal locality can do it for us A processor owns the block and performs several writes consecutively

11 MSI Protocol: States This means it’s another write-invalidate protocol
We need to differentiate between reads and writes Split the Valid state into two states I: Invalid S: Shared (one or more can read only) M: Modified or Dirty (only one can write) This means it’s another write-invalidate protocol Invalid Valid

12 MSI Protocol: Events/Actions
Local processor events PrRd: read PrWr: write Bus transactions BusRd: read w/ no intent to modify BusRdX: read w/ intent to modify (read to own) BusWB: update memory Possible actions _: Nothing BusRd: send read request over the bus BusRdX: ownership (read to own) transaction Flush: copy modified block to memory

13 MSI Protocol: State Transitions
PrRd, PrWr/_ M BusRd/Flush PrWr/BusRdX Promote Demote BusRdX/Flush PrWr/BusRdX S PrRd,BusRd/_ PrRd/BusRd BusRdX/_ I

14 MSI Protocol: Example Three processors, consider the states of the blocks containing X Main memory P3 $ P2 $ P1 $ (X / State) Operation 10 ? / I Initially 10 / S P2 Rd X P3 Rd X 10 / I 15 / M P2 Wr X=15 15 15 / S P1 Rd X 15 / I 3 / M P1 Wr X = 3 6/M P1 Wr X = 6

15 MESI Protocol: What’s wrong with MSI?
Another write-invalidate protocol Consider this MSI scenario Block containing X isn’t in any cache P1 reads X: BusRd, state: S P1 modifies X: BusWr, state: M BusWr is to let everybody else know X is being modified Previous scenario has 2 bus transactions No need for 2 transactions since P1 is the only processor to know about X!

16 MESI Protocol: States Same as MSI except S is split in 2
E: Exclusive clean (only one processor) S: Shared clean (more than one processor) Let’s consider same scenario Block containing X isn’t in any cache P1 reads X: BusRd, state: E P1 modifies X: nothing, state: M In other words, P1 doesn’t need to let anybody know about the modification

17 MESI Protocol: Hardware Support
Additional bus signal is needed Use S signal (S for shared) This helps processor know whether to load block in E or S state A cache controller asserts S signal if the relevant block is in cache S bus signal is a wired OR line

18 MESI Protocol: State Transitions
Diagram only showing labels for what’s different from MSI Flushing a “clean” block A fast way for the new reader to read the block While flushing a shared block, Flush’ means only 1 processor is responsible Other protocol variations may not flush a clean block M PrWr/_ E Demote BusRd/Flush Promote PrRd,/_ BusRdX/Flush S Not(S) BusRdX/Flush’ S I

19 Dragon Protocol Write-back update protocol States
Exclusive (E): 1 cache has a clean copy Shared-clean (Sc): 2 or more caches have a clean copy; memory up-to-date Shared-modified (Sm): 1 cache just modified the block, some other chaches memory outdated Modified (M): 1 cache has a modified copy Added processor events: PrRdMiss, PrWrMiss (remember we don’t have I state) Added bus transactions: BusUpd Broadcast the word or byte written by processor so other processors can update their copies

20 Dragon Protocol: State Transitions
PrRd/— PrRd/— BusUpd/Update BusRd/— E Sc PrRdMiss/BusRd(S) PrRdMiss/BusRd(S) PrW r/— PrW r/BusUpd(S) PrW r/BusUpd(S) BusUpd/Update BusRd/Flush PrW rMiss/(BusRd(S); BusUpd) PrW rMiss/BusRd(S) Sm M PrW r/BusUpd(S) PrRd/— PrRd/— PrW r/BusUpd(S) BusRd/Flush PrW r/—

21 Snoopy Protocol Taxonomy
Write-back Write- through MSI MESI IV Write-invalidate Dragon Homework Write-update Cache Protocol


Download ppt "Cache Coherence for Shared Memory Multiprocessors"

Similar presentations


Ads by Google