Cache Coherence Protocols:
What is Cache Coherence? When one Core writes to its own cache the other core gets to see it, when they read it out of its own cache. Provides underlying guarantees for the programmer with respect to data validation. Even one large L1 Cache per core will not be able to update itself fast enough to processor requests. Less throughput
Cache Coherence: Do we need it?
Coherence Property - I: Read R from Address X on Core C0 returns the value written by the most recent write W on X on C0, if no other core has written to X between W and R
Coherence Property - II If C0 writes to X and C1 reads after a sufficient time and there are no other writes in between, then C1’s read returns the value from C0’s write.
Coherence Property –III: Writes to the same location are serialized: Any 2 OR multiple writes to X must be seen to occur in the same order on all Cores.
How to get Cache Coherence? No Caches (Bad Performance) All Cores share the same L1 Cache (Bad Performance) Force Read in One Cache to see Write made in another: Broadcasts writes to update other caches (Write Update Coherence)
Write Update Snooping Coherence(The initial issue):
Write Update Snooping (Issue Resolved)– II Snooping: Cache 0 monitors the write of 1 in A through the bus Update: When Write is seen, the value is updated in relevant Core’s Cache having Memory block A
Multiple Writes maintains synchronized:
Write Update Enhanced Version (Avoid Memory Writes): In previous write update protocol every Processor needs to broadcast it on the bus and the Memory(Write Through Caches) Add a dirty bit to each Cache. It would allow us delay a Memory Write until replaced from Cache.
Core 0 Block Refreshed and Read with A (value?) Dirty Bit : Memory needs to be updated(WB to RAM), Dirty Bit Cache only has the updated value
Multi – Writes and Dirty Block Replacement: Memory won’t be updated until Dirty block is replaced
Writing from a different Cache:
Dirty Bit Benefits: Write to Memory only when Dirty Block replaced Read from Memory only if no block in a dirty state, else all reads from the Dirty Bit Cache Significantly reduces Read and Write transactions to Memory
Write Update Bus Enhancement:
Write to same Memory Location (S = 1)
Broadcast only when shared among cores: