Presentation is loading. Please wait.

Presentation is loading. Please wait.

Snoopy Coherence Protocols Small-scale multiprocessors.

Similar presentations


Presentation on theme: "Snoopy Coherence Protocols Small-scale multiprocessors."— Presentation transcript:

1 Snoopy Coherence Protocols Small-scale multiprocessors

2 Assumptions broadcast-style interconnect –e.g. shared bus, free-space optical, … –allows passive listeners assume write-back caches –invalidation after a write rather than update write-through (update protocol) is also possible

3 Invalidate vs. update Write-invalidate protocol: –write to shared data: an invalidate is sent to all caches which snoop and invalidate copies. –read miss: snoop caches to find most recent copy Write-update protocol: –write to shared data: broadcast on bus, processors snoop and update any copies. –read miss: memory is always up to date.

4 Three-state MSI protocol Each block of memory is in one state: –Clean in all caches and up-to-date in memory (shared) –Dirty in exactly one cache (modified) –Not in any cache Each cache line is in one state: –Modified: cache has only copy, it is writable and dirty –Shared: line can be read –Invalid: line contains no valid data Read misses cause the cache to snoop the bus Write to a shared block is treated as a miss - needs a (snoopy) bus transaction

5 IS M Read (miss) Write (hit) Write (miss) Read (hit) Read or write (hit) a) Processor actions

6 IS M Bus write Bus read – send data to requestor Bus write Bus read b) Bus snooping -send data to requestor Bus read or write

7 Example assume cache line is initially invalid consider two addresses, A1 and A2 assume A1 and A2 map to the same cache line, but A1 != A2 –that is, A1 and A2 refer to completely different places in memory, not adjacent (or nearby) addresses that fit within the same block

8 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 Step 1a: Write miss, invalid line - is A1 cached anywhere?

9 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 Step 1b: No other cache responds - assert ownership

10 Wait a minute... if we only have one type of read transaction (“Bus read”) how do we tell the difference between memory or another cache responding? the bus cycle allows for an “intervention” –more properly, a cache-to-cache intervention –a cache pre-empts the bus and answers instead of memory

11 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 Step 2: Read hit - no bus action needed

12 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 Step 3a: Read miss - does anyone have A1 cached?

13 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 S 10SA110Bus writeP1A110A110 Step 3b: Cached elsewhere - P1 replies

14 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 S 10SA110Bus writeP1A110A110 P2: write 20 to A1I MA120Bus writeP2A1 Step 4: Write hit, shared line - now P2 owns it

15 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 S 10SA110Bus writeP1A110A110 P2: write 20 to A1I MA120Bus writeP2A1 P2: write 40 to A2 Bus writeP2A120A120 Step 5a: Write miss, A2 maps to the same line as A1 - first, write back the victim

16 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 S 10SA110Bus writeP1A110A110 P2: write 20 to A1I MA120Bus writeP2A1 P2: write 40 to A2 Bus writeP2A120A120 Bus readP2A2 Step 5b: Service the miss - does anyone have A2 cached?

17 StepP1P2BusMemory StateAddrValueStateAddrValueActionProcessorAddrValueAddrValue P1: write 10 to A1I Bus readP1A1 M 10 Bus writeP1A1 P1: read A1MA110 P2: read A1 I Bus readP2A1 S 10SA110Bus writeP1A110A110 P2: write 20 to A1I MA120Bus writeP2A1 P2: write 40 to A2 Bus writeP2A120A120 Bus readP2A2 M 40Bus writeP2A2 Step 5c: Not cached elsewhere - like the second half of step 1

18 Four state protocol add “exclusive” state indicates this is the only cached copy no need to broadcast an invalidation on a write hit to an E line goal is to reduce bus traffic works well for local variables

19 IE M Write (hit) Write (miss) Read (hit) Read or write (hit) a) Processor actions S Write (hit) Read (miss) 2 Read (miss) 1 Read (hit) 1: data comes from memory 2: data from another cache

20 IE M Bus read Bus write Bus read or write b) Bus snooping S Bus read Bus write Bus read

21 Coherence misses a new type of miss has been added we still have the usual cold, capacity and conflict misses now we also have coherence misses these occur when a read miss is serviced from another cache


Download ppt "Snoopy Coherence Protocols Small-scale multiprocessors."

Similar presentations


Ads by Google