Extra Cache Coherence Examples In the following examples there are a couple questions. You can answer these for practice by ing Colin at

Slides:



Advertisements
Similar presentations
Cache Coherence. Memory Consistency in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has.
Advertisements

CSE 502: Computer Architecture
Lecture 7. Multiprocessor and Memory Coherence
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Cache Optimization Summary
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
CS252 Graduate Computer Architecture Lecture 25 Memory Consistency Models and Snoopy Bus Protocols Prof John D. Kubiatowicz
Computer Architecture II 1 Computer architecture II Lecture 8.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
EECC756 - Shaaban #1 lec # 10 Spring Shared Memory Multiprocessors Symmetric Memory Multiprocessors (SMPs): commonly 2-4 processors/node.
EECC756 - Shaaban #1 lec # 11 Spring Shared Memory Multiprocessors Symmetric Multiprocessors (SMPs): –Symmetric access to all of main memory.
1 Lecture 1: Introduction Course organization:  4 lectures on cache coherence and consistency  2 lectures on transactional memory  2 lectures on interconnection.
1 Lecture 3: Snooping Protocols Topics: snooping-based cache coherence implementations.
Computer architecture II
Cache Coherence: Part 1 Todd C. Mowry CS 740 November 4, 1999 Topics The Cache Coherence Problem Snoopy Protocols.
Bus-Based Multiprocessor
1 Lecture 2: Intro and Snooping Protocols Topics: multi-core cache organizations, programming models, cache coherence (snooping-based)
April 13, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
EECC756 - Shaaban #1 lec # 10 Spring Multiprocessors Cache Coherence in Bus-Based Shared Memory Multiprocessors Shared Memory Multiprocessors.
Logical Protocol to Physical Design
CS 258 Parallel Computer Architecture Lecture 12 Shared Memory Multiprocessors II March 1, 2002 Prof John D. Kubiatowicz
Snooping Cache and Shared-Memory Multiprocessors
Snoopy Coherence Protocols Small-scale multiprocessors.
©RG:E0243:L2- Parallel Architecture 1 E0-243: Computer Architecture L2 – Parallel Architecture.
Shared Address Space Computing: Hardware Issues Alistair Rendell See Chapter 2 of Lin and Synder, Chapter 2 of Grama, Gupta, Karypis and Kumar, and also.
CS492B Analysis of Concurrent Programs Coherence Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
Presented By:- Prerna Puri M.Tech(C.S.E.) Cache Coherence Protocols MSI & MESI.
Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Advanced Computer Architecture Lecture 21 MSP shared cached MSI protocol.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
Cache Coherence CSE 661 – Parallel and Vector Architectures
Evaluating the Performance of Four Snooping Cache Coherency Protocols Susan J. Eggers, Randy H. Katz.
CS252 Graduate Computer Architecture Lecture 18 April 4 th, 2011 Memory Consistency Models and Snoopy Bus Protocols Prof John D. Kubiatowicz
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
ECE 1747: Parallel Programming Basics of Parallel Architectures: Shared-Memory Machines.
1 Memory and Cache Coherence. 2 Shared Memory Multiprocessors Symmetric Multiprocessors (SMPs) Symmetric access to all of main memory from any processor.
Lecture 9 ECE/CSC Spring E. F. Gehringer, based on slides by Yan Solihin1 Lecture 9 Outline  MESI protocol  Dragon update-based protocol.
Multiprocessors— Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
Cache Coherence for Small-Scale Machines Todd C
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
ECE 4100/6100 Advanced Computer Architecture Lecture 13 Multiprocessor and Memory Coherence Prof. Hsien-Hsin Sean Lee School of Electrical and Computer.
Cache Coherence CS433 Spring 2001 Laxmikant Kale.
1 Lecture 7: PCM Wrap-Up, Cache coherence Topics: handling PCM errors and writes, cache coherence intro.
Performance of Snooping Protocols Kay Jr-Hui Jeng.
The University of Adelaide, School of Computer Science
CSC/ECE 506: Architecture of Parallel Computers Bus-Based Coherent Multiprocessors 1 Lecture 12 (Chapter 8) Lecture 12 (Chapter 8)
Outline Introduction (Sec. 5.1)
COSC6385 Advanced Computer Architecture
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches
תרגול מס' 5: MESI Protocol
Cache Coherence in Shared Memory Multiprocessors
Cache Coherence for Shared Memory Multiprocessors
Lecture 9 Outline MESI protocol Dragon update-based protocol
Example Cache Coherence Problem
Prof John D. Kubiatowicz
Protocol Design Space of Snooping Cache Coherent Multiprocessors
Lecture 2: Snooping-Based Coherence
Chip-Multiprocessor.
Cache Coherence in Bus-Based Shared Memory Multiprocessors
Symmetric Multiprocessors
Lecture 4: Update Protocol
Bus-Based Coherent Multiprocessors
Shared Memory Multiprocessors
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Lecture 25: Multiprocessors
Lecture 25: Multiprocessors
Lecture 3: Coherence Protocols
Lecture 8 Outline Memory consistency
CS 258 Parallel Computer Architecture Lecture 16 Snoopy Protocols I
Prof John D. Kubiatowicz
Presentation transcript:

Extra Cache Coherence Examples In the following examples there are a couple questions. You can answer these for practice by ing Colin at

MSI Protocol There are three processors. Each is reading/writing the same value from memory where r1 means a read by processor 1 and w3 means a write by processor 3. For simplicity sake, the memory location will be referred to as “value.” The memory access stream is: r1, r2, w3, r2, w1, w2, r3, r2, r1

P1 wants to read the value. The cache does not have itand generates a BusRd for the data. Main memory controller provides the data. The data goes into the cache in the shared state. P1P3 P2 Snooper PrRd BusRd r1 valueS Main Memory

P1P3 P2 Snooper valueS PrRd BusRd r2 valueS P2 wants to read the value. Its cache does not have the data, so it places a BusRd to notify other processors and ask for the data. The memory controller provides the data. Main Memory

P1P3 P2 Snooper valueS w3 valueS P3 wants to write the value. It places a BusRdX to get exclusive access and the most recent copy of the data. The caches of P1 and P2 see the BusRdX and invalidate their copies. Because the value is still up-to- date in memory, memory provides the data. PrWr BusRdX II valueM Main Memory

P1P3 P2 Snooper valueM I r2 valueI P2 wants to read the value. P3’s cache has the most up-to-date copy and will provide it. P2’s cache puts a BusRd on the bus. P3’s cache snoops this and cancels the memory access because it will provide the data. P3’s cache flushes the data to the bus. PrRd BusRd SS Flush

P1P3 P2 Snooper valueS I w1 valueS P1 wants to write to its cache. The cache places a BusRdX on the bus to gain exclusive access and the most up-to-date value. Main memory is not stale so it provides the data. The snoopers for P2 and P3 see the BusRdX and invalidate their copies in cache. PrWr BusRdX IIM Main Memory

P1P3 P2 Snooper valueI M w2 valueI P2 wants to write the value. Its cache places a BusRdX to get exclusive access and the most recent copy of the data. P1’s snooper sees the BusRdX and flushes the data to the bus. Also, it invalides the data in its cache and cancels the memory access. PrWr BusRdX IM Flush Main Memory

P1P3 P2 Snooper valueI I r3 valueM P3 wants to read the value. Its cache does not have a valid copy, so it places a BusRd on the bus. P2 has a modified copy, so it flushes the data on the bus and changes the status of the cache data to shared. The flush cancels the memory accecss and updates the data in memory as well. PrRd BusRd Flush SS

P1P3 P2 Snooper valueS I r2 valueS P2 wants to read the value. Its cache has an up-to-date copy. No bus transactions need to take place as there is no cache miss. PrRd Main Memory

P1P3 P2 Snooper valueS I r1 valueS P1 wants to read the value. The cache does not have it, so it places a BusRd onto the bus for the data. The memory controller provides the data as it has an up-to- date copy. The data goes into the cache in the shared state. PrRd BusRd S Main Memory

MESI Protocol There are three processors. Each is loading or storing different words from memory given as w0, w1, and w2. These all map to the same location in cache.

The memory accesses are as follows: P1: ld w0, P3: ld w2 P1: st w0, P2: st w2 P2 st w2, P3 ld w0 P3: st w0 P1: ld w2 P2: ld w1 P3: ld w1

In both loads, a cache miss happens so each cache puts a BusRd onto the bus for the information. Main memory is the owner and will provide the up-to-date data. P1’s cache loads w0 in the E state. P3’s cache loads w2 in the E state as well. P1P3 Snooper PrRd BusRd(¬S) P1 ld w0 P3 ld w2 PrRd BusRd (¬S) P2 w0 Ew2E Main Memory

P1 has w0 in the exclusive state, so on the cache hit, it does not need to have a bus transaction. w2 is not in P2’s cache, so the cache places a BusRdX to gain exclusive access. Main memory provides the data because it is not stale even though P3’s cache has the data. w2 is loaded in M state and P3’s cache invalidates its copy of w2. P1P3 Snooper w2Ew0E PrWr P1 st w0 P2 st w2 Flush P2 PrWr BusRdX IM w2 M Main Memory

P2 executes another store to w2. It already has exclusive access to w2 and the store results in a cache hit. No bus transaction is issued by P2’s cache. P3 wants to load w0. This results in a cache miss and the cache issues a BusRd transaction. P1’s cache asserts the S signal because it has a dirty w0 and provides the up-to-date data through a flush. P1 changes its state to S. P3’s cache loads w0 in the S state. P1P3 Snooper w2Iw0M P2 st w0 P3 ld w0 P2 w2M PrWrPrRd BusRd(S) Flush Sw0S

P3 executes a store to w0. Both P1 and P3 have an up-to-date, unmodified w0. What bus transactions are needed? P1P3 Snooper w0S S P3 st w0 P2 w2M I PrWr M Main Memory

P1 wants to load w2. P1’s cache does not have w2, so it issues a BusRd transaction. P2’s cache turns on the S signal, so P1’s cache knows to load w2 in the S state. P2’s cache provides w2 for P1 and cancels the access to main memory through a Flush. P1P3 Snooper w0M S P1 ld w2 P2 w2M PrRd BusRd(S) S Flush Sw2

P2 wants to load w1. This generates a cache miss. P2’s cache issues a BusRd transaction. The S signal is not asserted, so it knows that it has exclusive access to w1. Main memory provides the data for w1. Should the state of w2 be changed in P1 because it is the only cache that has a copy of w2? P1P3 Snooper w0Mw2S P2 ld w1 P2 w2S PrRd BusRd(¬S) Ew1 Main Memory

P3 wants to load w1. This generates a cache miss. P3’s cache issues a BusRd transaction. The S signal is asserted by P2’s cache, so P3’s cache knows that it will load w1 in the S state. Main memory provides the data for w1 because its copy is not stale. P3 flushes w0 before loading w1. P1P3 Snooper w0Mw2S P3 ld w1 P2 w1E PrRd BusRd(S) S Flush w1S Main Memory Flush

Dragon Protocol In this system there are 3 processors. Each is loading or storing from memory locations w0, w1, w2, and w3. w0 and w1 are on the same cache line and are loaded at the same time. Likewise for w2 and w3. The two cache lines map to the same location in cache.

P1 wants to load w2. This generates a cache miss and P1’s cache issues a BusRd bus transaction. The S signal is not asserted, so the cache knows to load w2 and w3 in the E state. P1P3 Snooper PrRd BusRd(¬S) P1 ld w2 P2 Ew2,w3 Main Memory

P2 wants to load w0. This generates a cache miss and P1’s cache issues a BusRd bus transaction. The S signal is not asserted, so the cache knows to load w0 and w1 in the E state. P1P3 Snooper w2, w3E PrRd BusRd(¬S) P2 ld w0 P2 w0, w1 E Main Memory

P3 wants to store w1. This generates a cache miss. Memory will provide the data as no other cache has this line in a modified state. After storing the new value of w1, P3’s cache issues a BusUpd. P2 snoops this and updates its cache with the updated w1. P1P3 Snooper w2, w3E PrWr BusRd(S) P3 st w1 P2 E BusUpd w0, w1 Sm Main Memory Sc Update w0, w1

P1 issues store w3. It has exclusive access to this cache line. What bus transactions does P1’s cache issue? P1P3 Snooper w0, w1Smw2, w3E P1 st w3 P2 w0, w1Sc PrRd M Main Memory

P2 wants to load w3. This generates a cache miss. P2’s cache issues a BusRd transaction. P1 asserts the S signal, so P2 will load the cache line in Sc state. P1’s cache has a modified version of the cache line, so it will provide the data for P2 with a flush transaction. P1’s cache will update the line’s state to Sm. Should P3 change w0/w1’s state to M? P1P3 Snooper w0, w1Smw2, w3M P2 ld w3 P2 w0, w1Sc PrRd BusRd(S) ??? Main Memory Sm Flush w2, w3

P2 wants to load w2. P3 wants to store w0. What are the necessary bus transactions and cache updates that need to take place? P1P3 Snooper w0, w1?w2, w3Sm P2 ld w2 P3 st w0 P2 w2, w3Sc PrRdPrWr Main Memory