CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
Cache Coherence. Memory Consistency in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has.
Advertisements

L.N. Bhuyan Adapted from Patterson’s slides
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache III Steve Ko Computer Sciences and Engineering University at Buffalo.
4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 12 CS252 Graduate Computer Architecture Spring 2014 Lecture 12: Synchronization and Memory Models Krste.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Complex Pipelining II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency II Steve Ko Computer Sciences and Engineering University at.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Address Translation and Protection Steve Ko Computer Sciences and Engineering University at.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Pipelining III Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache IV Steve Ko Computer Sciences and Engineering University at Buffalo.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
CSE 490/590 Computer Architecture Virtual Memory II
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Putting it all together: Intel Nehalem Steve Ko Computer Sciences and Engineering University.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Multithreading II Steve Ko Computer Sciences and Engineering University at Buffalo.
February 9, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
CS 152 Computer Architecture and Engineering Lecture 15 - Advanced Superscalars Krste Asanovic Electrical Engineering and Computer Sciences University.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 1: Introduction Course organization:  4 lectures on cache coherence and consistency  2 lectures on transactional memory  2 lectures on interconnection.
CS 152 Computer Architecture and Engineering Lecture 21: Directory-Based Cache Protocols Scott Beamer (substituting for Krste Asanovic) Electrical Engineering.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
February 9, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and.
CS 252 Graduate Computer Architecture Lecture 11: Multiprocessors-II Krste Asanovic Electrical Engineering and Computer Sciences University of California,
April 13, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches Krste Asanovic Electrical Engineering and Computer Sciences University of California,
Snooping Cache and Shared-Memory Multiprocessors
April 18, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
April 15, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
February 11, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 8 - Memory Hierarchy-III Krste Asanovic Electrical Engineering.
Cache Organization of Pentium
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Multiprocessor Cache Coherency
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Shared Address Space Computing: Hardware Issues Alistair Rendell See Chapter 2 of Lin and Synder, Chapter 2 of Grama, Gupta, Karypis and Kumar, and also.
Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Advanced Computer Architecture Lecture 21 MSP shared cached MSI protocol.
EECS 252 Graduate Computer Architecture Lec 13 – Snooping Cache and Directory Based Multiprocessors David Patterson Electrical Engineering and Computer.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories Also known as “Snoopy cache” Paper by: Mark S. Papamarcos and Janak H.
Lecture 13: Multiprocessors Kai Bu
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
Multiprocessor cache coherence. Caching: terms and definitions cache line, line size, cache size degree of associativity –direct-mapped, set and fully.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 5, 2005 Session 22.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
Additional Material CEG 4131 Computer Architecture III
1 Lecture: Coherence Protocols Topics: snooping-based protocols.
4/13/2016 CS152, Spring 2016 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Dr. George Michelogiannakis EECS, University of California.
The University of Adelaide, School of Computer Science
Outline Introduction (Sec. 5.1)
COSC6385 Advanced Computer Architecture
COMP 740: Computer Architecture and Implementation
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches
תרגול מס' 5: MESI Protocol
12.4 Memory Organization in Multiprocessor Systems
Dr. George Michelogiannakis EECS, University of California at Berkeley
Krste Asanovic Electrical Engineering and Computer Sciences
CMSC 611: Advanced Computer Architecture
Krste Asanovic Electrical Engineering and Computer Sciences
CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 18 Cache Coherence Krste Asanovic Electrical Engineering and.
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
CSE 486/586 Distributed Systems Cache Coherence
Presentation transcript:

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 490/590, Spring Last time… Cache coherence protocol –How to propagate writes by one processor to others Cache coherence protocol vs. memory consistency –Cache coherence protocol makes sure that writes are eventually propagated. –Memory consistency protocol guarantees when writes become visible.

CSE 490/590, Spring 2011 More on Coherence A memory system is coherent if: A read by a processor P to a location X that follows a write by P to X, with no writes of X by another processor occurring between the write and the read by P, always returns the value written by P. A read by a processor to location X that follows a write by another processor to X returns the written value if the read and write are sufficiently separated in time and no other writes to X occur between the two accesses. Writes to the same location are serialized; that is, two writes to the same location by any two processors are seen in the same order by all processors. 3

CSE 490/590, Spring Memory Coherence in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has a stale value Do these stale values matter? What is the view of shared memory for programming? cache-1 A100 CPU-Memory bus CPU-1 CPU-2 cache-2 A100 memory A100

CSE 490/590, Spring Problems with Parallel I/O Memory Disk: Physical memory may be stale if cache copy is dirty Disk Memory: Cache may hold stale data and not see memory writes DISK DMA Physical Memory Proc. Cache Memory Bus Cached portions of page DMA transfers

CSE 490/590, Spring Snoopy Cache Goodman 1983 Idea: Have cache watch (or snoop upon) DMA transfers, and then “do the right thing” Snoopy cache tags are dual-ported Proc. Cache Snoopy read port attached to Memory Bus Data (lines) Tags and State A D R/W Used to drive Memory Bus when Cache is Bus Master A R/W

CSE 490/590, Spring Snoopy Cache Actions for DMA Observed Bus Cycle Cache State Cache Action Address not cached DMA Read Cached, unmodified Memory Disk Cached, modified Address not cached DMA Write Cached, unmodified Disk Memory Cached, modified No action Cache intervenes Cache purges its copy ???

CSE 490/590, Spring Shared Memory Multiprocessor Use snoopy mechanism to keep all processors’ view of memory coherent M1M1 M2M2 M3M3 Snoopy Cache DMA Physical Memory Bus Snoopy Cache Snoopy Cache DISKS

CSE 490/590, Spring Snoopy Cache Coherence Protocols write miss: the address is invalidated in all other caches before the write is performed read miss: if a dirty copy is found in some cache, a write- back is performed before the memory is read

CSE 490/590, Spring Cache State Transition Diagram The MSI protocol M SI M: Modified S: Shared I: Invalid Each cache line has state bits Address tag state bits Write miss (P1 gets line from memory) Other processor intent to write (P 1 writes back) Read miss (P1 gets line from memory) P 1 intent to write Other processor intent to write Read by any processor P 1 reads or writes Cache state in processor P 1 Other processor reads (P 1 writes back)

CSE 490/590, Spring Two Processor Example (Reading and writing the same cache line) M SI Write miss Read miss P 1 intent to write P 2 intent to write P 2 reads, P 1 writes back P 1 reads or writes P 2 intent to write P1P1 M SI Write miss Read miss P 2 intent to write P 1 intent to write P 1 reads, P 2 writes back P 2 reads or writes P 1 intent to write P2P2 P 1 reads P 1 writes P 2 reads P 2 writes P 1 writes P 2 writes P 1 reads P 1 writes

CSE 490/590, Spring Observation If a line is in the M state then no other cache can have a copy of the line! – Memory stays coherent, multiple differing copies cannot exist M SI Write miss Other processor intent to write Read miss P 1 intent to write Other processor intent to write Read by any processor P 1 reads or writes Other processor reads P 1 writes back

CSE 490/590, Spring MESI: An Enhanced MSI protocol increased performance for private data ME SI M: Modified Exclusive E: Exclusive but unmodified S: Shared I: Invalid Each cache line has a tag Address tag state bits Write miss Other processor intent to write Read miss, shared Other processor intent to write P 1 write Read by any processor Other processor reads P 1 writes back P 1 read P 1 write or read Cache state in processor P 1 P 1 intent to write Read miss, not shared Other processor reads Other processor intent to write, P1 writes back

CSE 490/590, Spring CSE 490/590 Administrivia Keyboards available for pickup at my office Project 2: 2 weeks left (Deadline 5/2) –Will have demo sessions –Keyboard helper code will be available Final exam: Thursday 5/5, 11:45pm – 2:45pm

CSE 490/590, Spring Acknowledgements These slides heavily contain material developed and copyright by –Krste Asanovic (MIT/UCB) –David Patterson (UCB) And also by: –Arvind (MIT) –Joel Emer (Intel/MIT) –James Hoe (CMU) –John Kubiatowicz (UCB) MIT material derived from course UCB material derived from course CS252