CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache III Steve Ko Computer Sciences and Engineering University at Buffalo.
Advertisements

4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
1 Lecture 4: Directory Protocols Topics: directory-based cache coherence implementations.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Complex Pipelining II Steve Ko Computer Sciences and Engineering University at Buffalo.
Cache Optimization Summary
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency II Steve Ko Computer Sciences and Engineering University at.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Pipelining III Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache IV Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Putting it all together: Intel Nehalem Steve Ko Computer Sciences and Engineering University.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency I Steve Ko Computer Sciences and Engineering University at Buffalo.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
EECC756 - Shaaban #1 lec # 13 Spring Scalable Cache Coherent Systems Scalable distributed shared memory machines Assumptions: –Processor-Cache-Memory.
1 Lecture 1: Introduction Course organization:  4 lectures on cache coherence and consistency  2 lectures on transactional memory  2 lectures on interconnection.
CS 152 Computer Architecture and Engineering Lecture 21: Directory-Based Cache Protocols Scott Beamer (substituting for Krste Asanovic) Electrical Engineering.
CS252/Patterson Lec /28/01 CS 213 Lecture 9: Multiprocessor: Directory Protocol.
ECE669 L18: Scalable Parallel Caches April 6, 2004 ECE 669 Parallel Computer Architecture Lecture 18 Scalable Parallel Caches.
NUMA coherence CSE 471 Aut 011 Cache Coherence in NUMA Machines Snooping is not possible on media other than bus/ring Broadcast / multicast is not that.
April 13, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
EECC756 - Shaaban #1 lec # 12 Spring Scalable Cache Coherent Systems Scalable distributed shared memory machines Assumptions: –Processor-Cache-Memory.
EECC756 - Shaaban #1 lec # 11 Spring Scalable Cache Coherent Systems Scalable distributed shared memory machines Assumptions: –Processor-Cache-Memory.
1 Lecture 3: Directory-Based Coherence Basic operations, memory-based and cache-based directories.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
CS252/Patterson Lec /28/01 CS 213 Lecture 10: Multiprocessor 3: Directory Organization.
Snooping Cache and Shared-Memory Multiprocessors
April 18, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Multiprocessor Cache Coherency
Spring 2003CSE P5481 Cache Coherency Cache coherent processors reading processor must get the most current value most current value is the last write Cache.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
CS492B Analysis of Concurrent Programs Coherence Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
EECS 252 Graduate Computer Architecture Lec 13 – Snooping Cache and Directory Based Multiprocessors David Patterson Electrical Engineering and Computer.
Constructive Computer Architecture Cache Coherence Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November.
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 5, 2005 Session 22.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 7, 2005 Session 23.
The University of Adelaide, School of Computer Science
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches
CS427 Multicore Architecture and Parallel Computing
Scalable Cache Coherent Systems
Multiprocessor Cache Coherency
Dr. George Michelogiannakis EECS, University of California at Berkeley
Scalable Cache Coherent Systems
CMSC 611: Advanced Computer Architecture
Krste Asanovic Electrical Engineering and Computer Sciences
Example Cache Coherence Problem
CS427 Multicore Architecture and Parallel Computing
CS5102 High Performance Computer Systems Distributed Shared Memory
Scalable Cache Coherent Systems
11 – Snooping Cache and Directory Based Multiprocessors
Scalable Cache Coherent Systems
Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini
Lecture 25: Multiprocessors
Cache coherence CEG 4131 Computer Architecture III
Scalable Cache Coherent Systems
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 18 Cache Coherence Krste Asanovic Electrical Engineering and.
Lecture 24: Multiprocessors
Scalable Cache Coherent Systems
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
CSE 486/586 Distributed Systems Cache Coherence
Scalable Cache Coherent Systems
Scalable Cache Coherent Systems
Scalable Cache Coherent Systems
Presentation transcript:

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 490/590, Spring Last time… True sharing vs. false sharing Miss vs. hit in multiprocessors

CSE 490/590, Spring A Cache Coherent System Must: Provide set of states, state transition diagram, and actions Manage coherence protocol –(0) Determine when to invoke coherence protocol –(a) Find info about state of address in other caches to determine action »whether need to communicate with other cached copies –(b) Locate the other copies –(c) Communicate with those copies (invalidate/update) (0) is done the same way on all systems –state of the line is maintained in the cache –protocol is invoked if an “access fault” occurs on the line Different approaches distinguished by (a) to (c)

CSE 490/590, Spring Bus-based Coherence All of (a), (b), (c) done through broadcast on bus –faulting processor sends out a “search” –others respond to the search probe and take necessary action Could do it in scalable network too –broadcast to all processors, and let them respond Conceptually simple, but broadcast doesn’t scale with number of processors, P –on bus, bus bandwidth doesn’t scale –on scalable network, every fault leads to at least P network transactions Scalable coherence: –can have same cache states and state transition diagram –different mechanisms to manage protocol

CSE 490/590, Spring Scalable Approach: Directories Every memory block has associated directory information –keeps track of copies of cached blocks and their states –on a miss, find directory entry, look it up, and communicate only with the nodes that have copies if necessary –in scalable networks, communication with directory and copies is through network transactions Many alternatives for organizing directory information

CSE 490/590, Spring Basic Operation of Directory k processors. With each cache-block in memory: k presence-bits, 1 dirty-bit With each cache-block in cache: 1 valid bit, and 1 dirty (owner) bit Read from main memory by processor i: If dirty-bit OFF then { read from main memory; turn p[i] ON; } if dirty-bit ON then { recall line from dirty proc (downgrade cache state to shared); update memory; turn dirty-bit OFF; turn p[i] ON; supply recalled data to i;} Write to main memory by processor i: If dirty-bit OFF then {send invalidations to all caches that have the block; turn dirty-bit ON; supply data to i; turn p[i] ON;... }

CSE 490/590, Spring Directory Cache Protocol Assumptions: Reliable network, FIFO message delivery between any given source-destination pair CPU Cache Interconnection Network Directory Controller DRAM Bank Directory Controller DRAM Bank CPU Cache CPU Cache CPU Cache CPU Cache CPU Cache Directory Controller DRAM Bank Directory Controller DRAM Bank

CSE 490/590, Spring Cache States For each cache line, there are 4 possible states: –C-invalid (= Nothing): The accessed data is not resident in the cache. –C-shared (= Sh): The accessed data is resident in the cache, and possibly also cached at other sites. The data in memory is valid. –C-modified (= Ex): The accessed data is exclusively resident in this cache, and has been modified. Memory does not have the most up-to-date data. –C-transient (= Pending): The accessed data is in a transient state (for example, the site has just issued a protocol request, but has not received the corresponding protocol reply).

CSE 490/590, Spring Home directory states For each memory block, there are 4 possible states: –R(dir): The memory block is shared by the sites specified in dir (dir is a set of sites). The data in memory is valid in this state. If dir is empty (i.e., dir = ε), the memory block is not cached by any site. –W(id): The memory block is exclusively cached at site id, and has been modified at that site. Memory does not have the most up-to-date data. –TR(dir): The memory block is in a transient state waiting for the acknowledgements to the invalidation requests that the home site has issued. –TW(id): The memory block is in a transient state waiting for a block exclusively cached at site id (i.e., in C-modified state) to make the memory block at the home site up-to- date.

CSE 490/590, Spring CSE 490/590 Administrivia Keyboards available for pickup at my office Project 2: less than 2 weeks left (Deadline 5/2) –Will have demo sessions No class on 5/2 (finish the project!) Final exam: Thursday 5/5, 11:45pm – 2:45pm Project 2 + Final = 55%

CSE 490/590, Spring CategoryMessages Cache to Memory Requests ShReq, ExReq Memory to Cache Requests WbReq, InvReq, FlushReq Cache to Memory Responses WbRep(v), InvRep, FlushRep(v) Memory to Cache Responses ShRep(v), ExRep(v) Protocol Messages There are 10 different protocol messages:

CSE 490/590, Spring Cache State Transitions (from invalid state)

CSE 490/590, Spring Cache State Transitions (from shared state)

CSE 490/590, Spring Cache State Transitions (from exclusive state)

CSE 490/590, Spring Cache Transitions (from pending)

CSE 490/590, Spring Home Directory State Transitions Messages sent from site id

CSE 490/590, Spring Home Directory State Transitions Messages sent from site id

CSE 490/590, Spring Home Directory State Transitions Messages sent from site id

CSE 490/590, Spring Home Directory State Transitions Messages sent from site id

CSE 490/590, Spring Acknowledgements These slides heavily contain material developed and copyright by –Krste Asanovic (MIT/UCB) –David Patterson (UCB) And also by: –Arvind (MIT) –Joel Emer (Intel/MIT) –James Hoe (CMU) –John Kubiatowicz (UCB) MIT material derived from course UCB material derived from course CS252