An Extensible Simulator for Bus- and Directory-Based Coherence

Slides:



Advertisements
Similar presentations
Parallel Processing Problems Cache Coherence False Sharing Synchronization.
Advertisements

Extra Cache Coherence Examples In the following examples there are a couple questions. You can answer these for practice by ing Colin at
Lecture 7. Multiprocessor and Memory Coherence
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
June 9, 2007 Animation of Important Concepts in Parallel Computer Architecture Gambhir, Gehringer & Solihin Animation of Important Concepts in Parallel.
The University of Adelaide, School of Computer Science
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
©RG:E0243:L2- Parallel Architecture 1 E0-243: Computer Architecture L2 – Parallel Architecture.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Presented By:- Prerna Puri M.Tech(C.S.E.) Cache Coherence Protocols MSI & MESI.
Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Advanced Computer Architecture Lecture 21 MSP shared cached MSI protocol.
Lecture 13: Multiprocessors Kai Bu
Evaluating the Performance of Four Snooping Cache Coherency Protocols Susan J. Eggers, Randy H. Katz.
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 5, 2005 Session 22.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
Lecture 9 ECE/CSC Spring E. F. Gehringer, based on slides by Yan Solihin1 Lecture 9 Outline  MESI protocol  Dragon update-based protocol.
Project 11: Influence of the Number of Processors on the Miss Rate Prepared By: Suhaimi bin Mohd Sukor M
ECE 4100/6100 Advanced Computer Architecture Lecture 13 Multiprocessor and Memory Coherence Prof. Hsien-Hsin Sean Lee School of Electrical and Computer.
Cache Coherence CS433 Spring 2001 Laxmikant Kale.
Project Summary Fair and High Throughput Cache Partitioning Scheme for CMPs Shibdas Bandyopadhyay Dept of CISE University of Florida.
ECE/CS 552: Shared Memory © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim Smith.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 7, 2005 Session 23.
The University of Adelaide, School of Computer Science
Outline Introduction (Sec. 5.1)
COSC6385 Advanced Computer Architecture
COMP 740: Computer Architecture and Implementation
Cache Coherence in Shared Memory Multiprocessors
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
CS 704 Advanced Computer Architecture
A Study on Snoop-Based Cache Coherence Protocols
Cache Coherence for Shared Memory Multiprocessors
12.4 Memory Organization in Multiprocessor Systems
Multiprocessor Cache Coherency
Lecture 9 Outline MESI protocol Dragon update-based protocol
The University of Adelaide, School of Computer Science
Example Cache Coherence Problem
Directory-based Protocol
The University of Adelaide, School of Computer Science
Protocol Design Space of Snooping Cache Coherent Multiprocessors
Cache Coherence (controllers snoop on bus transactions)
Lecture 2: Snooping-Based Coherence
Multi-core systems COMP25212 System Architecture
Chip-Multiprocessor.
Cache Coherence Protocols 15th April, 2006
Interconnect with Cache Coherency Manager
Lecture 4: Update Protocol
Bus-Based Coherent Multiprocessors
Multiprocessor Highlights
Lecture 25: Multiprocessors
High Performance Computing
Lecture 4: Synchronization
Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini
Lecture 25: Multiprocessors
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Cache coherence CEG 4131 Computer Architecture III
Lecture 24: Virtual Memory, Multiprocessors
Lecture 8 Outline Memory consistency
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
CS 258 Parallel Computer Architecture Lecture 16 Snoopy Protocols I
Prof John D. Kubiatowicz
Lecture 17 Multiprocessors and Thread-Level Parallelism
CPE 631 Lecture 20: Multiprocessors
CSL718 : Multiprocessors 13th April, 2006 Introduction
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

An Extensible Simulator for Bus- and Directory-Based Coherence Allen Chen Deepak Souda Bhat Edward F. Gehringer North Carolina State University

Cache coherence One of the main issues in parallel architecture Two main protocol types … Invalidate Update Examples of ethical analyses Extensible cache-coherence simulator efg@ncsu.edu

Two main architecture types SMPs … snoopy protocols DSMs … directory-based protocols Extensible cache-coherence simulator efg@ncsu.edu

Simulator is trace driven Reads a set of mem refs in this format 1 r a1663dc4 1 w a1663dc4 2 r a165d30c 2 r a1663dc4 Extensible cache-coherence simulator efg@ncsu.edu

CPU action  bus action CPU action, e.g., write a word PrRd Triggers a bus action, e.g., invalidate other blocks BusRdX How this is implemented do CPU action for each other cache do bus action  method of cache class  method of main class  method of cache class Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported MSI Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported MESI Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported MOESI Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported Firefly Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported Dragon Extensible cache-coherence simulator efg@ncsu.edu

Example method—PrRd for MSI void MSI::PrRd(ulong addr, int processor_number) { // Per-cache global counter to maintain LRU order among // cache ways, updated on every cache access current_cycle++; reads++; cache_line * line = find_line(addr); if (line == NULL) { // This is a miss read_misses++; cache_line *newline = allocate_line(addr); memory_transactions++; // State I --> S newline->set_state(S); // Read miss --> BusRd bus_reads++; sendBusRd(addr, processor_number); } Extensible cache-coherence simulator efg@ncsu.edu

PrRd for MSI (cont.) else { // The block is cached cache_state state; state=line->get_state(); if (state == I){ // The block is cached, but in invalid state. // Hence Read miss memory_transactions++; read_misses++; line->set_state(S); bus_reads++; sendBusRd(addr, processor_number); } else{ update_LRU(line); Extensible cache-coherence simulator efg@ncsu.edu

How directory-based protocols differ Along with cache hierarchy, Cache MSI, MESI, Dragon, etc. a directory hierarchy Directory Full bit vector, SCI, SSCI, etc. Instead of bus actions, signal actions No BusRd, but SignalRd. No iteration over all other caches Directories receive Invalidation, Intervention messages Extensible cache-coherence simulator efg@ncsu.edu

Protocols supported FBV State transition for a cache State transition for main memory Extensible cache-coherence simulator efg@ncsu.edu

Sample assignments Given MESI and Dragon, Given write-through, implement MSI and Firefly Given write-through, implement MSI with and without BusUpgr implement Firefly Extensible cache-coherence simulator efg@ncsu.edu

Sample assignments, cont. Given invalidation protocols, implement update protocols Given a bus-based MESI, implement directory-based MESI Reimplement closely related protocols as a superclass & subclass Hybridize two of the protocols, say, invalidation and update Extensible cache-coherence simulator efg@ncsu.edu

Assignments can study … Vary protocol Vary cache size Vary block size Vary associativity Vary number of processors (dependent on trace) Extensible cache-coherence simulator efg@ncsu.edu

Summary Through coding cache actions, students learn how cache coherence really works. There are many different assignments you can give. You can use the simulator term after term, each time trying something new. Provides a good introduction to how architectural innovations are simulated. (But in much less detail, so results are quick.) Extensible cache-coherence simulator efg@ncsu.edu