An Extensible Simulator for Bus- and Directory-Based Coherence Allen Chen Deepak Souda Bhat Edward F. Gehringer North Carolina State University
Cache coherence One of the main issues in parallel architecture Two main protocol types … Invalidate Update Examples of ethical analyses Extensible cache-coherence simulator efg@ncsu.edu
Two main architecture types SMPs … snoopy protocols DSMs … directory-based protocols Extensible cache-coherence simulator efg@ncsu.edu
Simulator is trace driven Reads a set of mem refs in this format 1 r a1663dc4 1 w a1663dc4 2 r a165d30c 2 r a1663dc4 Extensible cache-coherence simulator efg@ncsu.edu
CPU action bus action CPU action, e.g., write a word PrRd Triggers a bus action, e.g., invalidate other blocks BusRdX How this is implemented do CPU action for each other cache do bus action method of cache class method of main class method of cache class Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported MSI Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported MESI Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported MOESI Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported Firefly Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported Dragon Extensible cache-coherence simulator efg@ncsu.edu
Example method—PrRd for MSI void MSI::PrRd(ulong addr, int processor_number) { // Per-cache global counter to maintain LRU order among // cache ways, updated on every cache access current_cycle++; reads++; cache_line * line = find_line(addr); if (line == NULL) { // This is a miss read_misses++; cache_line *newline = allocate_line(addr); memory_transactions++; // State I --> S newline->set_state(S); // Read miss --> BusRd bus_reads++; sendBusRd(addr, processor_number); } Extensible cache-coherence simulator efg@ncsu.edu
PrRd for MSI (cont.) else { // The block is cached cache_state state; state=line->get_state(); if (state == I){ // The block is cached, but in invalid state. // Hence Read miss memory_transactions++; read_misses++; line->set_state(S); bus_reads++; sendBusRd(addr, processor_number); } else{ update_LRU(line); Extensible cache-coherence simulator efg@ncsu.edu
How directory-based protocols differ Along with cache hierarchy, Cache MSI, MESI, Dragon, etc. a directory hierarchy Directory Full bit vector, SCI, SSCI, etc. Instead of bus actions, signal actions No BusRd, but SignalRd. No iteration over all other caches Directories receive Invalidation, Intervention messages Extensible cache-coherence simulator efg@ncsu.edu
Protocols supported FBV State transition for a cache State transition for main memory Extensible cache-coherence simulator efg@ncsu.edu
Sample assignments Given MESI and Dragon, Given write-through, implement MSI and Firefly Given write-through, implement MSI with and without BusUpgr implement Firefly Extensible cache-coherence simulator efg@ncsu.edu
Sample assignments, cont. Given invalidation protocols, implement update protocols Given a bus-based MESI, implement directory-based MESI Reimplement closely related protocols as a superclass & subclass Hybridize two of the protocols, say, invalidation and update Extensible cache-coherence simulator efg@ncsu.edu
Assignments can study … Vary protocol Vary cache size Vary block size Vary associativity Vary number of processors (dependent on trace) Extensible cache-coherence simulator efg@ncsu.edu
Summary Through coding cache actions, students learn how cache coherence really works. There are many different assignments you can give. You can use the simulator term after term, each time trying something new. Provides a good introduction to how architectural innovations are simulated. (But in much less detail, so results are quick.) Extensible cache-coherence simulator efg@ncsu.edu