Spring 2006 1 EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Advanced Computer Architecture Lecture 21 MSP shared cached MSI protocol.

Slides:



Advertisements
Similar presentations
L.N. Bhuyan Adapted from Patterson’s slides
Advertisements

Extra Cache Coherence Examples In the following examples there are a couple questions. You can answer these for practice by ing Colin at
Lecture 7. Multiprocessor and Memory Coherence
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
Cache Optimization Summary
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
CS252 Graduate Computer Architecture Lecture 25 Memory Consistency Models and Snoopy Bus Protocols Prof John D. Kubiatowicz
Computer Architecture II 1 Computer architecture II Lecture 8.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
EECC756 - Shaaban #1 lec # 10 Spring Shared Memory Multiprocessors Symmetric Memory Multiprocessors (SMPs): commonly 2-4 processors/node.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 1: Introduction Course organization:  4 lectures on cache coherence and consistency  2 lectures on transactional memory  2 lectures on interconnection.
1 Lecture 3: Snooping Protocols Topics: snooping-based cache coherence implementations.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
EECC756 - Shaaban #1 lec # 10 Spring Multiprocessors Cache Coherence in Bus-Based Shared Memory Multiprocessors Shared Memory Multiprocessors.
CS 258 Parallel Computer Architecture Lecture 12 Shared Memory Multiprocessors II March 1, 2002 Prof John D. Kubiatowicz
Snooping Cache and Shared-Memory Multiprocessors
Cache Organization of Pentium
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Multiprocessor Cache Coherency
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Chapter: , 5.8, 5.10, 5.15; Also, 5.13 & 5.17.
©RG:E0243:L2- Parallel Architecture 1 E0-243: Computer Architecture L2 – Parallel Architecture.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Shared Address Space Computing: Hardware Issues Alistair Rendell See Chapter 2 of Lin and Synder, Chapter 2 of Grama, Gupta, Karypis and Kumar, and also.
CS492B Analysis of Concurrent Programs Coherence Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
Presented By:- Prerna Puri M.Tech(C.S.E.) Cache Coherence Protocols MSI & MESI.
EECS 252 Graduate Computer Architecture Lec 13 – Snooping Cache and Directory Based Multiprocessors David Patterson Electrical Engineering and Computer.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
Lecture 13: Multiprocessors Kai Bu
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
Cache Coherence CSE 661 – Parallel and Vector Architectures
Evaluating the Performance of Four Snooping Cache Coherency Protocols Susan J. Eggers, Randy H. Katz.
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
ECE 1747: Parallel Programming Basics of Parallel Architectures: Shared-Memory Machines.
1 Memory and Cache Coherence. 2 Shared Memory Multiprocessors Symmetric Multiprocessors (SMPs) Symmetric access to all of main memory from any processor.
Lecture 9 ECE/CSC Spring E. F. Gehringer, based on slides by Yan Solihin1 Lecture 9 Outline  MESI protocol  Dragon update-based protocol.
Multiprocessors— Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
ECE 4100/6100 Advanced Computer Architecture Lecture 13 Multiprocessor and Memory Coherence Prof. Hsien-Hsin Sean Lee School of Electrical and Computer.
Cache Coherence CS433 Spring 2001 Laxmikant Kale.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
1 Lecture 7: PCM Wrap-Up, Cache coherence Topics: handling PCM errors and writes, cache coherence intro.
The University of Adelaide, School of Computer Science
CS 704 Advanced Computer Architecture
Outline Introduction (Sec. 5.1)
COSC6385 Advanced Computer Architecture
COMP 740: Computer Architecture and Implementation
תרגול מס' 5: MESI Protocol
Cache Coherence in Shared Memory Multiprocessors
Computer Engineering 2nd Semester
CS 704 Advanced Computer Architecture
Cache Coherence for Shared Memory Multiprocessors
12.4 Memory Organization in Multiprocessor Systems
Example Cache Coherence Problem
Flynn’s Taxonomy Flynn classified by data and control streams in 1966
Lecture 2: Snooping-Based Coherence
Chip-Multiprocessor.
CMSC 611: Advanced Computer Architecture
Bus-Based Coherent Multiprocessors
High Performance Computing
CS 3410, Spring 2014 Computer Science Cornell University
Lecture 25: Multiprocessors
Lecture 24: Virtual Memory, Multiprocessors
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Prof John D. Kubiatowicz
Presentation transcript:

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Advanced Computer Architecture Lecture 21 MSP shared cached MSI protocol MESI protocol

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Symmetric multiprocessor (SMP) MemoryMemory... CPU Memory Controller CPU I/O Hub/Bridge Key Board Mouse Monitor BIOS EtherNet Power Supply Cooling Fan One address space, uniform access time

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering A simpler model Each processor has a local cache, one main memory P 1 transactionsP n transactions Bus transactions

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Coherency through snooping Controllers monitor bus to manage local cache Single shared memory Local controller

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Coherency requirements 1.Memory operations occur in the order they were issued 2.All reads return the most current value

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Cache coherency solution Monitor bus to see when things change Must maintain the “state” of each cache line –Modifed (as in write-back) –Others

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Two solutions Both write-back invalidation: snooped write (another writer) invalidates local copy MSI protocol –Three states –Simpler, uses bus a bit more MESI protocol: most popular –Four states –Slightly more complex, uses bus less

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering WB invalidation protocol Cache states –Modified: dirty, memory inconsistent, local cache has only valid copy, only one CPU in this state –Shared: clean, one or more copies, memory consistent –Invalid: local data is not current, stale State transitions –Determined by local controller –States may vary across caches MSI

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Processor transactions Read (PrRd): read instruction Write (PrWr): write instruction Misses: if modified, must write back data to memory

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Bus transactions Read (BusRd): memory read Write (BusWr): memory write Read Exclusive (BusRdX) –Used to request an exclusive copy of data –Generated by a PrWr if data Invalid or Shared –Data returned may be ignored Flush: Modified cache performs a WB, resolves inconsistency

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Processor hits and misses ActionProcessorBus Read hitPrRdnone Write hitPrWrnone Read missPrRdBusRd Write missPrWrBusRdX Invalidates other local copies

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Bus snooping BusAction BusRdAnother processor wants to read line BusRdXAnother processor wants to write line

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering MSI protocol Bus/snoop generatedProcessor generated Controller Observes/Action write miss read miss write hit read hit Another reader Another writer

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Find the coherency? ActionP 1 StateP 2 StateP 3 StateBus Data source P 1 read u SIIBusRdmem P 3 read u SISBusRd C 1 or mem P 3 writes u IIM BusRdX /Flush P3P3 P 1 read u SISBusRdC3C3 P 2 read u SSSBusRdC3C3

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering MSI performance Consider single read-write –Read results in Shared state, BusRd –Write results in Modified state, BusRdX Unfortunate Result –Two bus actions –Second, BusRdX, not necessary if line is not shared with other processors –Suggest a new state, exclusive-clean

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering WB invalidation protocol Improved performance vs. MSI States –Modified: dirty, memory inconsistent, local cache has only valid copy –Exclusive (clean) : local cache owns, but not written, one CPU in this state –Shared: clean, one or more copies, memory consistent –Invalid: data is not current, stale MESI

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Processor transactions Read (PrRd): read instruction Write (PrWr): write instruction Misses: if modified, must write back data to memory

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Bus transactions Read (BusRd): memory read Write (BusWr): memory write Read Exclusive (BusRdX) Flush: only one need provide WB data Shared (S): new signal –Determines if data already shared –Used with BusRd(S) or BusRd(S#)

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering MESI protocol Controller Observes/Action Bus/snoop generatedProcessor generated

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering MESI protocol Controller Observes/Action Bus/snoop generatedProcessor generated write hit read hit read miss one cache does WB Another writer Another reader no bus cycle write miss

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Find the coherency? ActionP 1 StateP 2 StateP 3 StateBus Data source P 1 read u EIIBusRd(S#)Mem P 3 read u SISBusRd(S) C 1 or mem P 3 writes u IIMBusRdX/Flush C 3 or mem P 1 read u SISBusRd(S) P 2 read u SSS

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Find the coherency? ActionP 1 StateP 2 StateP 3 StateBus Data source P 1 read u SIIBusRdMemory P 3 read u SISBusRdMemory P 3 writes u IIMBusRdXP3P3 P 1 read u SIS BusRd/F lush P 3 cache P 2 read u SSSBusRdMemory

Spring EE 437 Lillevik 437s06-l21 University of Portland School of Engineering Find the coherency? ActionP 1 StateP 2 State P 3 State Bus Data source P 1 read u EIIBusRd(S#)Memory P 3 read u SISBusRd(S)/FlushP 1 cache P 3 writes u IIMBusRdX/Flush’P3P3 P 1 read u SISBusRd(S)/FlushP 3 cache P 2 read u SSSBusRd(S)/Flush’ P 1 or P 3 cache