Chapter 5 Part I: Shared Memory Multiprocessors

Slides:



Advertisements
Similar presentations
EECE : Synchronization Issue: How can synchronization operations be implemented in bus-based cache-coherent multiprocessors Components of a synchronization.
Advertisements

Cache Coherence. Memory Consistency in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has.
Symmetric Multiprocessors: Synchronization and Sequential Consistency.
L.N. Bhuyan Adapted from Patterson’s slides
CSCI 8150 Advanced Computer Architecture
1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
The University of Adelaide, School of Computer Science
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
CS252 Graduate Computer Architecture Lecture 25 Memory Consistency Models and Snoopy Bus Protocols Prof John D. Kubiatowicz
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 1: Parallel Architecture Intro Course organization:  ~5 lectures based on Culler-Singh textbook  ~5 lectures based on Larus-Rajwar textbook.
1 Lecture 18: Coherence Protocols Topics: coherence protocols for symmetric and distributed shared-memory multiprocessors (Sections )
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
Memory Consistency Models
1 Lecture 18: Shared-Memory Multiprocessors Topics: coherence protocols for symmetric shared-memory multiprocessors (Sections )
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
CS 258 Parallel Computer Architecture Lecture 12 Shared Memory Multiprocessors II March 1, 2002 Prof John D. Kubiatowicz
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Computer Architecture 2015 – Cache Coherency & Consistency 1 Computer Architecture Memory Coherency & Consistency By Yoav Etsion and Dan Tsafrir Presentation.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
Lecture 13: Multiprocessors Kai Bu
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
Multiprocessor cache coherence. Caching: terms and definitions cache line, line size, cache size degree of associativity –direct-mapped, set and fully.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Cache Coherence Protocols A. Jantsch / Z. Lu / I. Sander.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
Cache Coherence CS433 Spring 2001 Laxmikant Kale.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
The University of Adelaide, School of Computer Science
Additional Material CEG 4131 Computer Architecture III
1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )
CS267 Lecture 61 Shared Memory Hardware and Memory Consistency Modified from J. Demmel and K. Yelick
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The University of Adelaide, School of Computer Science
CSC/ECE 506: Architecture of Parallel Computers Bus-Based Coherent Multiprocessors 1 Lecture 12 (Chapter 8) Lecture 12 (Chapter 8)
1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.
COSC6385 Advanced Computer Architecture
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
CS 704 Advanced Computer Architecture
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
CMSC 611: Advanced Computer Architecture
Example Cache Coherence Problem
The University of Adelaide, School of Computer Science
Shared Memory Multiprocessors
Cache Coherence Protocols:
Cache Coherence Protocols:
CMSC 611: Advanced Computer Architecture
Shared Memory Consistency Models: A Tutorial
Multiprocessors - Flynn’s taxonomy (1966)
Bus-Based Coherent Multiprocessors
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Distributed Shared Memory
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 24: Virtual Memory, Multiprocessors
Lecture 23: Virtual Memory, Multiprocessors
Lecture 24: Multiprocessors
Lecture 17 Multiprocessors and Thread-Level Parallelism
CPE 631 Lecture 20: Multiprocessors
Lecture 19: Coherence and Synchronization
The University of Adelaide, School of Computer Science
CSE 486/586 Distributed Systems Cache Coherence
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Chapter 5 Part I: Shared Memory Multiprocessors Small multiprocessor Typically uses SMP (symmetric multiprocessor) architecture Shared address space directed supported by the hardware Common memory hierarchy configurations: Figure 5.2 Shared cache Bus-based SMP  most common SMP arch. Dancehall Typically uses MIN (multistage interconnection network) Distributed memory (asymmetric) Shared memory supported through “directory” methods EECE 550

Cache Coherence When a memory location is read, memory should provide the latest value written to that location Uniprocessor systems use a memory hierarchy There is no cache coherence problem Multiprocessor systems typically have multiple caches Copies of the same data may reside in different caches Potential cache coherence problem EECE 550

Example of Cache Coherence Problem U = ? U = ? U = 7 Cache 4 Cache 5 Cache 3 U: 5 U: 5 1 2 U: 5 Memory EECE 550

Cache Coherency Formal Definition (bottom of p. 276) Informal Definition The memory system should “behave” as if all processors obtain all of their data from a single memory store. Properties required for cache coherence Write propagation Writes must become visible to all other processes Write serialization All writes to a location (by 1 or more processes) are seen in the same order by ALL processes EECE 550

Bus Snooping Concept shown in Figure 5.4 Snooping protocol requires Requires continuous monitoring of the bus by each cache’s cache controller Snooping protocol requires A set of states associated with memory blocks in local caches A state transition diagram, showing the required state changes for a matching block Actions associated with each state transition EECE 550

Uniprocessor Cache Concepts Write-through Information is written to BOTH cache AND to main memory Write-back Information is written to cache only Modified cache block is tagged as “dirty” and later written to main memory Dirty block written when it needs to be flushed to to block replacement EECE 550

Possible write miss policies Write-allocate Transfer block to cache, and then update value Write-no-allocate Block is modified in main memory only Cache block placement strategies Direct-mapped Only one possible location for each memory address Fully-associative Data for a given memory address can be stored anywhere in the cache Set-associative Data for a given memory address can be stored in a limited set of locations in the cache EECE 550

Bus Snooping Write-through cache Figure 5.5 Snooping is simpler since all writes can be seen on the bus Problems with scaling All writes generate bus traffic Figure 5.5 Bus snooping with write-through, write-no-allocate policy Suppose that a write-through, write-allocate policy is used How should Figure 5.5 be modified? EECE 550

Partial Order for Cache Coherence Total ordering can be based on partial orders Refer to middle of p. 282 Example: Figure 5.6 Partial order with write-through invalidation protocol Example 5.3 EECE 550

Memory Consistency “A memory consistency model … specifies constraints on the order in which memory operations must appear to be performed … with respect to one another.” [Culler et. al. 1999, p. 285] Event synchronization through flags Figure 5.7 Explicit synchronization using barriers Figure 5.8 Order among accesses without synchronization Figure 5.9 EECE 550

Sequential Consistency Values become visible to a process according to some sequential interleaving of the memory accesses for all processes Formal definition p. 286 (referenced from [Lamport 1979]) Figure 5.10: Programmer’s view of sequential consistency Note: inter-process synchronization still required Write atomicity  Example 5.4 All writes (to any location) should appear to all processors to have occurred in the same order EECE 550

Sufficient conditions for preserving sequential consistency (p. 289) Every process issues memory operations in program order After a write is issued, the issuing process waits for the write to complete before issuing next operation After a read operation is issued If the write whose value is being returned has performed with respect to this processor, then the processor should wait until the write has performed with respect to all processors. Example 5.5: Re-ordering of memory operations (Figure 5.7) Creates problems for parallel or multithreaded program EECE 550