Download presentation
Published byIsaiah Wilson Modified over 11 years ago
1
Chapter 5 Part I: Shared Memory Multiprocessors
Small multiprocessor Typically uses SMP (symmetric multiprocessor) architecture Shared address space directed supported by the hardware Common memory hierarchy configurations: Figure 5.2 Shared cache Bus-based SMP most common SMP arch. Dancehall Typically uses MIN (multistage interconnection network) Distributed memory (asymmetric) Shared memory supported through “directory” methods EECE 550
2
Cache Coherence When a memory location is read, memory should provide the latest value written to that location Uniprocessor systems use a memory hierarchy There is no cache coherence problem Multiprocessor systems typically have multiple caches Copies of the same data may reside in different caches Potential cache coherence problem EECE 550
3
Example of Cache Coherence Problem
U = ? U = ? U = 7 Cache 4 Cache 5 Cache 3 U: 5 U: 5 1 2 U: 5 Memory EECE 550
4
Cache Coherency Formal Definition (bottom of p. 276)
Informal Definition The memory system should “behave” as if all processors obtain all of their data from a single memory store. Properties required for cache coherence Write propagation Writes must become visible to all other processes Write serialization All writes to a location (by 1 or more processes) are seen in the same order by ALL processes EECE 550
5
Bus Snooping Concept shown in Figure 5.4 Snooping protocol requires
Requires continuous monitoring of the bus by each cache’s cache controller Snooping protocol requires A set of states associated with memory blocks in local caches A state transition diagram, showing the required state changes for a matching block Actions associated with each state transition EECE 550
6
Uniprocessor Cache Concepts
Write-through Information is written to BOTH cache AND to main memory Write-back Information is written to cache only Modified cache block is tagged as “dirty” and later written to main memory Dirty block written when it needs to be flushed to to block replacement EECE 550
7
Possible write miss policies
Write-allocate Transfer block to cache, and then update value Write-no-allocate Block is modified in main memory only Cache block placement strategies Direct-mapped Only one possible location for each memory address Fully-associative Data for a given memory address can be stored anywhere in the cache Set-associative Data for a given memory address can be stored in a limited set of locations in the cache EECE 550
8
Bus Snooping Write-through cache Figure 5.5
Snooping is simpler since all writes can be seen on the bus Problems with scaling All writes generate bus traffic Figure 5.5 Bus snooping with write-through, write-no-allocate policy Suppose that a write-through, write-allocate policy is used How should Figure 5.5 be modified? EECE 550
9
Partial Order for Cache Coherence
Total ordering can be based on partial orders Refer to middle of p. 282 Example: Figure 5.6 Partial order with write-through invalidation protocol Example 5.3 EECE 550
10
Memory Consistency “A memory consistency model … specifies constraints on the order in which memory operations must appear to be performed … with respect to one another.” [Culler et. al. 1999, p. 285] Event synchronization through flags Figure 5.7 Explicit synchronization using barriers Figure 5.8 Order among accesses without synchronization Figure 5.9 EECE 550
11
Sequential Consistency
Values become visible to a process according to some sequential interleaving of the memory accesses for all processes Formal definition p. 286 (referenced from [Lamport 1979]) Figure 5.10: Programmer’s view of sequential consistency Note: inter-process synchronization still required Write atomicity Example 5.4 All writes (to any location) should appear to all processors to have occurred in the same order EECE 550
12
Sufficient conditions for preserving sequential consistency (p. 289)
Every process issues memory operations in program order After a write is issued, the issuing process waits for the write to complete before issuing next operation After a read operation is issued If the write whose value is being returned has performed with respect to this processor, then the processor should wait until the write has performed with respect to all processors. Example 5.5: Re-ordering of memory operations (Figure 5.7) Creates problems for parallel or multithreaded program EECE 550
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.