Multiprocessor Highlights

Slides:



Advertisements
Similar presentations
The University of Adelaide, School of Computer Science
Advertisements

CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
Computer Architecture 2011 – coherency & consistency (lec 7) 1 Computer Architecture Memory Coherency & Consistency By Dan Tsafrir, 11/4/2011 Presentation.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
Computer Architecture II 1 Computer architecture II Lecture 9.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
Lecture 13: Multiprocessors Kai Bu
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Synchronization, Memory Consistency 17th April, 2006.
Anshul Kumar, CSE IITD ECE729 : Advance Computer Architecture Lecture 26: Synchronization, Memory Consistency 25 th March, 2010.
Memory Consistency Models 1. Uniform Consistency Models Only have read and write operations Sequential Consistency Pipelined-RAM Causal Consistency Coherence.
Release Consistency Yujia Jin 2/27/02. Motivations Place partial order on memory accesses for correct parallel program behavior Relax partial order for.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
ECE/CS 552: Shared Memory © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim Smith.
1 Lecture 7: Implementing Cache Coherence Topics: implementation details.
Lecture 13: Multiprocessors Kai Bu
Siva and Osman March 7, 2000 Cache Coherence Schemes for Multiprocessors Sivakumar M Osman Unsal.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
COSC6385 Advanced Computer Architecture
COMP 740: Computer Architecture and Implementation
Processor support devices Part 2: Caches and the MESI protocol
Software Coherence Management on Non-Coherent-Cache Multicores
תרגול מס' 5: MESI Protocol
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 11: Consistency Models
The University of Adelaide, School of Computer Science
Example Cache Coherence Problem
The University of Adelaide, School of Computer Science
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Cache Coherence (controllers snoop on bus transactions)
Lecture 2: Snooping-Based Coherence
Merry Christmas Good afternoon, class,
CMPT 886: Computer Architecture Primer
Cache Coherence Protocols 15th April, 2006
Kai Bu 13 Multiprocessors So today, we’ll finish the last part of our lecture sessions, multiprocessors.
Chapter 5 Multiprocessor and Thread-Level Parallelism
CMSC 611: Advanced Computer Architecture
Shared Memory Consistency Models: A Tutorial
Lecture 5: Snooping Protocol Design Issues
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Bus-Based Coherent Multiprocessors
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Embedded Computer Architecture 5KK73 Going Multi-Core
Distributed Shared Memory
Lecture 25: Multiprocessors
Lecture 9: Directory-Based Examples
Lecture 10: Consistency Models
Lecture 25: Multiprocessors
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 24: Virtual Memory, Multiprocessors
Lecture 24: Multiprocessors
Lecture 3: Coherence Protocols
Lecture 8 Outline Memory consistency
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Prof John D. Kubiatowicz
Lecture 17 Multiprocessors and Thread-Level Parallelism
Programming with Shared Memory Specifying parallelism
Lecture: Coherence and Synchronization
A. T. Clements, M. F. Kaashoek, N. Zeldovich, R. T. Morris, and E
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 11: Consistency Models
Presentation transcript:

Multiprocessor Highlights MESI Cache Coherence Protocol, Memory Consistency, ILP and MC Zhao Zhang 2003

MESI Protocol From local processor’s viewpoint, for each cache block Modified: Only I have a copy and the copy has been modifed; must respond to any read/write request Exclusive-clean: Only I have a copy and the copy is clear; no need to inform others about my changes Shared: Someone else may have copy; have to inform others about my changes Invalid: The block has been invalidated (possibly on the request of someone else) Actions highlight: Have read misses on a block: send read request onto bus Have write misses on a block: send write request onto bus Receive bus read request: transit the block to shared state Receive bus write request: transit the block to invalid state Must write back data when transiting from modified state

Memory Consistency Model Define memory correctness for parallel execution: Execution appears to the that of some correct execution of some theoretical parallel computer which has n sequential processors Particularly, remote writes must appear in a local processor in some correct sequence Typical memory consistency model: Sequential consistency Memory read/writes are globally serialized; assume every cycle only one processor can proceed for one step, and write result appears on other processors immediately Processors do not reorder local reads and writes Note #possible sequences is an exponential function of #inst Total storing order Only writes are globally serialized; assume every cycle at most one write can proceed, and the write result appears immediately Processors may reorder local reads/writes without RAW dependence Processor consistency Writes from one processor appear in the same order on all other processors

Memory Consistency and ILP Sequential consistency, TSO and PC are strong consistency models (but TSO and PC are relaxed consistency models) Why use weak consistency models (e.g. release consistency)? Otherwise, without speculative execution recovery, every write to shared data may take a full memory access latency (can afford 100ns for every such write on 2GHz 4-way issue processors?) For SC, reads cannot bypass any previous write (even without RAW dependence) Strong consistency may work efficiently with speculative execution in ILP (PC and TSO in practice; SC can be supported with speculative cache)