Multiprocessor Highlights

Slides:

Advertisements

Similar presentations

The University of Adelaide, School of Computer Science

Advertisements

CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.

Computer Architecture 2011 – coherency & consistency (lec 7) 1 Computer Architecture Memory Coherency & Consistency By Dan Tsafrir, 11/4/2011 Presentation.

CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.

Computer Architecture II 1 Computer architecture II Lecture 9.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.

Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.

Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.

Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.

ECE200 – Computer Organization Chapter 9 – Multiprocessors.

Lecture 13: Multiprocessors Kai Bu

Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.

Anshul Kumar, CSE IITD CSL718 : Multiprocessors Synchronization, Memory Consistency 17th April, 2006.

Anshul Kumar, CSE IITD ECE729 : Advance Computer Architecture Lecture 26: Synchronization, Memory Consistency 25 th March, 2010.

Memory Consistency Models 1. Uniform Consistency Models Only have read and write operations Sequential Consistency Pipelined-RAM Causal Consistency Coherence.

Release Consistency Yujia Jin 2/27/02. Motivations Place partial order on memory accesses for correct parallel program behavior Relax partial order for.

1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.

ECE/CS 552: Shared Memory © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim Smith.

1 Lecture 7: Implementing Cache Coherence Topics: implementation details.

Lecture 13: Multiprocessors Kai Bu

Siva and Osman March 7, 2000 Cache Coherence Schemes for Multiprocessors Sivakumar M Osman Unsal.

Symmetric Multiprocessors: Synchronization and Sequential Consistency

COSC6385 Advanced Computer Architecture

COMP 740: Computer Architecture and Implementation

Processor support devices Part 2: Caches and the MESI protocol

Software Coherence Management on Non-Coherent-Cache Multicores

תרגול מס' 5: MESI Protocol

The University of Adelaide, School of Computer Science

The University of Adelaide, School of Computer Science

Lecture 11: Consistency Models

The University of Adelaide, School of Computer Science

Example Cache Coherence Problem

The University of Adelaide, School of Computer Science

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Cache Coherence (controllers snoop on bus transactions)

Lecture 2: Snooping-Based Coherence

Merry Christmas Good afternoon, class,

CMPT 886: Computer Architecture Primer

Cache Coherence Protocols 15th April, 2006

Kai Bu 13 Multiprocessors So today, we’ll finish the last part of our lecture sessions, multiprocessors.

Chapter 5 Multiprocessor and Thread-Level Parallelism

CMSC 611: Advanced Computer Architecture

Shared Memory Consistency Models: A Tutorial

Lecture 5: Snooping Protocol Design Issues

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Bus-Based Coherent Multiprocessors

Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP

Embedded Computer Architecture 5KK73 Going Multi-Core

Distributed Shared Memory

Lecture 25: Multiprocessors

Lecture 9: Directory-Based Examples

Lecture 10: Consistency Models

Lecture 25: Multiprocessors

The University of Adelaide, School of Computer Science

The University of Adelaide, School of Computer Science

Lecture 17 Multiprocessors and Thread-Level Parallelism

Lecture 24: Virtual Memory, Multiprocessors

Lecture 24: Multiprocessors

Lecture 3: Coherence Protocols

Lecture 8 Outline Memory consistency

Coherent caches Adapted from a lecture by Ian Watson, University of Machester.

Prof John D. Kubiatowicz

Lecture 17 Multiprocessors and Thread-Level Parallelism

Programming with Shared Memory Specifying parallelism

Lecture: Coherence and Synchronization

A. T. Clements, M. F. Kaashoek, N. Zeldovich, R. T. Morris, and E

Lecture 18: Coherence and Synchronization

The University of Adelaide, School of Computer Science

Lecture 17 Multiprocessors and Thread-Level Parallelism

Lecture 11: Consistency Models

Presentation transcript:

Multiprocessor Highlights MESI Cache Coherence Protocol, Memory Consistency, ILP and MC Zhao Zhang 2003

MESI Protocol From local processor’s viewpoint, for each cache block Modified: Only I have a copy and the copy has been modifed; must respond to any read/write request Exclusive-clean: Only I have a copy and the copy is clear; no need to inform others about my changes Shared: Someone else may have copy; have to inform others about my changes Invalid: The block has been invalidated (possibly on the request of someone else) Actions highlight: Have read misses on a block: send read request onto bus Have write misses on a block: send write request onto bus Receive bus read request: transit the block to shared state Receive bus write request: transit the block to invalid state Must write back data when transiting from modified state

Memory Consistency Model Define memory correctness for parallel execution: Execution appears to the that of some correct execution of some theoretical parallel computer which has n sequential processors Particularly, remote writes must appear in a local processor in some correct sequence Typical memory consistency model: Sequential consistency Memory read/writes are globally serialized; assume every cycle only one processor can proceed for one step, and write result appears on other processors immediately Processors do not reorder local reads and writes Note #possible sequences is an exponential function of #inst Total storing order Only writes are globally serialized; assume every cycle at most one write can proceed, and the write result appears immediately Processors may reorder local reads/writes without RAW dependence Processor consistency Writes from one processor appear in the same order on all other processors

Memory Consistency and ILP Sequential consistency, TSO and PC are strong consistency models (but TSO and PC are relaxed consistency models) Why use weak consistency models (e.g. release consistency)? Otherwise, without speculative execution recovery, every write to shared data may take a full memory access latency (can afford 100ns for every such write on 2GHz 4-way issue processors?) For SC, reads cannot bypass any previous write (even without RAW dependence) Strong consistency may work efficiently with speculative execution in ILP (PC and TSO in practice; SC can be supported with speculative cache)