Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

1 Episode III in our multiprocessing miniseries. Relaxed memory models. What I really wanted here was an elephant with sunglasses relaxing On a beach,
1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04 Selective, Accurate,
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
CS492B Analysis of Concurrent Programs Consistency Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
Cache Coherence in Scalable Machines (IV) Dealing with Correctness Issues Serialization of operations Deadlock Livelock Starvation.
Slides 8d-1 Programming with Shared Memory Specifying parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Fall 2010.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
Computer Architecture 2011 – coherency & consistency (lec 7) 1 Computer Architecture Memory Coherency & Consistency By Dan Tsafrir, 11/4/2011 Presentation.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 7: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Lecture 13: Consistency Models
Computer Architecture II 1 Computer architecture II Lecture 9.
1 Lecture 15: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Memory Consistency Models
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.
1 Lecture 22: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Processor Consistency [Goodman 1989]* Processor Consistency is a memory model in which the result of any execution is the same as if the operations of.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Sunita Marathe.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Evaluation of Memory Consistency Models in Titanium.
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
Computer Architecture 2015 – Cache Coherency & Consistency 1 Computer Architecture Memory Coherency & Consistency By Yoav Etsion and Dan Tsafrir Presentation.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
CS 295 – Memory Models Harry Xu Oct 1, Multi-core Architecture Core-local L1 cache L2 cache shared by cores in a processor All processors share.
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Fundamentals of Parallel Computer Architecture - Chapter 71 Chapter 7 Introduction to Shared Memory Multiprocessors Yan Solihin Copyright.
Memory Consistency Zhonghai Lu Outline Introduction What is a memory consistency model? Who should care? Memory consistency models Strict.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
CS533 Concepts of Operating Systems Jonathan Walpole.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
Threaded Programming Lecture 1: Concepts. 2 Overview Shared memory systems Basic Concepts in Threaded Programming.
CS267 Lecture 61 Shared Memory Hardware and Memory Consistency Modified from J. Demmel and K. Yelick
Fundamentals of Memory Consistency Smruti R. Sarangi Prereq: Slides for Chapter 11 (Multiprocessor Systems), Computer Organisation and Architecture, Smruti.
1 Programming with Shared Memory - 3 Recognizing parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Jan 22, 2016.
740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University.
1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
COSC6385 Advanced Computer Architecture
Distributed Shared Memory
Memory Consistency Models
Lecture 11: Consistency Models
Memory Consistency Models
Threads and Memory Models Hal Perkins Autumn 2011
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Shared Memory Consistency Models: A Tutorial
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Introduction to High Performance Computing Lecture 20
Threads and Memory Models Hal Perkins Autumn 2009
Lecture 22: Consistency Models, TM
Lecture 10: Consistency Models
Programming with Shared Memory Specifying parallelism
Memory Consistency Models
Lecture 24: Multiprocessors
Programming with Shared Memory Specifying parallelism
Lecture: Consistency Models, TM
Lecture 11: Consistency Models
Presentation transcript:

Shared Memory Consistency Models

SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations on all memory locations. Two key issues in such an architecture: –Cache coherence: how the data values should be propagated among caches/memory. Sequentialize accesses to one memory location –Memory consistency model: formal specification of memory semantics Define the semantic for accesses to ALL memory locations. The timing (the early and late bounds) when a value in memory (cache + memory) can be propagated to any processor. The model affects the applicability of many hardware and software optimization techniques.

A Coherent Memory in an SMP System: Intuition Initially flag1=flag2=0; P1: P2: flag1 = 1 flag2 = 1; if (flag2 == 0) if (flag1 ==0) critial section critical section Can we guarantee that one process is in the critical section? Needs to order the memory access among different memory locations – this is what memory consistence model does!!

Coherent Memory in an SMP System: Intuition Reading the location should see the latest value written by any process –“Last” is not well defined: Last write issued to the memory system? Last in the program? Last write in time? Memory consistency model is concerned about the program behavior: so “last” should be in terms of program order. –In sequential program: order of operations in the machine language presented to the processor. –In multi-threaded programs (those for SMP machines), program order is only defined within a process. Need to make sense of orders across processes.

Formal definition of coherence memory (sequential consistency) Lamport’s definition: A multiprocessor system is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.

Another formal definition of sequential consistency Results of a program: values returned by its read operations A memory system is coherent if the results of any execution of a program are such that for each location, it is possible to construct a hypothetical serial order of all operations to the location that is consistent with the results of the execution and in which: –Operations issued by any particular process occur in the order issued by that process, and –The value returned by a read is the value written by the last write to that location in the serial order. –All must see the same hypothetical serial order

Formal Definition of coherence memory Two necessary features: –Write propagation: value written must become visible to all others (instantaneously). –Write serialization: write to location seen in the same order by all If one sees W1 after W2, noone should see W2 after W1. No need for analogous read serialization since read is not visible to others.

Sequential consistence example P1: P2: A=1 A=2 B=2 B=1 Read A, B Read A, B Is it possible for P1 to have A=1, B=2 and P2 to have A=2 B=1? Is it possible for P1 to have A=1, B=1 and P2 to have A=2, B=2?

Sequential consistent examples

Complication in hardware software support for sequential consistent

Sequential consistency in architectures with caches –More chance to reorder operations that can violate sequential consistency. E.g. write through cache has the similar behavior as write buffer. –Even if a read hits the cache, the processor cannot read the cached value until its previous operations by program order are complete!! –Issues: Detecting when a write a complete needs more transactions. Hard to make propagating to multiple copies atomic: more challenging to preserve the program order.

Sequential consistency requirement Sequential consistency requirement: –Program order requirement: a processor must ensure that its previous memory operation is complete before proceedings with the next memory operation in program order. A write is complete only after all invalidates (or updates) are acked. –Write atomicity requirement: the value of a write not returned by a read until all invalidates are acked.

Sequential consistency requirement Can we change the order of any of the following sequences? A = 1 B = 2 A = 1 = B = A = B

The program order requirement and write atomicity requirement in sequential consistency model make many hardware and compiler optimizations invalid. –Memory reference order must be strictly enforced. –Instruction scheduling, register allocation, etc

Relaxing program order Sequential consistency model is too strict. –Coming from hardware point of view, trying to deal with the worst case scenario. Program order, write atomicity. From the software point of view: –What do we call a threaded program that can potentially read/write to the same memory location? Mostly wrong/non-deterministic programs with race conditions. –Most of the correct threaded programs do not have race conditions. No need to enforce the sequential consistency all the time.

Relaxing all program orders Relaxing all program orders may not be a big deal. –Between synchronization points, multiple writes or one write/multiple reads to the same location  race condition. –If no race condition, sequential consistence can be achieved by completing all memory operations at synchronization.

Weak ordering Two types of memory operations: data and synchronization. –Synchronization operation can only be carried out when all memory operations before it are completed. Hardware support: use a count to keep track of outstanding memory operations. –Weak ordering = sequential consistence for programs without race condition –Is the semantic defined for programs with race condition?

Relaxed memory models (in between) Relax program order requirement –E.g. write and read different locations Relax write atomicity requirement. The differences are subtles – each enables some hardware/software optimizations and prohibit other types of optimizations.

Relax program order Read/write order for the same address must always be enforced. Read/write order for different addresses is less important. –Sometimes it can still be important. Relax: –A write to a following read (of a different address). –A Write to a following write –A read to a following read or write.

Relax write atomicity Allow a read to return the value of another processor’s write before the write is complete (visible to all processors) Allow a read to return the value of its own value before the write is complete.

Some relaxed models