An Operational Approach to Relaxed Memory Models

An Operational Approach to Relaxed Memory Models
Xinyu Feng University of Science and Technology of China Joint work with Yang USTC

Why Memory Models C1 || C2 Compiler Result
Which reads see which writes? Memory

Two different philosophies for RMM
Define behaviors of all programs Such as x86-TSO, JMM DRF guarantee Behaviors of racy programs Weak, to incorporate main-stream optimizations But not too week Type safety, security, etc. , need to prohibit thin-air reads Define behaviors of only DRF programs C/C++11 But DRF with low-level atomics is difficult to understand

Operational Happens-Before Memory Model (OHMM)
Follows the first philosophy Motivated by solving some of the problems of JMM Use an abstract machine to simulate relaxed behaviors Memory model defined as operational semantics (Almost) Avoids thin-air reads Avoids many surprising behaviors and bugs of JMM Weak enough to allow common compiler optimizations

Basic settings Two types of memory cells: normal & volatile
Volatile read/write roughly corresponds to C++ load-acquire/store-release Volatile memory cells cannot be used as normal ones Unlike C++ This talk: non-volatile memory only

Design of the abstract machine
Starting from a SC machine Adding 3 features to relax the program behaviors Event buffers History-based memory Replay of events Similar to [Yang et al. 2002] New mechanism for compiler optimizations

The Abstract Machine - SC
Tn processor processor memory

The Abstract Machine – Event Buffer
Tn 00:00 timer processor processor <<T1, t>, i> <<Tn, t’>, i'> event buffer memory

Events and Event Buffer
Instructions are converted to events following the interleaving semantics: <<t1, 0>, x = 1> t1 t2 <<t2, 1>, y = 1> x = 1; r1 = y; y = 1; r2 = x; <<t2, 2>, r2 = x> <<t1, 3>, r1 = y>

Events and Event Buffer
Events from the same threads could be reordered: <<t1, 0>, x = 1> t1 t2 <<t2, 1>, y = 1> x = 1; r1 = y; y = 1; r2 = x; <<t2, 2>, r2 = x> <<t1, 3>, r1 = y> Execution order: 2, 3, 0, 1 Result: r1 = r2 = 0

Limitation of Event Reordering
Reordering of events is not weak enough for the following program: t1 t2 Reorder is not allowed due to data dependency! x = 1; r1 = x; x = 2; r2 = x; r1 = 2, r2 = 1 ?

The Abstract Machine – History-Based Memory
<<t1,0>, n1> <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9> T1 Tn timer 00:00 processor processor <<T1, t>, i> <<Tn, t’>, i'> event buffer A memory cell memory

<<t1, 8>, r = x>
History-Based Memory We keep all the write operations in the corresponding memory cell. Update History of x <<t1,0>, n1> Read sees (1) the most recent write that happens-before it, (2) or writes from other threads that does not happens-before it. <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1, 8>, r = x> <<t1,9>, n9>

<<t1, 3>, r1 = x> <<t2, 4>, r2 = x>
History-Based Memory Update History of x t1 t2 x = 1; r1 = x; x = 2; r2 = x; <<t1,1>, 1> <<t2,2>, 2> r1 = 2, r2 = 1 ? <<t1, 3>, r1 = x> <<t2, 4>, r2 = x>

Support of Compiler Analysis
Still cannot allow the following behavior: Initially: x = 0, y = 0 r1 = x; r2 = x; if (r1 == r2) y = 2; r3 = y; x = r3; r1 = r2 = r3 = 2?

Compiler Optimization Can Be Smart
Initially: x = 0, y = 0 r1 = x; r2 = x; if (r1 == r2) y = 2; y = 2; r1 = x; r2 = r1; if (true) r3 = y; x = r3; r1 = r2 = r3 = 2? Redundant read elimination Must be allowed!

Support of Compiler Analysis
Still cannot allow the following behavior: r1 = x; r2 = x; r1 = x; r2 = x; if (r1 == r2) y = 2; Our idea: Use dynamic execution to simulate static analysis (or symbolic execution). r3 = y; x = r3; Duplicate the first two lines. r1 = r2 = r3 = 2?

The Abstract Machine - Replay
Tn timer 00:00 processor processor replay replay event buffer memory

Replay Buffer r1 = x; r2 = x; if (r1 == r2) r3 = y; y = 2; x = r3;
Instead of code rewriting, we put an event into the replay buffer when they are executed, which can be executed a second time later. r1 = x; r2 = x; if (r1 == r2) y = 2; r3 = y; x = r3; replay r1 = x; r2 = x; Duplicate the first two lines. event buffer r1 = r2 = r3 = 2? Need to be careful to preserve sequential semantics.

Some constraints for replay
When reads get replayed, its timestamp doesn’t change r = x; r = x; x = r+1; Cannot see the write

When reads get replayed, its timestamp doesn’t change When writes get replayed, the old writes in history is overwritten <<t1,0>, n1> <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9> Recall the update history stored in memory cells: <<t1,4>, N > You won’t end up having two events with same time stamp but different update value

When reads get replayed, its timestamp doesn’t change When writes get replayed, the old writes in history is overwritten If a write have been seen by other threads, it cannot be replayed <<t1,0>, n1> false true . . . <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> Use Boolean flags to remember whether it has been seen (by others) <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9>

Necessary to prevent the following behavior:

Summary T1 Tn timer 00:00 processor processor event buffer memory
replay replay event buffer memory

Properties of OHMM DRF-guarantee
DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10)

Test case 5 JMM decides to prohibit it, but controversial in the mailing list (at least the value 1 does show in the program).

DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10) Avoids some counter-intuitive/buggy behaviors of JMM [Aspinall & Sevcik, 2007]

DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10) Avoids some counter-intuitive/buggy behaviors of JMM [Aspinall & Sevcik, 2007] Soundness with respect to program transformations

Soundness w.r.t. Prog. Trans.
Results for SC, JMM and JMM-Alt taken from [Sevcik and Aspinall, ECOOP 2008] JMM-Alt refers to [Aspinall and Sevcik, TPHOLs 2007], a fixed version of JMM A grain of salt: transformations used for OHMM are defined syntactically, which are less general than their semantically-defined counterparts in [Sevcik and Aspinall, ECOOP 2008]

Summary OHMM Properties Event buffer, history-based memory, and replay
Supports DRF Guarantee (proved in Coq) Weak enough to support compiler optimization Thanks to the replay mechanism No out-of-thin-air reads ? Avoids surprising behaviors in JMM Lockless programs have more relaxed semantics than JMM Semantics of locks is stronger Adding more synchronization reduces (instead of increases) behaviors

Thank you!

An Operational Approach to Relaxed Memory Models

Similar presentations

Presentation on theme: "An Operational Approach to Relaxed Memory Models"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Operational Approach to Relaxed Memory Models

Similar presentations

Presentation on theme: "An Operational Approach to Relaxed Memory Models"— Presentation transcript:

Similar presentations

About project

Feedback