An Operational Approach to Relaxed Memory Models

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
1 Episode III in our multiprocessing miniseries. Relaxed memory models. What I really wanted here was an elephant with sunglasses relaxing On a beach,
Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.
Memory Models (1) Xinyu Feng University of Science and Technology of China.
Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Amir Pnueli Memorial Symposium New York University, May 2010.
Java PathRelaxer: Extending JPF for JMM-Aware Model Checking Huafeng Jin, Tuba Yavuz-Kahveci, and Beverly Sanders Computer and Information Science and.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
“FENDER” AUTOMATIC MEMORY FENCE INFERENCE Presented by Michael Kuperstein, Technion Joint work with Martin Vechev and Eran Yahav, IBM Research 1.
A Rely-Guarantee-Based Simulation for Verifying Concurrent Program Transformations Hongjin Liang, Xinyu Feng & Ming Fu Univ. of Science and Technology.
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Memory Consistency Arbob Ahmad, Henry DeYoung, Rakesh Iyer /18-740: Recent Research in Architecture October 14, 2009.
Steven Pelley, Peter M. Chen, Thomas F. Wenisch University of Michigan
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.
By Sarita Adve & Kourosh Gharachorloo Review by Jim Larson Shared Memory Consistency Models: A Tutorial.
Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.
1 Lecture 7: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Lecture 13: Consistency Models
Specifying Java Thread Semantics Using a Uniform Memory Model Jason Yue Yang Ganesh Gopalakrishnan Gary Lindstrom School of Computing University of Utah.
Computer Architecture II 1 Computer architecture II Lecture 9.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
Memory Model Safety of Programs Sebastian Burckhardt Madanlal Musuvathi Microsoft Research EC^2, July 7, 2008.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
A Behavioral Memory Model for the UPC Language Kathy Yelick Joint work with: Dan Bonachea, Jason Duell, Chuck Wallace.
Compositional Verification of Termination-Preserving Refinement of Concurrent Programs Hongjin Liang Univ. of Science and Technology of China (USTC) Joint.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Hybrid Static-Dynamic Analysis for Statically.
By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Shared Memory Consistency Models: A Tutorial.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Data races, informally [More formal definition to follow] “race condition” means two different things Data race: Two threads read/write, write/read, or.
Threads Cannot be Implemented as a Library Hans-J. Boehm.
Java Thread and Memory Model
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
Threads and Singleton. Threads  The JVM allows multiple “threads of execution”  Essentially separate programs running concurrently in one memory space.
CS533 Concepts of Operating Systems Jonathan Walpole.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Fundamentals of Memory Consistency Smruti R. Sarangi Prereq: Slides for Chapter 11 (Multiprocessor Systems), Computer Organisation and Architecture, Smruti.
Prescient Memory: Exposing Weak Memory Model Behavior by Looking into the Future MAN CAO JAKE ROEMER ARITRA SENGUPTA MICHAEL D. BOND 1.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
Concurrency 2 CS 2110 – Spring 2016.
Software Coherence Management on Non-Coherent-Cache Multicores
Memory Consistency Models
Threads Cannot Be Implemented As a Library
Lecture 11: Consistency Models
Memory Consistency Models
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Specifying Multithreaded Java semantics for Program Verification
Hongjin Liang, Xinyu Feng & Ming Fu
Threads and Memory Models Hal Perkins Autumn 2011
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Progress of Concurrent Objects with Partial Methods
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Threads and Memory Models Hal Perkins Autumn 2009
Lecture 22: Consistency Models, TM
Shared Memory Consistency Models: A Tutorial
Store Atomicity What does atomicity really require?
Memory Consistency Models
Xinyu Feng University of Science and Technology of China
CSE 153 Design of Operating Systems Winter 19
Relaxed Consistency Part 2
Why we have Counterintuitive Memory Models
Relaxed Consistency Finale
Compilers, Languages, and Memory Models
Foundations and Definitions
Programming with Shared Memory Specifying parallelism
Lecture: Consistency Models, TM
Lecture 11: Relaxed Consistency Models
Presentation transcript:

An Operational Approach to Relaxed Memory Models Xinyu Feng University of Science and Technology of China Joint work with Yang Zhang @ USTC

Why Memory Models C1 || C2 Compiler Result Which reads see which writes? Memory

Two different philosophies for RMM Define behaviors of all programs Such as x86-TSO, JMM DRF guarantee Behaviors of racy programs Weak, to incorporate main-stream optimizations But not too week Type safety, security, etc. , need to prohibit thin-air reads Define behaviors of only DRF programs C/C++11 But DRF with low-level atomics is difficult to understand

Operational Happens-Before Memory Model (OHMM) Follows the first philosophy Motivated by solving some of the problems of JMM Use an abstract machine to simulate relaxed behaviors Memory model defined as operational semantics (Almost) Avoids thin-air reads Avoids many surprising behaviors and bugs of JMM Weak enough to allow common compiler optimizations

Basic settings Two types of memory cells: normal & volatile Volatile read/write roughly corresponds to C++ load-acquire/store-release Volatile memory cells cannot be used as normal ones Unlike C++ This talk: non-volatile memory only

Design of the abstract machine Starting from a SC machine Adding 3 features to relax the program behaviors Event buffers History-based memory Replay of events Similar to [Yang et al. 2002] New mechanism for compiler optimizations

The Abstract Machine - SC Tn processor processor memory

The Abstract Machine – Event Buffer Tn 00:00 timer processor processor <<T1, t>, i> <<Tn, t’>, i'> event buffer memory

Events and Event Buffer Instructions are converted to events following the interleaving semantics: <<t1, 0>, x = 1> t1 t2 <<t2, 1>, y = 1> x = 1; r1 = y; y = 1; r2 = x; <<t2, 2>, r2 = x> <<t1, 3>, r1 = y>

Events and Event Buffer Events from the same threads could be reordered: <<t1, 0>, x = 1> t1 t2 <<t2, 1>, y = 1> x = 1; r1 = y; y = 1; r2 = x; <<t2, 2>, r2 = x> <<t1, 3>, r1 = y> Execution order: 2, 3, 0, 1 Result: r1 = r2 = 0

Limitation of Event Reordering Reordering of events is not weak enough for the following program: t1 t2 Reorder is not allowed due to data dependency! x = 1; r1 = x; x = 2; r2 = x; r1 = 2, r2 = 1 ?

The Abstract Machine – History-Based Memory <<t1,0>, n1> <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9> T1 Tn timer 00:00 processor processor <<T1, t>, i> <<Tn, t’>, i'> event buffer A memory cell memory

<<t1, 8>, r = x> History-Based Memory We keep all the write operations in the corresponding memory cell. Update History of x <<t1,0>, n1> Read sees (1) the most recent write that happens-before it, (2) or writes from other threads that does not happens-before it. <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1, 8>, r = x> <<t1,9>, n9>

<<t1, 3>, r1 = x> <<t2, 4>, r2 = x> History-Based Memory Update History of x t1 t2 x = 1; r1 = x; x = 2; r2 = x; <<t1,1>, 1> <<t2,2>, 2> r1 = 2, r2 = 1 ? <<t1, 3>, r1 = x> <<t2, 4>, r2 = x>

Support of Compiler Analysis Still cannot allow the following behavior: Initially: x = 0, y = 0 r1 = x; r2 = x; if (r1 == r2) y = 2; r3 = y; x = r3; r1 = r2 = r3 = 2?

Compiler Optimization Can Be Smart Initially: x = 0, y = 0 r1 = x; r2 = x; if (r1 == r2) y = 2; y = 2; r1 = x; r2 = r1; if (true) r3 = y; x = r3; r1 = r2 = r3 = 2? Redundant read elimination Must be allowed!

Support of Compiler Analysis Still cannot allow the following behavior: r1 = x; r2 = x; r1 = x; r2 = x; if (r1 == r2) y = 2; Our idea: Use dynamic execution to simulate static analysis (or symbolic execution). r3 = y; x = r3; Duplicate the first two lines. r1 = r2 = r3 = 2?

The Abstract Machine - Replay Tn timer 00:00 processor processor replay replay event buffer memory

Replay Buffer r1 = x; r2 = x; if (r1 == r2) r3 = y; y = 2; x = r3; Instead of code rewriting, we put an event into the replay buffer when they are executed, which can be executed a second time later. r1 = x; r2 = x; if (r1 == r2) y = 2; r3 = y; x = r3; replay r1 = x; r2 = x; Duplicate the first two lines. event buffer r1 = r2 = r3 = 2? Need to be careful to preserve sequential semantics.

Some constraints for replay When reads get replayed, its timestamp doesn’t change r = x; r = x; x = r+1; Cannot see the write

Some constraints for replay When reads get replayed, its timestamp doesn’t change When writes get replayed, the old writes in history is overwritten <<t1,0>, n1> <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9> Recall the update history stored in memory cells: <<t1,4>, N > You won’t end up having two events with same time stamp but different update value

Some constraints for replay When reads get replayed, its timestamp doesn’t change When writes get replayed, the old writes in history is overwritten If a write have been seen by other threads, it cannot be replayed <<t1,0>, n1> false true . . . <<t2,1>, n2> . . . <<t2,3>, n3> <<t1,4>, n4> Use Boolean flags to remember whether it has been seen (by others) <<t2,5>, n5> <<t1,7>, n7> <<t1,9>, n9>

Necessary to prevent the following behavior:

Summary T1 Tn timer 00:00 processor processor event buffer memory replay replay event buffer memory

Properties of OHMM DRF-guarantee DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10)

Test case 5 JMM decides to prohibit it, but controversial in the mailing list (at least the value 1 does show in the program).

Properties of OHMM DRF-guarantee DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10) Avoids some counter-intuitive/buggy behaviors of JMM [Aspinall & Sevcik, 2007]

vs.

Properties of OHMM DRF-guarantee DRF defined under SC semantics (Almost) Passes JMM Causality Test Cases Except two controversial ones (Test case 5 and 10) Avoids some counter-intuitive/buggy behaviors of JMM [Aspinall & Sevcik, 2007] Soundness with respect to program transformations

Soundness w.r.t. Prog. Trans. Results for SC, JMM and JMM-Alt taken from [Sevcik and Aspinall, ECOOP 2008] JMM-Alt refers to [Aspinall and Sevcik, TPHOLs 2007], a fixed version of JMM A grain of salt: transformations used for OHMM are defined syntactically, which are less general than their semantically-defined counterparts in [Sevcik and Aspinall, ECOOP 2008]

Summary OHMM Properties Event buffer, history-based memory, and replay Supports DRF Guarantee (proved in Coq) Weak enough to support compiler optimization Thanks to the replay mechanism No out-of-thin-air reads ? Avoids surprising behaviors in JMM Lockless programs have more relaxed semantics than JMM Semantics of locks is stronger Adding more synchronization reduces (instead of increases) behaviors

Thank you!