Java Race Finder Checking Java Programs for Sequential Consistency Tuba Yavuz-Kahveci Fall 2013.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
50.003: Elements of Software Construction Week 6 Thread Safety and Synchronization.
Java PathRelaxer: Extending JPF for JMM-Aware Model Checking Huafeng Jin, Tuba Yavuz-Kahveci, and Beverly Sanders Computer and Information Science and.
Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Chapter 6 Process Synchronization Bernard Chen Spring 2007.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Week 9, Class 3: Model-View-Controller Final Project Worth 2 labs Happens-Before ( SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:
/ PSWLAB Concurrent Bug Patterns and How to Test Them by Eitan Farchi, Yarden Nir, Shmuel Ur published in the proceedings of IPDPS’03 (PADTAD2003)
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.
Threading Part 2 CS221 – 4/22/09. Where We Left Off Simple Threads Program: – Start a worker thread from the Main thread – Worker thread prints messages.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
02/19/2007CSCI 315 Operating Systems Design1 Process Synchronization Notice: The slides for this lecture have been largely based on those accompanying.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Thread-modular Abstraction Refinement Thomas A. Henzinger, et al. CAV 2003 Seonggun Kim KAIST CS750b.
1 Thread Synchronization: Too Much Milk. 2 Implementing Critical Sections in Software Hard The following example will demonstrate the difficulty of providing.
CS510 Concurrent Systems Introduction to Concurrency.
1 Concurrent Languages – Part 1 COMP 640 Programming Languages.
Lecture 2 Foundations and Definitions Processes/Threads.
CDP 2013 Based on “C++ Concurrency In Action” by Anthony Williams, The C++11 Memory Model and GCCThe C++11 Memory Model and GCC Wiki and Herb Sutter’s.
Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Verification of obstruction-free algorithm with contention management Niloufar Shafiei.
Sharing Objects  Synchronization  Atomicity  Specifying critical sections  Memory visibility  One thread’s modification seen by the other  Visibility.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Threads Cannot be Implemented as a Library Hans-J. Boehm.
Java Thread and Memory Model
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
CIS 842: Specification and Verification of Reactive Systems Lecture INTRO-Examples: Simple BIR-Lite Examples Copyright 2004, Matt Dwyer, John Hatcliff,
Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
Threads and Singleton. Threads  The JVM allows multiple “threads of execution”  Essentially separate programs running concurrently in one memory space.
November 27, 2007 Verification of a Concurrent Priority Queue Bart Verzijlenberg.
/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Week 9, Class 3: Java’s Happens-Before Memory Model (Slides used and skipped in class) SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:
CS 153 Design of Operating Systems Winter 2016 Lecture 7: Synchronization.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
Week 8, Class 3: Model-View-Controller Final Project Worth 2 labs Cleanup of Ducks Reducing coupling Finishing FactoryMethod Cleanup of Singleton SE-2811.
Lecture 20: Consistency Models, TM
An Operational Approach to Relaxed Memory Models
Memory Consistency Models
Compositional Pointer and Escape Analysis for Java Programs
Threads Cannot Be Implemented As a Library
Atomic Operations in Hardware
Atomic Operations in Hardware
Memory Consistency Models
Specifying Multithreaded Java semantics for Program Verification
Threads and Memory Models Hal Perkins Autumn 2011
Threads and Memory Models Hal Perkins Autumn 2009
Lecture 22: Consistency Models, TM
Foundations and Definitions
Lecture: Consistency Models, TM
Problems with Locks Andrew Whitaker CSE451.
Presentation transcript:

Java Race Finder Checking Java Programs for Sequential Consistency Tuba Yavuz-Kahveci Fall 2013

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

What is Sequential Consistency?  Program statements are executed according to program order  Each thread’s statements are executed according to the program order in that thread’s code  Write atomicity  Each read operation on a variable sees the most recent write operation on that variable

What is a Memory Model?  Constrains the behavior of memory operations  What value can a read operation see?  Example memory models  Sequential Consistency  Easy to understand  Relaxed Consistency Models  Relaxation of  Program order  Write atomicity

Who Should Care?  Programmers  Understanding how to achieve sequential consistency, if possible  Reasoning about correctness  Compiler writers  Optimizing code within the restrictions of the memory model

Problem: Getting Multi-threaded Java Programs Right  Important Questions Any Java Programmer Should Ask  Is my multithreaded program correctly synchronized?  Beware!!! Sequential consistency is not guaranteed for incorrectly synchronized Java programs!  If my multithreaded program is not correctly synchronized, how can I fix it?  If my multithreaded program is not correctly synchronized for a good reason, should I still be worried?  Automated tool support is needed to answer these nontrivial questions

An Example: Peterson’s Mutual Exclusion Algorithm - Version 1 Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */ s1: flag[0] = 1; Thread 1 Thread 2 s2: turn = 1; s3: while (flag[1] == 1 && turn == 1) { /*spin*/} s4: shared++; /* critical section */ s5: flag[0] = 0; s6: flag[1] = 1; s7: turn = 0; s8: while (flag[0] == 1 && turn == 0) { /*spin*/} s9: shared++; /* critical section */ s10: flag[0] = 0;

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

What is Java Memory Model (JMM)?  A relaxed memory model  Sequential consistency is guaranteed only for correctly synchronized programs  For programs without data races  Incorrectly synchronized programs can show extra behavior that is not sequentially consistent  Still subject to some safety rules

Synchronization Rules in Java  Some synchronization actions and their relationship in Java:  Unlocking a monitor lock synchronizes with locking that monitor lock.  Writing a volatile variable synchronizes with reading of that variable.  Starting a thread synchronizes with the first action of that thread.  Final action in a thread synchronizes with any action of a thread that detects termination of that thread.  Initialization of a field synchronizes with the first access to the field in every thread.  In general a release action synchronizes with a matching acquire action.

Happens-Before Relation  An action a1 happens-before action a2, a1 ≤ hb a2, due to one of the following:  a1 comes before a2 according to program order: a1 ≤ po a2.  a1 synchronizes with a2: a1 ≤ sw a2.  a1 happens-before a’ that happens-before a2: Exists a’. a1 ≤ hb a’ and a’ ≤ hb a2 (transitivity).  Happens-before, ≤ hb = ( ≤ po U ≤ sw ) +, is a partial-order on all actions in an execution.

Happens-before Consistency  A read operation r can see results of a write operation w provided that:  r does not happen-before w: not (r ≤ hb w).  There is no intervening write operation: not (exists w’. w r ≤ hb w’ ≤ hb r).

Anatomy of a Data Race  Definition: If two actions a1 and a2 from different threads access the same memory location loc, the actions are not ordered by happens-before and if one of the actions is a write, then there is a data race on loc.  Example: ≤ hb Thread 1 Thread 2 Initialization: boolean done = false; /* non-volatile */ done = true; if (done) // use result Race on done!!! result = compute();

A Simple Fix  A write to a volatile variable synchronizes with a read of that variable.  Example: ≤ hb Thread 1 Thread 2 Initialization: volatile boolean done = false; done = true; if (done) // use result result = compute(); ≤ hb Not in a race

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

Our Solutions/Contributions  Is my multi-threaded program correctly synchronized? Kim K., Yavuz-Kahveci T., Sanders B.Precise Data Race detection in Relaxed Memory Model using Heuristic based Model Checking [ASE Conf. 2009]  If my multi-threaded program is not correctly synchronized, how can I fix it? Kim K., Yavuz-Kahveci T., Sanders B. JRF-E: Using Model Checking to give Advice on Eliminating Memory Model-related Bugs [ASE Conf. 2010, ASE Journal 2012]  If my program is not correctly synchronized for a good reason, should I still be worried? Jin H., Yavuz-Kahveci T., Sanders B. Java Path Relaxer: Extending JPF for JMM-aware model checking [JPF Workshop] Jin H., Yavuz-Kahveci T., Sanders B. Java Memory Model-Aware Model Checking [TACAS 2012]

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

State/Snapshot of a Running Java Program Values of Static Fields Heap (objects) Thread states Bytecode for the Java program JAVA VIRTUAL MACHINE

Model Checking Java Programs Values of Static Fields Heap (objects) Thread states Main Thread Thread1 Thread2 Thread3 … Main Thread Thread2 Thread1 Thread3 … Main Thread Thread3 Thread2 Thread1 …

Model Checking for Sequential Consistency Java Race Finder (JRF) Java Path Finder (JPF) Multi-threaded Java application Data Race? yes no a model-checker for Java programs checks for general correctness properties assumes sequential consistency explores all possible thread interleaving extends JPF’s state representation to detect data races

Our Approach for Detecting Data Races Algorithm: for each execution path EP j = of program P do initialize happens-before relation for each action a i, i= 1 to n, do let loc be the memory location a i accesses if (it is safe (without a data race) for a i to access loc) generate DATA RACE error execute a i update happens-before relation

Representing Happens-Before  We define an h-function that captures the happens-before relation in an implicit way.  h: SyncAddr U Thread -> 2 Addr.  SyncAddr: Volatile variables and locks  Addr: Non-volatile variables  Is it safe for a j of thread t i to access loc?  does h(t i ) contain loc?  Which variables can be safely accessed if acquire on s (with a matching release on s) is executed?  h(s).

The h-function  Initialization:  At the beginning there is only the main thread:  h0 = λz.if z = main then static(P) else φ  Update:  Executing an action updates the h-function:  action(t, x) h = h’  h: h-function before executing action  t: the thread the action belongs to  x: synchronization variable (volatile or a lock)  h’: the updated h function

Updating the h-function action a n by thread th n+1 write a volatile field vrelease(t,v) h n read a volatile field vacquire(t, v) h n lock the lock variable lckacquire(t, lck) h n unlock the lock variable lckrelease(t,lck) h n start thread t′release(t,t′) h n join thread t′acquire(t, t′) h n t′.isAlive() returns falseacquire(t, t′) h n write a non-volatile field xinvalidate(t, x) h n read a non-volatile field xhnhn instantiate an object containing non-volatile fields fields and volatile fields volatiles new (t, fields, volatiles ) h n

Action Semantics  Variables that can be safely accessed from thread t copied to the set for synchronization variable x release(t, x)h = h[x → h(t) ∪ h(x)]  Variables in the set of synchronization variable x will now be safely accessed by thread t acquire(t, x)h = h[t → h(t) ∪ h(x)]  Only thread t which changed x can safely access it. invalidate(t, x) h = λz. if (t = z) then h(z) else h(z)\{x}  The non-volatile fields of the newly created object can be safely accessed by the thread who created it. The volatile fields are initialized to refer to empty sets. new(t, fields, volatiles)h = λz. if (t = z) then h(t) ∪ fields else if (z ∈ volatiles ) then{} else h(z)

Implementation of the h-function

How JRF extends JPF

Test Programs Sources# of examples# of examples found to have data races Textbook by Herhily and Shavitz Amino Concurrent Building Blocks Library 109 Google Concurrent Data Structures Workshop Java Grande Forum Benchmark Suite 106 Webserver Simulator – Student Projects 287

Time Overhead of JRF

Space Overhead of JRF

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

Finding the data race quickly race State space of a program initial state Each path from initial state to a leaf state represents a separate execution. race

Finding the data race using DFS race State space of a program initial state Each path from initial state to a leaf state represents a separate execution. race DFS counter-example

Finding the data race using BFS race State space of a program initial state Each path from initial state to a leaf state represents a separate execution. race BFS counter-example

Heuristic-Based Data Race Search  Our goal is to reach a state that has a data race as quick as possible.  Assign a traversal priority to each program state based on how close it may be to a racy state.  Writes-First (WF): Prefer write statements to read statements  Watch-Written (WW): Prefer access to memory locations recently written by another thread  Avoid Release/Acquire (ARA): Avoid scheduling threads that perform proper synchronization.  Acquire-First (AF): Prefer acquire operations that do not have a matching release operation.

An Example: Peterson’s Mutual Exclusion Algorithm - Version 1 Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */ s1: flag[0] = 1; Thread 1 Thread 2 s2: turn = 1; s3: while (flag[1] == 1 && turn == 1) { /*spin*/} s4: shared++; /* critical section */ s5: flag[0] = 0; s6: flag[1] = 1; s7: turn = 0; s8: while (flag[0] == 1 && turn == 0) { /*spin*/} s9: shared++; /* critical section */ s10: flag[0] = 0;

DFS vs Heuristic Search s1: flag[0] = 1; Thread 1 s2: turn = 1; s3: while (flag[1] == 1 && turn == 1) { /*spin*/} s4: shared++; /* critical section */ s5: flag[0] = 0; Thread 2 s6: flag[1] = 1; s7: turn = 0; Thread 1 s1: flag[0] = 1; s2: turn = 1; Thread 2 s6: flag[1] = 1; s7: turn = 0; Race! turn not in h(thread2)! DFS Search Path Heuristic Search Path

Experimental Results: Heuristic Search Code (lines of code) SearchStateLengthTime (sec) Memory (MB) DisBarrier (232) DFS Heuristic BFS Moldyn (1252) DFS Heuristic BFS >574* DEQueue (334) DFS Heuristic BFS BinaryStaticTree Barrier (1910) DFS Heuristic BFS >18* *: JPF ran out of memory

Outline  The Problem: Getting Multithreaded Java Programs Right  Java Memory Model  Our Solution: Java Race Finder  What is model checking anyway?  Representing Happens-before  Heuristic-based Search  Code Modification Suggestions

What went wrong? Thread 1 s1: flag[0] = 1; s2: turn = 1; Thread 2 s6: flag[1] = 1; s7: turn = 0; source statement manifest statement removes turn from h(thread2) accesses turn when turn is not in h(thread2)

How to fix it?  Data races are due to absence of happens-before relationship  Suggest code modifications that will create happens-before relationship between the source and manifest statements  Change the variable to volatile  Change the array to an atomic array  Move the source statement to make use of existing happens- before relationships due to transitivity  Perform the same synchronization  Change another variable to volatile to create happens-before relationships due to transitivity

Change to atomic array Thread 1 s1: flag[0] = 1; s2: turn = 1; Thread 2 s6: flag[1] = 1; source statement manifest statement removes flag[1] from h(thread1) Accesses flag[1] when flag[1] is not in h(thread1) Change flag to atomic array Peterson’s ME Alg. turn and flag are volatile s3: while (flag[1] == 1 && turn == 1) { /*spin*/} Thread 1

An Example for move source Initialization: goFlag = false; volatile Data publish; s1: r = new Data(); Thread 1 Thread 2 s2: publish = r; s3: r.setDesc(e); s4: goFlag = true; t1: if (publish != null) { t2: while (!goFlag); t3: String s = publish.getDesc(); t4: assert(s.equals(“e”); } Updates published object after making the reference visible Compiler may reorder s3 and s4 May use the published object when it is in an inconsistent state

Move source statement s1: r = new Data(); Thread 1 Thread 2 s2: publish = r; s3: r.setDesc(e); s4: goFlag = true; t1: if (publish != null) { t2: while (!goFlag); publish is volatile goFlag is not volatile source statement removes goFlag from h(thread2) manifest statement Accesses goFlag when goFlag is not in h(thread2) Move s4 before s2 s4: goFlag = true;

An Example for perform the same synchronization operation Initialization: int data; final Object lock = new Object(); s1: print (data); Thread 1 Thread 2 t1: synchronized (lock) { /*lock*/ t2: data = 1; t3: } /*unlock*/ For every non-volatile variable v, acquireHistory(v) stores the set of safe accesses by thread t via a synchronization operation on s. Thread2’s safe access on data is noted as an example behavior.

Perform that synchronized block s1: print (data); Thread 1 Thread 2 t1: synchronized (lock) { /*lock*/ t2: data = 1; t3: } /*unlock*/ data is not volatile Perform synchronized (lock) to access data source statement removes data from h(thread1) manifest statement Accesses data when data is not in h(thread1) s0: synchronized (lock) { /*lock*/ s2: } /*unlock*/

An Example for change another to volatile Initialization: int x; boolean done = false; /* both non- volatile*/ s1: x = 1; Thread 1 Thread 2 t1: while (!done); t2: assert(x == 1); s2: done = true; Potential data races both on x and done. Should we really change both to x and done to volatile? Can we get away by changing only one?

Change other to volatile s1: x = 1; Thread 1 Thread 2 t1: while (!done); t2: assert(x == 1); s2: done = true; x and done are not volatile source statement removes x from h(thread2) manifest statement accesses x when x is not in h(thread2) Change done to volatile

JRF-E: Eliminating Data Races  JRF is configured to produce threshold # of counter-example paths and write to a file  JRF-E works on the output of JRF and analyzes the counter- example paths to generate code modification suggestions  For each race  reports intersection of suggestions on all the relevant counter- example paths  For each specific code modification suggestion  reports the frequency

JRF-E RESULT ====================================================== data race #1 jrf.hbset.util.HBDataRaceException... ______________________________________________________ analyze counter example data race source statement : "putstatic" at simple/SimpleRace.java:64 : "x = 1;" data race manifest statement : "getstatic" at simple/SimpleRace.java:74: "assert (x==1);" Change the field "simple.SimpleRace.x from INITIALIZER" to volatile. Change the field "simple.SimpleRace.done from INITIALIZER" to volatile. ______________________________________________________ advice from acquiring history NONE ====================================================== data race #2 jrf.hbset.util.HBDataRaceException... ______________________________________________________ analyze counter example data race source statement : "putstatic" at simple/SimpleRace.java:65 : "done = true;" data race manifest statement : "getstatic" at simple/SimpleRace.java:73: "while(!done) { /*spin*/ }" Change the field "simple.SimpleRace.done from INITIALIZER" to volatile. ______________________________________________________ advice from acquiring history NONE ______________________________________________________ frequency of advice [1times] Change the field "simple.SimpleRace.x from INITIALIZER" to volatile. [2times] Change the field "simple.SimpleRace.done from INITIALIZER" to volatile. ______________________________________________________ statistic JRF takes 0:0:1 to find 2 equivalent races with 9 counterexample traces. JRF-E takes 0:0:0 in 9 races analysis. How did it happen? How many times a suggestion has been made considering all the races? feedback on a single race feedback on all races How to fix it? feedback on another race

JRF-E - Analyzing threshold # of races In all except MCSLock, the right suggestion made when Threshold <= 10.

Suggestions that worked LengthThreshold# of Racy Fields Change (other) to volatile Change to atomic array Use synchronized block DisBarrier40121 LockFreeHashS et OptimisticList42131 MCSLock LinearSenseBar rier Iterator_EBDeq ue Lufact19111 Sor44121 Webserver Sim.68121

Conclusion  Even experts can benefit from tool support for detecting data races.  JRF can also analyze synchronization idioms that do not use locking.  Has become an official extension of Java Path Finder   JRF-E makes working suggestions for most of the data races in our experiments.  JRF-E can teach programmers the intricacies of Java Memory Model.

Thank You Questions?