Dynamic Race Prediction in Linear Time

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Delta Debugging and Model Checkers for fault localization
1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.
1 Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center.
Gwendolyn Voskuilen, Faraz Ahmad, and T. N. Vijaykumar Electrical & Computer Engineering ISCA 2010.
Race Detection for Android Applications
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
Scalable and Precise Dynamic Datarace Detection for Structured Parallelism Raghavan RamanJisheng ZhaoVivek Sarkar Rice University June 13, 2012 Martin.
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.
Scaling Model Checking of Dataraces Using Dynamic Information Ohad Shacham Tel Aviv University IBM Haifa Lab Mooly Sagiv Tel Aviv University Assaf Schuster.
SOS: Saving Time in Dynamic Race Detection with Stationary Analysis Du Li, Witawas Srisa-an, Matthew B. Dwyer.
An efficient data race detector for DIOTA Michiel Ronsse, Bastiaan Stougie, Jonas Maebe, Frank Cornelis, Koen De Bosschere Department of Electronics and.
Cormac Flanagan and Stephen Freund PLDI 2009 Slides by Michelle Goodstein 07/26/10.
Ordering and Consistent Cuts Presented By Biswanath Panda.
Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li.
Chapter 10 Global Properties. Unstable Predicate Detection A predicate is stable if, once it becomes true it remains true Snapshot algorithm is not useful.
Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Thread-modular Abstraction Refinement Thomas A. Henzinger, et al. CAV 2003 Seonggun Kim KAIST CS750b.
1 Scalable and transparent parallelization of multiplayer games Bogdan Simion MASc thesis Department of Electrical and Computer Engineering.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
Eraser: A Dynamic Data Race Detector for Multithreaded Programs STEFAN SAVAGE, MICHAEL BURROWS, GREG NELSON, PATRICK SOBALVARRO, and THOMAS ANDERSON Ethan.
On Reducing the Global State Graph for Verification of Distributed Computations Vijay K. Garg, Arindam Chakraborty Parallel and Distributed Systems Laboratory.
1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal Vijay K. Garg
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.
Parallel and Distributed Systems Laboratory Paradise: A Toolkit for Building Reliable Concurrent Systems Trace Verification for Parallel Systems Vijay.
Deadlock Bug Detection Techniques Prof. Moonzoo Kim CS KAIST CS492B Analysis of Concurrent Programs 1.
6340 DBMS Components. DBMS OS, application, middleware Components: storage, query optimizer, recovery manager, transaction processor, security.
Grigore Rosu Founder, President and CEO Professor of Computer Science, University of Illinois
Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.
Hwajung Lee. Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using –shared variables –message passing.
ICDCS 2006 Efficient Incremental Optimal Chain Partition of Distributed Program Traces Selma Ikiz Vijay K. Garg Parallel and Distributed Systems Laboratory.
Clock Snooping and its Application in On-the-fly Data Race Detection Koen De Bosschere and Michiel Ronsse University of Ghent, Belgium Taipei, TaiwanDec.
Using Escape Analysis in Dynamic Data Race Detection Emma Harrington `15 Williams College
FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.
C++11 Atomic Types and Memory Model
Presenter: Godmar Back
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Contents. Goal and Overview. Ingredients. The Page Model.
EMERALDS Landon Cox March 22, 2017.
Computational Models Database Lab Minji Jo.
The Echo Algorithm The echo algorithm can be used to collect and disperse information in a distributed system It was originally designed for learning network.
Consistency and Replication
Yuan Yu(MSR) Tom Rodeheffer(MSR) Wei Chen(UC Berkeley) SOSP 2005
Effective Data-Race Detection for the Kernel
runtime verification Brief Overview Grigore Rosu
Parametric Trace Slicing and Monitoring
Distributed Mutex EE324 Lecture 11.
Amir Kamil and Katherine Yelick
References [1] LEAP:The Lightweight Deterministic Multi-processor Replay of Concurrent Java Programs [2] CLAP:Recording Local Executions to Reproduce.
Ka-Ming Keung Swamy D Ponpandi
Yiannis Nikolakopoulos
Shared Memory Consistency Models: A Tutorial
Breakpoints and Halting in Distributed Systems
Dongyun Jin, Patrick Meredith, Dennis Griffith, Grigore Rosu
CS510 - Portland State University
Basics of Distributed Systems
Amir Kamil and Katherine Yelick
Non-preemptive Semantics for Data-race-free Programs
Relaxed Consistency Part 2
Runtime Safety Analysis of Multithreaded Programs
Programming with Shared Memory Specifying parallelism
Modeling IDS using hybrid intelligent systems
Ka-Ming Keung Swamy D Ponpandi
Presentation transcript:

Dynamic Race Prediction in Linear Time Dileep Kini Umang Mathur Mahesh Viswanathan University of Illinois at Urbana Champaign

Debugging concurrent programs is a nightmare Reasoning about all possible inter-leavings ! Data Races, Deadlocks, …… Fundamental Challenge ! Data Races

Data Races and How to Find Them A data race occurs in an execution

Data Races and How to Find Them A data race occurs in an execution when two concurrent threads consecutively accessing a shared memory location Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l)

Data Races and How to Find Them A data race occurs in an execution when two concurrent threads consecutively accessing a shared memory location At least one of the events is a write Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l)

Data Races and How to Find Them A data race occurs in an execution when two concurrent threads consecutively accessing a shared memory location At least one of the events is a write Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l)

Data Races and How to Find Them Static Race Detection – Undecidable in general, false positives Dynamic Race Detection Lockset Based (Eraser) – false alarms Sound Predictive Analysis Happens Before (Lamport) Causally Precedes (Smaragdakis et al) Maximal Causal Models (Rosu et al) Our technique detects more races and scales too !

Predictive Analysis A given execution may not have a race, but can provide insights on other possible executions (by the same program) that exhibit race. Thread t1 Thread t2 1 w(x) 2 acq(l) 3 r(x) 4 rel(l) Thread t1 Thread t2 1 acq(l) 2 w(x) 3 r(x) 4 rel(l) PREDICT No data race

Predictive Analysis A given execution may not have a race, but can provide insights on other possible executions (by the same program) that exhibit race. Thread t1 Thread t2 1 w(x) 2 acq(l) 3 r(x) 4 rel(l) Thread t1 Thread t2 1 acq(l) 2 w(x) 3 r(x) 4 rel(l) PREDICT No data race Predictable data race Data race uncovered

Predictive Analysis ❌ ❌ Any program that generates σ can also generate all its correct reorderings Trace σ’ is a correct reordering of a trace σ if : σ’|t is a prefix of σ|t for every thread t Critical sections do not overlap All reads in σ’ see the same values as in σ – last w(x) before any r(x) is the same in both σ and σ’ Thread t1 Thread t2 1 acq(l) 2 r(x) 3 rel(l) 4 5 w(x) 6 Thread t1 Thread t2 1 w(x) 2 acq(l) 3 r(x) 4 rel(l) 5 6 ❌ ❌

Predictive Analysis ❌ ❌ Trace σ’ is a correct reordering of a trace σ if : σ’|t is a prefix of σ|t for every thread t Critical sections do not overlap All reads in σ’ see the same values as in σ – last w(x) before any r(x) is the same in both σ and σ’ Thread t1 Thread t2 1 acq(l) 2 r(x) 3 rel(l) 4 5 w(x) 6 Thread t1 Thread t2 1 w(x) 2 acq(l) 3 r(x) 4 rel(l) 5 6 ❌ ❌

Predictive Analysis Trace σ’ is a correct reordering of a trace σ if : σ’|t is a prefix of σ|t for every thread t Critical sections do not overlap All reads in σ’ see the same values as in σ – last w(x) before any r(x) is the same in both σ and σ’ Predictive analysis techniques tend to check if there is a correct reordering of a given trace σ that exhibits a concurrency error, typically using partial orders over the events in σ

Predictive Analysis - HB Thread t1 Thread t2 1 acq(l) 2 rel(l) 3 4 r(x) 5 6 w(x) 7 Pair of read/write events on same memory location performed by different threads, of which at least one is a write Order events in a trace (≤HB ) : Events inside a thread are ordered as seen in the trace. Order critical sections as they appear in the trace. Declare a race when two conflicting accesses are unordered. Admits an online linear time algorithm 1 2 Data Race

Predictive Analysis - HB Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l) 5 6 7 8 Thread t1 Thread t2 1 w(y) 2 acq(l) 3 r(x) 4 rel(l) 5 6 7 8 r(y) 1 2 1 2 No HB race No HB race No predictable race

Predictive Analysis - HB Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l) 5 6 7 8 Thread t1 Thread t2 1 w(y) 2 acq(l) 3 r(x) 4 rel(l) 5 6 7 8 r(y) No HB race No HB race No predictable race Predictable race

Predictive Analysis - HB Thread t1 Thread t2 1 acq(l) 2 r(x) 3 w(x) 4 rel(l) 5 6 7 8 Thread t1 Thread t2 1 2 3 4 5 6 7 8 HB is too conservative ! w(y) acq(l) r(x) rel(l) w(y) acq(l) r(x) rel(l) r(y) No HB race No HB race No predictable race Predictable race r(y)

Predictive Analysis - CP † <CP is the smallest transitive relation that orders e1 and e2 if: e1 = rel(l), e2 = acq(l), and their critical sections have conflicting events e1 = rel(l), e2 = acq(l), and their critical sections have events ordered by <CP. ∃e3 such that e1 ≤HB e3 <CP e2 or e1 <CP e3 ≤HB e2 Partial order ≤CP = <CP ∪Thread-order Declare a race when two conflicting accesses are unordered. acq(l) r(x) rel(l) w(x) 1 acq(l) e rel(l) e’ CP 2 e1 e3 e2 HB 3 CP † Sound Predictive Race Detection in Polynomial Time, Y. Smaragdakis et al, POPL 2012

Predictive Analysis - CP Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(x) 7 r(y) 8 Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(y) 7 r(x) 8 1 No CP race No predictable race

Predictive Analysis - CP Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(x) 7 r(y) 8 Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(y) 7 r(x) 8 No CP race No predictable race

Predictive Analysis - CP Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(x) 7 r(y) 8 Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(y) 7 r(x) 8 1 No CP race No CP race No predictable race

Predictive Analysis - CP Thread t1 Thread t2 1 w(y) 2 acq(l) 3 w(x) 4 rel(l) 5 6 r(x) 7 r(y) 8 Thread t1 Thread t2 1 2 3 4 5 6 7 8 w(y) w(y) acq(l) w(x) rel(l) r(y) r(x) w(y) r(y) acq(l) r(y) No CP race No CP race No predictable race Predictable race missed by CP

Predictive Analysis - CP CP misses races CP does not have a linear time algorithm – fails to scale to traces having millions of events Resort to windowing – can miss even HB races ! Remedy ? Detects races missed by CP WCP to the rescue Linear running time

Weak Causal Precedence <WCP is the smallest transitive relation that orders e1 and e2 if: e1 = rel(l), e2 = r(x)/w(x) inside a critical section of lock l, and e2 conflicts with an event in crit. sec. of e1 e1 = rel(l), e2 = rel(l), and their critical sections have events ordered by <WCP ∃e3 such that e1 ≤HB e3 <WCP e2 or e1 <WCP e3 ≤HB e2 Partial order ≤WCP = <WCP ∪Thread-order Declare a race when two conflicting accesses are unordered. acq(l) r(x) rel(l) w(x) 1 acq(l) e rel(l) e’ WCP 2 e1 e3 e2 HB 3 WCP

WCP v/s CP acq(l) r(x) rel(l) w(x) acq(l) e rel(l) e’ WCP adds fewer orderings than CP, thus allowing for more correct re-orderings CP CP WCP WCP Rule 1. Rule 2.

WCP v/s CP 1 2 3 4 5 6 7 8 Thread t1 Thread t2 w(y) acq(l) w(x) rel(l) r(y) r(x)

WCP v/s CP 1 2 3 Predictable race caught by WCP Thread t1 Thread t2 1 2 3 4 5 6 7 8 w(y) acq(l) w(x) rel(l) r(y) r(x) Predictable race missed by CP Predictable race caught by WCP CP WCP

WCP v/s CP 1 2 3 Predictable race caught by WCP Thread t1 Thread t2 1 2 3 4 5 6 7 8 w(y) w(y) acq(l) w(x) rel(l) r(y) r(x) Predictable race missed by CP Predictable race caught by WCP w(y) r(y) acq(l) r(y)

Assumption - Nested locking paradigm WCP Soundness Assumption - Nested locking paradigm WCP is weakly sound Given any trace σ, if σ exhibits a WCP-race then σ exhibits a predictable race or a predictable deadlock Any program generating σ can generate an execution σ’ (correct reordering) exhibiting a race/deadlock Proof of soundness is non trivial. Soundness proof for CP was incorrect. conflicting events unordered by ≤WCP

Vector Clock Algorithm Assigns timestamp Ce to each event e, similar to HB vector clock algorithm Timestamps are vector times (clocks) – thread indexed vectors, supporting various operations : comparison (⊑), join (⊔), update, etc., Ce ⊑ Ce’ iff e ≤WCP e’ – conflicting events with unordered timestamps imply a WCP race One pass online algorithm – detects races as they occur Processes events as they are generated, updates internal state (comprising of vector clocks and FIFO queues)

Vector Clock Algorithm Linear running time – O(n) where n is the number of events Worst case space requirement – O(n) Empirically, the space overhead was observed to be small. Optimal space usage – any single pass algorithm for WCP takes Ω(n) space Optimal time/space tradeoff – For any algorithm computing WCP in time T(n) and space S(n), it must be the case that T(n)∙S(n) ∈Ω(n2)

Weak Causal Precedence WCP detects all races (and deadlocks) detected by CP or HB, and even more WCP admits a linear time algorithm HB < CP < WCP CP < HB ≈ WCP

Experimental Evaluation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Program LOC #Events #Thrds #Locks #Races WCP Queue Length (%) Time WCP HB RV Max w=1K, s=60s w=10K, s=240s account 87 130 0.2s 0.3s 1s airline 83 128 0.8s 2s array 36 47 4.3 1.1s boundedbuffer 334 333 bubblesort 274 4K 2.4 0.7s 0.5s 3.6s 7m3s bufwriter 199 11.7M 47s 22.4s 4.1s 4.5s critical 63 55 1.7s 0.9s mergesort 298 3K 1.3 0.4s 1.4s pingpong 124 146 1.2s 1.3s moldyn 2.9K 164K 44 7.1s 2.4s 17.4s montecarlo 7.2M 23.4s 16.2s 5.7s raytracer 16K 14.7s derby 302K 1.3M 1112 23 - 0.6 7s 31.2s TO eclipse 560K 87M 8263 66 64 0.4 6m51s 4m18s 26.2s 15m10s ftpserver 32K 49K 304 2.2 2.1s 3.8s 3m jigsaw 101K 3M 280 18s 11.8s 2.8s lusearch 410K 216M 118 160 10m13s 6m48s 57.3s 46.7s xalan 180K 122M 2494 18 0.1 7m22s 4m46s 43.1s 7m11s

Experimental Evaluation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Program LOC #Events #Thrds #Locks #Races WCP Queue Length (%) Time WCP HB RV Max w=1K, s=60s w=10K, s=240s account 87 130 0.2s 0.3s 1s airline 83 128 0.8s 2s array 36 47 4.3 1.1s boundedbuffer 334 333 bubblesort 274 4K 2.4 0.7s 0.5s 3.6s 7m3s bufwriter 199 11.7M 47s 22.4s 4.1s 4.5s critical 63 55 1.7s 0.9s mergesort 298 3K 1.3 0.4s 1.4s pingpong 124 146 1.2s 1.3s moldyn 2.9K 164K 44 7.1s 2.4s 17.4s montecarlo 7.2M 23.4s 16.2s 5.7s raytracer 16K 14.7s derby 302K 1.3M 1112 23 - 0.6 7s 31.2s TO eclipse 560K 87M 8263 66 64 0.4 6m51s 4m18s 26.2s 15m10s ftpserver 32K 49K 304 2.2 2.1s 3.8s 3m jigsaw 101K 3M 280 18s 11.8s 2.8s lusearch 410K 216M 118 160 10m13s 6m48s 57.3s 46.7s xalan 180K 122M 2494 18 0.1 7m22s 4m46s 43.1s 7m11s 6 WCP 4 2 8 3 7 44 5 23 66 36 14 160 18 Detects more races than other techniques

Experimental Evaluation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Program LOC #Events #Thrds #Locks #Races WCP Queue Length (%) Time WCP HB RV Max w=1K, s=60s w=10K, s=240s account 87 130 0.2s 0.3s 1s airline 83 128 0.8s 2s array 36 47 4.3 1.1s boundedbuffer 334 333 bubblesort 274 4K 2.4 0.7s 0.5s 3.6s 7m3s bufwriter 199 11.7M 47s 22.4s 4.1s 4.5s critical 63 55 1.7s 0.9s mergesort 298 3K 1.3 0.4s 1.4s pingpong 124 146 1.2s 1.3s moldyn 2.9K 164K 44 7.1s 2.4s 17.4s montecarlo 7.2M 23.4s 16.2s 5.7s raytracer 16K 14.7s derby 302K 1.3M 1112 23 - 0.6 7s 31.2s TO eclipse 560K 87M 8263 66 64 0.4 6m51s 4m18s 26.2s 15m10s ftpserver 32K 49K 304 2.2 2.1s 3.8s 3m jigsaw 101K 3M 280 18s 11.8s 2.8s lusearch 410K 216M 118 160 10m13s 6m48s 57.3s 46.7s xalan 180K 122M 2494 18 0.1 7m22s 4m46s 43.1s 7m11s 12 WCP 0.2s 0.3s 0.7s 47s 0.4s 0.5s 7.1s 23.4s 2.4s 16.2s 6m51s 5.7s 18s 10m13s 7m22s Fast

Experimental Evaluation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Program LOC #Events #Thrds #Locks #Races WCP Queue Length (%) Time WCP HB RV Max w=1K, s=60s w=10K, s=240s account 87 130 0.2s 0.3s 1s airline 83 128 0.8s 2s array 36 47 4.3 1.1s boundedbuffer 334 333 bubblesort 274 4K 2.4 0.7s 0.5s 3.6s 7m3s bufwriter 199 11.7M 47s 22.4s 4.1s 4.5s critical 63 55 1.7s 0.9s mergesort 298 3K 1.3 0.4s 1.4s pingpong 124 146 1.2s 1.3s moldyn 2.9K 164K 44 7.1s 2.4s 17.4s montecarlo 7.2M 23.4s 16.2s 5.7s raytracer 16K 14.7s derby 302K 1.3M 1112 23 - 0.6 7s 31.2s TO eclipse 560K 87M 8263 66 64 0.4 6m51s 4m18s 26.2s 15m10s ftpserver 32K 49K 304 2.2 2.1s 3.8s 3m jigsaw 101K 3M 280 18s 11.8s 2.8s lusearch 410K 216M 118 160 10m13s 6m48s 57.3s 46.7s xalan 180K 122M 2494 18 0.1 7m22s 4m46s 43.1s 7m11s 3 #Events 130 128 47 333 4K 11.7M 55 3K 146 164K 7.2M 16K 1.3M 87M 49K 3M 216M 122M Scales to large traces

Experimental Evaluation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Program LOC #Events #Thrds #Locks #Races WCP Queue Length (%) Time WCP HB RV Max w=1K, s=60s w=10K, s=240s account 87 130 0.2s 0.3s 1s airline 83 128 0.8s 2s array 36 47 4.3 1.1s boundedbuffer 334 333 bubblesort 274 4K 2.4 0.7s 0.5s 3.6s 7m3s bufwriter 199 11.7M 47s 22.4s 4.1s 4.5s critical 63 55 1.7s 0.9s mergesort 298 3K 1.3 0.4s 1.4s pingpong 124 146 1.2s 1.3s moldyn 2.9K 164K 44 7.1s 2.4s 17.4s montecarlo 7.2M 23.4s 16.2s 5.7s raytracer 16K 14.7s derby 302K 1.3M 1112 23 - 0.6 7s 31.2s TO eclipse 560K 87M 8263 66 64 0.4 6m51s 4m18s 26.2s 15m10s ftpserver 32K 49K 304 2.2 2.1s 3.8s 3m jigsaw 101K 3M 280 18s 11.8s 2.8s lusearch 410K 216M 118 160 10m13s 6m48s 57.3s 46.7s xalan 180K 122M 2494 18 0.1 7m22s 4m46s 43.1s 7m11s 11 WCP Queue Length (%) 4.3 2.4 10 1.3 0.6 0.4 2.2 0.1 Low memory overhead

Conclusions WCP – generalizes the CP relation Linear time algorithm Detects more races in practice Scales to large traces Future Work Further weakening Control flow information Epoch based optimizations

Thank You !

WCP Deadlock Thread t1 Thread t2 Thread t3 1 acq(l) 2 acq(m) 3 rel(m) 4 r(z) 5 rel(l) 6 7 acq(n) 8 sync(x) 9 rel(n) 10 11 12 13 14 w(z) 15 16 sync(y) 17 18

WCP Deadlock Thread t1 Thread t2 Thread t3 1 acq(l) 2 acq(m) 3 rel(m) 4 r(z) 5 rel(l) 6 7 acq(n) 8 sync(x) 9 rel(n) 10 11 12 13 14 w(z) 15 16 sync(y) 17 18 acq(xlock) r(xvar) w(xvar) rel(xlock)

WCP Deadlock Thread t1 Thread t2 Thread t3 WCP WCP 1 acq(l) 2 acq(m) 3 rel(m) 4 r(z) 5 rel(l) 6 7 acq(n) 8 sync(x) 9 rel(n) 10 11 12 13 14 w(z) 15 16 sync(y) 17 18 WCP WCP

WCP race, but no predictable race WCP Deadlock Thread t1 Thread t2 Thread t3 1 acq(l) 2 acq(m) 3 rel(m) 4 r(z) 5 rel(l) 6 7 acq(n) 8 sync(x) 9 rel(n) 10 11 12 13 14 w(z) 15 16 sync(y) 17 18 WCP race, but no predictable race WCP WCP

WCP Deadlock Predictable deadlock Thread t1 Thread t2 Thread t3 acq(l) 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 acq(l) acq(l) acq(m) rel(m) r(z) rel(l) acq(n) sync(x) rel(n) w(z) sync(y) acq(m) Predictable deadlock acq(m) acq(n) acq(n) acq(l)