Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.

Dynamic Data Race Detection

Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric Sobalvarro, Thomas Anderson, ACM Transactions on Computer Systems, Vol. 15, No. 4, November 1997 RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking; –Yuan Yu, Tom Rodeheffer, Wei Chen, Proceedings SOSP ’05, copyright 2005 ACM

The Shared Problem Problem: Data race detection in multithreaded programs. (Implies shared memory) Solution: a tool that automates the problem of detecting potential data races –Each paper describes a different method or technique Basic idea: look for “unprotected” accesses to shared variables. Why important: synchronization errors based on data races are –Timing dependent –Hard to find

Data Race “A data race occurs when two concurrent threads access a shared variable and … –at least one access is a write and –the threads use no explicit mechanism to prevent the accesses from being simultaneous” In other words, a data race can lead to a potential violation of mutual exclusion.

Data Race Example: Threads with unsynchronized access to a shared array Thread 1 int i; … for (i = 1; i < MAX; i++) { cin >> x; A[i] = 2*x; } … Thread 2 int i; … for (i = 1; i < MAX; i++) { if (A[i] < B[i]) B[i] = A[i]; } …

Static Data Race Detection Static data race detection can be done at compile time. –Type-based methods; a language-level approach –Path analysis of code; a compile-time approach Hard to apply to dynamically allocated date Doesn’t scale well to large programs Many false positives – it’s hard to reason about execution behavior.

Dynamic Data Race Detection Dynamic detection is done by code that monitors the software during execution. –The program may be “instrumented” with additional instructions –The additions don’t change program functionality but are used to monitor conditions of interest - in this case, access to shared variables and synchronization operations.

Dynamic Detection Post mortem or on-the-fly analysis of code traces Problems: –Can only check paths that are actually executed –Adds overhead at runtime Techniques –Happens-before (earliest dynamic technique) –Lockset analysis (Eraser) –Various hybrids (RaceTrack)

Dynamic Race Detection Using happens-before Definition of happens-before for data race detection uses accesses to synchronization objects (locks) to synchronize separate threads. –Compare to use of messages to synchronize separate processes in previous applications. In a single thread, happens-before reflects the temporal order of event occurrence (as always).

Happens-before Relation Between threads, events can be causally connected when a lock is accessed in a thread (A) and the next access to that lock is in a different thread B (lock access replaces message exchange) Accesses must obey the semantics of locks: –only the owner of a lock can unlock it, –two threads can’t hold the same lock simultaneously.

Happens-before Relation Let event a be in thread A and event b be in thread B. –If a = unlock(mu) and b = lock(mu) then a → b(a happens-before b) Data races between threads are possible if accesses to shared variables are not ordered by happens-before.

EXAMPLE: Fig. 1 Thread 1 lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); The arrows represent happens-before. The events represent an actual execution of the two threads. Instead of a logical clock, each thread might maintain a “most recent event” variable. In T1, the most recent event is unlock(mu); when T2 executes lock(mu) the system can establish the happens-before relation.

EXAMPLE: Fig. 2 Thread 1 y = y + 1; lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); y = y + 1; Accesses to both y and v are ordered by happens-before, so no data race occurred. But … a different execution ordering could get different results. Happens-before only detects data races if the incorrect order shows up in an execution trace.

EXAMPLE: Fig. 2 Thread 1 y = y + 1; lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); y = y + 1;... If Thread 2 executes before Thread 1, happens-before no longer holds between the two accesses to y, so the possibility of a data race occurs and should be notified to the programmer. Accesses to y are “concurrent” since neither a b nor b a

Problems with happens-before Eraser would find the error for any test case that included both code paths, regardless of the order; happens-before analysis only works if the dangerous schedule is executed. Since there are many possible interleavings, you can’t be sure to test them all so you might miss a potential error. Eraser might miss some data races, but it will catch more than tools based only on happens- before.

Lockset Analysis Background Lock: a synchronization object that is either available, or owned (by a thread). –Operations: lock(mu) and unlock(mu). –No explicit initialize operation. Compare to binary semaphore –Lock( ) ~ P( ); Unlock ~ V( ) –The Lock( ) operation is blocking if the lock is owned by another thread.

Background Simple mutex locks are not the only kind. Some systems provide others: –Read/write locks permit multiple readers, but only one writer. Some shared-memory accesses don’t need locks at all –Read only data: intialized and then never written again.

Basic Premise of Eraser Observe all instances where a shared variable is accessed by a thread. If there is a chance that a data race can occur, be sure the shared variable is protected by a lock. –Simple algorithm – basic locks –Advanced algorithm – reader/writer locks If variable isn’t protected, issue a warning.

How Eraser Works Requires each shared variable to be protected by a lock. (the same lock for all threads) Eraser will monitor all reads and writes (loads and stores) of a variable as the program runs. Eraser must deduce which locks protect each shared variable. Eraser assumes that it knows the full set of locks in advance (they must be declared in the code). Protects at the word level; i.e., a word is considered to be a variable.

How It Works (see Section 2) For each variable v build a set of locks C(v) that holds candidate locks (locks that may be protecting v). –l is in C(v) if every thread that has accessed v so far was holding l at the time of access. Lockset refinement: C(v) is adjusted every time v is accessed. If C(v) becomes empty, the variable is assumed to be unprotected.

The First Lockset Algorithm Let locks_held(t) be the set of locks held by thread t. (a per-thread structure) For each v, initialize C(v) to the set of all locks. (a per-variable structure) Lock sets change over time. Each time a thread t i accesses variable v –Set C(v) = C(v) ∩ locks_held(t) –If C(v) = { Φ } issue a warning

Example (Fig. 3) If a program has two locks, mu1 and mu2, then C(v) is initially {mu1, mu2}. If the first access to v is in a thread holding mu1 then C(v) ∩ locks_held(t) = mu1. If the second access to v is in a thread holding mu2 then C(v) ∩ locks_held(t) = {Φ}.

Refining the Lockset Algorithm The previous algorithm is correct, but flags some situations as potential race conditions when in fact they aren’t: False alarms –Variable initialization (restricted to one thread) –Shared variables that are read-only Sections 2.2 and 2.3 discuss refinements to the algorithm for avoiding some false alarms and handling read-write locks as well as simple locks.

Refinements Until a variable is accessed by a second thread, there’s no danger of a data race so no need to monitor

virgin exclusive shared Shared-modified write Write, new thread Write Read, new thread State transitions for a memory location, based on whether it has been accessed at all, accessed by more than one thread, accessed in read mode only, etc. Figure 4 Race conditions are reported only for locations in the shared- modified state.

Implementing Eraser Eraser instruments the program binary by inserting calls to the Eraser runtime functions. Each load and store is instrumented if it accesses global or heap data. Stack data is assumed not to be shared. The storage allocator is also instrumented to initialize C(v) for dynamic data.

Implementing Eraser Each call to the lock operation is instrumented to keep locks_held(t) updated. When a race is suspected (reference to a shared variable that isn’t protected by a lock) Eraser indicates the file and line # plus other information that can help the programmer locate the problem.

Conclusions A number of systems (AltaVista, the Petal distributed file system) were used as testbeds. Undergraduate programs were also tested. Eraser found a number of potential race conditions and had a few false alarms. Experienced programmers did better than undergraduates!

Summary/Review Data race detection can be done statically or dynamically –Static: compile time analysis, examine all paths or modify language type system to include synchronization relationships –Dynamic: run-time analysis, can only catch errors if they are observed – can’t examine all paths Eraser does a better job than happens-before methods; will detect all potential races in monitored code

EXAMPLE: Fig. 2 Thread 1 y = y + 1; lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); y = y + 1; Eraser would notice that y is unprotected by a lock and thus detect a data race, even though happens-before would not.

Summary/Review Two earlier techniques –Lockset analysis (Eraser): enforces the requirement that every shared variable is protected by a lock Possible false positives, slow, not “sound”, but relatively unaffected by execution order. May miss some races if a dangerous path is not tested. –Happens-before analysis: based on Lamport’s relation, establish partial ordering of statements based on synchronization events No false positives, but may have false negatives

RaceTrack Claim: Improves on lockset analysis by only looking for data races when shared data is being accessed concurrently. –Eraser does this too, but in a limited fashion Able to handle locks as well as fork-join parallelism Monitors library code also Is sensitive to execution traces

Fork-Join A way to achieve parallelism –Parent thread creates (forks) several sub- threads –Parent thread pauses –Forked threads report results, parent thread resumes execution (the join) and combines child results Similar to UNIX approach with processes, but finer granularity

RaceTrack Does not claim to detect all concurrent accesses (i.e., there may be false negatives) Why: to detect all instances of concurrency the tool would have to keep a complete access history for each shared variable –RaceTrack uses estimation techniques to prune the threadset (set of accesses) and the lockset.

Tool Environment Large multithreaded OO programs running on the.NET platform –All code is translated into an intermediate language (IL) which is later compiled into platform specific code by the JIT compiler in the Common Language Runtime (CLR) Fig. l The CLR manages all runtime activities: object allocation, thread creation, garbage collection, exception handling.

Tool Environment RaceTrack instruments at the virtual machine level (CLR ~ JVM) –The JIT compiler in the CRL inserts calls to RaceTrack tools as it generates native code RaceTrack is language independent, as applications run directly on the modified runtime environment.

Race Track versus Lockset Lockset-based detection does not consider fork/join operations (if only one thread exists no data race is possible) and asynchronous calls (non-blocking). –Result: false alarms Observation: data race can occur only if several threads are currently accessing the variable

RaceTrack Approach RaceTrack maintains a lockset, C x for each shared variable x, but it also maintains a current threadset, S x. Threadset = a set of concurrent accessess, where concurrent is defined in terms of vector clocks. –A thread’s virtual clock ticks at certain synchronization operations; –Synchronization ops transfer information about clock values to other threads, which use it to update their own vector clocks (just as messages are used in earlier examples).

Threadsets Whenever a thread T j accesses a shared variable, it adds an entry (label) to that variable’s threadset. –Label = (thread id, timestamp of the access) T j then uses happens-before analysis, based on the vector clocks, to “prune” the threadset. –Any label L i in the threadset which “happens-before” the current access made by T j is removed –Any remaining accesses are considered “concurrent” Races are not considered to be a threat if the threadset is a singleton

Basic Algorithm - threads Each thread t has a lockset L t (locks-held) and a vector clock B t. –Lockset: contains currently held locks –Vector clock: most recent information about the logical clocks of t and all other threads Lock and unlock operations update L t Fork/join also update vector clocks. Local clock is set to 1 at thread creation, lockset is set to null

Basic Algorithm - variables Each variable x has a lockset C x and a threadset S x, where C x is the set of locks that are (potentially) currently protecting x and S x is the current set of concurrent accesses to x. –Initially, S x is the empty set { } and C x is intialized to the set of all possible locks

RaceTrack Approach Adjusts monitoring granularity from object level to field level based on program conditions. Issues warnings on-the-fly and then performs a more careful analysis during a post-mortem

RaceTrack Benefits Coverage: JIT compiler enables any code to be instrumented and monitored. Accuracy: Ability to monitor at a low granularity (field, individual array element) improves detection accuracy. Happens-before analysis filters out some false positives that would be flagged by lockset analysis alone. Performance: Monitoring is adaptive – reduce level when races are unlikely Scalability: good, due to low overhead and ease of instrumentation.

Future Work (RaceTrack) Add deadlock detection mechanisms to flag lock acquisitions that are ordered incorrectly.

Example of Potential Deadlock: global variables x, y; semaphores sx = sy = 1 Thread 1 P(sx); P(sy); x = f1(x,y); y = f2(x,y); V(sy); V(sx); Thread 2 P(sy); P(sx); x = p1(x,y); y = p2(x,y); V(sx); V(sy);

Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.

Similar presentations

Presentation on theme: "Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.

Similar presentations

Presentation on theme: "Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric."— Presentation transcript:

Similar presentations

About project

Feedback