Download presentation
Presentation is loading. Please wait.
Published bySharyl Whitehead Modified over 9 years ago
1
Dynamic Data Race Detection
2
Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric Sobalvarro, Thomas Anderson, ACM Transactions on Computer Systems, Vol. 15, No. 4, November 1997 RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking; –Yuan Yu, Tom Rodeheffer, Wei Chen, Proceedings SOSP ’05, copyright 2005 ACM
3
The Shared Problem Problem: Data race detection in multithreaded programs. (Implies shared memory) Solution: a tool that automates the problem of detecting potential data races Basic idea: look for “unprotected” accesses to shared variables. Eraser and Racetrack use somewhat different approaches. Why important: synchronization errors based on data races are –Timing dependent –Hard to find
4
Data Race “A data race occurs when two concurrent threads access a shared variable and … –at least one access is a write and –the threads use no explicit mechanism to prevent the accesses from being simultaneous” In other words, a data race can lead to a potential violation of mutual exclusion.
5
Data Race Example: Threads with unsynchronized access to a shared array Thread 1 int i; … for (i = 1; i < MAX; i++) { cin >> x; A[i] = 2*x; } … Thread 2 int i; … for (i = 1; i < MAX; i++) { if (A[i] < B[i]) B[i] = A[i]; } …
6
Approaches to Data Race Detection Static: performed at compile time or earlier Dynamic: runtime analysis Most current research focuses on dynamic detection The environment is a single multithreaded program rather than separate programs.
7
Static Data Race Detection Type-based analysis –Language type system augmented to “express common synchronization relationships”: correct typing→no data races –Difficult to do & restricts the type of synchronization primitives Language features –e.g., use of monitors –Only works for static data – not dynamic data Path analysis; –Doesn’t scale well –Too many false positives
8
Dynamic Data Race Detection Dynamic detection: monitor programs during execution and look for problems –The program may be “instrumented” with additional instructions –The additions don’t change program functionality but are used to monitor conditions of interest - in this case, access to shared variables and synchronization operations.
9
Dynamic Detection Post mortem or on-the-fly analysis of traces Problems: –Can only check paths that are actually executed –Adds overhead at runtime Techniques –Happens-before (earliest dynamic technique) –Lockset analysis (Eraser) –Hybrids of happen-before and lockset (RaceTrack)
10
Lock Definition Lock: a synchronization object that is either available, or owned (by a thread). –Operations: lock(mu) and unlock(mu). –No explicit initialize operation. Compare to binary semaphore –Lock( ) ~ P( ); Unlock ~ V( ) –A lock can only be unlocked by its current owner –The Lock( ) operation is blocking if the lock is owned by another thread.
11
Dynamic Race Detection Using happens-before happens-before defines a partial order for events in a set of concurrent threads –In a single thread, happens-before reflects the temporal order of event occurrence –Between threads, A happens before B if A is a lock access in one thread, and the next access to that lock (event B) is in a different thread and if the accesses obey the semantics of the lock (can’t have two successive locks, or two successive unlocks, or a lock in one thread and an unlock in a different thread)
12
Happens-before Relation Formalized Let event a be in thread 1 and event b be in thread 2. –If a = unlock(mu) and b = lock(mu) then a → b(a happens-before b) Data races between threads are possible if accesses to shared variables are not ordered by happens-before.
13
EXAMPLE: Fig. 1 Thread 1 lock(mu); v = v + 1; unlock(mu); Thread 2 ∙. lock(mu); v = v + 1; unlock(mu); The arrows represent happens-before. The events represent an actual execution of the two threads. Instead of a logical clock, each thread might maintain a “most recent event” variable. In T1, the most recent event is unlock(mu); when T2 executes lock(mu) the system can establish the happens-before relation.
14
EXAMPLE: Fig. 2 Thread 1 y = y + 1; lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); y = y + 1; Accesses to both y and v are ordered by happens-before, so no data race occurred. But … a different execution ordering could get different results. Happens-before only detects data races if the incorrect order shows up in an execution trace.
15
EXAMPLE: Fig. 2 Thread 1 y = y + 1; lock(mu); v = v + 1; unlock(mu); Thread 2 lock(mu); v = v + 1; unlock(mu); y = y + 1;... If Thread 2 executes before Thread 1, happens-before no longer holds between the two accesses to y, so the possibility of a data race occurs and should be notified to the programmer. Accesses to y are “concurrent” since neither a b nor b a
16
Problems with happens-before Eraser would find the error for any test case that included both code paths, regardless of the order; happens-before analysis only works if the dangerous schedule is executed. Since there are many possible interleavings, you can’t be sure to test them all so you might miss a potential error. Eraser might miss some data races, but it will catch more than tools based only on happens- before. (It also might raise false positives)
17
Basic Premise of Eraser Observe all instances where a shared variable is accessed by a thread. If there is a chance that a data race can occur, be sure the shared variable is protected by a lock. –Simple algorithm – basic locks –Advanced algorithm – reader/writer locks If variable isn’t protected, issue a warning.
18
How Eraser Works Requires each shared variable to be protected by a lock. (all threads must use the same lock for the same variable) Eraser monitors all reads and writes (loads and stores) of a variable as the program runs. Eraser must deduce which locks protect each shared variable. Eraser assumes that it knows the full set of locks in advance (they must be declared in the code).
19
Locksets There are two kinds of locksets Candidate locksets – C(v) –One per shared variable –Contains all locks that may be protecting the variable Locks_held(t) –Contains the locks currently held by a thread
20
The First Lockset Algorithm (Section 2) Let locks_held(t) be the set of locks held by thread t. (a per-thread structure) For each v, initialize C(v) to the set of all locks. (a per-variable structure) Lockset refinement: C(v) is adjusted every time v is accessed. Each time a thread t i accesses variable v –Set C(v) = C(v) ∩ locks_held(t) –If C(v) = { Φ } issue a warning
21
Example (Fig. 3) If a program has two locks, mu1 and mu2, then C(v) is initially {mu1, mu2}. If the first access to v is in a thread holding mu1 then C(v) ∩ locks_held(t) = mu1. If the second access to v is in a thread holding mu2 then C(v) ∩ locks_held(t) = {Φ}.
22
Refining the Lockset Algorithm The previous algorithm is correct, but flags some situations as potential race conditions when in fact they aren’t: False alarms –Variable initialization (restricted to one thread) –Shared variables that are read-only –Variables protected by read/write locks Sections 2.2 and 2.3 discuss refinements to the algorithm for avoiding some false alarms and handling read-write locks as well as simple locks.
23
Refinements Until a variable is accessed by a second thread, there’s no danger of a data race so no need to monitor
24
virgin exclusive shared Shared-modified write Write, new thread Write Read, new thread State transitions for a memory location, based on whether it has been accessed at all, accessed by more than one thread, accessed in read mode only, etc. Figure 4 Race conditions are reported only for locations in the shared- modified state.
25
Implementing Eraser Eraser instruments the program binary by inserting calls to Eraser’s runtime functions. Each load and store is instrumented if it accesses global or heap data. Stack data is assumed not to be shared. Protects at the word level; i.e., a word is considered to be a variable. The storage allocator is also instrumented to initialize C(v) for dynamic data.
26
Implementing Eraser Each call to the lock operation is instrumented to keep locks_held(t) updated. When a race is suspected (reference to a shared variable that isn’t protected by a lock) Eraser indicates the file and line # plus other information that can help the programmer locate the problem.
27
Conclusions A number of systems (AltaVista, the Petal distributed file system) were used as testbeds. Undergraduate programs were also tested. Eraser found a number of potential race conditions and had a few false alarms. Experienced programmers did better than undergraduates!
28
Compare Early Dynamic Methods Lockset analysis (Eraser): enforces the requirement that every shared variable is protected by a lock –Possible false positives, slow, but relatively unaffected by execution order. –Will miss races if a dangerous path is not tested. Happens-before analysis: based on Lamport’s relation, establish partial ordering of statements based on synchronization events –No false positives, but may have false negatives: looks at code with problems, doesn’t detect
29
Modern Race Detection Tools (e.g. Racetrack) Tools such as Eraser are too low-level – instrumentation of object code; monitoring at word level, etc. RaceTrack and other OO language race detectors work at a higher level – source code, bytecode, virtual machine level. –Less overhead –Can adjust level of monitoring from object level to field level, based on circumstances
30
RaceTrack Does not claim to detect all concurrent accesses in monitored code; i.e., there may be false negatives. Why: to detect all instances of concurrency the tool would have to keep a complete access history for each shared variable –RaceTrack uses estimation techniques to prune the threadset (set of concurrent accesses) and the lockset.
31
RaceTrack Claim: Improves on lockset analysis by only looking for data races when shared data is being accessed concurrently. –Uses a thread set to detect concurrency Able to handle locks as well as fork-join parallelism; also considers asynchronous calls Designed to monitor library code also Is sensitive to execution traces as are other dynamic race detection tools.
32
Fork-Join Parallelism A way to achieve parallelism –Parent thread creates (forks) child threads –Parent thread pauses while child threads work –Children report results, parent resumes execution (the join), combines child results Significance: monitor between fork & join but not otherwise (no concurrency)
33
Asynchronous Calls A program creates a thread and returns immediately. The creator runs concurrently with the new thread. In this case, monitoring of parent and child is necessary.
34
RaceTrack Environment Large multithreaded OO programs running on the.NET platform –All code is translated into an intermediate language (IL) which is later compiled into platform specific code by the JIT compiler in the Common Language Runtime (CLR) Fig. l The CLR manages all runtime activities: object allocation, thread creation, garbage collection, exception handling.
35
Tool Environment RaceTrack instruments at the virtual machine level (CLR ~ JVM) –The JIT compiler in the CRL inserts calls to RaceTrack tools as it generates native code RaceTrack is language independent, as applications run directly on the modified runtime environment.
36
Low-level or High-level Instrumentation Eraser inserts instrumentation code into program binaries, modifies the memory allocator; very low level, monitors every access, leads to higher execution overhead. “Modern” methods such as RaceTrack modify source code, compiler, etc. Allows tools to choose to monitor at different levels (e.g. object level or field level) and thus be more efficient.
37
Race Track versus Lockset Lockset-based detection doesn’t consider fork/join operations, asynchronous calls. –Result: false alarms Observation: data race can occur only if multiple threads are accessing a variable concurrently. Racetrack combines two approaches: locksets and threadsets
38
RaceTrack Approach Locksets detect unprotected shared memory accesses and threadsets decide if shared memory accesses are concurrent. –Threadset = a set of timestamps, defined in terms of vector clocks. Each thread t has a lockset L t and a vector clock B t. –Lockset: contains currently held locks –Vector clock: most recent information about the logical clocks of t and all other threads –Lock and unlock operations update L t –Fork/join operations update B t.
39
Threadsets When a thread T j accesses a shared variable, it adds an entry (label) to that variable’s threadset. –Label = (thread id, timestamp of the access) T j then uses happens-before analysis, based on the vector clocks, to “prune” the threadset. –Any label in the threadset which “happens-before” the current access made by T j is removed –Any remaining accesses are considered “concurrent” Races are not considered to be a threat if the threadset is a singleton
40
Variables Each variable x has a lockset C x and a threadset S x, where C x is the set of locks that are (potentially) currently protecting x and S x is the current set of concurrent accesses to x. –Initially, S x is the empty set { } and C x is intialized to the set of all possible locks
41
Basic Approach Lock set monitoring proceeds as in Eraser Threadsets are modified each time a thread accesses a variable If the threadset has only one entry, then there are no concurrent accesses. Warnings are issued only if the candidate lockset for a variable is empty AND there are multiple entries in the threadset.
42
RaceTrack Performance Adjusts monitoring granularity from object level to field level based on program conditions. –Monitoring at the object level may lead to false positives. Issues warnings on-the-fly and then performs a more careful analysis during a post-mortem
43
RaceTrack Benefits Coverage: JIT compiler enables any code to be instrumented and monitored. Accuracy: Ability to monitor at a low granularity (field, individual array element) improves detection accuracy. Happens-before analysis filters out some false positives that would be flagged by lockset analysis alone. Performance: Monitoring is adaptive – reduce level when races are unlikely Scalability: good, due to low overhead and ease of instrumentation.
44
Future Work (RaceTrack) Add deadlock detection mechanisms to flag lock acquisitions that are ordered incorrectly.
45
Example of Potential Deadlock: global variables x, y; semaphores sx = sy = 1 Thread 1 P(sx); P(sy); x = f1(x,y); y = f2(x,y); V(sy); V(sx); Thread 2 P(sy); P(sx); x = p1(x,y); y = p2(x,y); V(sx); V(sy);
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.