Download presentation
Presentation is loading. Please wait.
Published byRandolf Norton Modified over 9 years ago
1
Verification for Concurrency Part 2: Incomplete techniques and bug finding
2
Contents Race detection Context bounding and Sequentialization Odds and ends
3
Race detection
4
Data Races “Two threads simultaneously access the same memory location, with at least one access being a write” Most concurrent software is written to avoid data-races Extremely tricky to write racy code that is correct For racy code, correctness on one architecture and complier does not imply correctness for all Race-detection is a very efficient light-weight bug detection technique Dynamic techniques: Work on actual executions
5
The Lockset algorithm Lockset assumption: Whenever two different threads access a location (with one of them being a write), they will both hold a common lock. Possible to write non-racy code violating this assumption False positives from code that violates this assumption
6
Vanilla Lockset algorithm Based on a simple locking discipline Each variable is protected by some lock for each tid: LocksHeld[tid] = ∅ for each v : Cands[v] = set of all locks for each instruction in trace: // Maintain lockset if (instruction = lock(l)) LocksHeld[tid] = LocksHeld[tid] ∪ { l } if (instruction = unlock(l)) LocksHeld[tid] = LocksHeld[tid] \ { l } // Update candidate locks if (access to v by thread tid) Cands[v] := Cands[v] ∩ LocksHeld[tid] if (Cands[v] = ∅ ) Output “Potential race on v”
7
Vanilla Lockset algorithm: Example Thread 1: Thread 2: lock (l1) x := x + 4 unlock (l1) lock(l1) lock(l2) x := x – y y := y + 4 unlock(l2) unlock(l1) lock (l2) y = y + 5 unlock (l2) x := x - 3
8
Vanilla LockSet: False positive Thread1:Thread2:Thread3: // Initialize g := 1729 // Share published := true assume(published) y := g z := g y := f(l, y); z := g(l, z) output(y) output(z) lock(l) unlock(l) lock(l) unlock(l) lock(l) unlock(l)
9
More advanced locking disciplines Simple locking discipline leads to false positives in many common cases Lazy initialization Initialize followed by read-only access Virgin ExclusiveShared Shared Modified write r/w (first thread) write (new thread) read (new thread) write
10
More advanced locking discipline Works well for initialize-and-publish, lazy initialization, etc Doesn’t work with ownership-transfer, etc for each access to v by thread tid: update State[v] if State[v] = Exclusive // Do nothing else if State[v] = Shared // Update lockset, don’t report any races Cands[v] = Cands[v] ∩ LocksHeld[tid] else if State[v] = Shared-Modified // Update lockset, report races Cands[v] := Cands[v] ∩ LocksHeld[tid] if (Cands[v] = ∅ ) Output “Potential race on v”
11
Advanced LockSet: Examples Thread1:Thread2:Thread3: // Initialize g := 1729 // Share published := true assume(published) y := g z := g y := f(l, y); z := g(l, z) output(y) output(z) lock(l) unlock(l) lock(l) unlock(l) lock(l) unlock(l)
12
Advanced LockSet: False positive Thread1:Thread2: … await (Obj.owner == 1); Obj.foo() Obj.bar := baz(); … Obj.owner := 2 … await (Obj.owner = 2); Obj.bar() Obj.foo = baz(); …
13
Happens-Before relation Proposed by Lamport in 1978 for distributed systems Many race detection algorithms try and approximate the happens-before relation Basic idea: two events are related if and only if communication allows information-flow between them We write e i e j for event-i happens-before event-j Informally, if e i e j, e i happens-before e j in all variations of the trace A race if two events e i and e j accesses the same location, and neither e i e j holds, nor e j e i holds
14
Happens-Before Relation: Defintion If two events are from the same thread, the earlier one happens-before the later thread(e i ) = thread(e j ) ∧ i < j e i e j Happens-before is transitive (e i e j ) ∧ (e j e k ) (e i e k ) Every synchronization gives some happens-befores LOCK: if e i is a unlock and e j is a lock later, e i e j WAIT/NOTIFY: if e i is a notify and e j is a corresponding wait, e i e j ……
15
Happens-Before: Examples Thread1:Thread2: obj := new Foo() data = readFile() Notify(obj) Wait(obj) obj.data = data Notify(obj) Wait(obj) lock(l) obj.data = obj.data + 4 unlock(l) lock(l) obj.data = obj.data - 4 unlock(l)…
16
Computing the Happens-Before: Vector Clocks The happens-before relation is usually very expensive to compute Few dynamic techniques actually compute the full relation Classical method proposed by Lamport himself Vector clocks: with each event, associate a “vector clock” storing the last event from the other threads that affects it e i e j if and only if VC[e j ][thread(i)] >= e i
17
Vector clocks VC[e] = [ e 1, e 2, e 3, …, e n ] Last relevant event from thread 1 Last relevant event from thread 2 VC[e][thread(e)] := e // regular events VC[e][tid] := VC[prev(e)][tid] // Acquire locks VC[e][tid] := max(VC[prev(e)][tid], LVC[tid]) // Release locks VC[e][tid] := VC[prev(e)][tid] LVC[tid] := max(VC[e][tid], LVC[tid]) Can be extended to other synchronization primitives! e i e j if and only if VC[e j ][thread(i)] >= e i
18
Vector clocks: Examples Thread1:Thread2: T1_0: obj := new Foo() T2_0: data = readFile() T1_1: Notify(obj) T2_1: Wait(obj) T2_2: obj.data = data T2_3: Notify(obj) T1_2: Wait(obj) T1_3: lock(l) T1_4: obj.data = obj.data + 4 T1_5: unlock(l) T2_4: lock(l) T2_5: obj.data = obj.data - 4 T2_6: unlock(l)…
19
What does HB miss? Every race or false race reported by happens-before based methods are also reported by LockSet based methods Fewer false positives, Potential false negatives Why? Every happens-before relation is not really a true synchronization Thread1:Thread2: y = y + 1 lock(l) x = x + 1 unlock(l) lock(l) x = x + 1 unlock(l) y = y + 1
20
Race detection: summary First line of defence for most concurrent programs Many bugs just show up as race conditions Lockset is fast Lots of false positives Happens-before is slow Reports only true data races Potential false negatives There are hybrid techniques Compute approximations of LockSet and HB
21
Context bounding and Sequentialization
22
Context bounding Folk knowledge: Most concurrency bugs are shallow in terms of required context-switches Most bugs require very few bug fixes Most concurrency bugs are atomicity violations or order violations For an empirical study, see Shan Lu et al. 2006…2008 Why not check concurrent programs only up to a few context switches? Much more efficient
23
CHESS: Systematic exploration Culmination of techniques proposed by Qadeer et al in 2004 Correctness primarily given by assertions in the code Can also use monitors Can detect data-races, deadlocks, etc Main idea: Use a scheduler that explores traces of the program deterministically, prioritizing traces having few context-switches
24
CHESS: Controlling scheduler Non-determinism source: Input Scheduling Timing and library Input non-determinism controlled by specifying fixed inputs Scheduling non-determinism controlled by writing deterministic scheduler Library non-determinism: model library code
25
State-space explosion Exploring k steps in each of the n threads Number of executions is O(n nk ) Exploring k steps in each thread, but only c context-switches Number of executions is O((n 2 k) c.n!) Not exponential in k Thread1: x = 1 … y = k Threadn: x = 1 … y = k … Additionally, scheduler can use polynomial amount of space Remember c spots for context switches Permutations of the n+c atomic blocks
26
Scheduling: Picking pre-emption points void Deposit100() { ChessSchedule(); EnterCriticalSection(&cs); balance += 100; ChessSchedule(); LeaveCriticalSection(&cs); } void Withdraw100() { int t; ChessSchedule(); EnterCriticalSection(&cs); t = balance; ChessSchedule(); LeaveCriticalSection(&cs); ChessSchedule(); EnterCriticalSection(&cs); balance = t -100; ChessSchedule(); LeaveCriticalSection(&cs); } Heuristics: More pre-emption points in critical code, etc Coverage guarantee: When 2 context-switches are explored, every remaining bug requires at least 3 context-switches
27
CHESS: Summary Build a deterministic scheduler Complications: Fairness and Live locks, weak memory models Advantages: Runs real code on real systems Only scheduler has been replaced Disadvantages: Is mostly program agnostic Exhaustive testing
28
Sequentialization CHESS approach: Concurrent program + bound on context switches explore all interleavings General sequentialization approach: Concurrent program + bound on context switches Sequential program Then, verify sequential program using your favourite verification technique Many flavours of context-bounded analysis: PDS based (Qadeer et al.) Transformation based sequentialization: Eager, Lazy (Lal et al.) BMC based (Parlato et al.)
29
Sequentialization: Basic idea What is hard about sequentialization? Have to remember local variables across phases (though they don’t change) If exploring T1 T2 T1, have to remember locals of T1 across phase of T2 Lal-Reps 2008: Instead, do a source to source transformation Copy each statement and global variable c times Now, we can explore T1 T1 T2 instead of T1 T2 T1 Only one threads local variables relevant at each stage
30
Sequentialization: Basic idea Replace each global variable X by X[tid][0..K] X[tid][i] represents the value of the global variable X the i th time thread tid is scheduled X := X + 1 if (phase = 0) X[tid][0] := X[tid][0] + 1 else if (phase = 1) X[tid][1] := X[tid][1] + 1 … else if (phase = K) X[tid][K] = X[tid][K] + 1 if (phase < K && *) phase++; if phase == K + 1 phase = 1 Thread[tid+1]()
31
Sequentialization: Basic idea A program (T1||T2) is rewritten into Seq(T1); Seq(T2); check() Roughly, Execute each thread sequentially But, at random points, guess new values for global variables In the end, check the guessed new values are consistent for phase = 0 to K if (phase > 0) assume (X[0][phase] == X[N][phase – 1] for tid = 1 to N assume (X[tid][phase] == X[tid-1][phase])
32
Sequentialization Thread 0: … X[0][0] := X[0][0] + 1 … X[0][1] := X[0][1] + 1 … X[0][2] := X[0][2] + 1 … Thread 1: … X[1][0] := X[1][0] + 1 … X[1][1] := X[1][1] + 1 … X[1][2] := X[1][2] + 1 … Each green arrow is one part of the check!
33
Sequentialization The original Lal/Reps technique uses summarization for verification of the sequential program Compute summaries for the relation of initial and final values of global variables Extremely powerful idea Advantage: Reduces the need to reason about locals of different threads No need to reason explicitly about interleavings Interleavings encoded into data (variables) Scales linearly with number of threads
34
Sequentialization and BMC Currently, the best tools in the concurrency verification competitions use “sequentialization + BMC” The previous sequentialization technique is better suited for analysis techniques, not model checking No additional advantage using additional globals and then checking for consistency Instead, just explicitly use non-determinism
35
BMC for concurrency First, rewrite threads by unrolling loops and inlining function calls No loops No function calls Forward only control flow Write a driver “main” function to schedule the threads one by one
36
Naïve sequentialization for BMC Main driver: pc0 = 0, …, pcn = 0 main() { for (r = 0; r < K, r++) for (i = 0; i < n; i++) threadi(); } threadi(): switch(pci) { case 0: goto 0; case 1: goto 1; … } 0: CS(0); stmt0; 1: CS(1); stmt1; … M: CS(M); stmtm; CS(j) := if(*) { pci = j; return } The resume mechanism jumps into “right” spot in the thread There is a potential CS before each statement What’s the problem? Lots of jumps in the control flow Bad for SMT encoding
37
Better sequentialization for BMC Main driver: pc0 = 0, …, pcn = 0 main() { for (r = 0; r < K, r++) for (i = 0; i < n; i++) nextCS = * assume (nextCS >= pci) threadi(); pci = nextCS } threadi(): 0: CS(0); stmt0; 1: CS(1); stmt1; … M: CS(M); stmtm; CS(j) := if(j = nextCS) { goto j+1; } Avoid the multiple control flow breaking jumps Restricted non-determinism to one spot
38
Context bounding and Sequentialization: Summary Host of related techniques Can be adapted for analysis, model checking, testing, etc Different techniques need different kinds of tuning Basic idea: Most bugs require few context switches to turn up Can leverage standard sequential program analysis techniques
39
Odds and Ends Things we didn’t cover
40
Specification-free correctness In many cases we don’t want to write assertions Just want concurrent program to do the same thing as a sequential program is doing Standard correctness conditions Linearizability [Herlihy/Wing 91] Serializability [Papadimitrou and others 70s] Method 0 Method 3 Method 2 Method 1 Method 0Method 3Method 1 Method 2 Conc. ExecConc. Exec Seq. Exec
41
Testing for concurrency Root cause of bugs Ordering violations Atomicity violations Data races Coverage metrics and coverage guided search Define use pairs [Tasirin et al] Find ordering violations based on define use orderings HaPSet [ Wang et al] Find interesting interleavings by trying to cover all “immediate histories” of events Cute/JCute [Sen et al] Concolic testing: Accumulate constraints along test run to guide future test runs
42
(Symbolic) Predictive Analysis Analyze variations of the given concurrent trace Run a test and record information Build a predictive model by relaxing scheduling constraints Analyze predictive model for alternate interleavings Can flag false bugs Symbolic predictive analysis From a trace, build precise predictive model (as SMT formula) No false bugs
43
This is the End Brief overview of concurrent verification techniques Lecture 1: Full proof techniques Lecture 2: Incomplete techniques / Bug finding What did we learn? Full verification is hard, not many techniques for weak- memory architectures Use light-weight and incomplete techniques to detect shallow bugs Code using a strict concurrency discipline is more likely to be correct, easier to verify
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.