Presentation is loading. Please wait.

Presentation is loading. Please wait.

Context-bounded model checking of concurrent software Shaz Qadeer Microsoft Research Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton.

Similar presentations


Presentation on theme: "Context-bounded model checking of concurrent software Shaz Qadeer Microsoft Research Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton."— Presentation transcript:

1 Context-bounded model checking of concurrent software Shaz Qadeer Microsoft Research Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton University

2 Concurrent software Operating systems, device drivers Databases, web servers, browsers, GUIs,... Modern languages: C#, Java Processor 1 Processor 2                         Thread 1 Thread 2 Thread 3 Thread 4

3 Concurrency is increasingly important New classes of concurrent software –Web services –Workflows Single-chip multiprocessors are an architectural inflexion point –Software running on these chips will be even more concurrent

4 Reliable concurrent software? Correctness Problem –does program behaves correctly for all inputs and all interleavings? Bugs due to concurrency are insidious –non-deterministic, timing dependent –difficult to detect, reproduce, eliminate –coverage from testing very poor

5 Analysis of concurrent programs is difficult (1) Finite-data single-procedure program –n lines –m states for global data variables 1 thread –n * m states K threads –(n) K * m states

6 Analysis of concurrent programs is difficult (2) Finite-data program with procedures –n lines –m states for global data variables 1 thread –Infinite number of states –Can still decide assertions in O(n * m 3 ) –SLAM, ESP, BLAST implement this algorithm K  2 threads –Undecidable! (Ramalingam 00)

7 Context-bounded verification of concurrent software           Context Context switch Analyze all executions with small number of context switches !

8 Many subtle concurrency errors are manifested in executions with a small number of contexts Context-bounded analysis can be performed efficiently Why context-bounded analysis?

9 KISS: A static checker for concurrent software An implementation of context-bounded analysis –Technique to use any sequential checker to perform context-bounded concurrency analysis Has found a number of concurrency errors in NT device drivers

10 Sequential program Q KISS Sequential Checker Concurrent program P  No error found Error in Q indicates error in P KISS: A static checker for concurrent software

11 Sequential program Q KISS Concurrent program P KISS: A static checker for concurrent software  No error found Error in Q indicates error in P SDV

12 Sequential program Q KISS Concurrent program P KISS: A static checker for concurrent software  No error found Error in Q indicates error in P PREfix

13 Sequential program Q KISS Concurrent program P KISS: A static checker for concurrent software  No error found Error in Q indicates error in P ESP

14 Inside a static checker for sequential programs int x, y, z; void foo ( ) { if (x > y) { y = x; } if (y > z) { z = y; } assert (x ≤ z); } Symbolically analyze all paths Check the assertion for each path Interprocedural analysis – e.g., PREfix, ESP, SLAM, BLAST

15 KISS strategy Q encodes executions of P with small number of context switches –instrumentation introduces lots of extra paths to mimic context switches Leverage all-path analysis of sequential checkers Sequential program Q KISS Concurrent program P  SDV

16 PnpStop( ) { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent); } DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }

17 DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); } PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent); }

18 PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent); } DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); } if ( !done ) { if ($) { done = T; PnpStop( ); } } CODE  bool done = F;

19 PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent); } DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); } if ( !done ) { if ($) { done = T; PnpStop( ); } } CODE  bool done = F; main( ) { DispatchRoutine( ); }

20 PnpStop( ) { int t; CODE; de->stopping = T; CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); CODE; WaitEvent(& de->stopEvent); } DispatchRoutine( ) { int t; if ($) return; if (! de->stopping) { if ($) return; AtomicIncr(& de->count); // do useful work // … if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); } if ( !done ) { if ($) { done = T; PnpStop( ); } } CODE  bool done = F; main( ) { PnpStop( ); }

21 KISS features KISS trades off soundness for scalability Cost of analyzing a concurrent program P = cost of analyzing a sequential program Q –Size of Q asymptotically same as size of P Unsoundness is precisely quantifiable –for 2-thread program, explores all executions with up to two context switches –for n-thread program, explores up to 2n-2 context switches Allows any sequential checker to analyze concurrency

22

23 Experimental Evaluation of KISS

24 Driver Stopping Error in Bluetooth Driver (1 KLOC) DispatchRoutine() { int t; if (! de->stopping) { AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); } PnpStop() { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent); driverStopped = T; }

25 int t; if (! de->stopping) { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent); driverStopped = T; AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); } Assertion fails!

26 DispatchRoutine(IRP *irp) { … irp->CancelRoutine = PacketCancelRoutine; Enqueue(irp); IoMarkIrpPending(irp); … } IoCancelIrp(IRP *irp) { IoAcquireCancelSpinLock(); if (irp->CancelRoutine) { (irp->CancelRoutine)(irp); } … } PacketCancelRoutine(IRP *irp) { … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock(); … } IRP Cancellation Error in Packet Driver (2.5 KLOC)

27 … irp->CancelRoutine = PacketCancelRoutine; Enqueue(irp); IoAcquireCancelSpinLock(); if (irp->CancelRoutine) { // inline PacketCancelRoutine(irp) … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock(); IoMarkIrpPending(irp); Error: An irp should not be marked pending after it has been completed !

28 Data-race Conditions in DDK Sample Drivers Device extension shared among threads Data-races on device extension fields 18 sample DDK drivers –Range 0.5-9.2 KLOC –Total 70 KLOC Each field checked separately with resource limit of 20 minutes and 800MB Two threads: each calls nondeterministically chosen dispatch routine

29 DriverKLOC# Fields# Races Tracedrv0.530 Moufiltr1.0140 Kbfiltr1.1150 Imca1.151 Startio1.190 Toaster/toastmon1.481 Diskperf2.4160 1394diag2.7180 1394vdev2.8181 Fakemodem2.9396 Toaster/bus5.0300 Serenum5.9412 Toaster/func6.6245 Mouclass7.0341 Kbdclass7.4361 Mouser7.6341 Fdc9.2929 Total: 30 races

30 ToastMon_DispatchPnp( DEVICE_OBJECT *obj, IRP *irp) { … IoAcquireRemoveLock(); … case IRP_MN_QUERY_STOP_DEVICE: // Race: write access deviceExt->DevicePnPState = StopPending; … break; … IoReleaseRemoveLock(); … } ToastMon_DispatchPower( DEVICE_OBJECT *obj, IRP *irp) { … // Race: read access if (deviceExt->DevicePnpState == Deleted) { … } … } DevicePnpState Field in Toaster/toastmon

31 Acknowledgments Tom Ball Byron Cook John Henry Doron Holan Vladimir Levin Jakob Lichtenberg Adrian Oney Sriram Rajamani Peter Wieland …

32 Keep It Simple and Sequential Context-bounded analysis by leveraging existing sequential checkers Validates the hypothesis that many concurrency errors require few context switches to show up

33 However… Hard limit on number of explored contexts –e.g., two context switches for concurrent program with two threads Case study: Concurrent transaction management code written in C# (Naik-Rehof 04) –Analyzed by the Zing model checker after automatically translating to the Zing input language –Found three bugs each requiring between three and four context switches

34 Is a tuning knob possible? Given a concurrent boolean program P and a positive integer c, does P go wrong by failing an assertion via an execution with at most c contexts? Given a concurrent boolean program P, does P go wrong by failing an assertion? Undecidable Decidable Given a concurrent boolean program P with unbounded fork-join parallelism and a positive integer c, does P go wrong by failing an assertion via an execution with at most c contexts? Decidable

35           Context Context switch Problem: Unbounded computation possible within each context! Unbounded execution depth and reachable state space Different from bounded-depth model checking

36 Global store g, valuation to global variables Local store l, valuation to local variables Stack s, sequence of local stores State (g, s) Sequential pushdown system Transition relation: (g, s)  (g’, s’)

37 Reachability problem for sequential pushdown system Given (g, s), is there s’ such that (g, s)  * (error,s’)?

38 Aggregate state Set of stacks ss Aggregate state (g, ss) = { (g,s) | s  ss } Reach(g, ss, g’) = {s’ | (g’, s’)  Reach(g, ss)} Reach(g, ss) = { (g’, s’) | exists s  ss such that (g, s)  * (g’, s’) }

39 Theorem (Buchi, Schwoon00) If ss is regular, then Reach(g, ss, g’) is regular. If ss is given as a finite automaton A, then a finite automaton A’ for Reach(g, ss, g’) can be constructed from A in polynomial time.

40 Algorithm Solution: Compute automaton for Reach(g, {s}, error) and report error if it is nonempty. Problem: Given (g, s), is there s’ such that (g, s)  * (error,s’)?

41 Global store g, valuation to global variables Local store l, valuation to local variables Stack s, sequence of local stores State (g, s 1, s 2 ) Concurrent pushdown system Transition relation: (g, s 1 )  (g’, s’ 1 ) in thread 1 (g, s 1, s 2 )  1 (g, s’ 1, s 2 ) (g, s 2 )  (g’, s’ 2 ) in thread 2 (g, s 1, s 2 )  2 (g, s 1, s’ 2 )

42 Reachability problem for concurrent pushdown system Given (g, s 1, s 2 ), are there s’ 1 and s’ 2 such that (g, s 1, s 2 ) reaches (error, s’ 1, s’ 2 ) via an execution with at most c contexts?

43 Aggregate transition relation ss’ 1 = Reach 1 (g, ss 1, g’) (g, ss 1, ss 2 )  1 (g’, ss’ 1, ss 2 ) (g, ss 1, ss 2 )  2 (g’, ss 1, ss’ 2 ) ss’ 2 = Reach 2 (g, ss 2, g’)

44 Algorithm: 2 threads, c contexts   12   12   12 Depth c (g, {s 1 }, {s 2 }) Compute the set of reachable aggregate states. Report an error if (g, ss 1, ss 2 ) is reachable and g = error, ss 1 is nonempty, and ss 2 is nonempty.

45 Complexity: 2 threads, c contexts   12   12   12 Depth of tree = context bound c Branching factor bounded by G  2 (G = # of global stores) Number of edges bounded by (G  2) (c+1) Each edge computable in polynomial time Depth c (g, {s 1 }, {s 2 })

46 Unbounded fork-join parallelism Fork operation: x = fork Join operation: join(x) Copy thread identifier from one variable to another

47 Algorithm: unbounded fork-join parallelism, c contexts At most c threads may perform a transition Reduce to previously solved problem with c threads and c contexts –Nondeterministically pick c forked threads for execution

48 start : {1, …, c}  boolean, initialized to i. (i == 1) end : {1, …, c}  boolean, initialized to i. false x = fork translates to if ($) { assume(count < c); count = count + 1; x = count; start[count] = true; } else { x = c + 1; } join(x) translates to assume(x  c); assume(end[x]); count : {1, …, c}, initialized to 1 c statically created threads thread i starts execution when start[i] is true thread i sets end[i] to true on termination

49 Context-bounded analysis of concurrent software Many subtle concurrency errors are manifested in executions with few context switches –Experience with KISS on Windows drivers –Experience with Zing on transaction manager Algorithms for context-bounded analysis are more efficient than those for unbounded analysis –Reducibility to sequential checking with KISS –Decidability of assertion checking for concurrent boolean programs


Download ppt "Context-bounded model checking of concurrent software Shaz Qadeer Microsoft Research Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton."

Similar presentations


Ads by Google