Presentation is loading. Please wait.

Presentation is loading. Please wait.

Abstract Transformers for Thread Correlation Analysis Michal Segalov, TAU Tal Lev-Ami, TAU Roman Manevich, TAU G. Ramalingam, MSR India Mooly Sagiv, TAU.

Similar presentations


Presentation on theme: "Abstract Transformers for Thread Correlation Analysis Michal Segalov, TAU Tal Lev-Ami, TAU Roman Manevich, TAU G. Ramalingam, MSR India Mooly Sagiv, TAU."— Presentation transcript:

1 Abstract Transformers for Thread Correlation Analysis Michal Segalov, TAU Tal Lev-Ami, TAU Roman Manevich, TAU G. Ramalingam, MSR India Mooly Sagiv, TAU

2 Motivation A novel approach for static analysis of highly concurrent algorithms Verify correctness Alert on (possible) bugs Challenges Fine-grained syncronization Requires subtle reasoning on thread interference Heap data structures Unbounded state space 2

3 add(node) { while (true) { = locate(node.key) if (found) return false; node.next = cur if (CAS(prev.next,, )) return true; } remove(key) { while (true) { = locate(key) if (!found) return false; if (! CAS(cur.next,, ) ) continue; if (CAS(prev.next,, ) DeleteNode(curr); return true; else locate(key); } locate(key) { restart: pred = Head ; = pred.next; while (true) { if (curr == null) return ; = curr.next; ckey = curr.key; if (pred.next != ) goto restart; if (!cmark) { if (ckey >= key) return pred = curr; } else { if (CAS(pred.next,, )) DeleteNode(curr); else goto restart; } curr = next; } } 3 Concurrent Set [M. Maged SPAA’02] set implemented by linked list heavy use of CAS( Compare and Swap) fine-grained concurrency

4 add(node1) { while (true) { =locate(node1.key) if (found) return false; node1.next = curr1 if (CAS(prev1.next,, )) return true; } remove(key) { while (true) { = locate(key) if (!found) return false; if (!CAS(cur2.next,, ) continue; if (CAS(prev2.next,, )) DeleteNode(curr2); return true; else locate(key); } curr1 prev1 curr2prev2 next2 node1 Head m 4 CAS fails due to mark bit 1 2 3 4 Tr: remove(2) Ta: add(3)

5 Detecting a Bug A node is removed before it is marked remove(key) { while (true) { = locate(key) if (!found) return false; if (!CAS(cur.next,, ) continue; if (CAS(prev.next,, )) DeleteNode(cur); else locate(key); } } 5

6 add(node1) { while (true) { = locate(node1.key) if (found) return false; node1.next = cur1 if (CAS(prev1.next,, )) return true; } remove(key) { while (true) { = locate(key) if (!found) return false; if (CAS(prev2.next,, )) DeleteNode(cur2); if (!CAS(cur2.next,, ) continue; else locate(key); } Concurrent Set [M. Maged SPAA’02] curr1 prev1 curr2prev2 next2 node1 Head  A memory leak 6 1 2 3 4 Tr: remove(2) Ta: add(3)

7 Main Results Thread-correlation analysis A new kind of thread-modular analysis Precise enough to prove properties of fine- grained concurrent programs Not automatically proven before Two transformer enhancements Summarizing Effects Summarizing Abstraction On a concurrent set imp. speedup is x34! 7

8 Thread-modular Abstraction Abstraction from point of view of one thread Maintains local store and global store precisely Abstracts away local stores of all other threads Naturally handles unbounded number of threads Imprecise modeling thread interactions Fine-grained concurrency main thread precise information  program state 8  t.

9 Thread Correlation Abstraction Refines thread-modular abstraction to reason about thread interactions Tracks correlations between local stores of every two threads 3 levels of abstraction Main thread Secondary thread All other threads Main-Second abstracted asymmetrically 9 secondary thread track less precisely main thread precise information

10 Singleton Buffer Example 10 boolean empty = true; Object b = null; produce() { 1: Object p = new(); 2: await (empty) then { b = p; empty = false; } 3: } consume() { Object c; 4: await (!empty) then { c = b; empty = true; } 5: use(c); 6: dispose(c); 7: } Safe Dereference No Double free

11 6: 4: empty c1c1 c2c2 c3c3 c4c4 6:4:6:4: empty c1c1 c2c2 c3c3 c4c4 6:4: 6: empty c1c1 c2c2 c3c3 c4c4 Thread Modular Abstraction C1 6: empty c1c1  6: empty c2c2 4: empty c2c2  C2 … 11 6: empty c3c3 4: empty c3c3  C3 6: empty c4c4 4: empty c4c4 C4

12 6: 4: empty c1c1 c2c2 c3c3 c4c4 6:4:6:4: empty c1c1 c2c2 c3c3 c4c4 6:4: 6: empty c1c1 c2c2 c3c3 c4c4 Thread Modular Abstraction C1  6: empty c1c1 6: empty c2c2 4: empty c2c2  C2 … 12 6: empty c3c3 4: empty c3c3  6: 4: empty c1c1 c2c2 c3c3 c4c4 C3 6: empty c4c4 4: empty c4c4 C4

13 6: 4: empty c1c1 c2c2 c3c3 c4c4 6:4:6:4: empty c1c1 c2c2 c3c3 c4c4 6:4: 6: empty c1c1 c2c2 c3c3 c4c4 Thread Correlation Abstraction  C1,C2 6: empty c1c1 c2c2 6: empty c1c1 c2c2 6: empty c1c1 c3c3 6: empty c1c1 c3c3   6: empty c1c1 c4c4 6: empty c1c1 c4c4 6: empty c2c2 c1c1 4: empty c2c2 c1c1   4: empty c2c2 c3c3 6: empty c2c2 c3c3 4: empty c2c2 c4c4 6: empty c2c2 c4c4 C1,C4 C1,C3 C2,C1 C2,C3C2,C4 4: empty c2c2 c3c3 4: empty c2c2 c4c4 … 13 2-thread factoid

14 6: 4:  C1,C2 empty c1c1 c2c2 c3c3 c4c4 6:4:6:4: empty c1c1 c2c2 c3c3 c4c4 6:4: 6: empty c1c1 c2c2 c3c3 c4c4 6: empty c1c1 c2c2 Concretization Example 6: empty c1c1 c2c2 6: empty c1c1 c3c3 6: empty c1c1 c3c3   6: empty c1c1 c4c4 6: empty c1c1 c4c4 6: empty c2c2 c1c1 4: empty c2c2 c1c1   4 c2c2 c3c3 6: empty c2c2 c3c3 4: empty c2c2 c4c4 6: empty c2c2 c4c4 C1,C4 C1,C3 C2,C1 C2,C3C2,C4 4: empty c2c2 c3c3 4: empty c2c2 c4c4 … 6:4: empty c1c1 c2c2 c3c3 c4c4 14

15 secondary thread track less precisely main thread precise information Abstractions Compared Thread-modular abstraction 2 levels of abstraction Thread-correlation abstraction 3 levels of abstraction 15 main thread precise information

16 Point-wise Transformer 6: C1: dispose(c 1 ) 16 empty 6: c1c1 c3c3 b 6: C1: dispose(c 1 ) empty 7: c1c1 c3c3 b

17 17 6: C1: dispose(c 1 ) Safe?? Single factoid – no… All factoids – Yes! Point-wise Transformer 6: C1: dispose(c 1 ) empty 5: c2c2 c3c3 b c1c1 c4c4 ?? ?: empty 5: c2c2 c3c3 b ?:

18 Build 3-Thread Factoids (model effect C1 has on C2)  empty 6: c1c1 c2c2 c3c3 empty 6: c2c2 c2c2 empty 6: c1c1 c2c2 c3c3 empty 4: c3c3 c2c2 empty 6: c4c4 empty 6: c2c2 c2c2 c4c4 empty 4: c4c4 c2c2  empty 6: c1c1 c4c4 empty 6: c1c1 c4c4  empty 6: c1c1 c3c3 empty 6: c1c1 c3c3 C1,C2 C1,C4 C1,C3 C2,C3C2,C4  empty 6: c2c2 c1c1 empty 4: c2c2 c1c1 C2,C1 ….. 18 C1: Executing C2: Tracked C3: Other

19 3-Thread Factoids  empty 6: c1c1 c2c2 empty 6: c1c1 c2c2  empty 6: c1c1 c3c3 empty 6: c1c1 c3c3 empty 6: c3c3 empty 6: c2c2 c2c2 c3c3 empty 4: c3c3 c2c2  empty 6: c2c2 c1c1 empty 4: c2c2 c1c1 empty c1c1 c2c2 6:4: c3c3 empty c1c1 c2c2 6: c3c3 19 C1,C2C1,C3 C2,C1 C2,C3 C1: Executing C2: Tracked C3: Other

20 6: C1: dispose(c) (exec) 20 empty c1c1 c2c2 6: c3c3 empty c1c1 c2c2 6:4: c3c3 6: empty c1c1 c2c2 6: c3c3 7: empty c1c1 c2c2 7:7:4: c3c3 C1: Executing C2: Tracked C3: Other

21 21 empty c1c1 c2c2 6: c3c3 7: empty c1c1 c2c2 7:4: c3c3 empty 6: c2c2 c3c3 C2,C3C2,C1 empty 6: c2c2 c1c1  6: C1: dispose(c) (project) empty 4: c2c2 c3c3 empty 4: c2c2 c1c1 C1: Executing C2: Tracked C3: Other

22 Transformers Spectrum 22 efficient precise most-precise transformer incomputable? point-wise transformer (thread-modular) efficient imprecise baseline transformer precise enough quadratic blow-ups w. Summarizing Effects precise enough more efficient w. Summary Abstraction precise enough efficient

23 Reducing Quadratic Blow-ups |3-thread factoids|  O(|2-thread factoids| 2 ) Summarizing Effects Memoize computations on common sub states No over-approximation Summary Abstraction Aggressive abstraction to executing thread Crucial for performance Potential loss of precision 23

24 Memoizing PCs 6: C1: dispose(c) empty 6: c1c1 c2c2 24 empty 6: c1c1 c2c2 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3  C1,C2C1,C3C2,C1 C2,C3 empty 6: c2c2 c1c1  empty 5: c2c2 c1c1 empty 4: c2c2 c3c3  empty 6: c2c2 c3c3 empty 5: c2c2 c3c3 6: 5:6: 5: empty 6: c1c1 c2c2 c3c3 empty 6: c1c1 c2c2 c3c3 5: 3-T factoids exec 6: C1: dispose(c) empty 7: c1c1 c2c2 c3c3 6: empty 7: c1c1 c2c2 c3c3 5: proj C2,C1 C2,C3 empty 6: c2c2 c1c1 empty 5: c2c2 c1c1 empty 6: c2c2 c3c3  empty 5: c2c2 c3c3 C1: Executing C2: Tracked C3: Other

25 empty 6: c1c1 c2c2 25 empty 6: c1c1 c2c2 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3  C1,C2C1,C3C2,C1 C2,C3 empty 6: c2c2 c1c1  empty 5: c2c2 c1c1 empty 4: c2c2 c3c3  empty 6: c2c2 c3c3 empty 5: c2c2 c3c3 6: 5:6: 5: these states identical up to the PCs which are invisible to the executing thread Memoizing PCs 6: C1: dispose(c) C1: Executing C2: Tracked C3: Other

26 Memoizing PCs 6: C1: dispose(c) empty 6: c1c1 c2c2 26 empty 6: c1c1 c2c2 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3  C1,C2C1,C3C2,C1 C2,C3 empty c2c2 c1c1  c2c2 c1c1 4: c2c2 c3c3  empty c2c2 c3c3 c2c2 c3c3 6: 5:6: 5: empty 6: c1c1 c2c2 c3c3 3-T factoids c1c1 c2c2 exec 6: C1: dispose(c) empty 7: c3c3 proj C2,C1 C2,C3 empty c2c2 c1c1 c2c2 c3c3  c2c2 c1c1 c2c2 c3c3 frame 6: 5: 4: 6: 5: 6: 5: C1: Executing C2: Tracked C3: Other

27 Evaluation Implemented on top of TVLA Unbounded number of threads Unbounded number of objects Thread-modular not precise enough Thread correlations analysis proved required properties Reproduced injected errors 27

28 28 Speedups Relative to Baseline

29 Related Work Thread-modular abstractions Finite-state model checking [Flanagan & Qadeer, SPIN’03] Environment abstraction [Clarke et al., VMCAI’06, TACAS’08] Thread-modular shape analysis Coarse-grained concurrency [Gotsman et al., PLDI’07] Fine-grained concurrency [SAS’08, CAV’08] 29

30 Summary New analysis for concurrent systems Thread-correlations abstraction Handles unbounded number of threads Two important transformer enhancements Summarizing effects Summary abstraction Reduce quadratic blow-ups Empirically evaluated 30

31 Thanks!

32 Which Properties Did You Prove? Data Structure Invariants Linearization Hand Over Hand  DCAS  CAS  Lazy List  Maged  Maged Opt  32

33 Why 3 Levels of Abstraction? Generalizes naturally by maintaining k local stores in second level k=1 suffices for our benchmarks Same principles for optimizing More than levels of abstraction complicate reasoning – usefulness not obvious 33

34 What is the Increment Relative to CAV’08? CAV’08 uses two levels of abstraction – thread-modular Baseline transformer – too expensive – timed-out on some of our benchmarks 34

35 Does Baseline Transformer Make Sense? Transformer used by earlier CAV’08 paper Starting point of [Flanagan & Qadeer, SPIN’03] We added optimizations by distinguishing 3 levels of abstraction 35

36 Which Properties Did You Prove? Data Structure Invariants Linearization Hand Over Hand  DCAS  ( And in other thread too ) CAS  Lazy List  Michael  Michael Opt  36

37 Summary Abstraction Sound approximation heuristics Details in paper precise reasoning coarse reasoning Reduce preciseness coarse reasoning baseline transformerw. summary abstraction 37

38 A Singleton Buffer - Modified 38 Boolean empty = true; Object b = null; produce() { 1: Object p = new(); 2: await (empty) then { b = p; empty = false; } 3: } consume() { Object c; Boolean x; 4: await (!empty) then { c = b; empty = true; } 5: x = f(c); 6: dispose(c); 7: use(x); 8: }

39 Example 6: C1: dispose(c)  C1,C2 !x 1  empty 5: c2c2 c1c1 C2,C1 39 empty 5: c2c2 c1c1 empty 5: c2c2 c3c3 C2,C3 empty 4: c2c2 c3c3 empty 5: c2c2 c3c3 empty 6: c1c1 c2c2 empty 6: c1c1 c2c2 x1x1 x1x1 empty 6: c1c1 c2c2 empty 6: c1c1 c2c2 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3 x1x1 x1x1 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3 !x 1 C1,C3 empty 5: c2c2 c1c1 empty 5: c2c2 c3c3 x2x2 x2x2 x2x2 x2x2 substates c1c1 c2c2 c3c3 empty 6:5: !x1!x1 c1c1 c2c2 c3c3 empty 6:5: x1x1 x2x2 x2x2 c1c1 c2c2 c3c3 empty 7:5: x1x1 exec C1: dispose(c) c1c1 c2c2 c3c3 empty 7:5: !x1!x1 x2x2 x2x2 x2x2 proj  C2,C1 empty 5: c2c2 c1c1 empty 5: c2c2 c3c3 C2,C3 x2x2 x2x2 x2x2

40  C1,C2 !x 1  empty 5: c2c2 c1c1 C2,C1 40 empty 5: c2c2 c1c1 empty 5: c2c2 c3c3 C2,C3 empty 4: c2c2 c3c3 empty 5: c2c2 c3c3 empty 6: c1c1 c2c2 empty 6: c1c1 c2c2 x1x1 x1x1 empty 6: c1c1 c3c3 empty 6: c1c1 c3c3 x1x1 x1x1 c1c1 c3c3 empty 6: c1c1 c2c2 empty 6: c1c1 c2c2 empty 6: c1c1 c3c3 empty 6: !x 1 C1,C3 substates c1c1 c2c2 c3c3 empty 6:5: exec C1: dispose(c) c1c1 c2c2 c3c3 empty 7:5:  C2,C1 empty 5: c2c2 c1c1 empty 5: c2c2 c3c3 C2,C3 proj Example 6: C1: dispose(c) x2x2 x2x2 x2x2 x2x2 x2x2 x2x2 x2x2 x2x2

41 41 Running Times

42 Types of Algorithms 42 Lock freeWait free Hand Over Hand  NoNo DCAS  CAS No  (locate) Lazy List   (locate) Michael  No Michael Opt   (locate)

43 Baseline Transformer 43 exec (1 st ) statement st exec(tracked) statement st 3-thread substate  factoids

44 Conditions Ensuring No Loss of Precision Abstraction does not distinguish between local stores with same footprint Footprint is idempotent 44


Download ppt "Abstract Transformers for Thread Correlation Analysis Michal Segalov, TAU Tal Lev-Ami, TAU Roman Manevich, TAU G. Ramalingam, MSR India Mooly Sagiv, TAU."

Similar presentations


Ads by Google