Presentation is loading. Please wait.

Presentation is loading. Please wait.

T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav.

Similar presentations


Presentation on theme: "T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav."— Presentation transcript:

1 T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav T. Reps & R. Wilhelm

2 Object Oriented Programs Extensive use of the heap Dynamic allocation Recursive data structures Destructive reference updates x.f := y

3 Verification of OO Programs Verify properties about the heap No runtime errors –Null dereferences –Freed memory –Dangling pointers No memory leaks Partial correctness

4 A Common Approach to Pointer Analysis in Software Verification Break verification problem into two phases –Preprocessing phase – separate points-to analysis typically no distinction between objects allocated at same site –Verification phase – verification using points-to results May lose precision –May lose ability to perform strong updates –May produce false alarms Program Property Pointer Analysis Verification Analysis

5 Loss of Precision in Two-Phase Approach f = new InputStream(); f.read(); f.close(); Verify f is not read after it is closed Straightforward...

6 Loss of Precision in Two-Phase Approach while (?) { f = new InputStream(); f.read(); f.close(); } f1 f False alarm! “read may be erroneous” closed=1/2

7 The TVLA Approach Express concrete semantics using first order logic Abstract Interpretation using 3-valued Kleene’s logic 1.verification integrated with heap analysis  heap analysis (merging criterion) can adapt to verification Program Property Front end First-Order Transition Sys Abstract Interpretation

8 while (?) { f = new InputStream(); f.read(); f.close(); } The TVLA Approach: An Example f closed f f f … Concrete States f closed f f Abstract States

9 Outline Expressing concrete semantics using logic (TVLA input) Canonincal abstraction Abstract interpretation using Canonincal abstraction (TVLA engine) Applications & ongoing Work

10 Traditional Heap Interpretation States = Two level stores –env: Var  Values –fields: Loc  Values –Values=Loc  Atoms Example –env = [x  30, p  79] –next = [30  40, 40  50, 50  79, 79  90] –val = [30  1, 40  2, 50  3, 79  4, 90  5] 1 40 2 50 3 79 4 90 5 0 x p 30 40 50 79 90

11 Predicate Logic Vocabulary –A finite set of predicate symbols P each with a fixed arity Logical Structures S provide meaning for predicates –A set of individuals (nodes) U –p S : (U S ) k  {0, 1} FO TC over TC,  express logical structure properties

12 Representing Stores as Logical Structures Locations  Individuals Program variables  Unary predicates Fields  Binary predicates Example –U = {u1, u2, u3, u4, u5} –x = {u1}, p = {u3} –next = {,,, } u1u2u3u4u5 x n nn n p

13 Concrete Interpretation Rules (Lists) StatementSafety PreconditionPostcondition x =NULL1  v:  x ’ (v) x=malloc()  v 0 :  active(v 0 )  v: x ’ (v)  eq(v, v 0 )  v: active ’ (v)  active(v)  eq(v,v 0 ) x=y1  v:x ’ (v)  y(v) x=y  next  v 0 : y(v 0 )  active(v 0 )  v:x ’ (v)  next(v 0, v) x  next=y  v 0 : x(v 0 )  active(v 0 )  v 1,v 2 : next ’ (v 1, v 2 )  (  eq(v 1, v 0 )  next(v 1, v 2 ))  (eq(v 1, v 0 )  y(v 2 ))

14 Invariants No memory leaks  v: active(v)   {x  PVar}  v 1 : x(v 1 )  next*(v 1, v) Acyclic list(x)  v 1,v 2 : x(v 1 )  next*(v 1, v 2 )   next + (v 2, v 1 ) Reverse (x)  v 1,v 2, v 3 : x(v1)  next*(v 1, v 2 )  next(v 2, v 3 )  next ’ (v 3, v 2 )

15 1: True 0: False 1/2: Unknown A join semi-lattice: 0  1 = 1/2 Kleene 3-Valued Logic   1/2 Information order Logical order

16 Boolean Connectives [Kleene]

17 3-Valued Logical Structures A set of individuals (nodes) U Predicate meaning –p S : (U S ) k  {0, 1, 1/2}

18 Canonical Abstraction Partition the individuals into equivalence classes based on the values of their unary predicates –Every individual is mapped into its equivalence class Collapse predicates via  –p S (u ’ 1,..., u ’ k ) =  {p B (u 1,..., u k ) | f(u 1 )=u ’ 1,..., f(u ’ k )=u ’ k ) } At most 2 A abstract individuals

19 Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 u1 x t u2,3 n n n n u1 x t u2 u3 n n

20 x t n n u2 u1 u3 Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2,3 n n n   

21 Canonical Abstraction and Equality Summary nodes may represent more than one element (In)equality need not be preserved under abstraction Explicitly record equality Summary nodes are nodes with eq(u, u)=1/2

22 Canonical Abstraction and Equality x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 u1 x t u2,3 eq n n n n  u2,3

23 Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 n n u1 x t u2,3 n n

24 Heap Sharing predicate is(v)=0 u1u1 x t u2u2 unun … u1 x t u 2..n n n  v:is(v)   v 1,v 2 : next(v 1,v)  next(v 2,v)   eq(v 1, v 2 ) is(v)=0 n n n

25 Heap Sharing predicate is(v)=0 u1u1 x t u2u2 unun … is(v)=1is(v)=0 n n n n u1 x t u2 n is(v)=0is(v)=1is(v)=0 n u 3..n n n  v:is(v)   v 1,v 2 : next(v 1,v)  next(v 2,v)   eq(v 1, v 2 )

26 Best Conservative Interpretation (CC79) Abstraction Concretization Concrete Representation Collecting Interpretation  st  c Concrete Representation Abstract Representation Abstract Representation Abstract Interpretation  st  #

27  y x y x...... Evaluate update formulas y x y x...... inverse embedding y x y x Canonincal Canonincal abstraction x y “ Focus ” - Based Transformer (x = x  n)

28 y x y x Evaluate update Formulas (Kleene) y x y x Canonincal y x y x Focus(x  n) “Partial  ” x y

29 Example: Abstract Interpretation t x n x t n x t n n x t t x n t t n t x t x t x empty x t n n x t n n n x t n t n x n x t n n return x x = t t =malloc(..); t  next=x; x = NULL T F

30 Summary

31 Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000) Static analysis of C programs manipulating linked lists Defects checked  References to freed memory  Null dereferences  Memory leaks Existing algorithms are inadequate  Miss errors  Report too many false alarms

32 Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE

33 Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE potential null dereference

34 Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; elem != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE potential null dereference

35 Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; typedef struct element { int value; struct element *next; } Elements

36 Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; leakage of address pointed-by h typedef struct element { int value; struct element *next; } Elements

37 Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; typedef struct element { int value; struct element *next; } Elements ✔ No memory leaks

38 Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000) Partial correctness  The elements are sorted  The list is a permutation of the original list Termination  At every loop iterations the set of elements reachable from the head is decreased

39 Example: InsertSort List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r  n; pl = NULL; while (l != r) { if (l  data > r  data) { pr  n = rn; r  n = l; if (pl = = NULL) x = r; else pl  n = r; r = pr; break; } pl = l; l = l  n; } pr = r; r = rn; } return x; } typedef struct list_cell { int data; struct list_cell *n; } *List; assert sorted[x]; assert permutation[x,x’]  v:inOrder[dle, n](v)   v1: next(v, v1)  dle(v,v1)

40 Example: InsertSort Run Demo List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } typedef struct list_cell { int data; struct list_cell *n; } *List; 14

41 Example: Mark and Sweep void Sweep() { unexplored = Universe collected =  while (unexplored   ) { x = SelectAndRemove(unexplored) if (x  marked) collected = collected  {x} } assert(collected = = Universe – Reachset(root) ) } void Mark(Node root) { if (root != NULL) { pending =  pending = pending  {root} marked =  while (pending   ) { x = SelectAndRemove(pending) marked = marked  {x} t = x  left if (t  NULL) if (t  marked) pending = pending  {t} t = x  right if (t  NULL) if (t  marked) pending = pending  {t} } assert(marked = = Reachset(root)) }

42 Challenges: Heap & Concurrency [Yahav POPL ’ 01] Concurrency with the heap is evil… Java threads are just heap allocated objects Data and control are strongly related  Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue)  Heap analysis requires information about thread scheduling Thread t1 = new Thread(); Thread t2 = new Thread(); … t = t1; … t.start();

43 Configurations – Example at[l_C] rval[myLock] held_by at[l_1] rval[myLock] at[l_0] at[l_1] rval[myLock] blocked l_0: while (true) { l_1: synchronized(myLock) { l_C:// critical section l_2: } l_3: }

44 Concrete Configuration at[l_C] rval[myLock] held_by at[l_1] rval[myLock] at[l_0] at[l_1] rval[myLock] blocked

45 Abstract Configuration at[l_C] rval[myLock] held_by blocked at[l_1] rval[myLock] at[l_0]

46 Examples Verified ProgramProperty twoLock QNo interference No memory leaks Partial correctness Producer/consumerNo interference No memory leaks Apprentice Challenge Counter increasing Dining philosophers with resource ordering Absence of deadlock MutexMutual exclusion Web ServerNo interference

47 Compile-Time GC for Java (Ran Shaham, SAS ’ 03) The compiler can issue free when objects are no longer needed Analysis of Java/JavaCard programs Generic Java Bytecode Frontend Precise analysis but expensive (cost of procedures)

48 More Scalable Solution [Gilad Arnold] Combine backward and forward analysis  Forward analysis determine the shape  Backward analysis locates heap liveness  Use 

49 Lightweight Specification  "correct usage" rules a client must follow  "call open() before read()" Certification does the client program satisfy the lightweight specification? Verification of Safety Properties Component a library with cleanly encapsulated state Client a program that uses the library The Canvas Project (IBM Watson and Tel Aviv) (Component Annotation, Verification and Stuff)

50 E. Yahav School of Computer Science Tel-Aviv University Verifying Safety Properties using Separation and Heterogeneous Abstractions PLDI ’ 04 G. Ramalingam IBM T.J. Watson Research Center

51 Quick Overview: Fine Grained Heap Abstraction Heap H Precise...... but often too expensive

52 Separation & Heterogeneous Abstraction Fine-grained abstraction Coarse abstraction

53 Outline of Separation  Decompose verification problem into a set of subproblems  Adapt abstraction to each subproblem

54 Outline of Separation  Decompose verification problem into a set of subproblems Analysis-user specifies a separation strategy  Adapt abstraction to each subproblem  Heterogeneous abstraction

55 Prototype Implementation Implemented over TVLA Correctness comes from the embedding theorem Applied to several example programs  Up to 5000 lines of Java Used to verify  Absence of concurrent modification exception (CME)  JDBC API conformance  IOStreams API conformance Improved performance In some cases improved precision

56

57

58 Ongoing Work Temporal properties [Yahav, ESOP’03] Assume guarantee reasoning [Yorsh, VMCAI’04, TACAS’04] Correctness of collection implementation [Livshits] Automatic refinement [Loginov, ESOP’03] Integrating numeric domains [Gopan, TACAS’04] Procedure summaries [Jeannet, SAS’04] Sclalablity [Manevich SAS’04, Lev-Ami] Heap modularity [Bauer & Rinetzky]

59 TVLA Design Mistakes The operational semantics is written in too low level language TVP can be a high level language “instrumentation” = “derived” “Consistency Rules” = “Integrity Rules” Focus = Partial Concretization TVLA  3VLA

60 Conclusion FO TC is expressive for defining semantics TVLA is a very useful research tool  Try before you publish  Easy to try new algorithms Sometimes faster than existing shape analyzers But has high interpretation overhead  Currently improved

61 Summary

62 http://www.cs.tau.ac.il/~tvla


Download ppt "T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav."

Similar presentations


Ads by Google