T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Abstract Interpretation Part II
Predicate Abstraction and Canonical Abstraction for Singly - linked Lists Roman Manevich Mooly Sagiv Tel Aviv University Eran Yahav G. Ramalingam IBM T.J.
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Heap Decomposition for Concurrent Shape Analysis R. Manevich T. Lev-Ami M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine MSR Cambridge Dagstuhl.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
1 E. Yahav School of Computer Science Tel-Aviv University Verifying Safety Properties using Separation and Heterogeneous Abstractions G. Ramalingam IBM.
1 Lecture 07 – Shape Analysis Eran Yahav. Previously  LFP computation and join-over-all-paths  Inter-procedural analysis  call-string approach  functional.
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Compile-Time Verification of Properties of Heap Intensive Programs Mooly Sagiv Thomas Reps Reinhard Wilhelm
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
Local Heap Shape Analysis Noam Rinetzky Tel Aviv University Joint work with Jörg Bauer Universität des Saarlandes Thomas Reps University of Wisconsin Mooly.
Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Finite Differencing of Logical Formulas for Static Analysis Thomas Reps University of Wisconsin Joint work with M. Sagiv and A. Loginov.
3-Valued Logic Analyzer (TVP) Part II Tal Lev-Ami and Mooly Sagiv.
1 Motivation Dynamically allocated storage and pointers are an essential programming tools –Object oriented –Modularity –Data structure But –Error prone.
Program analysis Mooly Sagiv html://
1 Verifying Temporal Heap Properties Specified via Evolution Logic Eran Yahav, Tom Reps, Mooly Sagiv and Reinhard Wilhelm
Model Checking of Concurrent Software: Current Projects Thomas Reps University of Wisconsin.
1 Eran Yahav and Mooly Sagiv School of Computer Science Tel-Aviv University Verifying Safety Properties.
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
Compile-Time Deallocation of Individual Objects Sigmund Cherem and Radu Rugina International Symposium on Memory Management June, 2006.
Overview of program analysis Mooly Sagiv html://
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
A Semantics for Procedure Local Heaps and its Abstractions Noam Rinetzky Tel Aviv University Jörg Bauer Universität des Saarlandes Thomas Reps University.
Overview of program analysis Mooly Sagiv html://
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
TVLA: A system for inferring Quantified Invariants Tal Lev-Ami Tom Reps Mooly Sagiv Reinhard Wilhelm Greta Yorsh.
1 Employing decision procedures for automatic analysis and verification of heap-manipulating programs Greta Yorsh under the supervision of Mooly Sagiv.
Shape Analysis Overview presented by Greta Yorsh.
Shape Analysis via 3-Valued Logic Mooly Sagiv Thomas Reps Reinhard Wilhelm
Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.
Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.
1 Applications of TVLA Mooly Sagiv Tel Aviv University Shape Analysis with Applications.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Schedule 27/12 Shape Analysis 3/1 Static Analysis in Soot 10/1 Static Analysis in LLVM 17/1 Advanced Topics: Concurrent programs and TAU research topics.
Heap liveness and its usage in automatic memory management Ran Shaham Elliot Kolodner Mooly Sagiv ISMM’02 Unpublished TVLA.
Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv.
1 Program Analysis via 3-Valued Logic Mooly Sagiv, Tal Lev-Ami, Roman Manevich Tel Aviv University Thomas Reps, University of Wisconsin, Madison Reinhard.
/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.
Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm.
Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.
Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University
1 Simulating Reachability using First-Order Logic with Applications to Verification of Linked Data Structures Tal Lev-Ami 1, Neil Immerman 2, Tom Reps.
Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav.
Static Analysis of Concurrent Programs Mooly Sagiv.
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
Interprocedural shape analysis for cutpoint-free programs
Partially Disjunctive Heap Abstraction
Compactly Representing First-Order Structures for Static Analysis
Spring 2016 Program Analysis and Verification
Program Analysis and Verification
Seminar in automatic tools for analyzing programs with dynamic memory
Compile-Time Verification of Properties of Heap Intensive Programs
Symbolic Characterization of Heap Abstractions
A Semantics for Procedure Local Heaps and its Abstractions
Presentation transcript:

T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav T. Reps & R. Wilhelm

Object Oriented Programs Extensive use of the heap Dynamic allocation Recursive data structures Destructive reference updates x.f := y

Verification of OO Programs Verify properties about the heap No runtime errors –Null dereferences –Freed memory –Dangling pointers No memory leaks Partial correctness

A Common Approach to Pointer Analysis in Software Verification Break verification problem into two phases –Preprocessing phase – separate points-to analysis typically no distinction between objects allocated at same site –Verification phase – verification using points-to results May lose precision –May lose ability to perform strong updates –May produce false alarms Program Property Pointer Analysis Verification Analysis

Loss of Precision in Two-Phase Approach f = new InputStream(); f.read(); f.close(); Verify f is not read after it is closed Straightforward...

Loss of Precision in Two-Phase Approach while (?) { f = new InputStream(); f.read(); f.close(); } f1 f False alarm! “read may be erroneous” closed=1/2

The TVLA Approach Express concrete semantics using first order logic Abstract Interpretation using 3-valued Kleene’s logic 1.verification integrated with heap analysis  heap analysis (merging criterion) can adapt to verification Program Property Front end First-Order Transition Sys Abstract Interpretation

while (?) { f = new InputStream(); f.read(); f.close(); } The TVLA Approach: An Example f closed f f f … Concrete States f closed f f Abstract States

Outline Expressing concrete semantics using logic (TVLA input) Canonincal abstraction Abstract interpretation using Canonincal abstraction (TVLA engine) Applications & ongoing Work

Traditional Heap Interpretation States = Two level stores –env: Var  Values –fields: Loc  Values –Values=Loc  Atoms Example –env = [x  30, p  79] –next = [30  40, 40  50, 50  79, 79  90] –val = [30  1, 40  2, 50  3, 79  4, 90  5] x p

Predicate Logic Vocabulary –A finite set of predicate symbols P each with a fixed arity Logical Structures S provide meaning for predicates –A set of individuals (nodes) U –p S : (U S ) k  {0, 1} FO TC over TC,  express logical structure properties

Representing Stores as Logical Structures Locations  Individuals Program variables  Unary predicates Fields  Binary predicates Example –U = {u1, u2, u3, u4, u5} –x = {u1}, p = {u3} –next = {,,, } u1u2u3u4u5 x n nn n p

Concrete Interpretation Rules (Lists) StatementSafety PreconditionPostcondition x =NULL1  v:  x ’ (v) x=malloc()  v 0 :  active(v 0 )  v: x ’ (v)  eq(v, v 0 )  v: active ’ (v)  active(v)  eq(v,v 0 ) x=y1  v:x ’ (v)  y(v) x=y  next  v 0 : y(v 0 )  active(v 0 )  v:x ’ (v)  next(v 0, v) x  next=y  v 0 : x(v 0 )  active(v 0 )  v 1,v 2 : next ’ (v 1, v 2 )  (  eq(v 1, v 0 )  next(v 1, v 2 ))  (eq(v 1, v 0 )  y(v 2 ))

Invariants No memory leaks  v: active(v)   {x  PVar}  v 1 : x(v 1 )  next*(v 1, v) Acyclic list(x)  v 1,v 2 : x(v 1 )  next*(v 1, v 2 )   next + (v 2, v 1 ) Reverse (x)  v 1,v 2, v 3 : x(v1)  next*(v 1, v 2 )  next(v 2, v 3 )  next ’ (v 3, v 2 )

1: True 0: False 1/2: Unknown A join semi-lattice: 0  1 = 1/2 Kleene 3-Valued Logic   1/2 Information order Logical order

Boolean Connectives [Kleene]

3-Valued Logical Structures A set of individuals (nodes) U Predicate meaning –p S : (U S ) k  {0, 1, 1/2}

Canonical Abstraction Partition the individuals into equivalence classes based on the values of their unary predicates –Every individual is mapped into its equivalence class Collapse predicates via  –p S (u ’ 1,..., u ’ k ) =  {p B (u 1,..., u k ) | f(u 1 )=u ’ 1,..., f(u ’ k )=u ’ k ) } At most 2 A abstract individuals

Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 u1 x t u2,3 n n n n u1 x t u2 u3 n n

x t n n u2 u1 u3 Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2,3 n n n   

Canonical Abstraction and Equality Summary nodes may represent more than one element (In)equality need not be preserved under abstraction Explicitly record equality Summary nodes are nodes with eq(u, u)=1/2

Canonical Abstraction and Equality x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 u1 x t u2,3 eq n n n n  u2,3

Canonical Abstraction x = NULL; while (…) do { t = malloc(); t  next=x; x = t } u1 x t u2 u3 n n u1 x t u2,3 n n

Heap Sharing predicate is(v)=0 u1u1 x t u2u2 unun … u1 x t u 2..n n n  v:is(v)   v 1,v 2 : next(v 1,v)  next(v 2,v)   eq(v 1, v 2 ) is(v)=0 n n n

Heap Sharing predicate is(v)=0 u1u1 x t u2u2 unun … is(v)=1is(v)=0 n n n n u1 x t u2 n is(v)=0is(v)=1is(v)=0 n u 3..n n n  v:is(v)   v 1,v 2 : next(v 1,v)  next(v 2,v)   eq(v 1, v 2 )

Best Conservative Interpretation (CC79) Abstraction Concretization Concrete Representation Collecting Interpretation  st  c Concrete Representation Abstract Representation Abstract Representation Abstract Interpretation  st  #

 y x y x Evaluate update formulas y x y x inverse embedding y x y x Canonincal Canonincal abstraction x y “ Focus ” - Based Transformer (x = x  n)

y x y x Evaluate update Formulas (Kleene) y x y x Canonincal y x y x Focus(x  n) “Partial  ” x y

Example: Abstract Interpretation t x n x t n x t n n x t t x n t t n t x t x t x empty x t n n x t n n n x t n t n x n x t n n return x x = t t =malloc(..); t  next=x; x = NULL T F

Summary

Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000) Static analysis of C programs manipulating linked lists Defects checked  References to freed memory  Null dereferences  Memory leaks Existing algorithms are inadequate  Miss errors  Report too many false alarms

Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE

Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE potential null dereference

Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) { Elements *elem; for (elem = c; elem != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE potential null dereference

Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; typedef struct element { int value; struct element *next; } Elements

Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; leakage of address pointed-by h typedef struct element { int value; struct element *next; } Elements

Memory leakage Elements* reverse(Elements *c) { Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; typedef struct element { int value; struct element *next; } Elements ✔ No memory leaks

Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000) Partial correctness  The elements are sorted  The list is a permutation of the original list Termination  At every loop iterations the set of elements reachable from the head is decreased

Example: InsertSort List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r  n; pl = NULL; while (l != r) { if (l  data > r  data) { pr  n = rn; r  n = l; if (pl = = NULL) x = r; else pl  n = r; r = pr; break; } pl = l; l = l  n; } pr = r; r = rn; } return x; } typedef struct list_cell { int data; struct list_cell *n; } *List; assert sorted[x]; assert permutation[x,x’]  v:inOrder[dle, n](v)   v1: next(v, v1)  dle(v,v1)

Example: InsertSort Run Demo List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } typedef struct list_cell { int data; struct list_cell *n; } *List; 14

Example: Mark and Sweep void Sweep() { unexplored = Universe collected =  while (unexplored   ) { x = SelectAndRemove(unexplored) if (x  marked) collected = collected  {x} } assert(collected = = Universe – Reachset(root) ) } void Mark(Node root) { if (root != NULL) { pending =  pending = pending  {root} marked =  while (pending   ) { x = SelectAndRemove(pending) marked = marked  {x} t = x  left if (t  NULL) if (t  marked) pending = pending  {t} t = x  right if (t  NULL) if (t  marked) pending = pending  {t} } assert(marked = = Reachset(root)) }

Challenges: Heap & Concurrency [Yahav POPL ’ 01] Concurrency with the heap is evil… Java threads are just heap allocated objects Data and control are strongly related  Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue)  Heap analysis requires information about thread scheduling Thread t1 = new Thread(); Thread t2 = new Thread(); … t = t1; … t.start();

Configurations – Example at[l_C] rval[myLock] held_by at[l_1] rval[myLock] at[l_0] at[l_1] rval[myLock] blocked l_0: while (true) { l_1: synchronized(myLock) { l_C:// critical section l_2: } l_3: }

Concrete Configuration at[l_C] rval[myLock] held_by at[l_1] rval[myLock] at[l_0] at[l_1] rval[myLock] blocked

Abstract Configuration at[l_C] rval[myLock] held_by blocked at[l_1] rval[myLock] at[l_0]

Examples Verified ProgramProperty twoLock QNo interference No memory leaks Partial correctness Producer/consumerNo interference No memory leaks Apprentice Challenge Counter increasing Dining philosophers with resource ordering Absence of deadlock MutexMutual exclusion Web ServerNo interference

Compile-Time GC for Java (Ran Shaham, SAS ’ 03) The compiler can issue free when objects are no longer needed Analysis of Java/JavaCard programs Generic Java Bytecode Frontend Precise analysis but expensive (cost of procedures)

More Scalable Solution [Gilad Arnold] Combine backward and forward analysis  Forward analysis determine the shape  Backward analysis locates heap liveness  Use 

Lightweight Specification  "correct usage" rules a client must follow  "call open() before read()" Certification does the client program satisfy the lightweight specification? Verification of Safety Properties Component a library with cleanly encapsulated state Client a program that uses the library The Canvas Project (IBM Watson and Tel Aviv) (Component Annotation, Verification and Stuff)

E. Yahav School of Computer Science Tel-Aviv University Verifying Safety Properties using Separation and Heterogeneous Abstractions PLDI ’ 04 G. Ramalingam IBM T.J. Watson Research Center

Quick Overview: Fine Grained Heap Abstraction Heap H Precise but often too expensive

Separation & Heterogeneous Abstraction Fine-grained abstraction Coarse abstraction

Outline of Separation  Decompose verification problem into a set of subproblems  Adapt abstraction to each subproblem

Outline of Separation  Decompose verification problem into a set of subproblems Analysis-user specifies a separation strategy  Adapt abstraction to each subproblem  Heterogeneous abstraction

Prototype Implementation Implemented over TVLA Correctness comes from the embedding theorem Applied to several example programs  Up to 5000 lines of Java Used to verify  Absence of concurrent modification exception (CME)  JDBC API conformance  IOStreams API conformance Improved performance In some cases improved precision

Ongoing Work Temporal properties [Yahav, ESOP’03] Assume guarantee reasoning [Yorsh, VMCAI’04, TACAS’04] Correctness of collection implementation [Livshits] Automatic refinement [Loginov, ESOP’03] Integrating numeric domains [Gopan, TACAS’04] Procedure summaries [Jeannet, SAS’04] Sclalablity [Manevich SAS’04, Lev-Ami] Heap modularity [Bauer & Rinetzky]

TVLA Design Mistakes The operational semantics is written in too low level language TVP can be a high level language “instrumentation” = “derived” “Consistency Rules” = “Integrity Rules” Focus = Partial Concretization TVLA  3VLA

Conclusion FO TC is expressive for defining semantics TVLA is a very useful research tool  Try before you publish  Easy to try new algorithms Sometimes faster than existing shape analyzers But has high interpretation overhead  Currently improved

Summary