Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Abstract Interpretation Part II
Predicate Abstraction and Canonical Abstraction for Singly - linked Lists Roman Manevich Mooly Sagiv Tel Aviv University Eran Yahav G. Ramalingam IBM T.J.
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
1 A Logic of Reachable Patterns in Linked Data-Structures Greta Yorsh joint work with Alexander Rabinovich, Mooly Sagiv Tel Aviv University Antoine Meyer,
1 Lecture 07 – Shape Analysis Eran Yahav. Previously  LFP computation and join-over-all-paths  Inter-procedural analysis  call-string approach  functional.
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Compile-Time Verification of Properties of Heap Intensive Programs Mooly Sagiv Thomas Reps Reinhard Wilhelm
Local Heap Shape Analysis Noam Rinetzky Tel Aviv University Joint work with Jörg Bauer Universität des Saarlandes Thomas Reps University of Wisconsin Mooly.
Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University
Counterexample-Guided Focus TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A AA A A Thomas Wies Institute of.
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Finite Differencing of Logical Formulas for Static Analysis Thomas Reps University of Wisconsin Joint work with M. Sagiv and A. Loginov.
1 Motivation Dynamically allocated storage and pointers are an essential programming tools –Object oriented –Modularity –Data structure But –Error prone.
Model Checking of Concurrent Software: Current Projects Thomas Reps University of Wisconsin.
1 Eran Yahav and Mooly Sagiv School of Computer Science Tel-Aviv University Verifying Safety Properties.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
Overview of program analysis Mooly Sagiv html://
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
A Semantics for Procedure Local Heaps and its Abstractions Noam Rinetzky Tel Aviv University Jörg Bauer Universität des Saarlandes Thomas Reps University.
Overview of program analysis Mooly Sagiv html://
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav.
TVLA: A system for inferring Quantified Invariants Tal Lev-Ami Tom Reps Mooly Sagiv Reinhard Wilhelm Greta Yorsh.
1 Employing decision procedures for automatic analysis and verification of heap-manipulating programs Greta Yorsh under the supervision of Mooly Sagiv.
Symbolic Implementation of the Best Transformer Thomas Reps University of Wisconsin Joint work with M. Sagiv and G. Yorsh (Tel-Aviv) [TR-1468, Comp. Sci.
Shape Analysis Overview presented by Greta Yorsh.
Shape Analysis via 3-Valued Logic Mooly Sagiv Thomas Reps Reinhard Wilhelm
Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv.
1 Program Analysis via 3-Valued Logic Mooly Sagiv, Tal Lev-Ami, Roman Manevich Tel Aviv University Thomas Reps, University of Wisconsin, Madison Reinhard.
Quantified Data Automata on Skinny Trees: an Abstract Domain for Lists Pranav Garg 1, P. Madhusudan 1 and Gennaro Parlato 2 1 University of Illinois at.
Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.
Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University
1 Simulating Reachability using First-Order Logic with Applications to Verification of Linked Data Structures Tal Lev-Ami 1, Neil Immerman 2, Tom Reps.
Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav.
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
Interprocedural shape analysis for cutpoint-free programs
Partially Disjunctive Heap Abstraction
Compactly Representing First-Order Structures for Static Analysis
Spring 2016 Program Analysis and Verification
Program Analysis and Verification
Seminar in automatic tools for analyzing programs with dynamic memory
Compile-Time Verification of Properties of Heap Intensive Programs
Symbolic Implementation of the Best Transformer
Parametric Shape Analysis via 3-Valued Logic
Parametric Shape Analysis via 3-Valued Logic
Program Analysis and Verification
Symbolic Characterization of Heap Abstractions
A Semantics for Procedure Local Heaps and its Abstractions
Presentation transcript:

Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)

... and also IBM Research J. Field D. Goyal H. Kolodner G. Ramalingam M. Rodeh Clarkson W. Hesse Technical Univ. of Denmark H.R. Nielson F. Nielson IRISA B. Jeannet Weizmann Institute A. Pnueli MIT V. Kuncak M. Rinard University of Wisconsin F. DiMaio A. Loginov D. Gopan D. Melski A. Lal A. Mulhern Tel Aviv University G. Arnold N. Rinetzky N. Dor R. Shaham G. Erez A. Warshavsky T. Lev-Ami E. Yahav R. Manevich G. Yorsh A.Rabinovich Universität des Saarlandes J. Bauer R. Biber University of Massachusetts N. Immerman

In memoriam, Frank Anger (Deceased July 7, 2004)

The administrator of the U.S.S. Yorktown’s Standard Monitoring Control System entered 0 into a data field for the Remote Data Base Manager program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units. The Yorktown was dead in the water for about two hours and 45 minutes.

A sailor on the U.S.S. Yorktown entered a 0 into a data field in a kitchen-inventory program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units. The Yorktown was dead in the water for about two hours and 45 minutes. Analysis must track numeric information Full Employment for Verification Experts

x = 3; y = 1/(x-3); x = 3; px = &x; y = 1/(*px-3); x = 3; p = (int*)malloc(sizeof int); *p = x; q = p; y = 1/(*q-3); need to track values other than 0 need to track pointers need to track heap-allocated storage

Flow-Sensitive Points-To Analysis a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e 33 22 11 44 a = &e 55 c = &f *b = c b = ab = a d = *a p = &q; p = q; p = *q; *p = q; pq p r1r1 r2r2 q r1r1 r2r2 q s1s1 s2s2 s3s3 p p s1s1 s2s2 q r1r1 r2r2 pq p r1r1 r2r2 q r1r1 r2r2 q s1s1 s2s2 s3s3 p p s1s1 s2s2 q r1r1 r2r2

Flow-Insensitive Points-To Analysis [Andersen 94, Shapiro & Horwitz 97] p = &q; p = q; p = *q; *p = q; pq p r1r1 r2r2 q r1r1 r2r2 q s1s1 s2s2 s3s3 p p s1s1 s2s2 q r1r1 r2r2 a = &e; b = a; c = &f; *b = c; d = *a; a d b c f e

Flow-Insensitive Points-To Analysis 33 22 11 44 a = &e 55 c = &f *b = c b = ab = a d = *a 33 22 11 44 a = &e 55 c = &f *b = c b = ab = a d = *a 33 22 11 44 55 a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e

What About Malloc? Each malloc site  malloc-site variable r p int *p, *q, *r; a: p = (int*)malloc(sizeof(int)); b: q = (int*)malloc(sizeof(int)); r = p; q malloc$b malloc$a

What About Malloc? Each malloc site  malloc-site variable /* Create a List of length n */ List head; List *p = &head; for (int i = 0; i < n; i++) { *p = (List)malloc(sizeof(List*)); p = &((*p)  next); } typedef struct list_cell { int val; struct list_cell *next; } *List; p head

Malloc Sites as “Variables” Concrete: Assignments through pointers Abstract: Accumulate edges (“weak update”) Not very accurate q r head s

Shape Analysis [Jones and Muchnick 1981] Characterize dynamically allocated data –Identify may-alias relationships –x points to an acyclic list, cyclic list, tree, dag, … –“disjointedness” properties x and y point to structures that do not share cells –show that data-structure invariants hold Account for destructive updates through pointers

pointer analysis? points-to analysis? alias analysis? shape analysis? Dynamic storage allocation Destructive updating through pointers

Applications: Software Tools Static detection of memory errors –dereferencing NULL pointers –dereferencing dangling pointers –memory leaks Static detection of logical errors –Is a data-structure invariant restored?

Applications: Code Optimization Parallelization –Operate in parallel on disjoint structures Software prefetching “Compile-time garbage collection” –Insert storage-reclamation operations Eliminate or move “checking code”

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Want to verify that data-structure invariants are re-established

Outline Background on pointer analysis Informal introduction to shape analysis Shape analysis via 3-valued logic A lot more than just shape analysis! Extensions, applications Relationships with model checking Wrapup

Good Abstractions?

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL Materialization

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Idea for a List Abstraction represents x y t NULL x y t x y t x y t

x yt x y t x y t return y t = y y  next = t y = x x = x  next x != NULL x yt NULL x yt x y t x y t x y t x y t x y t x y t x y t x y t x y t x y t

Properties of reverse(x) On entry, x points to an acyclic list On each iteration, x & y point to disjoint acyclic lists All the pointer dereferences are safe No memory leaks On exit, y points to an acyclic list On exit, x = = NULL All cells reachable from y on exit were reachable from x on entry, and vice versa On exit, the order between neighbors in the y-list is opposite to their order in the x-list on entry

A ‘Yacc’ for Shape Analysis: TVLA Parametric framework –Some instantiations  known analyses –Other instantiations  new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg. –Correct usage of Java iterators

A ‘Yacc’ for Static Analysis: TVLA Parametric framework –Some instantiations  known analyses –Other instantiations  new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg. –Correct usage of Java iterators

Formalizing “... ” Informal: x Formal: x Summary node

The French Recipe for Program Verification [Cousot & Cousot] Concrete operational semantics –  st  :    Collecting semantics –  st  c : 2   2  Abstract semantics –  : 2   A,  : A  2 ,  (  (a)) = a,  (  (C))  C –  st  # : A  A –Upper approximation Sound results But may produce false alarms Canonical Abstraction A family of abstractions for use in logic

Using Relations to Represent Linked Lists

u1u1 u2u2 u3u3 u4u4 x y

Formulas: Queries for Observing Properties Are x and y pointer aliases?  v: x(v)  y(v)

x y u1u1 u2u2 u3u3 u4u4 Are x and y Pointer Aliases?  v: x(v)  y(v) x y u1u1  1 Yes

Predicate-Update Formulas for “y = x” x’(v) = x(v) y’(v) = x(v) t’(v) = t(v) n’(v 1,v 2 ) = n(v 1,v 2 )

x u1u1 u2u2 u3u3 u4u4 y’(v) = x(v) y Predicate-Update Formulas for “y = x”

Predicate-Update Formulas for “x = x  n” x’(v) =  v 1 : x(v 1 )  n(v 1,v) y’(v) = y(v) t’(v) = t(v) n’(v 1, v 2 ) = n(v 1, v 2 )

x u1u1 u2u2 u3u3 u4u4 y x’(v) =  v 1 : x(v 1 )  n(v 1,v)   x Predicate-Update Formulas for “x = x  n”

Predicate-Update Formulas for “y  n = t” x’(v) = x(v) y’(v) = y(v) t’(v) = t(v) n’(v 1,v 2 ) =  y(v 1 )  n(v 1,v 2 )  y(v 1 )  t(v 2 )

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re-established

Two- vs. Three-Valued Logic 01 Two-valued logic {0,1} {0}{1} Three-valued logic {0}  {0,1} {1}  {0,1}

Two- vs. Three-Valued Logic Two-valued logicThree-valued logic

Two- vs. Three-Valued Logic 01 Two-valued logic {0}{1} Three-valued logic {0,1}

Two- vs. Three-Valued Logic 01 Two-valued logic ½ 01 Three-valued logic 0  ½ 1  ½

Boolean Connectives [Kleene]

Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234        

Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234

Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234   

Canonical Abstraction Partition the individuals into equivalence classes based on the values of their unary predicates Collapse other predicates via 

Canonical Abstraction vs. Predicate Abstraction u1u1 u2u2 u3u3 u4u4 x  v: x(v) 1  v: y(v) 0  v: x(v)   v:  x(v) 1  v 1,v 2,v: n(v 1,v)  n(v 2,v)  (v 1  v 2 ) 0

Abstract Value Obtained via Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234 Abstract Value Obtained via Predicate Abstraction u1u1 u2u2 u3u3 u4u4 x  1,0,1,0 

Abstract Value Obtained via Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234  1,0,1,0  The predicate-abstraction abstract domain is a special case of the canonical- abstraction abstract domain: –map everything to a single summary individual –retain only the nullary predicates

 1,0,1,0 

Property-Extraction Principle Questions about a family of two-valued stores can be answered conservatively by evaluating a formula in a three-valued store Formula evaluates to 1  formula holds in every store in the family Formula evaluates to 0  formula does not hold in any store in the family Formula evaluates to 1/2  formula may hold in some; not hold in others  

Are x and y Pointer Aliases? u1u1 u x y  v: x(v)  y(v)    Yes 1

Is Cell u Heap-Shared?  v 1,v 2 : n(v 1,u)  n(v 2,u)  v 1  v 2 u Yes 1  

Maybe Is Cell u Heap-Shared?  v 1,v 2 : n(v 1,u)  n(v 2,u)  v 1  v 2 u1u1 u x y 1/2   1

The Embedding Theorem y x u1u1 u 34 u2u2 y x u1u1 u 234 y x u1u1 u3u3 u2u2 u4u4 x y u 1234  v: x(v)  y(v) Maybe No

The Embedding Theorem If a structure B can be embedded in a structure S by an onto function f, such that basic predicates are preserved, i.e., p B (u 1,.., u k )  p S (f(u 1 ),..., f(u k )) Then every formula  is preserved: –If    = 1 in S, then    = 1 in B –If    = 0 in S, then    = 0 in B –If    = 1/2 in S, then    could be 0 or 1 in B

Embedding u1u1 u2u2 u3u3 u4u4 x u5u5 u6u6 u 12 u 34 u 56 x u 123 u 456 x

Canonical Abstraction: An Embedding Whose Result is of Bounded Size u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234

Predicate-Update Formulas for “y = x” y ’ (v) = x(v) Old: u1u1 u x y New: u1u1 u x 

Predicate-Update Formulas for “x = x  n” x ’ (v) =  v 1 : x(v 1 )  n(v 1,v) y Old: u1u1 u x y New: u1u1 u x  

x yt NULL return y t = y y  next = t y = x x = x  next x != NULL x yt NULL x yt u1u1 u x y u1u1 u x y ’ (v) = x(v) 1010

x yt NULL x y t x y t return y t = y y  next = t y = x x = x  next x != NULL x yt NULL x yt x y t x y t x y t x y t x y t x y t x y t x y t x y t x y t

x’ (v) =  v 1 : x(v 1 )  n(v 1,v) Naïve Transformer (x = x  n) x y Evaluate update formulas y x  

Cyclic versus Acyclic Lists  x u1u1 u x u1u1 u x

How Are We Doing? Conservative Convenient But not very precise  –Advancing a pointer down a list loses precision –Cannot distinguish an acyclic list from a cyclic list

The Instrumentation Principle Increase precision by storing the truth-value of some chosen formulas

Is Cell u Heap-Shared?  v 1,v 2 : n(v 1,u)  n(v 2,u)  v 1  v 2 u

is = 0 Example: Heap Sharing  x is(v) =  v 1,v 2 : n(v 1,v)  n(v 2,v)  v 1  v 2 u1u1 u x u1u1 u x is = 0

Example: Heap Sharing  x is(v) =  v 1,v 2 : n(v 1,v)  n(v 2,v)  v 1  v 2 is = 0 is = 1 u1u1 u x u1u1 u x is = 0 is = 1

Example: Cyclicity  x c(v) =  v 1 : n(v,v 1 )  n * (v 1,v) c = 0 is = 0c = 1 u1u1 u x u1u1 u x c = 0 c = 1

Is Cell u Heap-Shared?  v 1,v 2 : n(v 1,u)  n(v 2,u)  v 1  v 2 u1u1 u x y is = 0 No! 1/2   1 Maybe

Formalizing “... ” Informal: x y Formal: x y

Formalizing “... ” Informal: x y t2t2 t1t1 Formal: x y t2t2 t1t1

Formalizing “... ” Informal: x y Formal: x y reachable from variable x reachable from variable y r[x]r[x] r[y]r[y] r[x]r[x] r[y]r[y]

Formalizing “... ” Informal: x y t2t2 t1t1 Formal: t2t2 t1t1 r[x],r[t 1 ] r[y],r[t 2 ] r[x],r[t 1 ] r[y],r[t 2 ] x y r[y]r[y] r[x]r[x]r[x]r[x] r[y]r[y]

Updating Auxiliary Information t1t1 {r[t 1 ],r[x],r[y]}{p[t 1 ],r[x],r[y]} x {p[x],p[y]} {r[x],r[y]} y t1t1 {r[t 1 ],r[x]}{p[t 1 ],r[x]} x {p[x]} {r[x]} y = NULL

Automatic Generation of Update Formulas for Instrumentation Predicates Where do we get the predicate-update formulas to update the extra predicates?

Automatic Generation of Update Formulas for Instrumentation Predicates Originally, user provided –Update formulas for core predicates:  c,st (v) –Definitions of instrumentation predicates:  p (v) –Update formulas for instrumentation predicates: p,st (v) Now: p,st created from  p and the  c,st dddd Consistently defined?

doubly-linked(v) reachable-from-variable-x(v) acyclic-along-dimension-d(v) tree(v) dag(v) AVL trees: –balanced(v), left-heavy(v), right-heavy(v) –... but not via height arithmetic Useful Instrumentation Predicates Need FO + TC

Materialization x = x  n Informal: x y y x x = x  n Formal: x y x y x y x = x  n y x

Naïve Transformer (x = x  n) x y Evaluate update formulas y x

Best Transformer (x = x  n) x y y x y x  y x y x  Evaluate update formulas y x y x......

“Focus”-Based Transformer (x = x  n) x y y x y x Focus(x  n) “Partial  ” y x y x Evaluate update formulas y x y x 

Reps, T., Sagiv, M., and Yorsh, G., Symbolic implementation of the best transformer, VMCAI Yorsh, G., Reps, T., and Sagiv, M., Symbolically computing most-precise abstract operations for shape analysis, TACAS Immerman, N, Rabinovitch, A., Reps, T., Sagiv, M., and Yorsh, G., Verification via structure simulation, CAV Immerman, N, Rabinovitch, A., Reps, T., Sagiv, M., and Yorsh, G., The boundary between decidability and undecidability for transitive-closure logics, CSL Best Transformers via Decision Procedures

Outline Background on pointer analysis Informal introduction to shape analysis Shape analysis via 3-valued logic A lot more than just shape analysis! Extensions, applications Relationships with model checking Wrapup

Threads and Concurrency A memory configuration: thread3 inCritical lock1 isAcquired thread1 atStart thread2 atStart thread4 atStart csLock heldBy

An abstract memory configuration: thread inCritical lock1 isAcquired thread ’ atStart csLock heldBy Threads and Concurrency

Static analysis using 3-valued logic provides a way to explore the (abstract) memory configurations that can arise. Threads and Concurrency

Shape + Numeric Abstractions x u1u1 u2u2 x u u’u’’u’’’ (1,2,3,4) u u’ u’’ u’’’ Position in list (1,2) {1} x [2,4] u1u1 u2u2 (1,4)

Shape + Numeric Abstractions y = x  next (1,2) {1} x [2,4] x u1u1 u2u2 u1u1 u2u2 (1,4) u1u1 u 2.1 u 2.0 x y (1,2,3) {1} x {2} x [3,4] u1u1 u 2.1 u 2.0 (1,2,4)

Example: Sortedness inOrder(v) =  v 1 : n(v, v 1 )  (  v    v 1  )  x inOrder = 1       xx   Yes sorted =  v: inOrder(v)

Example: Sortedness  x inOrder(v) =  v 1 : n(v, v 1 )  (  v    v 1  )       inOrder = 1inOrder = 0inOrder = 1 sorted =  v: inOrder(v) No      inOrder = 0inOrder = 1 xx

TVLA System Input (FO+TC) –Concrete operational semantics –Definitions of instrumentation predicates –Abstraction predicates –Program (as transition system) –Definitions of error conditions Output –Warnings (possible errors) –The 3-valued structures at every node New version available soon

Model Checking vs. TVLA Determine properties of a transition system State-space exploration State labels: 1 st -order structures 3-valued structures represent commonalities Properties checked: Formulas in FO+TC Determine properties of a transition system State-space exploration State labels: Propositions BDDs represent commonalities Properties checked: Formulas in temporal logic TVLAModel checking

RedGreen Yellow RedGreenYellow Red Go Clarke, Grumberg, Jha, Lu, Veith, CAV 00

RedGreen Yellow RedGreenYellow Red = 0 RedRed = 0 Red Go Clarke, Grumberg, Jha, Lu, Veith, CAV 00 Sagiv, Reps, Wilhelm, POPL 99 CTL *  ACTL * FO + TC  FO + TC   2   3

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships –Track aliasing using 3-valued structures Dynamic storage allocation –No bound on the size of run-time data structures –Canonical abstraction  finite-sized 3-valued structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re- established –Query the 3-valued structures that arise at the exit

What to Take Away A ‘yacc’ for static analysis based on logic Broad scope of potential applicability –Not just linkage properties: predicates are not restricted to be links! –Discrete systems in which a relational (+ numeric) structure evolves –Transition: evolution of relational + numeric state

Canonical Abstraction A family of abstractions for use in logic

Questions?