Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm.

Slides:



Advertisements
Similar presentations
Model Checking Lecture 4. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Abstract Interpretation Part II
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
1 Lecture 07 – Shape Analysis Eran Yahav. Previously  LFP computation and join-over-all-paths  Inter-procedural analysis  call-string approach  functional.
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University
Counterexample-Guided Focus TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A AA A A Thomas Wies Institute of.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Finite Differencing of Logical Formulas for Static Analysis Thomas Reps University of Wisconsin Joint work with M. Sagiv and A. Loginov.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
3-Valued Logic Analyzer (TVP) Part II Tal Lev-Ami and Mooly Sagiv.
1 Motivation Dynamically allocated storage and pointers are an essential programming tools –Object oriented –Modularity –Data structure But –Error prone.
Program analysis Mooly Sagiv html://
Model Checking of Concurrent Software: Current Projects Thomas Reps University of Wisconsin.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Overview of program analysis Mooly Sagiv html://
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
Improving Code Generation Honors Compilers April 16 th 2002.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Describing Syntax and Semantics
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
A Semantics for Procedure Local Heaps and its Abstractions Noam Rinetzky Tel Aviv University Jörg Bauer Universität des Saarlandes Thomas Reps University.
Overview of program analysis Mooly Sagiv html://
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes.
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 14: Pointers, Classes, Virtual Functions, and Abstract Classes.
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
T. Lev-Ami, R. Manevich, M. Sagiv TVLA: A System for Generating Abstract Interpreters A. Loginov, G. Ramalingam, E. Yahav.
TVLA: A system for inferring Quantified Invariants Tal Lev-Ami Tom Reps Mooly Sagiv Reinhard Wilhelm Greta Yorsh.
Symbolic Implementation of the Best Transformer Thomas Reps University of Wisconsin Joint work with M. Sagiv and G. Yorsh (Tel-Aviv) [TR-1468, Comp. Sci.
Shape Analysis Overview presented by Greta Yorsh.
Shape Analysis via 3-Valued Logic Mooly Sagiv Thomas Reps Reinhard Wilhelm
Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.
Chapter 12: Pointers, Classes, Virtual Functions, and Abstract Classes.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
1 Program Analysis via 3-Valued Logic Mooly Sagiv, Tal Lev-Ami, Roman Manevich Tel Aviv University Thomas Reps, University of Wisconsin, Madison Reinhard.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University
1 Simulating Reachability using First-Order Logic with Applications to Verification of Linked Data Structures Tal Lev-Ami 1, Neil Immerman 2, Tom Reps.
Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav.
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Interprocedural shape analysis for cutpoint-free programs
Partially Disjunctive Heap Abstraction
Spring 2016 Program Analysis and Verification
Program Analysis and Verification
Compile-Time Verification of Properties of Heap Intensive Programs
Symbolic Implementation of the Best Transformer
Parametric Shape Analysis via 3-Valued Logic
Parametric Shape Analysis via 3-Valued Logic
Symbolic Characterization of Heap Abstractions
A Semantics for Procedure Local Heaps and its Abstractions
CUTE: A Concolic Unit Testing Engine for C
Presentation transcript:

Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL Materialization

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL

Original Problem: Shape Analysis Characterize dynamically allocated data –x points to an acyclic list, cyclic list, tree, dag, etc. –data-structure invariants Identify may-alias relationships Establish “disjointedness” properties –x and y point to structures that do not share cells

Formalizing “... ” Informal: x Formal: x Summary node

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re-established

Applications: Code Optimization Machine-independent optimizations –constant propagation –loop-invariant code motion –common subexpression elimination Machine-dependent optimizations –register allocation –parallelization –software prefetching Insert storage-reclamation operations Eliminate or move “checking code”

Applications: Software Tools Static detection of memory errors (cleanness) –dereferencing NULL pointers –dereferencing dangling pointers –memory leaks Static detection of logical errors –Is a shape invariant restored? What is in the heap? –list? doubly-linked list? tree? DAG? –disjoint? intertwined?

Properties of reverse(x) On entry: x points to an acyclic list On exit: y points to an acyclic list On exit: x = = NULL On each iteration, x and y point to disjoint acyclic lists All the pointer dereferences are safe No memory leaks

A ‘Yacc’ for Shape Analysis: TVLA Parametric framework –Some instantiations  known analyses –Other instantiations  new analyses

A ‘Yacc’ for Shape Analysis: TVLA Parametric framework –Some instantiations  known analyses –Other instantiations  new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg.

A ‘Yacc’ for Static Analysis: TVLA Parametric framework –Some instantiations  known analyses –Other instantiations  new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg.

A ‘Yacc’ for Static Analysis (Using Logic) Correctness proofs via inductive-assertion method Proof derivation via weakest-precondition calculus “Annotate your loops with invariants!”

“I learned many things – and equally important – I unlearned many things.” — S.K. Allison A ‘Yacc’ for Static Analysis (Using Logic) Correctness proofs via inductive-assertion method Proof derivation via weakest-precondition calculus “Annotate your loops with invariants!” WP

A ‘Yacc’ for Static Analysis (Using Logic) First-order structures (= predicate tables) –hold recorded information –model-theoretic approach, not proof-theoretic Formulae –means for observing information Predicate-update formulae –operational semantics –update recorded information

Recorded Information (for reverse)

u1u1 u2u2 u3u3 u4u4 x y

Formulae for Observing Properties Are x and y pointer aliases?  v: x(v)  y(v) Does x point to a cell with a self cycle?  v : x(v)  n(v,v) Is cell v heap-shared?  v1,v2 : n(v1,v)  n(v2,v)  v1  v2

x y u1u1 u2u2 u3u3 u4u4 Are x and y Pointer Aliases?  v: x(v)  y(v) x y u1u1  Yes

x ’ (v) = x(v) y ’ (v) = 0 t ’ (v) = t(v) n ’ (v1,v2) = n(v1,v2) Predicate-Update Formulae for ‘y = NULL’

x y u1u1 u2u2 u3u3 u4u4 y ’ (v) = 0 

Predicate-Update Formulae for ‘y = x’ x ’ (v) = x(v) y ’ (v) = x(v) t ’ (v) = t(v) n ’ (v1,v2) = n(v1,v2)

x u1u1 u2u2 u3u3 u4u4 Predicate-Update Formulae for ‘y = x’ y ’ (v) = x(v) y 

Predicate-Update Formulae for ‘x = x  n’ x ’ (v) =  v1: x(v1)  n(v1,v) y ’ (v) = y(v) t ’ (v) = t(v) n ’ (v1, v2) = n(v1, v2)

x u1u1 u2u2 u3u3 u4u4 Predicate-Update Formulae for ‘x = x  n’ y x ’ (v) =  v1: x(v1)  n(v1,v)   x

Predicate-Update Formulae for ‘y  n = t’ x ’ (v) = x(v) y ’ (v) = y(v) t ’ (v) = t(v) n ’ (v1,v2) =  y(v1)  n(v1,v2)  y(v1)  t(v2)

Logic and box/arrow diagrams Kleene’s 3-valued logic The abstraction principle Using 3-valued structures to represent sets of stores Conservative extraction of store properties Abstract interpretation More precise abstract interpretation Outline

Two- vs. Three-Valued Logic 01 Two-valued logic {0,1} {0}{1} Three-valued logic {0}  {0,1} {1}  {0,1}

Two- vs. Three-Valued Logic Two-valued logicThree-valued logic

Two- vs. Three-Valued Logic Three-valued logic 0 1 Two-valued logic {1} {0,1} {0} 1 ½ 0

Two- vs. Three-Valued Logic 01 Two-valued logic {0}{1} Three-valued logic {0,1}

Two- vs. Three-Valued Logic 01 Two-valued logic ½ 01 Three-valued logic 0  3 ½ 1  3 ½

Boolean Connectives [Kleene]

1: True 0: False 1/2: Unknown A join semi-lattice: 0  1 = 1/2 Three-Valued Logic   1/2 Information order

Logic and box/arrow diagrams Kleene’s 3-valued logic The abstraction principle Using 3-valued structures to represent sets of stores Conservative extraction of store properties Abstract interpretation More precise abstract interpretation Outline

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re-established

The Abstraction Principle u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234        

The Abstraction Principle Partition the individuals into equivalence classes based on the values of their unary predicates Collapse other predicates via 

What Stores Does a 3-Valued Structure Represent? Example 3-valued structure –individuals: {u 1 } –predicates: graphical presentation concrete stores represented x u1u1 33 x 88 x 37 x

Example 3-valued structure graphical presentation concrete stores What Stores Does a 3-Valued Structure Represent? u1u1 u x u1u1 u x  x

Example 3-valued structure graphical presentation concrete stores u1u1 u x u1u1 u x  x What Stores Does a 3-Valued Structure Represent?

Property-Extraction Principle Questions about store properties can be answered conservatively by evaluating formulae in three-valued logic Formula evaluates to 1  formula always holds in every store Formula evaluates to 0  formula never holds in any store Formula evaluates to 1/2  don’t know  

Are x and y Pointer Aliases? u1u1 u x y  v: x(v)  y(v)    Yes 1

Maybe Is Cell u Heap-Shared?  v1,v2: n(v1,u)  n(v2,u)  v1  v2 u1u1 u x y 1/2   1

Logic and box/arrow diagrams Kleene’s 3-valued logic The abstraction principle Using 3-valued structures to represent sets of stores Conservative extraction of store properties Abstract interpretation More precise abstract interpretation Outline

Abstract Interpretation f (a,b) = (16 * b + 3) * (2 * a + 1) * + b * + 1 2a *   *  

Abstract Interpretation f (a,b) = (16 * b + 3) * (2 * a + 1) * + b * + 1 2a *3 16 O ? ? E E E E O O O O f : _  _  O

Shape Analysis via Abstract Interpretation Iteratively compute a set of 3-valued structures for every program point Every statement transforms structures according to the predicate-update formulae –use 3-valued logic instead of 2-valued logic –use exactly the predicate-update formulae of the concrete semantics!!

Predicate-Update Formulae for “y = x” y ’ (v) = x(v) Old: u1u1 u x y New: u1u1 u x 

Predicate-Update Formulae for “x = x  n” x ’ (v) =  v1: x(v1)  n(v1,v) y Old: u1u1 u x y New: u1u1 u x  

Abstract Abstract Interpretation Concrete  Sets of stores Descriptors of sets of stores   T#T# T 

Concrete   Abstract Abstract Interpretation  Sets of stores Descriptors of sets of stores 

Abstract Abstract Interpretation Concrete   T#T# T  Ordinarily: Must define both T and T #

Abstract Abstract Interpretation Concrete   T#T# T  Ordinarily: Complicated proof of correctness!

Abstract Abstract Interpretation Concrete   T#T# T  Our approach: Same formula for T and T #

Abstract Abstract Interpretation Concrete   T#T# T  Our approach: No proof! We did it for you!

The Embedding Theorem y x u1u1 u 3,4 u2u2 y x u1u1 u 2,3,4 y x u1u1 u3u3 u2u2 u4u4 x y u 1,2,3,4  v: x(v)  y(v) Maybe No

The Embedding Theorem y x u1u1 u3u3 u2u2 u4u4  v: x(v)  y(v) No y x u1u1 u 3,4 u2u2

The Embedding Theorem If a structure B can be embedded in a structure S via a surjective (onto) function f such that basic predicates are preserved, i.e., p B (u 1,.., u k )  p S (f(u 1 ),..., f(u k )) Then, every formula  is preserved: –If  =1 in S, then  =1 in B –If  =0 in S, then  =0 in B –If  =1/2 in S, then  could be 0 or 1 in B

How Are We Doing? Conservative Convenient But not very precise  –Advancing a pointer down a list loses precision –Cannot distinguish an acyclic list from a cyclic list

Cyclic versus Acyclic Lists  x u1u1 u x u1u1 u x

Logic and box/arrow diagrams Kleene’s 3-valued logic The abstraction principle Using 3-valued structures to represent sets of stores Conservative extraction of store properties Abstract interpretation More precise abstract interpretation Outline

The Instrumentation Principle Increase precision by storing the truth- value of some chosen formulae Introduce predicate-update formulae to update the extra predicates

is = 0 Example: Heap Sharing  x is(v) =  v1,v2: n(v1,v)  n(v2,v)  v1  v2 u1u1 u x u1u1 u x is = 0

Example: Heap Sharing  x is(v) =  v1,v2: n(v1,v)  n(v2,v)  v1  v2 u1u1 u x u1u1 u x is = 0 is = 1

Is Cell u Heap-Shared?  v1,v2: n(v1,u)  n(v2,u)  v1  v2 u1u1 u x y is = 0 No! 1/2   1 Maybe

x ’ (v) = x(v) y ’ (v) = 0 t ’ (v) = t(v) n ’ (v1,v2) = n(v1,v2) is ’ (v) = is(v) Predicate-Update Formulae for ‘y = NULL’

Predicate-Update Formulae for ‘y = x’ x ’ (v) = x(v) y ’ (v) = x(v) t ’ (v) = t(v) n ’ (v1,v2) = n(v1,v2) is ’ (v) = is(v)

is = 0 Example: Heap Sharing  x is(v) =  v1,v2: n(v1,v)  n(v2,v)  v1  v2 u1u1 u x u1u1 u x is = 0 is = 1 is = 1/2

Predicate-Update Formulae for ‘x = x  n’ x ’ (v) =  v1: x(v1)  n(v1,v) y ’ (v) = y(v) t ’ (v) = t(v) n ’ (v1,v2) = n(v1, v2) is ’ (v) = is(v)

x ’ (v) = x(v) y ’ (v) = y(v) t ’ (v) = t(v) n ’ (v1,v2) =  y(v1)  n(v1,v2)  y(v1)  t(v2) is ’ (v) = Predicate-Update Formulae for ‘y  n = t’  v1,v2: (is(v)  n ’ (v1,v)  n ’ (v2,v)  v1  v2)  (t(v)  n(v1,v)   y(v1))

x ’ (v) = x(v) y ’ (v) = y(v) t ’ (v) = t(v) n ’ (v1,v2) =  y(v1)  n(v1,v2)  y(v1)  t(v2) is ’ (v) = Predicate-Update Formulae for ‘y  n = t’ (  ((  v1: y(v1)  n(v1,v2))  t(v))  is(v))  ( ((  v1: y(v1)  n(v1,v2))  t(v))  (is(v)  t(v))   v1,v2: n ’ (v1,v)  n ’ (v1,v)  v1  v2 )

reachable-from-variable-x(v) acyclic-along-dimension-d(v) –à la ADDS doubly-linked(v) tree(v) dag(v) AVL trees: –balanced(v), left-heavy(v), right-heavy(v) –... but not via height arithmetic Additional Instrumentation Predicates Need FO + TC

Materialization x = x  n Informal: x y y x x = x  n Formal: x y x y x y x = x  n y x

Materialization x = x  n [Chase, Wegman, & Zadeck 90] x y u1u1 u2u2 y x u1u1 u2u2 x = x  n [Sagiv, Reps, & Wilhelm 96, 98] x y u1u1 u2u2 y x u1u1 u3u3 u2u2

The Focusing Principle “Bring the structure into better focus” –Selectively force 1/2 to 0 or 1 –Avoid indefiniteness Then apply the predicate-update formulae

(1) Focus on  v 1 : x(v 1 )  n(v 1,v) u1u1 x y u x y u1u1 u x y y u1u1 u.1 x  u1u1 u.0 u

x ’ (v) =  v 1 : x(v 1 )  n(v 1,v) (2) Evaluate Predicate-Update Formulae u1u1 u x y x y u1u1 u1u1 u y x y u1u1 x u.1 u.0 x y u1u1 u.1 u.0 y u1u1 u u

The Coercion Principle Increase precision by exploiting some structural properties possessed by all stores Structural properties captured by constraints Apply a constraint solver

(3) Apply Constraint Solver y u1u1 u1u1 u y x y u1u1 u1u1 u y x x u.1 u.0 y u1u1 x u.1 y u1u1 uu

 is(v)  n(v1, v)  v1  v2   n(v2, v) (3) Apply Constraint Solver x y u1u1 u.1 u.0  is(v)  n(v1, v)  v1  v2   n(v2, v) 1 1  1  0   n(v1, v )  n(v2, v)  v1  v2  is(v)

(3) Apply Constraint Solver x y u1u1 u.1 u.0

(3) Apply Constraint Solver x y u1u1 u.1 u.0  is(v)  n(v1, v)  v1  v2   n(v2, v) 1 1  1  n(v1, v )  n(v2, v)  v1  v2  is(v)   0

(3) Apply Constraint Solver x y u1u1 u.0 x(v1)  x(v2)  v1 = v2 u.1 1  11 

(3) Apply Constraint Solver x y u1u1 u.0

Formalizing “...” Informal: x y Formal: x y Summary node

Formalizing “...” Informal: x y t2t2 t1t1 Formal: x y t2t2 t1t1

Formalizing “...” Informal: x y Formal: x y reachable from variable x reachable from variable y r[x]r[x] r[y]r[y] r[x]r[x] r[y]r[y]

Formalizing “...” Informal: x y t2t2 t1t1 Formal: t2t2 t1t1 r[x],r[t 1 ] r[y],r[t 2 ] r[x],r[t 1 ] r[y],r[t 2 ] x y r[y]r[y] r[x]r[x]r[x]r[x] r[y]r[y]

reachable-from-variable-x(v) acyclic-following-field-f(v) doubly-linked(v) tree(v) dag(v) AVL trees: –balanced(v), left-heavy(v), right-heavy(v) –... but not via height arithmetic Additional Instrumentation Predicates

A ‘Yacc’ for Shape Analysis %predicate-update formulae stmt : $x = NULL { is ’ (v) = is(v); } | $x = $t { is ’ (v) = is(v); } | $x = $t  n { is ’ (v) = is(v); } | $x  n = $t { is ’ (v) =  v1,v2: (is(v)  n ’ (v1,v)  n ’ (v2,v)  v1  v2)  (t(v)  n(v1,v)   y(v1)); } | $x = malloc(INT) { is ’ (v) = is(v)   NEW(v);} ; %pointer-field predicates n(v1,v2) %instrumentation-predicate definitions is(v) =  v1,v2: n(v1,v)  n(v2,v)  v1  v2

Why is Shape Analysis Difficult? Destructive updating through pointers –p  next = q –Produces complicated aliasing relationships –Track aliasing on 3-valued structures Dynamic storage allocation –No bound on the size of run-time data structures –Abstraction principle  finite-sized 3-valued structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re- established –Evaluate formulas over 3-valued structures

Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  next; y  next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; Run Demo

Example: Mark and Sweep void Sweep() { unexplored = Universe collected =  while (unexplored   ) { x = SelectAndRemove(unexplored) if (x  marked) collected = collected  {x} } assert(collected = = Universe – Reachset(root) ) } void Mark(Node root) { if (root != NULL) { pending =  pending = pending  {root} marked =  while (pending   ) { x = SelectAndRemove(pending) marked = marked  {x} t = x  left if (t  NULL) if (t  marked) pending = pending  {t} t = x  right if (t  NULL) if (t  marked) pending = pending  {t} } assert(marked = = Reachset(root)) } Run Demo

TVLA vs. Model Checking Determine properties of a transition system State-space exploration State labels: 1 st -order structures 3-valued structures represent commonalities Properties checked: Formulas in FO+TC Determine properties of a transition system State-space exploration State labels: Propositions BDDs represent commonalities Properties checked: Formulas in temporal logic TVLA Model checking

Summary 1/2 arises from abstraction –One-sided analyses (e.g., 1 means “true”, 0 means “don’t know”) conflate 0 and 1/2 –1/2 essential; conflation not essential For program analysis, 3-valued logic allows: –Materialization –Conservative extraction of properties

Cleanness Checking typedef struct list_cell { int val; struct list_cell *next; } *List; bool member (int v, List c) { List e = c; while (e != NULL) { if (e  val == v) return TRUE; e = e  next; } return FALSE; }

Cleanness Checking typedef struct list_cell { int val; struct list_cell *next; } *List; bool member (int v, List c) { List e = c; while (e != NULL) { if (e  val == v) potential dereference of NULL? return TRUE; e = e  next; } return FALSE; }

Cleanness Checking typedef struct list_cell { int val; struct list_cell *next; } *List; bool member (int v, List c) { List e = c; while (e != NULL) { if (e  val == v) potential dereference of NULL? return TRUE; e = e  next; potential dereference of NULL? } return FALSE; }

Possibly Uninitialized Variables Startx = 3 if... y = x y = w w = 8 printf(y) {w,x,y} {w,y} {w} {w,y} {} {w,y} {} V.{w,x,y} V.V – {x} V.V V.V – {w} V.if x  V then V  {y} else V – {y} V.if w  V then V  {y} else V – {y}