Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
x = 3; y = 1/(x-3); x = 3; px = &x; y = 1/(*px-3); x = 3; p = (int*)malloc(sizeof int); *p = x; q = p; y = 1/(*q-3); need to track values other than 0 need to track pointers need to track heap-allocated storage
Flow-Sensitive Points-To Analysis a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e a d b c f e 33 22 11 44 a = &e 55 c = &f *b = c b = ab = a d = *a p = &q; p = q; p = *q; *p = q; pq p r1r1 r2r2 q r1r1 r2r2 q s1s1 s2s2 s3s3 p p s1s1 s2s2 q r1r1 r2r2 pq p r1r1 r2r2 q r1r1 r2r2 q s1s1 s2s2 s3s3 p p s1s1 s2s2 q r1r1 r2r2
What About Malloc? Each malloc site malloc-site variable r p int *p, *q, *r; a: p = (int*)malloc(sizeof(int)); b: q = (int*)malloc(sizeof(int)); r = p; q malloc$b malloc$a
What About Malloc? Each malloc site malloc-site variable /* Create a List of length n */ List head; List *p = &head; for (int i = 0; i < n; i++) { *p = (List)malloc(sizeof(List*)); p = &((*p) next); } typedef struct list_cell { int val; struct list_cell *next; } *List; p head
Shape Analysis [Jones and Muchnick 1981] Characterize dynamically allocated data –Identify may-alias relationships –x points to an acyclic list, cyclic list, tree, dag, … –“disjointedness” properties x and y point to structures that do not share cells –show that data-structure invariants hold Account for destructive updates through pointers
Applications: Software Tools Static detection of memory errors –dereferencing NULL pointers –dereferencing dangling pointers –memory leaks Static detection of logical errors –Is a data-structure invariant restored?
Why is Shape Analysis Difficult? Destructive updating through pointers –p next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Want to verify that data-structure invariants are re-established
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; 123 NULL x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL Materialization
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Example: In-Situ List Reversal List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y; } typedef struct list_cell { int val; struct list_cell *next; } *List; x yt NULL
Idea for a List Abstraction represents x y t NULL x y t x y t x y t
x yt x y t x y t return y t = y y next = t y = x x = x next x != NULL x yt NULL x yt x y t x y t x y t x y t x y t x y t x y t x y t x y t x y t
Properties of reverse(x) On entry, x points to an acyclic list On each iteration, x & y point to disjoint acyclic lists All the pointer dereferences are safe No memory leaks On exit, y points to an acyclic list On exit, x = = NULL All cells reachable from y on exit were reachable from x on entry, and vice versa On exit, the order between neighbors in the y-list is opposite to their order in the x-list on entry
A ‘Yacc’ for Shape Analysis: TVLA Parametric framework –Some instantiations known analyses –Other instantiations new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg. –Correct usage of Java iterators
A ‘Yacc’ for Static Analysis: TVLA Parametric framework –Some instantiations known analyses –Other instantiations new analyses Applications beyond shape analysis –Partial correctness of sorting algorithms –Safety of mobile code –Deadlock detection in multi-threaded programs –Partial correctness of mark-and-sweep gc alg. –Correct usage of Java iterators
Formalizing “... ” Informal: x Formal: x Summary node
The French Recipe for Program Verification [Cousot & Cousot] Concrete operational semantics – st : Collecting semantics – st c : 2 2 Abstract semantics – : 2 A, : A 2 , ( (a)) = a, ( (C)) C – st # : A A –Upper approximation Sound results But may produce false alarms
Using Relations to Represent Linked Lists
u1u1 u2u2 u3u3 u4u4 x y
Formulas: Queries for Observing Properties Are x and y pointer aliases? v: x(v) y(v)
x y u1u1 u2u2 u3u3 u4u4 Are x and y Pointer Aliases? v: x(v) y(v) x y u1u1 1 Yes
Predicate-Update Formulas for “y = x” x’(v) = x(v) y’(v) = x(v) t’(v) = t(v) n’(v 1,v 2 ) = n(v 1,v 2 )
x u1u1 u2u2 u3u3 u4u4 y’(v) = x(v) y Predicate-Update Formulas for “y = x”
Predicate-Update Formulas for “x = x n” x’(v) = v 1 : x(v 1 ) n(v 1,v) y’(v) = y(v) t’(v) = t(v) n’(v 1, v 2 ) = n(v 1, v 2 )
x u1u1 u2u2 u3u3 u4u4 y x’(v) = v 1 : x(v 1 ) n(v 1,v) x Predicate-Update Formulas for “x = x n”
Why is Shape Analysis Difficult? Destructive updating through pointers –p next = q –Produces complicated aliasing relationships Dynamic storage allocation –No bound on the size of run-time data structures Data-structure invariants typically only hold at the beginning and end of operations –Need to verify that data-structure invariants are re-established
Two- vs. Three-Valued Logic 01 Two-valued logic {0,1} {0}{1} Three-valued logic {0} {0,1} {1} {0,1}
Two- vs. Three-Valued Logic Two-valued logicThree-valued logic
Two- vs. Three-Valued Logic 01 Two-valued logic {0}{1} Three-valued logic {0,1}
Two- vs. Three-Valued Logic 01 Two-valued logic ½ 01 Three-valued logic 0 ½ 1 ½
Boolean Connectives [Kleene]
Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234
Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234
Canonical Abstraction u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234
Property-Extraction Principle Questions about a family of two-valued stores can be answered conservatively by evaluating a formula in a three-valued store Formula evaluates to 1 formula holds in every store in the family Formula evaluates to 0 formula does not hold in any store in the family Formula evaluates to 1/2 formula may hold in some; not hold in others
Are x and y Pointer Aliases? u1u1 u x y v: x(v) y(v) Yes 1
Is Cell u Heap-Shared? v 1,v 2 : n(v 1,u) n(v 2,u) v 1 v 2 u Yes 1
Maybe Is Cell u Heap-Shared? v 1,v 2 : n(v 1,u) n(v 2,u) v 1 v 2 u1u1 u x y 1/2 1
The Embedding Theorem y x u1u1 u 34 u2u2 y x u1u1 u 234 y x u1u1 u3u3 u2u2 u4u4 x y u 1234 v: x(v) y(v) Maybe No
The Embedding Theorem If a structure B can be embedded in a structure S by an onto function f, such that basic predicates are preserved, i.e., p B (u 1,.., u k ) p S (f(u 1 ),..., f(u k )) Then every formula is preserved: –If = 1 in S, then = 1 in B –If = 0 in S, then = 0 in B –If = 1/2 in S, then could be 0 or 1 in B
Embedding u1u1 u2u2 u3u3 u4u4 x u5u5 u6u6 u 12 u 34 u 56 x u 123 u 456 x
Canonical Abstraction: An Embedding Whose Result is of Bounded Size u1u1 u2u2 u3u3 u4u4 x u1u1 x u 234
Predicate-Update Formulas for “y = x” y ’ (v) = x(v) Old: u1u1 u x y New: u1u1 u x
Predicate-Update Formulas for “x = x n” x ’ (v) = v 1 : x(v 1 ) n(v 1,v) y Old: u1u1 u x y New: u1u1 u x
x yt NULL return y t = y y next = t y = x x = x next x != NULL x yt NULL x yt u1u1 u x y u1u1 u x y ’ (v) = x(v) 1010
x yt NULL x y t x y t return y t = y y next = t y = x x = x next x != NULL x yt NULL x yt x y t x y t x y t x y t x y t x y t x y t x y t x y t x y t
x’ (v) = v 1 : x(v 1 ) n(v 1,v) Naïve Transformer (x = x n) x y Evaluate update formulas y x
Cyclic versus Acyclic Lists x u1u1 u x u1u1 u x
How Are We Doing? Conservative Convenient But not very precise –Advancing a pointer down a list loses precision –Cannot distinguish an acyclic list from a cyclic list
The Instrumentation Principle Increase precision by storing the truth-value of some chosen formulas
Is Cell u Heap-Shared? v 1,v 2 : n(v 1,u) n(v 2,u) v 1 v 2 u
is = 0 Example: Heap Sharing x is(v) = v 1,v 2 : n(v 1,v) n(v 2,v) v 1 v 2 u1u1 u x u1u1 u x is = 0
Example: Heap Sharing x is(v) = v 1,v 2 : n(v 1,v) n(v 2,v) v 1 v 2 is = 0 is = 1 u1u1 u x u1u1 u x is = 0 is = 1
Example: Cyclicity x c(v) = v 1 : n(v,v 1 ) n * (v 1,v) c = 0 is = 0c = 1 u1u1 u x u1u1 u x c = 0 c = 1
Is Cell u Heap-Shared? v 1,v 2 : n(v 1,u) n(v 2,u) v 1 v 2 u1u1 u x y is = 0 No! 1/2 1 Maybe
Formalizing “... ” Informal: x y Formal: x y
Formalizing “... ” Informal: x y t2t2 t1t1 Formal: x y t2t2 t1t1
Formalizing “... ” Informal: x y Formal: x y reachable from variable x reachable from variable y r[x]r[x] r[y]r[y] r[x]r[x] r[y]r[y]
Formalizing “... ” Informal: x y t2t2 t1t1 Formal: t2t2 t1t1 r[x],r[t 1 ] r[y],r[t 2 ] r[x],r[t 1 ] r[y],r[t 2 ] x y r[y]r[y] r[x]r[x]r[x]r[x] r[y]r[y]
Updating Auxiliary Information t1t1 {r[t 1 ],r[x],r[y]}{p[t 1 ],r[x],r[y]} x {p[x],p[y]} {r[x],r[y]} y t1t1 {r[t 1 ],r[x]}{p[t 1 ],r[x]} x {p[x]} {r[x]} y = NULL
Automatic Generation of Update Formulas for Instrumentation Predicates Where do we get the predicate-update formulas to update the extra predicates?
Automatic Generation of Update Formulas for Instrumentation Predicates Originally, user provided –Update formulas for core predicates: c,st (v) –Definitions of instrumentation predicates: p (v) –Update formulas for instrumentation predicates: p,st (v) Now: p,st created from p and the c,st dddd Consistently defined?
doubly-linked(v) reachable-from-variable-x(v) acyclic-along-dimension-d(v) tree(v) dag(v) AVL trees: –balanced(v), left-heavy(v), right-heavy(v) –... but not via height arithmetic Useful Instrumentation Predicates Need FO + TC
Materialization x = x n Informal: x y y x x = x n Formal: x y x y x y x = x n y x
Naïve Transformer (x = x n) x y Evaluate update formulas y x
Best Transformer (x = x n) x y y x y x y x y x Evaluate update formulas y x y x......
“Focus”-Based Transformer (x = x n) x y y x y x Focus(x n) “Partial ” y x y x Evaluate update formulas y x y x