Effectively-Propositional Reasoning about Reachability in Linked Data Structures Shachar Itzhaky Anindya Banerjee Neil Immerman Aleks Nanevski Mooly Sagiv TAU IMDEA UMASS IMDEA TAU
Motivation Proving presence (absence) of pointer paths between memory allocated objects in a given program –Partial program correctness Memory safety Absence of memory leaks Data structure invariants –Acylicity, Sortedness –Total program correctness –Program equivalence
Program Termination traverse(Node x, Node y) { for (t =x; t != y ; t = t.n) { … } {x y} null nnnn x y
Disjoint Parallelism for (x =h; x != null; x = x.n) { … } for (y=k; y != null; y = y.n) { … } { : null (h k )} null nnnn h nnn k x y
Challenges Complexity of reasoning about reachability assertions –Undecidability of reachability [Inferring reachability properties from the code] " there is a mismatch between the simple intuitions about the way pointer operations work and the complexity of their axiomatic treatments " O'Hearn, Reynolds, Yang [CSL 2001]
Link list manipulations are simple Simple to reason about correctness –Small counterexamples “Simple” invariants –Alternation Free + Reachability “ ” * *
EA( * * ) formulas Bernays-Schönfinkel-Ramsey t ::= var | constant (Terms) ap ::= t 1 = t 2 | r(t 1,t 2, …, t n ) qf ::= ap | qf 1 qf 2 | qf 1 qf 2 | qf ea ::= 1, 2, n : 1, 2, m : qf Effectively Propositional –Skolimization yields finite models –EQ-satisfiable to a propositional formula –Support from Z3
EA( ) formulas Bernays-Schönfinkel-Ramsey 1, 2, : 1 : r( 1, 1 ) r( 1, 2 ) = sat 1 : r(c 1, 1 ) r( 1, c 2 ) = sat (r(c 1, c 1 ) r(c 1, c 2 )) (r(c 1, c 2 ) r(c 2, c 2 )) = sat (P 11 P 12 ) (P 12 P 22 )
Alternation Free Reachability (AF R ) “Extended subset” of EA –Closed under negation t ::= var | constant (Terms) ap ::= t 1 = t 2 | r(t 1,t 2, …, t n ) | t 1 t 2 (Reachability via sequences of f’s) ( exists k: f k (t 1 )=t 2 ) qf ::= qf | qf 1 qf 2 | qf 1 qf 2 | qf e ::= 1, 2,…, n : qf a: ::= 1, 2,…, m : qf af R ::= e | a | af R 1 af R 2 | af R 1 af R 2
AF R Program Properties Acylicity – , : – , : = Acyclic list with a head h – , : h h = Sorted segment – , : data n*n* n*n* n*n* n*n* h u n*n* v u v
AF R Program Properties Doubly linked lists – , : Disjoint lists with heads h and k – : null (h k ) f * b*b* 11 n*n* h k 22
List Reversal (isolatd) h d n*n* n*n* n*n* n*n* n*n* n*n* n n null Node reverse(Node h) { Node c = h; Node d = null; while (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d } {ac [h] : h }
Invariant List Reversal (isolatd) Node reverse(Node h) { Node c = h; Node d = null; while (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d } {ac [h] : h } h d n*n* n*n* n*n* n*n* n*n* n*n* n n null n c d c ( ) d I= , :
List Reversal (isolated) Node reverse(Node h) { Node c = h; Node d = null; while {I} (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d } {ac [h] : h } {ac[d] , : } d c ( ) d I= , :
List Reversal (isolated) Node reverse(Node h) { Node c = h; Node d = null; while {I} (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d } {ac [h] : h } {ac[d] , : : d } d c ( ) d I= , :
List Reversal (ownership) { , : h h } Node reverse(Node h) { Node c = h; Node d = null; while {I} (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d }
List Reversal (ownership) h d n*n* n*n* n*n* n*n* n*n* n*n* n n h h null Case 1:
List Reversal (ownership) h<n*> h<n*>h<n*> h<n*> h d n*n* n *, n * n*n* n null n Case 2:
List Reversal (ownership) h d n*n* n*n* n*n* n*n* h<n*> h<n*>h<n*> h<n*> false n null n Case 3:
List Reversal (ownership) d h=h= n*n* n*n* n *, n * h h h= n null n Case 4:
List Reversal (ownership) Node reverse(Node h) { Node c = h; Node d = null; while (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d; } {ac [h] , : h h } h h <n*><n*> <n*><n*> h<n*> h<n*>h<n*> h<n*> , : h<n*> h<n*>h<n*> h<n*> false h= h h
Why AF R ? Represents the invariants of simple linked list manipulations Closed under , , , Finite model property Decidable for satisfiability/validity AF R AF Can be reduced to a propositional formula –SAT solver is complete for verification/falsification
AF R AF Introduce an auxiliary relation n* t[ ] =n * ( , ) Completely axiomatize n* by an AF formula linOrd = , : n * ( , ) n * ( , ) = , , : n * ( , ) n * ( , ) n * ( , ) , , : n * ( , ) n * ( , ) (n * ( , ) n * ( , )) is satisfiable ( linOrd t[ ]) is satisfiable –AF formulas have finite model
Inverting n* n Every finite model in which n* satisfies the order requirements: linOrd = , : n * ( , ) n * ( , ) = , , : n * ( , ) n * ( , ) n * ( , ) , , : n * ( , ) n * ( , ) (n * ( , ) n * ( , )) n * uniquely determines n
Inverting n* n u v w x y n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n* n*n*
Inverting n* n u v w x y n( ) = : n+n+ n+n+ n+n+ n+n+ n+n+ n+n+ n+n+ n+n+ n n+n+ n n n+n+ n
Simple SAT Application Determine if two clients are identical –Produce isomorphic reachable stores reverse(reverse(h)) = h , : , : , :
Verification Process Program P Assertions VC gen Verification Condition P “ ” SAT Solver Counterexample Proof
wp Weakest Precondition wp: Stm (Ass Ass) wp S (Q) – the weakest condition such that every terminating computation of S results in a state satisfying Q wp S (Q) ’: S ’ ’ Q Can be used to compute verification conditions Q
Hoare Assignment Rule wp x := e (Q) =Q[e / x] wp x := 5 (x=5) = 5=5 true wp x := 5 (x=6) = 6=5 false wp[x := x +1](x=7) = x+1 = 7 x = 6 d c d wc c := d , : = d d d , :
WP Compound statements wp skip (Q) = Q wp x := e (Q) = Q[e / x] wp S 1 ; S 2 (Q) = wp S 1 (wp S 2 (Q)) wp if B then S 1 else S 2 = ( B wp S 1 (Q)) ( B wp S 2 (Q)) wp while B do {I} S = I
VC rules VC gen ({P} S {Q}) = P wp S (Q) VC aux (S, Q) VC aux (S, Q) = {} (for any atomic statement) VC aux (S 1 ; S 2, Q) = VC aux (S 1, wp(S 2, Q)) VC aux (S 2, Q) VC aux (if C then S 1 else S 2, Q) = VC aux (S 1, Q) VC aux (S 2, Q) VC aux (while B do S, Q) = VC aux (S, I) {I B wp S (I)} {I B Q}
But how about heap mutations? McCarthy assignment rule does not work wp c.n := null (Q) = Q[n[c null] / n] –Refers to n –Does not explicitly update reachability –Outside AF R Employ incremental updates n n’ x.n := null QF n’* n* FO TC
Dong & Su [SIGMOD’00] DAG c d : c n( )= c
Deterministic Graphs (function) c d c d c d c d
Mutating Single Linked Lists wp c.n := null (Q) = Q[( ( c c)) / ] Can also enforce absence of null dereferences c null
Circular Linked Lists Slightly more complex but Quantifier-Free [Hesse’03,Reps, Lahiri&Quadeer POPL’08] wp remains in QF
Single Mutation c.n := y (assuming c.n =null) Simple for general graphs AF R for arbitrary data structures wp c.n := y (Q) = Q[( ( c y ))/ ] Can also enforce acyclicity y c , : ( x y )
But what about pointer traversals? x := x.n Hoare assignment rule goes outside AF R wp x := y.n (Q) = Q[n(y) / x] –Outside AF R Reason about list segments Coincides with complications in pointer and shape analysis
WP Compound statements wp skip (Q) = Q wp x := e (Q) = Q[e / x] wp S 1 ; S 2 (Q) = wp S 1 (wp S 2 (Q)) wp if B then S 1 else S 2 = ( B wp S 1 (Q)) ( B wp S 2 (Q)) wp while B do {I} S = I
VC rules VC gen ({P} S {Q}) = P wp S (Q) VC aux (S, Q) VC aux (S, Q) = {} (for any atomic statement) VC aux (S 1 ; S 2, Q) = VC aux (S 1, wp(S 2, Q)) VC aux (S 2, Q) VC aux (if C then S 1 else S 2, Q) = VC aux (S 1, Q) VC aux (S 2, Q) VC aux (while B do S, Q) = VC aux (S, I) {I B wp S (I)} {I B Q}
Pointer Traversals Observe that wp is only used positively in VCs (unlike invariants and preconditions) Allows EA formulas with reachability (AE R ) wp x := y.n (Q) = : ‘n(y)= ’ Q[ /x] –Replace n with n * using reachability inversions Universal quantifications are also used for allocation x := new()
Backward Reasoning with WP {a n * e c n * b disjoint(a,c)} d := e.n ; d.n := null ; d.n := c ; {a n * b} {a n * b (a n * d c n * b)} ( a n * b ( a n * d b n * d ) ) ( a n * d ( a n * d d n * d ) c n * b ( c n * d b n * d ) ) {} true
Backward Reasoning with WP {a n * e c n * b disjoint(a,c)} d := e.n ; d.n := null ; d.n := c ; {a n * b} {a n * b (a n * d c n * b)} ( a n * b ( a n * d b n * d ) ) ( a n * d c n * b ( c n * d b n * d ) ) {} : “n(e) = ” ( a n * b ( a n * b n * ) ) ( a n * c n * b ( c n * b n * ) ) {}
Closure Properties QF AE EA ,,,,,,,, wp x:=y , wp x.n:=y AF ,, , , , wp x:=y.n wp x:=new( ) ,,,,
Benchmark Formula Size Solving time P,QIVC # # # (Z3) SLL: reverse ms SLL: filter ms SLL: create ms SLL: delete ms SLL: deleteAll ms SLL: insert ms SLL: find ms SLL: last ms SLL: merge ms SLL: rotate ms SLL: swap ms DLL: fix ms DLL: splice ms
Disproving with SAT BenchmarkNature of defect Formula Size Solving time C.e. Size P,QIVC # # # (Z3) (vertices) SLL: find null pointer dereference ms 2 SLL: deleteAll Loop invariant in annotation is too weak to prove the desired property ms 5 SLL: rotate Transient cycle introduced during execution ms 3 SLL: insert Unhandled corner case when an element with the same value already exists in the list --- ordering violated ms 4
Example Bug Node insert(Node h, Node e) { Node i = h, j = null; while {I} (i != null && e.val >= i.val) { j = i; i = i.n; } if (j != null) { j.n = e; e.n = i; } else { e.n = h; h = e; } return h; } I = : h i e < val null vnn h vn i e i’
Data Structures outside AF R Lists with the same lengths DAGs Grids …
List Reversal (general) Node reverse(Node h) { Node c = h; Node d = null; while {I} (c != null) { Node t = c.next; c.next = d; d = c; c = t; } return d } {ac [h] , : h h } h<n*> h<n*>h<n*> h<n*><n*><n*> <n*><n*> h h h h false : h n( ) h h null h , :
Related Work Axiomatizing Rechability –[Nelson POPL’83] Useful axioms –[Lev-Ami’09] Useful axioms + completeness study Descriptive Complexity [Hesse’03, Reps’03, Lahiri&Qadeer POPL’08] Decidable Logics [Mona, STRAND, LRP]
Summary Reduction to SAT Works for many programs Principles –Restricted invariants –Inversion n* –Incremental updates –Two logics