Interprocedural shape analysis for cutpoint-free programs ניתוח צורני בין-שגרתי לתוכניות נטולות נקודות-חתוך Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav IBM Watson
Motivation Conservative static program analysis Applications (Recursive) procedures Dynamically allocated data structures Applications Cleanness Verify data structure invariants Compile-time garbage collection …
A crash course in abstract interpretation Static analysis: Automatic derivation of properties which hold on every execution leading to a program point main() { int w=0,x=0,y=0,z=0; w = 2 + y - 1; x = 2 + z - 1; assert: w+x is even } w x y z E O E O E
How to handle procedures? Pure functions Procedure input/output relation No side-effects p ret 1 2 3 .. … main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x is even } int inc(int p) { return 2 + p - 1; }
How to handle procedures? Pure functions Procedure input/output relation No side-effects p ret Even Odd main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x is even } int inc(int p) { return 2 + p - 1; } w x y z E O E O E
What about global variables? p g ret g’ 1 … Procedures have side-effects Easy fix p g ret g’ Even E/O Odd int g = 0; g = p; int g = 0; main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x+g is even } int inc(int p) { g = p; return 2 + p - 1; }
But what about pointers and heap? Aliasing Destructive update Heap Global resource Anonymous objects x.n.n ~ y n n append(y,z) n y.n=z x z x.n.n.n ~ z y How to tabulate append?
How to tabulate procedures? Procedure input/output relation Not reachable Not effected proc: local (reachable) heap local heap main() { append(y,z); } p q append(List p, List q) { … } p q x n t x n t n y z y n z p q n n
How to handle sharing? External sharing may break the functional view main() { append(y,z); } p q n append(List p, List q) { … } p q n n n y t t x z y n z p q n x n
What’s the difference? 1st Example 2nd Example n n n n x y append(y,z); append(y,z); x n t n n n y y t z x z
Cutpoints An object is a cutpoint for an invocation Reachable from actual parameters Not pointed to by an actual parameter Reachable without going through a parameter append(y,z) append(y,z) n n n n y y n n t t x z z
Cutpoint freedom Cutpoint-free Invocation: has no cutpoints Execution: every invocation is cutpoint-free Program: every execution is cutpoint-free append(y,z) append(y,z) n n n n y x t x t y z z
Main results Cutpoint freedom Non-standard concrete semantics Verifies that an execution is cutpoint-free Local heaps Interprocedural shape analysis Conservatively verifies program is cutpoint free Desired properties Partial correctness of quicksort Procedure summaries Prototype implementation
Plan Cutpoint freedom Non-standard concrete semantics Interprocedural shape analysis Prototype implementation
Programming model Single threaded Procedures Heap Value parameters Formal parameters not modified Recursion Heap Recursive data structures Destructive update No explicit addressing (&) No pointer arithmetic
Memory states A memory state encodes a local heap Local variables of the current procedure invocation Relevant part of the heap Relevant Reachable main append p q n n x t y z
Memory states Represented by first-order logical structures Predicate Meaning x(v) Variable x points to v n(v1,v2) Field n of object v1 points to v2
Memory states Represented by first-order logical structures Predicate Meaning x(v) Variable x points to v n(v1,v2) Field n of object v1 points to v2 p u1 1 u2 q u1 u2 1 n u1 u2 1 p q n u1 u2
Operational semantics Statements modify values of predicates Specified by predicate-update formulae Formulae in FO-TC
Procedure calls append(p,q) 1. Verify cutpoint freedom 2 Compute input z x n p q 1. Verify cutpoint freedom 2 Compute input … Execute callee … 3 Combine output append(y,z) append body p n q y z x n
Procedure call: 1. Verifying cutpoint-freedom An object is a cutpoint for an invocation Reachable from actual parameters Not pointed to by an actual parameter Reachable without going through a parameter append(y,z) append(y,z) n n n n y y x n z x z n t t Cutpoint free Not Cutpoint free
Procedure call: 1. Verifying cutpoint-freedom Invoking append(y,z) in main R{y,z}(v)=v1:y(v1)n*(v1,v) v1:z(v1)n*(v1,v) isCPmain,{y,z}(v)= R{y,z}(v) (y(v)z(v1)) ( x(v) t(v) v1: R{y,z}(v1)n(v1,v)) (main’s locals: x,y,z,t) n n n n y y x n z x z n t t Cutpoint free Not Cutpoint free
Procedure call: 2. Computing the input local heap Retain only reachable objects Bind formal parameters Call state p n q Input state n n y x z n t
Procedure body: append(p,q) Input state Output state p n q n n p q
Procedure call: 3. Combine output Call state Output state n n p n q n y x z n t
Procedure call: 3. Combine output Call state Output state n n n n n y p x z q n t Auxiliary predicates y n z x t inUc(v) inUx(v)
Observational equivalence CPF CPF (Cutpoint free semantics) GSB GSB (Standard semantics) CPF and GSB observationally equivalent when for every access paths AP1, AP2 AP1 = AP2 (CPF) AP1 = AP2 (GSB)
Observational equivalence For cutpoint free programs: CPF CPF (Cutpoint free semantics) GSB GSB (Standard semantics) CPF and GSB observationally equivalent It holds that st, CPF ’CPF st, GSB ’GSB ’CPF and ’GSB are observationally equivalent
Introducing local heap semantics Operational semantics ~ Local heap Operational semantics ’ ’ Abstract transformer
Plan Cutpoint freedom Non-standard concrete semantics Interprocedural shape analysis Prototype implementation
Shape abstraction Abstract memory states represent unbounded concrete memory states Conservatively In a bounded way Using 3-valued logical structures
3-Valued logic 1 = true 0 = false 1/2 = unknown A join semi-lattice, 0 1 = 1/2
Canonical abstraction y z n n n n n x n n t y z x n t
Instrumentation predicates Record derived properties Refine the abstraction Instrumentation principle [SRW, TOPLAS’02] Reachability is central! Predicate Meaning rx(v) v is reachable from variable x robj(v1,v2) v2 is reachable from v1 ils(v) v is heap-shared c(v) v resides on a cycle
Abstract memory states (with reachability) z n n n n n x rx rx rx rx rx rx rx,ry rx,ry rz rz rz rz rz rz n n t rt rt rt rt rt rt y rx rx,ry n z rz x rt t
The importance of reachability: Call append(y,z) x rx rx rx rx,ry rz rz rz n n t rt rt rt y z n y z x n t n n n n x rx rx rx,ry rz rz n n t rt rt
Abstract semantics Conservatively apply statements on abstract memory states Same formulae as in concrete semantics Soundness guaranteed [SRW, TOPLAS’02]
Procedure calls append(p,q) 1. Verify cutpoint freedom 2 Compute input z x n p q 1. Verify cutpoint freedom 2 Compute input … Execute callee … 3 Combine output append(y,z) append body p n q y z x n
Conservative verification of cutpoint-freedom Invoking append(y,z) in main R{y,z}(v)=v1:y(v1)n*(v1,v) v1:z(v1)n*(v1,v) isCPmain,{y,z}(v)= R{y,z}(v) (y(v)z(v1)) ( x(v) t(v) v1: R{y,z}(v1)n(v1,v)) n n n n n y ry ry rz y ry ry ry rx rz n n n z n x z rt rt t rt rt t Cutpoint free Not Cutpoint free
Interprocedural shape analysis Tabulation exits p p call f(x) y x x y
Interprocedural shape analysis Analyze f p p Tabulation exits p p call f(x) y x x y
Interprocedural shape analysis Procedure input/output relation Input Output q q rq rq p q p q n rp rq rp rq rp p n n … q p q n n n rp rp rq rp rp rp rq
Interprocedural shape analysis Reusable procedure summaries Heap modularity q p rp rq n y n z x append(y,z) rx ry rz y x z n rx ry rz append(y,z) g h i k n rg rh ri rk append(h,i)
Plan Cutpoint freedom Non-standard concrete semantics Interprocedural shape analysis Prototype implementation
Prototype implementation TVLA based analyzer Soot-based Java front-end Parametric abstraction Data structure Verified properties Singly linked list Cleanness, acyclicity Sorting (of SLL) + Sortedness Unshared binary trees Cleaness, tree-ness
Iterative vs. Recursive (SLL) 585
Inline vs. Procedural abstraction // Allocates a list of // length 3 List create3(){ … } main() { List x1 = create3(); List x2 = create3(); List x3 = create3(); List x4 = create3();
Call string vs. Relational vs Call string vs. Relational vs. CPF [Rinetzky and Sagiv, CC’01] [Jeannet et al., SAS’04]
Related Work Interprocedural shape analysis Local Reasoning Rinetzky and Sagiv, CC ’01 Chong and Rugina, SAS ’03 Jeannet et al., SAS ’04 Hackett and Rugina, POPL ’05 Rinetzky et al., POPL ‘05 Local Reasoning Ishtiaq and O’Hearn, POPL ‘01 Reynolds, LICS ’02 Encapsulation Noble et al. IWACO ’03 ...
Future work Bounded number of cutpoints False cutpoints append(y,z); Liveness analysis y n t ry rx rt rz z x append(y,z); x = null;
Summary Cutpoint freedom Non-standard operational semantics Interprocedural shape analysis Partial correctness of quicksort Prototype implementation
End www.cs.tau.ac.il/~maon A Semantics for procedure local heaps and its abstraction Noam Rinetzky, Jörg Bauer, Thomas Reps, Mooly Sagiv, and Reinhard Wilhelm POPL, 2005 Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky, Mooly Sagiv, and Eran Yahav To appear in SAS, 2005 www.cs.tau.ac.il/~maon
quickSort(List p, List q) 5 2 6 8 1 3 7 4 2 1 4 3 p hd 7 8 6 tl 5 1 5 2 3 4 p hd low 6 8 7 tl high p 1 6 8 7 5 2 3 4 hd low tl high
Quicksort List quickSort(List p, List q) { If(p==q || q == null) return p; List h = partition(p,q); List x = p.n; p.n = null; List low = quickSort(h, p); List high = quickSort(x,null); p.n = high; Return low; } 2 1 4 3 5 7 6 hd p tl 8 p tl hd
Quicksort 2 1 4 3 5 7 6 hd p tl 8 p tl hd 2 1 4 3 5 7 6 low p tl 8 hd List quickSort(List p, List q) { If(p==q || q == null) return p; List h = partition(p,q); List x = p.n; p.n = null; List low = quickSort(h, p); List high = quickSort(x,null); p.n = high; Return low; } 2 1 4 3 5 7 6 hd p tl 8 p tl hd 2 1 4 3 5 7 6 low p tl 8 hd p tl hd low
Quicksort … … … hd p tl 6 8 2 1 3 5 7 6 8 4 p tl hd 9 2 1 4 3 5 7 6 low p tl 8 hd 9 8 … low hd p tl Lev Ami et. al. ISSTA’00