Program Analysis as Constraint Solving Sumit Gulwani (MSR Redmond) Ramarathnam Venkatesan (MSR Redmond) Saurabh Srivastava (Univ. of Maryland) TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A
Introduction Last decade has witnessed an engineering revolution in constraint solving. We have developed techniques to reduce classic program analysis problems to constraint solving (in the context of linear arithmetic properties). –Program Verification –Inter-procedural Program Analysis –Weakest Precondition Generation –Strongest Postcondition Generation We show applications of these techniques to –Proving termination/Bounds Analysis –Finding preconditions for non-termination –Most-general counterexamples 1
Introduction Last decade has witnessed an engineering revolution in constraint solving. We have developed techniques to reduce classic program analysis problems to constraint solving (in the context of linear arithmetic properties). –Program Verification –Inter-procedural Program Analysis –Weakest Precondition Generation –Strongest Postcondition Generation We show applications of these techniques to –Proving termination/Bounds Analysis –Finding preconditions for non-termination –Most-general counterexamples 2
Program Verification Verify a Hoare triple (Pre, Program, Post) 3 Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I 9 I I 8X8X Second-order satisfiability constraint
Solving second-order satisfiability constraints 9 I 8 X Á 1 (I,X) Second-order to First-order – Assume I has some form, e.g., j a j x j ¸ 0 – 9 I 8 X Á 1 (I,X) translates to 9 a j 8 X Á 2 (a j,X) First-order to “only existentially quantified” –Farkas Lemma helps translate 8 to 9 – 8 X ( Æ k (e k ¸ 0) ) e ¸ 0) iff 9 k ¸ 0 8 X (e ´ + k k e k ) Eliminate X from polynomial equality by equating coefficients. – 9 a j 8 X Á 2 (a j,X) translates to 9 a j 9 ¸ k Á 3 (a j, ¸ k ) “only existentially quantified” to SAT –Bit-vector modeling for integer variables 4
Program Verification: Example 5 [n=1 Æ m=1] x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100] a 0 + a 1 x + a 2 y + a 3 n + a 4 m ¸ 0 b 0 + b 1 x + b 2 y + b 3 n + b 4 m ¸ 0 y ¸ x m ¸ 1 n · 1 a 2 =b 0 =c 4 =1, a 1 =b 3 =c 0 =-1 a 0 + a 1 x + a 2 y + a 3 n + a 4 m ¸ 0 b 0 + b 1 x + b 2 y + b 3 n + b 4 m ¸ 0 c 0 + c 1 x + c 2 y + c 3 n + c 4 m ¸ 0 y ¸ x m ¸ n a 2 =b 2 =1, a 1 =b 1 =-1 a 0 + a 1 x + a 2 y + a 3 n + a 4 m ¸ 0 Invalid triple or Imprecise Template UNSAT Invariant Template Satisfying Solution Loop Invariant
Outline Program Verification Weakest Precondition Generation Termination/Bounds Analysis Most-general Counterexamples 6
Weakest Precondition Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. 7 [m ¸ n] x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100]
Weakest Precondition: Attempt 1 Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. 8 Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I 8X8X Precondition Encoding 9 Pre, I VC(Pre) 8 R: If R is weaker than Pre, then : VC(R) 9 Pre, I VC(Pre) “Weakest” Precondition Encoding Unfortunately, Farkas lemma is no longer applicable since the formula is non-linear in universally quantified variables.
Weakest Precondition: Attempt 2 Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. 9 Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I I 8X8X Precondition Encoding VC(Pre) Pre is weaker than false 9 Pre, I Non-false Precondition Encoding m ¸ n+127 is a satisfying assignment for Pre. However, this is still not “Weakest” Pre. Pre x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100] 9 Pre, I VC(Pre)
Weakest Precondition: Attempt 3 Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. 10 Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I I 8X8X Precondition Encoding VC(Pre) Pre is weaker than Previous Pre 9 Pre, I “Weaker then Previous“ Precondition Encoding Seq. m ¸ n+127, m ¸ n+126,…, m ¸ n is admissible. This iterative refinement is too slow. Pre x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100] 9 Pre, I VC(Pre)
Weakest Precondition: Solution 1 Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. 11 Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I I 8X8X Precondition Encoding VC(Pre) Pre is weaker then Previous Pre : VC(R 1 ) Æ : VC(R 2 ) … 9 Pre, I Locally-weakest Precondition Encoding 9 Pre, I VC(Pre) where R 1, R 2,… are weaker neighbors of Pre. Pre x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100]
Weaker Neighborhood Structure Weaker neighbors of e 1 ¸ 0 Æ e 2 ¸ 0 are: e 1 +1 ¸ 0 Æ e 2 ¸ 0 e 1 ¸ 0 Æ e 2 +1 ¸ 0 e 1 +e 2 ¸ 0 Æ e 2 ¸ 0 e 1 ¸ 0 Æ e 2 +e 1 ¸ 0 12 Shift plane e i parallel to itself. Rotate plane e i along its intersection with another plane. Geometric Interpretation
Weakest Precondition: Solution 1 Find “weakest” Pre such that the Hoare triple (Pre, Program, Post) is valid. Pre while (c) S Post Pre ) I I Æ : c ) Post (I Æ c)[S] ) I I 8X8X Precondition Encoding VC(Pre) Æ : VC(R 1 ) Æ : VC(R 2 ) … Pre is weaker than Previous Pre 9 Pre, I Locally-weakest Precondition Encoding 9 Pre, I VC(Pre) Pre x := 0; y := 0; while (x < 100) x := x+n; y := y+m; [y ¸ 100] We directly obtain m ¸ n. –m ¸ n+c is no longer locally-weakest for c>0. In fact, we obtain one more solution: n · 0. Generally, locally weakest weakest. Iteration may be required.
Outline Program Verification Weakest Precondition Generation Termination/Bounds Analysis Most-general Counterexample 14
Termination/Bounds Analysis 15 while (c) S i := 0; while (c) i := i+1; Assert(i · F(inputs)); S Transformation introduces a counter to track loop iterations. Then we run “Program Verification” Algorithm. –Besides loop invariant, part F of assertion is also unknown. –Existence of a solution yields a termination proof. –Solution for F gives upper bound on # of loop iterations.
Outline Program Verification Weakest Precondition Generation Termination/Bounds Analysis Most-general Counterexample 16
Most-general Counterexample x := 0; while (x < n) Assert(x<200); x := x+y; 17 Pre x := 0; err := 0; i := 0; while (x < n) i := i+1; Assert(i · F(n,y)); if (x ¸ 200) err := 1; x := x+y; Assert (err=1) “n=356 and y=12” leads to a bug. However, it is more useful to say “y>0 Æ n ¸ 200+y” leads to a bug. We generate this by: 2-step Transformation –Introduce error variable err to track assertion violation. –Introduce counter i to track loop iterations/termination. Then, we run “weakest” Precondition algorithm.
Experiments: Methodology We ran our tool against academic/small benchmarks used by 10 earlier pieces of work that address a wide variety of program analyses related to linear arithmetic properties. –Inter-procedural analysis, Disjunctive invariant inference, termination/bounds analysis, non-termination, strongest Postcondition generation. Parameters: –Templates: 2 conjuncts/1 disjunct –Bit-vector modeling: 3 bits for unknown coefficients, 6 bits for unknown constants, 1 bit for multipliers ¸ in Farkas lemma –Iteratively increased these quantities. 18
Experiments: Results Clauses generated: 5K to 560K (Most examples < 200K) Time taken: 0.1 sec to 80 sec (Most examples < 5 sec) –comparable; however, demonstrates versatility 19
20 Related Work: Constraint based techniques Invariant discovery Linear [Colon, Sankaranarayanan, Sipma, CAV ‘03] Non-linear [Sankaranarayanan et al, POPL ‘04]; [Kapur, ‘05] Linear arithmetic + uninterpreted fns [Beyer et.al., VMCAI ‘07] This work only addresses linear invariants, but With arbitrary (pre-specified) boolean structure. In an inter-procedural setting. In a weakest precondition generation mode too. Bug Finding SATURN [Xie, Aiken, CAV ‘05]: works with loop-free programs This work only addresses programs with linear assignments, but Finds most-general bugs in programs with loops
21 Conclusion Constraint-based techniques offer 2 advantages over fixed-point computation based techniques: –Goal-directed (may buy efficiency) –Do not require widening (may buy precision) This work showed how to reduce a wide variety of program analysis problems to constraint solving over the domain of linear arithmetic. Future work includes extending these ideas to other domains such as pointers, quantifiers.