Symbolic Implementation of the Best Transformer Thomas Reps Mooly Sagiv Greta Yorsh University of Wisconsin Tel Aviv University The work I present here was done together with Thomas Reps from the University of Wisconsin, Madison, and Mooly Sagiv. We propose an algorithm for computing best abstract transformer using decision procedure and a symbolic representation of the abstract domain VMCAI'04 November 18
Motivation New approach to using symbolic techniques in abstract interpretation for shape analysis for other abstract domains What does it mean to harness a decision procedure for use in static analysis? requirements what does it buy us ? - The motivation of this work is to do symbolic abstract interpretation in particular for shape analysis, but the algorithm we suggest can be applied to other abstract domains. - By the term symbolic technique we mean that abstract values represented symbolically using formulas and a decision procedure is used to check satisfability of these formulas. First question is what are the requirements, the conditions that allow us to use decision procedure ? I’ll define the requirements for our algorithm. We believe that these requirements are natural. The second question is what does it buy us ? It gives us an algorithm for best transformer. But it gives us more things, as you will see. I’ll say a few words about the practical aspects at the end. VMCAI'04 November 18
Goals Automatic verification of software no loop invariants abstraction sound but incomplete methods Mathematically justified by the abstract interpretation (CC79) Best possible precision for the given abstraction No loss of information beyond abstraction The purpose of our work is to allow automatic verification of programs, without the need to write loop invariants, which is a non-trivial task in many cases. We use abstraction because the number of concrete stores is infinite. It is important for us to do sound analysis, it means that if the program has a bug (of the kind we analyze) our analysis will discover it. The abstract interpretation framework in which we operate provides us the soundness. But it may be not precise in the sense that we may warn about an error even if the program is correct, because abstract interpretation may consider infeasible paths. This is the cost of abstraction, that looses information. But we don’t want to loose more information beyond the abstraction, when we compute an effect of a statement. We want to produce the most-precise result of a statement, w.r.t. the given abstraction. It is different from being complete. It becomes more complicated when the abstractions are parametric, as it happens, for example, in shape analysis. In this presentation I’ll show you an algorithms that achieves the most-precise result. VMCAI'04 November 18
Plan The challenge Our solution Conclusions computing the best transformer Our solution the idea behind our algorithm example Conclusions First, I’ll introduce the assumptions under which the algorithm works. It also gives me an opportunity to present our notations and give short reminder of abstract interpretation. Then I’ll use these notations to define more precisely what is the problem we are solving, and present our solution. VMCAI'04 November 18
Implementing Best Transformer For predicate-abstraction domains, implementation of best transformer is known Uses decision procedure Our work: implement best transformers for non-predicate-abstraction domains Also uses decision procedure Do I need to convince you that best transformer is useful ? First of all, once we determined the abstraction it looses the information that we are ready to loose and it keeps the important information. If we have a best transformer, we are as precise as possible for a given abstraction. If we have a best transformer for a statement we can compute a best abstract transformer for a sequence of statements (straight-line code). VMCAI'04 November 18
The Best Transformer – abstract operation a1 T# T a .. then we have a concrete transformer T that takes a concrete store an returns another concrete store. We apply T to all stores in this set and get another set of stores. then we apply abstraction function alpha to this set of stores, and get some other abstract value here. This defines the best abstract transformer T# . The problem is that we cannot compute T# by going to the concrete domain, because this set of store may be infinite. I hope that I don’t have to convince you that we want to compute the best tranformer… Concrete Abstract VMCAI'04 November 18
The Best Transformer – abstract operation a1⊑ a2 a2 a1 T# T a Why is it the “best” ? T((a)) (a1) ∀ a2 : T((a)) (a2) ⇒ (a1) (a2) Concrete Abstract VMCAI'04 November 18
The Best Transformer - abstract operation Formulas a1 (a1) T# T a (a) Concrete Abstract VMCAI'04 November 18
The Best Transformer – symbolic operation [x := y * z] ^ [x' = y*z y' = y z' = z ] ⋀ (x' = y*z y' = y z' = z ) Formulas ^ a1 (a1) T ^ Best T a ^ (a) We combine T and alpha in one step. The idea of the procedure best is the same as the idea behind computing alphaHat. Therefore, I’ll show you alphaHat and the procedure for best is described in the paper. Note that in predicate abstraction best transformer is computed using the abstract domain and the formula. Our algorithm uses all three domains, as you will see from the example. (in the abstract domain it uses the join to advance upwards in the concrete domain it uses a representative concrete element from the set not yet covered) Concrete Abstract VMCAI'04 November 18
The Best Transformer – symbolic operation [x := y * z] ^ [x' = y*z y' = y z' = z ] ⋀ (x' = y*z y' = y z' = z ) Formulas ^ a1 (a1) T ^ Best T a ^ (a) Concrete Abstract VMCAI'04 November 18
The idea behind () ⊧ ^ ^ () ^ Formulas Concrete Abstract Predicate abstraction compute best transformer using the abstract domain and the formula. Our algorithm uses all three domains: in the concrete domain it uses the join to Concrete Abstract VMCAI'04 November 18
The idea behind () ^ ^ ⊥ ans Formulas Concrete Abstract successive approximations ⊥ ans Concrete Abstract VMCAI'04 November 18
Remainder of the Talk Requirements Example Conclusions VMCAI'04 November 18
Requirements Lattice L = (L, , ,, , ) abstract domain of finite height a1 a2 means a1 “represents fewer states than” a2 : Store L the best abstract value represented by a store :2Store L the best abstract value for a set of stores (C) = {(store) | store C} : L 2Store the set of stores represented by an abstract value (a)= {store | (store) a} Galois Connection (, ) VMCAI'04 November 18
A Simple Example - Constant Propagation void main() { int x,y,z; x = 3; if (getchar()>116) { y = x; } else { z = 2; y = z + 1; } printf(y); Start x = 3 if . . . z = 2 y = x y = z+1 printf(y) x = 3 x = 3 The running example of the paper and this presentation is not shape analysis, it is a constant propagation. It was chosen for methodological reasons. The program on the left and its control flow graph on the right. what branch of the if-condition is taken depends on the input. We want to find out the value of y at the end of the program. At the first look, the value of y is not constant, because it depends on the branch that is taken. Here the value of y is the same as x and here the value of y equals to (z + 1). However, at the closer look we find out that x has a constant value 3 in this branch, therefore y equals 3. And in this branch y equals 3 too, because z is constant 2 and y is z + 2. z = 2 y = ? y = 3 y = 2 +1 = 3 y = 3 VMCAI'04 November 18
Abstract Domain for Constant Propagation . . . -2 -1 0 1 2 . . . Z T L = (Var Z T) [x⊤, y, z ] [x, y, z0] [x0, y, z0] top means that the value is not known to be a constant. [x0, y43, z0] [x0, y2, z0] [x0, y0, z0] [x0, y1, z0] Infinite cardinality, but finite height VMCAI'04 November 18
Function cp cp [x0, y1, z0] [x0, y0, z0] [x0, y2, z0] Concrete Abstract VMCAI'04 November 18
Function cp cp cp [x0, y0, z0] [x0, y1, z0] [x0, y0, z0] Concrete Abstract VMCAI'04 November 18
Abstraction Function cp [x0, y, z0] [x0, y1, z0] [x0, y0, z0] [x0, y0, z0] [x0, y2, z0] cp So, I’ve just showed you how to define all the things for constant propagation. But there are two more requirements, I have to explain. cp [x0, y2, z0] Concrete Abstract VMCAI'04 November 18
Three Value-Spaces ^ S ⊧ (a) ⇔ S ∈ (a) ^ [x0, y1, z0] (x=0)(z=0) ^ Concrete Formulas Abstract VMCAI'04 November 18
Required Primitive Operations Abstraction (S) = storeS (store) ([x 0, y 2, z 0]) = [x0, y2, z0] Symbolic concretization S ⊧ (a) ⇔ S ∈ (a) ([x0, y, z0]) = (x = 0) (z = 0) Decision procedure returning a satisfying structure S [x 0, y 2, z 0] (z = 0) (x = y * z) ^ ^ We have seen 3 requirements by now: (1) definition of abstraction and abstraction function alpha. In constant propagation these are easy, beta is identity (2) the existence of symbolic representation of an abstract value using a logical formula in some logic. Such that the set of concrete store that satisfy the formula is the set of concrete stores represented by the corresponding abstract value. In case of constant propagation it is a quantifier free formula. (3) the third requirement, not mentioned before is that the logic that is used for symbolic representation gammaHat is decisdable and the decision procedure returns a concrete satisfying assignment. This is a reasonable requirement if the logic is decidable. (??? If not, our approach allows using a theorem prover. If the theorem prover terminates and a concrete counter-example can be retrieved. If it does not terminate, this approach cannot be applied with timeout, as you will see. ???) These are the only requirement of our approach. Before showing you how the algorithm works, I will define what it does in a declarative way. VMCAI'04 November 18
Procedure () Example ^ Procedure () Example (z = 0) (x = y * z) [x0, y43, z0] S ans [x0,y43,z0] Concrete Formulas Abstract VMCAI'04 November 18
Procedure () Example ^ Procedure () Example ^ 2 := (ans) = (z = 0) (x = y*z) ((x=0) (y=43) (z=0)) = (z = 0) (x = y*z) (y 43) 2 (z = 0) (x = y * z) (ans) ^ [x0, y43, z0] S (x = 0) (y = 43) (z = 0) ^ (ans) ^ ans [x0,y43,z0] Concrete Formulas Abstract VMCAI'04 November 18
Procedure () Example ^ Procedure () Example (z = 0) (x = y * z) (y 43) S [x0, y24, z0] [x0,y24,z0] [x0, y43, z0] Concrete Formulas Abstract VMCAI'04 November 18
Procedure () Example ^ Procedure () Example 3 := 2 (ans) = (z=0) (x=y*z) (y43) ((x=0) (z=0)) = false ^ (z = 0) (x = y*z) (y 43) [x0, y, z0] S (x = 0) (z = 0) (x = 0) (z = 0) ^ ans Concrete Formulas Abstract VMCAI'04 November 18
Procedure () Example ^ Procedure () Example 3 := 2 (ans) = (z=0) (x=y*z) (y43) ((x=0) (z=0)) = false ^ ((z = 0)(x = y*z)) ^ (z = 0) (x = y*z) (y 43) [x0, y, z0] S (x = 0) (z = 0) (x = 0) (z = 0) ^ ans Concrete Formulas Abstract VMCAI'04 November 18
Conclusions What does it mean to harness a decision procedure for use in static analysis? Requirements Finite-height abstract domain (S) and join Symbolic concretization Decision procedure that returns a satisfying structure ^ Now we can answer the question. it means that the user only provides the abstraction and we can compute alpha, best, meet and do a/g. VMCAI'04 November 18
Conclusions What does it mean to harness a decision procedure for use in static analysis? What does it buy us ? () – best abstract value that represents Best(T,a) – best abstract transformer parametric abstractions meet(a1, a2) = ( (a1), (a2) ) assume-guarantee reasoning ^ ^ ^ ^ Now we can answer the question. it means that the user only provides the abstraction and we can compute alpha, best, meet and do a/g. VMCAI'04 November 18
Practical Considerations Number of calls to the decision procedure linear in the height of the domain Termination guaranteed for decidable logic for expressing use timeout for more expressive logic Concrete satisfying structure ^ We cannot return the intermediate value that we get during the algorithm, because it work from below. Therefore, the value is not sound before the end of the algorithm. VMCAI'04 November 18
Different Algorithm - TACAS’04 Computes () for abstract domain of 3-valued structures Pros goes down from no counter examples formulas do not grow Cons works only for finite domain of 3-valued structures more complicated ^ ⊤ ans ⊥ VMCAI'04 November 18
Further Work Use symbolic counter examples Infinite-height domains Can we operate on formulas directly? Lower bounds on the problem of computing the best transformer Current algorithm is conceptually simpler than TACAS’04. TACAS algorithm works for abstract domain in which abstract value is a three-valued logical structure – the domain used for shape analysis. Number of calls to theorem prover is the same in the worst case. Even more ambitious goal … try to operator directly on formulas. VMCAI'04 November 18
The END www.cs.tau.ac.il/~gretay VMCAI'04 November 18