Abstraction of Source Code

Abstraction of Source Code
(from Bandera lectures and talks)

Abstraction: the key to scaling up
Original system property P symbolic state Abstract system property P’ represents a set of states abstraction Of course, most people recognize that abstraction is the key to scaling fsv tech. To reasoning about realistic programs. Abstraction involves creation of a new sys with fewer states, were each state in the abs.sys. Is like a “symbolic” state that actually represents multiple states. A symbolic state may even repres.an infinite number of state. When building abstractions, we want them to be safe. Safety allows us to infer that if a property is true for the abs system then the prop. Is true for the orig system. However, because of over-approx., the ability to disprove properties is in general sacrificed. Safety: The set of behaviors of the abstract system over-approximates the set of behaviors of the original system

Data Abstraction Data Abstraction
Abstraction proceeds component-wise, where variables are components Even Odd …, -3, -1, 1, 3, … …, -2, 0, 2, 4, … x:int 1, 2, 3, … …, -3, -2, -1 Pos Neg Zero y:int

Data Type Abstraction Collapses data domains via abstract interpretation: Code Data domains int int x = 0; if (x == 0) x = x + 1; (n<0) : NEG (n==0): ZERO (n>0) : POS Signs NEG POS ZERO Signs x = ZERO; if (Signs.eq(x,ZERO)) x = Signs.add(x,POS); The technique can be illustrated on this fragment of code; assume we would like to keep only enough info about x to allow the conditional test to be decided. So we only need to know if x is zero or not for this we could use the signs abstraction that maps integers onto a set of 3 symbolic values:neg,zero,pos we transform the code so that to operate on the abs domain and it looks like this; here the concrete type int was replaced by abs type signs, concrete constants 0 and 1 were replaced with abs ct 0 and pos; and primitive ops on ints were replaced with calls to some methods that implement the abs.ops.that manipulate abstract values. Ex : equality operator was replaced with a call to method signs.eq and + was replace by signs.add . So, how do we apply this abstraction technique to the DEOS example ? We have to decide which variables to abstract, what abstrations to use and then we have to effectively transform the system to encode the abstractions.

Hypothesis Abstraction of data domains is necessary
Automated support for Defining abstract domains (and operators) Selecting abstractions for program components Generating abstract program models Interpreting abstract counter-examples will make it possible to Scale property verification to realistic systems Ensure the safety of the verification process The DEOS experience makes it clear that abstraction is necessary for model checking of realistic Java programs. Furthermore,this experience also illustrates the need for several different forms of tool support to achieve abstraction of large programs Link to next slide … Now that you’ve seen the basic steps in abstracting a program in Bandera I’ll give an overview of the abstraction components starting with the user’s interface …

Definition of Abstractions in BASL
operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_) -> {NEG,ZERO,POS}; /* case (POS,NEG),(NEG,POS) */ end abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) n < > {NEG}; n == 0 -> {ZERO}; n > > {POS}; end Automatic Generation Example: Start safe, then refine: +(NEG,NEG)={NEG,ZERO,POS} In BASL, we can define AI’s. an AI is a collection of 3 components : a domain of abs.values, an abstract function and a collection of abstract op., one for each primitive operation in the program. -For example, this is the BASL representation of the signs AI, where the abstract domain is the powerset of the set of tokens. -The abstract ops.are automatically generated using the PVS theorem prover, for example this is the abstract version of +. -Note that abstract operations can return more than one token.This is because by abstraction, we loose info about abs.values : you will see on the next slide, that imprecision is MODELED as a non-det. choice over the values in the set. -So now, how do we use the theorem prover. For ex, this is how we generate the def of abs.op + applied to 2 NEG values. We start whith set {NEG, ZERO, POS} which is safe because it covers all the possible results of adding 2 negative numbers. -We try to eliminate from the set the tokens that we can prove are not possible. We submit to PVS 3 implications : one for each token in the def. Here predicate NEG? ...POS? and ZERO? are defined similarly. -So we ask PVS : is it true that by adding 2 neg numbers we can not get a pos. number, and PVS answers yes, so we eliminate POS from the definition,…. -> same technique for all the cases! Forall n1,n2: neg?(n1) and neg?(n2) implies not pos?(n1+n2) Forall n1,n2: neg?(n1) and neg?(n2) implies not zero?(n1+n2) Forall n1,n2: neg?(n1) and neg?(n2) implies not neg?(n1+n2) Proof obligations submitted to PVS...

Compiling BASL Definitions
abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) n < > {NEG}; n == 0 -> {ZERO}; n > > {POS}; end operator + add (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; /* case (POS,NEG), (NEG,POS) */ public class Signs { public static final int NEG = 0; // mask 1 public static final int ZERO = 1; // mask 2 public static final int POS = 2; // mask 4 public static int abs(int n) { if (n < 0) return NEG; if (n == 0) return ZERO; if (n > 0) return POS; } public static int add(int arg1, int arg2) { if (arg1==NEG && arg2==NEG) return NEG; if (arg1==NEG && arg2==ZERO) return NEG; if (arg1==ZERO && arg2==NEG) return NEG; if (arg1==ZERO && arg2==ZERO) return ZERO; if (arg1==ZERO && arg2==POS) return POS; if (arg1==POS && arg2==ZERO) return POS; if (arg1==POS && arg2==POS) return POS; return Bandera.choose(7); /* case (POS,NEG), (NEG,POS) */ Compiled BASL specification are compiled into Java classes For example, this is the Java class for the signs abstraction. Abstract tokens are implemented as integer values. The abstract function is straightforward. And this is the definition of method add that implements the abstract version of + -> we use a special method Bandera.choose the model the imprecision introduced by the abs. -> For ex, Bandera.choose (7) will be interpreted in subsequent verification tools (i.e. JPF) as a non-deterministic choice between the tokens encoded in the bit-vector 7, that is between NEG, ZERO, POS.

Interpreting Results For an abstracted program, a counter-example may be infeasible because: Over-approximation introduced by abstraction Example: x = -2; if(x + 2 == 0) then ... x = NEG; if(Signs.eq(Signs.add(x,POS),ZERO)) then ... {NEG,ZERO,POS} -> abs.program has more behaviors than the original program -> a counter-ex may be reported for a behavior in the abstract system not present in the original system We developed a technique that searches for feasible counter-ex, during verification of the abstract system.

Choose-free state space search
Theorem [Saidi:SAS’00] Every path in the abstracted program where all assignments are deterministic is a path in the concrete program. Bias the model checker to look only at paths that do not include instructions that introduce non-determinism JPF model checker modified to detect non-deterministic choice (i.e. calls to Bandera.choose()); backtrack from those points

Choice-bounded Search
State space searched Undetectable Violation Detectable Violation choose() Verif. on choose free paths will report this path…. But not this paths….. That may correspond to a feasible execution and indicates a real defect X X

Counter-example guided simulation
Use abstract counter-example to guide simulation of concrete program Why it works: Correspondence between concrete and abstracted program Unique initial concrete state

Example of Abstracted Code
Java Program: class App{ public static void main(…) { [1] new AThread().start(); … [2] int i=0; [3] while(i<2) { [4] assert(!Global.done); [5] i++; }}} class Athread extends Thread { public void run() { [6] Global.done=true; }} Abstracted Program: class App{ public static void main(…) { [1] new AThread().start(); … [2] int i=Signs.ZERO; [3] while(Signs.lt(i,signs.POS)){ [4] assert(!Global.done); [5] i=Signs.add(i,Signs.POS); }}} class Athread extends Thread { public void run() { [6] Global.done=true; }} Nondeterminism! i=zero Problem: I is abstracted with Signs, the loop condition is abstracted, infeasible counter-examples because the loop gets executed more than 2 times Choose-free counter-example:

Example of Abstracted Code
Java Program: class App{ public static void main(…) { [1] new AThread().start(); … [2] int i=0; [3] while(i<2) { [4] assert(!Global.done); [5] i++; }}} class Athread extends Thread { public void run() { [6] Global.done=true; }} Abstracted Program: class App{ public static void main(…) { [1] new AThread().start(); … [2] int i=Signs.ZERO; [3] while(Signs.lt(i,signs.POS)){ [4] assert(!Global.done); [5] i=Signs.add(i,Signs.POS); }}} class Athread extends Thread { public void run() { [6] Global.done=true; }} i=zero i=0 i=pos i=1 i=pos i=2 i=zero i=0 i=pos i=1 i=zero i=0 i=pos i=2 i=pos i=1 Abstract counter-example: Mismatch

Hybrid Approach Program & Property Abstraction Refine selections
Guided Simulation Mismatch Abstract Program & Property counter-example Abstract Property false! (counter-example) Property true! Model Check Choose-free Model Check

Property Abstraction Property System Model Property Abstraction
(under-approximation) Program Abstraction (over-approximation) Property System Model If the abstract property holds on the abstract system, then the original property holds on the original system

Property Abstraction Properties are temporal logic formulas, written in negational normal form. Abstract propositions under-approximate the truth of concrete propositions. Examples: Invariance property: Abstracted to: (x > -1) ((x = zero)  (x=pos)) (x > -2) ((x = zero)  (x=pos))

Predicate Abstraction
Use a boolean variable to hold the value of an associated predicate that expresses a relationship between variables true false (1, 2) (0, 0) (1, 1) (-1, -1) (-1, 3) (3, 2) … int * int predicate: x = y

An Example x and y are unbounded
Init: x := 0; y := 0; z := 1; goto Body; Body: assert (z = 1); x := (x + 1); y := (y + 1); if (x = y) then Z1 else Z0; Z1: z := 1; Z0: z := 0; x and y are unbounded Data abstraction does not work in this case --- abstracting component-wise (per variable) cannot maintain the relationship between x and y We will use predicate abstraction in this example

Predicate Abstraction Process
Add boolean variables to your program to represent current state of particular predicates E.g., add a boolean variable [x=y] to represent whether the condition x=y holds or not These boolean variables are updated whenever program statements update variables mentioned in predicates E.g., add updates to [x=y] whenever x or y or assigned

An Example Init: x := 0; y := 0; z := 1; goto Body; Body: assert (z = 1); x := (x + 1); y := (y + 1); if (x = y) then Z1 else Z0; Z1: z := 1; Z0: z := 0; We will use the predicates listed below, and remove variables x and y since they are unbounded. Don’t worry too much yet about how we arrive at this particular set of predicates; we will talk a little bit about that later Predicates Boolean Variables p1: (x = 0) p2: (y = 0) p3: (x = (y + 1)) p4: (x = y) b1: [(x = 0)] b2: [(y = 0)] b3: [(x = (y + 1))] b4: [(x = y)] This is our new syntax for representing boolean variables that helps make the correspondence to the predicates clear

Transforming Programs
An example of how to transform an assignment statement Predicates Assignment Statement [(x=0)] := true; [(x=(y+1))] := if [$(y=0)] then false else top; [(x=y)] := if [$(y = 0)] then true else if ![$(y=0)] then false Where: [$P] = prev. value of [P] top is a non-deterministic choice between true and false The statement to the left is replaced the statements below x := 0; [(x = 0)] [(y = 0)] [(x = (y + 1))] [(x = y)] [(x=0)] := true; [(x=y)] := H([$(y=0)], ![$(y=0)]); [(x=(y+1))] := H(false, [$(y=0)]); Where: true, if e1 H (e, e2) = false, if e2 top, otherwise { Make a more compact representation using a helper function H (following SLAM notation)

State Simulation (n2,[x ! 2, y ! 2, z ! 0]) (n2,[ [x=0] ! False,
Given a program abstracted by predicates E1, …, En, an abstract state simulates a concrete state if Ei holds on the concrete state iff the boolean variable [Ei] is true and remaining concrete vars and control points agree. Concrete Abstract (n2,[x ! 2, y ! 2, z ! 0]) simulates (n2,[ [x=0] ! False, [y=0] ! False, [x=(y+1)] ! False, [x=y] ! True, z ! 0]) (n2,[x ! 3, y ! 3, z ! 0]) (n2,[x ! 1, y ! 0, z ! 1]) simulates (n2,[ [x=0] ! False, [y=0] ! True, [x=(y+1)] ! True, [x=y] ! False, z ! 1]) (n2,[x ! 3, y ! 3, z ! 1]) does not simulate

Computing Abstracted Programs
Truth values of predicates on s1. (n1,s1) (n2,s2) simulates (n1, [[P1], [P2], …, [Pn]]) Truth values of predicates on s1. simulates (n2, [[P1]’, [P2]’, …, [Pn]’]) For each statement s in the original program, we need to compute a new statement that describes how the given predicates change in value due to the effects of s. To do this for a given predicate Pi, we need to know if we should set [Pi] to true or false in the abstract target state. Thus, we need to know the conditions at (n1, s1) that guarantee that [Pi] will be true in the target state and the conditions that guarantee that [Pi] will be false in the target state. These conditions will be used in the helper function H. true, if e1 [P_i] := H (e, e2) = false, if e2 top, otherwise { Conditions that make [Pi] true. Conditions that make [Pi] false.

Computing Abstracted Programs
Example (n1,s1) (n2,s2) (n1, [[P1], [P2], …, [Pn]]) a ´ x := 0; [x=y] := H(…?…, …?…) (n2, [[P1]’, [P2]’, …, [Pn]’]) What conditions have to hold before a is executed to guarantee that x=y is true (false) after a is executed? Note: we want the least restrictive conditions The technical term for what we want is the “weakest pre-condition of a with respect to x=y” Let’s take a little detour to learn about weakest preconditions.

Floyd-Hoare Triples {F1} C {F2}
Command {F1} C {F2} Pre-condition (boolean formula) Post-condition (boolean formula) A triple is a logical judgement that holds when the following condition is met: For all states s that satisfies F1 (I.e., s ² F1), if executing C on s terminates with a state s’, then s’ ² F2.

Weaker/Stronger Formulas
If F’  F (F’ implies F), we say that F is weaker than F’. Intuitively, F’ contains as least as much information as F because whatever F says about the world can be derived from F’. Intuitively, stronger formulas impose more restrictions on states. Thinking in terms of sets of states… SF’ = {s | s ² F’} SF = {s | s ² F} Note that SF’ µ SF since F’ imposes more restrictions than F Let Question: what formula is the weakest of all? (In other words, what formula describes the largest set of states? What formula imposes the least restrictions?)

Weakest Preconditions
The weakest precondition of C with respect to F2 (denoted WP(C,F2)) is a formula F1 such that {F1} C {F2} and for all other F’1 such that {F’1} C {F2}, F’1 ) F1 (F1 is weaker then F’1). This notion is useful because it answers the question: “what formula F1 captures the fewest restrictions that I can impose on a state s so that when s’ = [[C]]s then s’ ² F2?” WP is interesting for us when calculating predicate abstractions because for a given command C and boolean variable [Pi], we want to know the least restrictive conditions that have to hold before C so that we can conclude that Pi is definitely true (false) after executing C.

Calculating Weakest Preconditions
Calculating WP for assignments is easy: WP(x := e, F) = F[x ! e ] Intuition: x is going to get a new value from e, so if F has to hold after x := e, then F[x ! e] is required to hold before x := e is executed. Examples = (0 = y) WP(x := 0, x = y) = (x = y)[x ! 0] = (0 = y + 1) WP(x := 0, x = y + 1) = (x = y + 1)[x ! 0] = (x + 1 = y + 1) WP(x := x+1, x = y + 1) = (x = y + 1)[x ! x + 1]

Calculating Weakest Preconditions
Calculating WP for other commands (state transformers): WP(skip, F) = F WP(assert e, F) = e ) F (: e  F) WP(assume e, F) = e ) F (: e  F) Skip: since the store is not modified, then F will hold afterward iff it holds before. Assert and Assume: even though we have a different operational interpretation of assert and assume in the verifier, the definition of WP of these rely on the fact that we assume that if an assertion or assume condition is violated, it’s the same as the command “not completing”. Note that if e is false, then the triple {(: e  F)} assert e {F} always holds since the command never completes for any state.

Assessment Intuition:
Source Program atomic { } Abstracted Program Transformed Assignment statement C Assignment to each boolean variable [Pi] where each assignment has the form [Pi] := H(WP(C,Pi),WP(C,!Pi)). But what’s wrong with this? Answer: the predicates Pi refer to concrete variables, and the entire purpose of the abstraction process is to remove those from the program. The point is that the conditions in the ‘H’ function should be stated in terms of the boolean variables [Pi] instead of the predicates Pi.

Assessment In the case of x := 0 and the predicate x = y, we have
WP(x := 0, x=y) = (0=y) WP(x := 0, !x=y) = !(0=y) In this case, the information in the predicate variables is enough to decide whether 0=y holds or not. That is, we can simply generate the assignment statement [(x=y)] := H([$(y = 0)],![$(y=0)]);

Assessment In the case of x := 0 and the predicate x = (y+1), we have
WP(x := 0, (x=y+1)) = (0=y+1) WP(x := 0, !(x=y+1)) = !(0=y+1) In this case, we don’t have a predicate variable [0=y+1]. We must consider combinations of our existing predicate variables that imply the conditions above. That is, we consider stronger (more restrictive, less desirable but still useful) conditions formed using the predicate variables that we have.

What Are Appropriate Predicates?
In general, a difficult question, and subject of much research Research focuses on automatic discovery of predicates by processing (infeasible) counterexamples If a counterexample is infeasible, add predicates that allow infeasible branches to be avoided counterexample Add the predicate x > y so that we have enough information to avoid the infeasible branch. If x > y … Infeasible branch taken feasible branch

What Are Appropriate Predicates?
Some general heuristics that we will follow Use the predicates A mentioned in property P, if variables mentioned in predicates are being removed by abstraction At least, we want to tell if our property predicates are true or not Use predicates that appear in conditionals along “important paths” E.g., (x=y) Predicates that allow predicates above to be decided E.g., (x=0), (y=0), (x = (y + 1))

Abstraction of Source Code

Similar presentations

Presentation on theme: "Abstraction of Source Code"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Abstraction of Source Code

Similar presentations

Presentation on theme: "Abstraction of Source Code"— Presentation transcript:

Similar presentations

About project

Feedback