Program Analysis and Verification 0368-4479

Program Analysis and Verification 0368-4479 http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html Noam Rinetzky Lecture 9: Abstract Interpretation II Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

From verification to analysis Manual program verification – Verifier provides assertions Loop invariants Program analysis – Automatic program verification – Tool automatically synthesize assertions Finds loop invariants

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis 3

Abstract Interpretation [Cousot’77] Mathematical framework for approximating semantics (aka abstraction) – Allows designing sound static analysis algorithms Usually compute by iterating to a fixed-point – Computes (loop) invariants Can be interpreted as axiomatic verification assertions Generalizes Hoare Logic & WP / SP calculus 4

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states ~ Assertions Join (  ) ~ Weakening – Transformer functions Abstract steps ~ Axioms – Chaotic iteration Structured Programs ~ Control-flow graphs Abstract computation ~ Loop invariants 5

Concrete Semantics 6 set of states operational semantics (concrete semantics) statement S set of states

Conservative Semantics 7 set of states operational semantics (concrete semantics) statement S set of states 

Abstract (conservative) interpretation 8 set of states operational semantics (concrete semantics) statement S abstract representation  abstract representation abstract semantics statement S abstract representation abstraction {P}{P}S{Q}{Q}sp(S, P)  generalizes axiomatic verification

Abstract (conservative) interpretation 9 set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Abstract (conservative) interpretation 10 set of states operational semantics (concrete semantics) statement S set of states abstract state abstract semantics (transfer function) statement S abstract state concretization 

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states ~ Assertions Join (  ) ~ Weakening – Transformer functions Abstract steps ~ Axioms – Chaotic iteration Abstract computation ~ Loop invariants Structured Programs ~ Control-flow graphs 11 Lattices (D, , , , ,  ) Lattices (D, , , , ,  ) Monotonic functions Fixpoints

A taxonomy of semantic domain types 12 Complete Lattice (D, , , , ,  ) Lattice (D, , , , ,  ) Join semilattice (D, , ,  ) Meet semilattice (D, , ,  ) Complete partial order (CPO) (D, ,  ) Partial order (poset) (D,  ) Preorder (D,  )

Preorder 13 We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’

Preorder 14 d d d’ d’’        We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ Hasse Diagram

Preorder 15 We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ d d d’ d’’        Hasse Diagram

Partial order We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ – Anti-symmetric: d  d’ and d’  d implies d = d’ 16 d d d’ d’’        Hasse Diagram

Chains d  d’ means d  d’ and d  d’ An ascending chain is a sequence x 1  x 2  …  x k … A descending chain is a sequence x 1  x 2  …  x k … The height of a poset (D,  ) is the length of the maximal ascending chain in D 17

poset Hasse diagram (for CP) 18 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

Some posets-related terminology 19

Least upper bound (LUB) (D,  ) is a poset b ∊ D is an upper bound of A ⊆ D if ∀ a  A: a  b b ∊ D is the least upper bound of A ⊆ D if – b is an upper bound of A – If b’ is an upper bound of A then b  b’ – Join:  X = LUB of X x  y =  {x, y} May not exist

Join operator Properties of a join operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x A kind of abstract union (disjunction) operator Top element of (D,  ) is  =  D 21

Join Example 22 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

Greatest lower bound (GLB) (D,  ) is a poset b ∊ D is an lower bound of A ⊆ D if ∀ a  A: b  a b ∊ D is the greatest lower bound of A ⊆ D if – b is an lower bound of A – If b’ is an lower bound of A then b’  b – Meet:  X = GLB of X x  y =  {x, y} May not exist

Meet operator Properties of a meet operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x A kind of abstract intersection (conjunction) operator Bottom element of (D,  ) is  =  D 26

Complete partial order (CPO) A poset (D,  ) is a complete partial if every ascending chain x 1  x 2  …  x k … has a LUB 27

Meet Example 28   x=0 x0x0 x<0x>0 x0x0

Complete partial order (CPO) A poset (D,  ) is a complete partial if every ascending chain x 1  x 2  …  x k … has a LUB 31

Join semilattices (D, , ,  ) is a join semilattice – (D,  ) is a partial order – ∀ X  FIN D.  X is defined – A top element  32

Meet semilattices (D, , ,  ) is a meet semilattice – (D,  ) is a partial order – ∀ X  FIN D.  X is defined – A bottom element  33

Lattices (D, , , , ,  ) is a lattice if – (D, , ,  ) is a join semilattice – (D, , ,  ) is a meet semilattice A lattice (D, , , , ,  ) is a complete lattice if –  X and  Y are defined for arbitrary sets 34

Example: Powerset lattices (2 X, , , , , X) is the powerset lattice of X – A complete lattice 35

Example: Sign lattice 36   x=0 x0x0 x<0x>0 x0x0

A taxonomy of semantic domain types 37 Complete Lattice (D, , , , ,  ) Lattice (D, , , , ,  ) Join semilattice (D, , ,  ) Meet semilattice (D, , ,  ) Join/Meet exist for every subset of D Join/Meet exist for every finite subset of D (alternatively, binary join/meet) Join of the empty set Meet of the empty set Complete partial order (CPO) (D, ,  ) Partial order (poset) (D,  ) Preorder (D,  ) reflexive transitive anti-symmetric: d  d’ and d’  d implies d = d’ reflexive: d  d transitive: d  d’, d’  d’’ implies d  d’’ poset with LUB for all ascending chains

Towards a recipe for static analysis 38

Collecting semantics For a set of program states State, we define the collecting lattice (2 State, , , , , State) The collecting semantics accumulates the (possibly infinite) sets of states generated during the execution – Not computable in general 39

Abstract (conservative) interpretation 40 set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Abstract (conservative) interpretation 41 {x ↦ 1, x ↦ 2, …}{x ↦ 0, x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1 0 ≤ x concretization

Abstract (conservative) interpretation 42 {x ↦ 1, x ↦ 2, …}{…, x ↦ 0, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1  concretization

Abstract (non-conservative) interpretation 43 {x ↦ 1, x ↦ 2, …}{ x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …} ⊈ 0 < x abstract semantics x = x -1 0 < x concretization

But … what if we have x & y? – Define lattice (semantics) for each variable – Compose lattices Goal: compositional definition What if we have more than 1 statement? – Define semantics for entire program via CFG – Different “abstract states” at every CFG node 44

One lattice per variable 45 true false x=0 x0x0 x<0x>0 x0x0 true false y=0 y0y0 y<0y>0 y0y0 How can we compose them?

Cartesian product of complete lattices For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the poset L cart = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) as follows: – (x 1, x 2 )  cart (y 1, y 2 ) iff x 1  1 y 1 x 2  2 y 2 –  cart = ?  cart = ?  cart = ?  cart = ? Lemma: L is a complete lattice Define the Cartesian constructor L cart = Cart(L 1, L 2 ) 46

Cartesian product example 47  =( ,  ) x<0,y<0x<0,y=0x 0x=0,y<0x=0,y=0x=0,y>0x>0,y<0x>0,y=0x>0,y>0 x  0,y< 0 x  0,y< 0 x  0,y= 0 x  0,y= 0 x  0,y> 0 x  0,y> 0 x>0,y  0 … … x  0,y  0 x  0,y  0 x  0,y  0 x  0,y  0 x0x0 x0x0 y0y0 y0y0 … ( false, false ) How does it represent (x 0  y>0)?  =( ,  )

Disjunctive completion For a complete lattice L = (D, , , , ,  ) Define the powerset lattice L  = (2 D,  ,  ,  ,  ,   )   = ?   = ?   = ?   = ?   = ? Lemma: L  is a complete lattice L  contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L  = Disj(L) 48

The base lattice CP 49 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

The disjunctive completion of CP 50 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false {x=-2  x=-1}{x=-2  x=0}{x=-2  x=1}{x=1  x=2} ……… {x=0  x=1  x=2}{x=-1  x=1  x=-2} ……… … What is the height of this lattice?

Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = ? 51

Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = Disj(Cart(L 1, L 2 )) Lemma: L is a complete lattice What does it buy us? 52

Cartesian product example 53 true false x<0,y<0x<0,y=0x 0x=0,y<0x=0,y=0x=0,y>0x>0,y<0x>0,y=0x>0,y>0 x  0,y< 0 x  0,y< 0 x  0,y= 0 x  0,y= 0 x  0,y> 0 x  0,y> 0 x>0,y  0 … … x  0,y  0 x  0,y  0 x  0,y  0 x  0,y  0 x0x0 x0x0 y0y0 y0y0 … How does it represent (x 0  y>0)? What is the height of this lattice?

Relational product example 54 true false (x 0  y>0) x0x0 x0x0 y0y0 y0y0 How does it represent (x 0  y>0)? (x 0  y=0)(x<0  y  0)  (x<0  y  0) … What is the height of this lattice?

Collecting semantics 1 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x1][x1] [x0][x0] [x0][x0] [ x  -1] [x2][x2] [x2][x2] [x2][x2] [x2][x2] [x3][x3] [x3][x3] [x3][x3] … … … 55 [ x  -2] …

Defining the collecting semantics How should we represent the set of states at a given control-flow node by a lattice? How should we represent the sets of states at all control-flow nodes by a lattice? 56

Finite maps For a complete lattice L = (D, , , , ,  ) and finite set V Define the poset L V  L = (V  D,  V  L,  V  L,  V  L,  V  L,  V  L ) as follows: – f 1  V  L f 2 iff for all v  V f 1 (v)  f 2 (v) –  V  L = ?  V  L = ?  V  L = ?  V  L = ? Lemma: L is a complete lattice Define the map constructor L V  L = Map(V, L) 57

The collecting lattice Lattice for a given control-flow node v: ? Lattice for entire control-flow graph with nodes V: ? We will use this lattice as a baseline for static analysis and define abstractions of its elements 58

The collecting lattice Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 59

Equational definition of the semantics Define variables of type set of states for each control-flow node Define constraints between them 60 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]

Equational definition of the semantics R[2] = R[entry]   x:=x-1  R[3] R[3] = R[2]  {s | s(x) > 0} R[exit] = R[2]  {s | s(x)  0} A system of recursive equations How can we approximate it using what we have learned so far? 61 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]

An abstract semantics R[2] = R[entry]   x:=x-1  # R[3] R[3] = R[2]  {s | s(x) > 0} # R[exit] = R[2]  {s | s(x)  0} # A system of recursive equations 62 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit] Abstract transformer for x:=x-1 Abstract representation of {s | s(x) < 0}

Abstract interpretation via abstraction 63 set of states collecting semantics statement S abstract representation of sets of states  abstract representation of sets of states abstract semantics statement S abstract representation of sets of states abstraction {P}{P}S{Q}{Q}sp(S, P)  generalizes axiomatic verification

Abstract interpretation via concretization 64 set of states collecting semantics statement S set of states  abstract representation of sets of states abstract semantics statement S abstract representation of sets of states concretization {P}{P}S{Q}{Q} models(P)models(sp(S, P))models(Q) 

Required knowledge Collecting semantics Abstract semantics Connection between collecting semantics and abstract semantics Algorithm to compute abstract semantics 65

The collecting lattice (sets of states) Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 66

Collecting semantics example 1 2 3 4 5 67 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: if x > 0 x := x-1 entry exit 2 3 [x1][x1][x0][x0][ x  -1][x2][x2][x2][x2][x3][x3] … [x1][x1][x0][x0] [x2][x2][x2][x2][x3][x3] … [x1][x1] [x3][x3][x4][x4] [x2][x2] … [ x  -2] … [x0][x0] [x0][x0][x1][x1] [x2][x2][x3][x3] …

Shorthand notation 1 2 3 4 5 68 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: if x > 0 x := x-1 entry exit {xZ}{xZ} {xZ}{xZ} { x >0} { x  0} { x  0} 2 3

Equational definition example A vector of variables R[0, 1, 2, 3, 4] R[0] = { x  Z} // established input R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] A (recursive) system of equations 69 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3] Semantic function for assume x>0 Semantic function for x:=x-1 lifted to sets of states

General definition A vector of variables R[0, …, k] one per input/output of a node – R[0] is for entry For node n with multiple predecessors add equation R[n] =  {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] =  S  R[m] Treat true/false branches of conditions as atomic statements assume bexpr and assume  bexpr 70 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3]

Equation systems in general Let L be a complete lattice (D, , , , ,  ) Let R be a vector of variables R[0, …, n]  D  …  D Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 71

Equation systems in general Let L be a complete lattice (D, , , , ,  ) Let R be a vector of variables R[0, …, n]  D  …  D Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 72

Monotone functions Let L 1 =(D 1,  ) and L 2 =(D 2,  ) be two posets A function f : D 1  D 2 is monotone if for every pair x, y  D 1 x  y implies f(x)  f(y) A special case: L 1 =L 2 =(D,  ) f : D  D 73

Important cases of monotonicity Join: f(X, Y) = X  Y Prove it! For a set X and any function g F(X) = { g(x) | x  X } Prove it! Notice that the collecting semantics function is defined in terms of – Join (set union) – Semantic function for atomic statements lifted to sets of states 74

Extensive/reductive functions Let L=(D,  ) be a poset A function f : D  D is extensive if for every pair x  D, we have that x  f(x) A function f : D  D is reductive if for every pair x  D, we have that x  f(x) 75

Fixed-points L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 76 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn() 1.Does a solution always exist? Yes 2.If so, is it unique? No, but it has least/greatest solutions 3.If so, is it computable? Under some conditions…

Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 77 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : Fixed-point = d

Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 78 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <-5} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : pre Fixed-point  d

Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 79 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <9} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : post Fixed-point  d

Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = … 80

Fixed-point theorem [Kleene] Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) Lemma: Monotone functions on posets satisfying ACC are continuous Proof: 1.Every ascending chain Y eventually stabilizes d 0  d 1  …  d n = d n+1 = … hence d n is the least upper bound of {d 0, d 1, …, d n }, thus f(  Y) = f(d n ) 2.From monotonicity of f f(d 0 )  f(d 1 )  …  f(d n ) = f(d n+1 ) = … Hence f(d n ) is the least upper bound of {f(d 0 ), f(d 1 ), …, f(d n )}, thus  { f(y) | y  Y } = f(d n ) 81

Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 82   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := d  f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

Chaotic iteration 83 Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , ,  ) Define variables R[0,…,n] for input/output of each CFG node such that R[i]  D For each node v let v out be the variable at the output of that node: v out = F[v](  u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 84

Constant propagation example 85 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit x := 4; while (y  5) do z := x; x := 4

Constant propagation lattice For each variable x define L as For a set of program variables Var=x 1,…,x n L n = L  L  …  L 86  0-212...  no information not-a-constant

Write down variables 87 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit x := 4; while (y  5) do z := x; x := 4

Write down equations 88 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R1R1 R5R5 R0R0 x := 4; while (y  5) do z := x; x := 4

Collecting semantics equations 89 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 = State R 1 =  x:=4  R 0 R 2 = R 1  R 5 R 3 =  assume y  5  R 2 R 4 =  z:=x  R 3 R 5 =  x:=4  R 4 R 6 =  assume y=5  R 2 R1R1 R5R5 R0R0

Constant propagation equations 90 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R1R1 R5R5 R0R0 abstract transformer

Abstract operations for CP 91 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 Lattice elements have the form: (v x, v y, v z )  x:=4  # (v x,v y,v z ) = (4, v y, v z )  z:=x  # (v x,v y,v z ) = (v x, v y, v x )  assume y  5  # (v x,v y,v z ) = (v x, v y, v x )  assume y=5  # (v x,v y,v z ) = if v y = k  5 then ( , ,  ) else (v x, 5, v z ) R 1  R 5 = (a 1, b 1, c 1 )  (a 5, b 5, c 5 ) = (a 1  a 5, b 1  b 5, c 1  c 5 )  0-212...  CP lattice for a single variable

Chaotic iteration for CP: initialization 92 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =( , ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 0, R 1, R 2, R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 93 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =( , ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 1, R 2, R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 94 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 2, R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 95 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  )  0-212...  3 4 WL = {R 2, R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 96 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =(4, ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  )  0-212...  3 4 WL = {R 2, R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 97 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =(4, ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 3, R 4, R 5, R 6 }

Chaotic iteration for CP 98 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =( , ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 4, R 5, R 6 }

Chaotic iteration for CP 99 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, ,  ) R 4 =(4, , 4) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =( , ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 5, R 6 }

Chaotic iteration for CP 100 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 2, R 6 } added R 2 back to worklist since it depends on R 5

Chaotic iteration for CP 101 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , ,  ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 6 }

Chaotic iteration for CP 102 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =(4, 5,  ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) Fixed-point WL = {} In practice maintain a worklist of nodes

Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , ,  ) Define variables R[0,…,n] for input/output of each CFG node such that R[i]  D For each node v let v out be the variable at the output of that node: v out = F[v](  u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 103

Complexity of chaotic iteration Parameters: – n the number of CFG nodes – k is the maximum in-degree of edges – Height h of lattice L – c is the maximum cost of Applying F v  Checking fixed-point condition for lattice L Complexity: O(n  h  c  k) Incremental (worklist) algorithm reduces the n factor in factor – Implement worklist by priority queue and order nodes by reversed topological order 104

Required knowledge Collecting semantics Abstract semantics (over lattices) Algorithm to compute abstract semantics (chaotic iteration) Connection between collecting semantics and abstract semantics Abstract transformers 105

Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 106

Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 107

Galois Connection Given two complete lattices C = (D C,  C,  C,  C,  C,  C )– concrete domain A = (D A,  A,  A,  A,  A,  A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function  : D C  D A – The concretization function  : D A  D C for every concrete element c  D C and abstract element a  D A  (  (a))  a and c   (  (c)) Alternatively  (c)  a iff c   (a) 108

Galois Connection: c   (  (c)) 109 1   c 2 (c)(c) 3  (  (c))  The most precise (least) element in A representing c CA

Galois Connection:  (  (a))  a 110 1   3  (  (a)) 2 (a)(a) a  CA What a represents in C (its meaning)

Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (X) = ?  (Y) = ? 111

Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (s) =  ({s}) = { x=y | s x = s y} that is s  x=y  (X) =  {  (s) | s  X} =  A {  (s) | s  X}  (Y) = { s | s   Y } = models(  Y) 112

Galois Connection: c   (  (c)) 113 1   [x  5, y  5, z  5] 2 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 3 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] …  4 x=x, y=y, z=z  The most precise (least) element in A representing [x  5, y  5, z  5] CA

Most precise abstract representation 114 1 c 5  CA 4 6 2 73   8 9  (c)(c)  (c) =  {c’ | c   (c’)} 

Most precise abstract representation 115 1 c 5  CA 4 6 2 73   8 9   (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y  (c) =  {c’ | c   (c’)} [x  5, y  5, z  5] x=y, y=z x=y, z=y x=y

Galois Connection:  (  (a))  a 116 1   3 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … x=y, y=z  What a represents in C (its meaning)    is called a semantic reduction CA

Galois Insertion  a:  (  (a))=a 117   1 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … CA How can we obtain a Galois Insertion from a Galois Connection? All elements are reduced

Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other:  (a) =  {c |  (c)  a}  (c) =  {a | c   (a)} 118

Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements  (s) =  ({s}) Then lift the abstraction to sets of elements  (X) =  A {  (s) | s  X} 119

The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2 State, , , , , State) A = (D A,  A,  A,  A,  A,  A ) If D A is a set of formulas then the abstraction of a state is defined as  (s) =  ({s}) =  A {  | s   } the least formula from D A that s satisfies The abstraction of a set of states is  (X) =  A {  (s) | s  X} The concretization is  (  ) = { s | s   } = models(  ) 120

Inducing along the connections Assume the complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) M = (D M,  M,  M,  M,  M,  M ) and Galois connections GC C,A =(C,  C,A,  A,C, A) and GC A,M =(A,  A,M,  M,A, M) Lemma: both connections induce the GC C,M = (C,  C,M,  M,C, M) defined by  C,M =  C,A   A,M and  M,C =  M,A   A,C 121

Inducing along the connections 122 1 C,AC,A A,CA,C c 2 C,A(c)C,A(c) 5 CA 3 M A,MA,M 4 M,AM,A c’c’ a’ =  A,M (  C,A (c))

Sound abstract transformer Given two lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with A concrete transformer f : D C  D C an abstract transformer f # : D A  D A We say that f # is a sound transformer (w.r.t. f) if  c: f(c)=c’   (f # (c))   (c’) For every a and a’ such that  (f(  (a)))  A f # (a) 123

Transformer soundness condition 1 124 12 CA f 3 4 f#f# 5   c: f(c)=c’   (f # (c))   (c’)

Transformer soundness condition 2 125 CA 12 f#f# 3 5 f 4   a: f # (a)=a’  f(  (a))   (a’)

Best (induced) transformer 126 CA 2 3 f f # (a)=  (f(  (a))) 1 f#f# 4 Problem:  incomputable directly

Best abstract transformer [CC’77] Best in terms of precision – Most precise abstract transformer – May be too expensive to compute Constructively defined as f # =   f   – Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer – Useful tool: partial concretization 127

Transformer example C = (2 State, , , , , State) EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  )  (s) =  ({s}) = { x=y | s x = s y} that is s  x=y  (X) =  {  (s) | s  X} =  A {  (s) | s  X}  (  ) = { s | s   } = models(  ) Concrete:  x:=y  X = { s[x  s y] | s  X} Abstract:  x:=y  # X = ? 128

Developing a transformer for EQ - 1 Input has the form X =  {a=b} sp(x:=expr,  ) =  v. x=expr[v/x]   [v/x] sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Let’s define helper notations: – EQ(X, y) = {y=a, b=y  X} Subset of equalities containing y – EQc(X, y) = X \ EQ(X, y) Subset of equalities not containing y 129

Developing a transformer for EQ - 2 sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Two cases – x is y: sp(x:=y,  X) = X – x is different from y: sp(x:=y,  X) =  v. x=y  EQ)X, x)[v/x]  EQc(X, x)[v/x] = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  x=y  EQc(X, x) Vanilla transformer:  x:=y  #1 X = x=y  EQc(X, x) Example:  x:=y  #1  {x=p, q=x, m=n} =  {x=y, m=n} Is this the most precise result? 130

Developing a transformer for EQ - 3  x:=y  #1  {x=p, x=q, m=n} =  {x=y, m=n}   {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y,  X) = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 131

Developing a transformer for EQ - 4 Define a reduction operator: Explicate(X) = if exist {a=b, b=c}  X but not {a=c}  X then Explicate(X  {a=c}) else X Define  x:=y  #2 =  x:=y  #1  Explicate  x:=y  #2 )  {x=p, x=q, m=n}) =  {x=y, m=n, p=q} is this the best transformer? 132

Developing a transformer for EQ - 5  x:=y  #2 )  {y=z}) = {x=y, y=z}  {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer  x:=y  #3 = Explicate   x:=y  #1  Explicate Observation : after the first time we apply Explicate, all subsequent values will be in the image of the abstraction so really we only need to apply it once to the input Finally:  x:=y  # (X) = Explicate   x:=y  #1 – Best transformer for reduced elements (elements in the image of the abstraction) 133

Negative property of best transformers Let f # =   f   Best transformer does not compose  (f(f(  (a))))  f # (f # (a)) 134

 (f(f(  (a))))  f # (f # (a)) 135 CA 2 3 f 1 f#f# 6 5 4 f 7  f#f# 8 9 f

Soundness theorem 1 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  a  D A : f(  (a))   (f # (a)) Then lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # ) 136

Soundness theorem 1 137 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a  D A : f(  (a))   (f # (a))   a  D A : f n (  (a))   (f #n (a))   a  D A : lfp(f n )(  (a))   (lfp(f #n )(a))  lfp(f)   lfp(f # ) 

Soundness theorem 2 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  c  D C :  (f(c))  f # (  (c)) Then  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # )) 138

Soundness theorem 2 139 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c))  f # (  (c))   c  D C :  (f n (c))  f #n (  (c))   c  D C :  (lfp(f)(c))  lfp(f # )(  (c))  lfp(f)   lfp(f # ) 

A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound 140

Completeness Local property: – forward complete:  c:  (f # (c)) =  (f(c)) – backward complete:  a: f(  (a)) =  (f # (a)) A property of domain and the (best) transformer Global property: –  (lfp(f)) = lfp(f # ) – lfp(f) =  (lfp(f # )) Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties 141

Forward complete transformer 142 12 CA f 3 4 f#f#  c:  (f # (c)) =  (f(c))

Backward complete transformer 143 CA 12 f#f# 3 5 f  a: f(  (a)) =  (f # (a))

Global (backward) completeness 144 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a: f(  (a)) =  (f # (a))   a: f n (  (a)) =  (f #n (a))   a  D A : lfp(f n )(  (a)) =  (lfp(f #n )(a))  lfp(f)  = lfp(f # ) 

Global (forward) completeness 145 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c)) = f # (  (c))   c  D C :  (f n (c)) = f #n (  (c))   c  D C :  (lfp(f)(c)) = lfp(f # )(  (c))  lfp(f)  = lfp(f # ) 

Three example analyses Abstract states are conjunctions of constraints Variable Equalities – VE-factoids = { x=y | x, y  Var}  false VE = (2 VE-factoids, , , , false,  ) Constant Propagation – CP-factoids = { x=c | x  Var, c  Z}  false CP = (2 CP-factoids, , , , false,  ) Available Expressions – AE-factoids = { x=y+z | x  Var, y,z  Var  Z}  false A = (2 AE-factoids, , , , false,  ) 146

Lattice combinators reminder Cartesian Product – L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) – Cart(L 1, L 2 ) = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) Disjunctive completion – L = (D, , , , ,  ) – Disj(L) = (2 D,  ,  ,  ,  ,   ) Relational Product – Rel(L 1, L 2 ) = Disj(Cart(L 1, L 2 )) 147

Cartesian product of complete lattices For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the poset L cart = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) as follows: – (x 1, x 2 )  cart (y 1, y 2 ) iff x 1  1 y 1 and x 2  2 y 2 –  cart = ?  cart = ?  cart = ?  cart = ? Lemma: L is a complete lattice Define the Cartesian constructor L cart = Cart(L 1, L 2 ) 148

Cartesian product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X)= ? –  A  B,C (Y) = ? 149

Cartesian product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) What about transformers? 150

Cartesian product transformers GC C,A =(C,  C,A,  A,C, A)F A [st] : A  A GC C,B =(C,  C,B,  B,C, B)F B [st] : B  B Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) How should we define F A  B [st] : A  B  A  B 151

Cartesian product transformers GC C,A =(C,  C,A,  A,C, A)F A [st] : A  A GC C,B =(C,  C,B,  B,C, B)F B [st] : B  B Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) How should we define F A  B [st] : A  B  A  B Idea: F A  B [st](a, b) = (F A [st] a, F B [st] b) Are component-wise transformers precise? 152

Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 153 a := 9; b := 9; c := a; CP analysisVE analysis {a=9} {a=9, b=9} {a=9, b=9, c=9} {} {} {c=a}

Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 154 CP analysis + VE analysis a := 9; b := 9; c := a; {a=9} {a=9, b=9} {a=9, b=9, c=9, c=a}

Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 155 CP  VE analysis Missing {a=b, b=c} a := 9; b := 9; c := a; {a=9} {a=9, b=9} {a=9, b=9, c=9, c=a}

Transformers for Cartesian product Naïve (component-wise) transformers do not utilize information from both components – Same as running analyses separately and then combining results Can we treat transformers from each analysis as black box and obtain best transformer for their combination? 156

Can we combine transformer modularly? No generic method for any abstract interpretations 157

Reducing values for CP  VE X = set of CP constraints of the form x=c (e.g., a=9 ) Y = set of VE constraints of the form x=y Reduce CP  VE (X, Y) = (X’, Y’) such that (X’, Y’)  (X’, Y’) Ideas? 158

Reducing values for CP  VE X = set of CP constraints of the form x=c (e.g., a=9 ) Y = set of VE constraints of the form x=y Reduce CP  VE (X, Y) = (X’, Y’) such that (X’, Y’)  (X’, Y’) ReduceRight: – if a=b  X and a=c  Y then add b=c to Y ReduceLeft: – If a=c and b=c  Y then add a=b to X Keep applying ReduceLeft and ReduceRight and reductions on each domain separately until reaching a fixed-point 159

Transformers for Cartesian product Do we get the best transformer by applying component-wise transformer followed by reduction? – Unfortunately, no (what’s the intuition?) – Can we do better? – Logical Product [Gulwani and Tiwari, PLDI 2006] 160

Product vs. reduced product 161 CP  VE lattice {a=9}{c=a}{c=9}{c=a} {a=9, c=9}{c=a} {[a  9, c  9]} collecting lattice {}    

Reduced product For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the reduced poset D 1  D 2 = {(d 1,d 2 )  D 1  D 2 | (d 1,d 2 ) =    (d 1,d 2 ) } L 1  L 2 = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) 162

Transformers for Cartesian product Do we get the best transformer by applying component-wise transformer followed by reduction? – Unfortunately, no (what’s the intuition?) – Can we do better? – Logical Product [Gulwani and Tiwari, PLDI 2006] 163

Logical product-- Assume A=(D,…) is an abstract domain that supports two operations: for x  D – inferEqualities(x) = { a=b |  (x)  a=b } returns a set of equalities between variables that are satisfied in all states given by x – refineFromEqualities(x, {a=b}) = y such that  (x)=  (y) y  x 165

Developing a transformer for EQ - 1 Input has the form X =  {a=b} sp(x:=expr,  ) =  v. x=expr[v/x]   [v/x] sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Let’s define helper notations: – EQ(X, y) = {y=a, b=y  X} Subset of equalities containing y – EQc(X, y) = X \ EQ(X, y) Subset of equalities not containing y 166

Developing a transformer for EQ - 2 sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Two cases – x is y: sp(x:=y,  X) = X – x is different from y: sp(x:=y,  X) =  v. x=y  EQ)X, x)[v/x]  EQc(X, x)[v/x] = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  x=y  EQc(X, x) Vanilla transformer:  x:=y  #1 X = x=y  EQc(X, x) Example:  x:=y  #1  {x=p, q=x, m=n} =  {x=y, m=n} Is this the most precise result? 167

Developing a transformer for EQ - 3  x:=y  #1  {x=p, x=q, m=n} =  {x=y, m=n}   {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y,  X) = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 168

Developing a transformer for EQ - 4 Define a reduction operator: Explicate(X) = if exist {a=b, b=c}  X but not {a=c}  X then Explicate(X  {a=c}) else X Define  x:=y  #2 =  x:=y  #1  Explicate  x:=y  #2 )  {x=p, x=q, m=n}) =  {x=y, m=n, p=q} is this the best transformer? 169

Developing a transformer for EQ - 5  x:=y  #2 )  {y=z}) = {x=y, y=z}  {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer  x:=y  #3 = Explicate   x:=y  #1  Explicate 170

Logical Product- 171 basically the strongest postcondition safely abstracting the existential quantifier

Abstracting the existential 172 Reduce the pair Abstract away existential quantifier for each domain

Example 173

Information loss example 174 if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b=  } can’t prove

Disjunctive completion of a lattice For a complete lattice L = (D, , , , ,  ) Define the powerset lattice L  = (2 D,  ,  ,  ,  ,   )   = ?   = ?   = ?   = ?   = ? Lemma: L  is a complete lattice L  contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L  = Disj(L) 175

Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = ? –  P(A),C (Y) = ? 176

Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = {  C,A ({x}) | x  X} –  P(A),C (Y) =  {  P(A) (y) | y  Y} What about transformers? 177

Information loss example 178 if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b= 5  b=-5 } {b= 0 } proved

The base lattice CP 179 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false

The disjunctive completion of CP 180 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false {x=-2  x=-1}{x=-2  x=0}{x=-2  x=1}{x=1  x=2} ……… {x=0  x=1  x=2}{x=-1  x=1  x=-2} ……… … What is the height of this lattice?

Taming disjunctive completion Disjunctive completion is very precise – Maintains correlations between states of different analyses – Helps handle conditions precisely – But very expensive – number of abstract states grows exponentially – May lead to non-termination Base analysis (usually product) is less precise – Analysis terminates if the analyses of each component terminates How can we combine them to get more precision yet ensure termination and state explosion? 181

Taming disjunctive completion Use different abstractions for different program locations – At loop heads use coarse abstraction (base) – At other points use disjunctive completion Termination is guaranteed (by base domain) Precision increased inside loop body 182

With Disj(CP) 183 while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 } Doesn’t terminate

With tamed Disj(CP) 184 while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 } terminates CP Disj(CP) What MultiCartDomain implements

Reducing disjunctive elements A disjunctive set X may contain within it an ascending chain Y=a  b  c… We only need max(Y) – remove all elements below 185

Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = ? 186

Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = Disj(Cart(L 1, L 2 )) Lemma: L is a complete lattice What does it buy us? – How is it relative to Cart(Disj(L 1 ), Disj(L 2 ))? What about transformers? 187

Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = ? –  P(A  B),C (Y) = ? 188

Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = {(  C,A ({x}),  C,B ({x})) | x  X} –  P(A  B),C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} 189

Cartesian product example 190 Correlations preserved

Function space GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Denote the set of monotone functions from A to B by A  B Define  for elements of A  B as follows (a 1, b 1 )  (a 2, b 2 ) = if a 1 =a 2 then {(a 1, b 1  B b 1 )} else {(a 1, b 1 ), (a 2, b 2 )} Reduced cardinal power GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) =  {(  C,A ({x}),  C,B ({x})) | x  X} –  A  B,C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} Useful when A is small and B is much larger – E.g., typestate verification 191

Widening/Narrowing 192

How can we prove this automatically? 193 RelProd(CP, VE)

Intervals domain One of the simplest numerical domains Maintain for each variable x an interval [L,H] – L is either an integer of -  – H is either an integer of +  A (non-relational) numeric domain 194

Intervals lattice for variable x 195  [0,0][-1,-1][-2,-2][1,1][2,2]... [- ,+  ] [0,1][1,2][2,3][-1,0][-2,-1] [-10,10] [1, +  ][ - ,0 ]... [2, +  ][0, +  ][ - ,-1 ]... [-20,10]

Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] ? – [1,4]  [1,3] ? – [1,3]  [1,4] ? – [1,3]  [- ,+  ] ? What is the lattice height? 196

Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] no – [1,4]  [1,3] no – [1,3]  [1,4] yes – [1,3]  [- ,+  ]yes What is the lattice height? Infinite 197

Joining/meeting intervals [a,b]  [c,d] = ? – [1,1]  [2,2] = ? – [1,1]  [2, +  ] = ? [a,b]  [c,d] = ? – [1,2]  [3,4] = ? – [1,4]  [3,4] = ? – [1,1]  [1,+  ] = ? Check that indeed x  y if and only if x  y=y 198

Joining/meeting intervals [a,b]  [c,d] = [min(a,c), max(b,d)] – [1,1]  [2,2] = [1,2] – [1,1]  [2,+  ] = [1,+  ] [a,b]  [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise  – [1,2]  [3,4] =  – [1,4]  [3,4] = [3,4] – [1,1]  [1,+  ] = [1,1] Check that indeed x  y if and only if x  y=y 199

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = ? 200

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas? 201

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x  c and x  c – Example: S =  {x  9, y  5, y  10} – Helper operations c + +  = +  remove(S, x) = S without any x-constraints lb(S, x) = 202

Assignment transformers  x := c  # S = ?  x := y  # S = ?  x := y+c  # S = ?  x := y+z  # S = ?  x := y*c  # S = ?  x := y*z  # S = ? 203

Assignment transformers  x := c  # S = remove(S,x)  {x  c, x  c}  x := y  # S = remove(S,x)  {x  lb(S,y), x  ub(S,y)}  x := y+c  # S = remove(S,x)  {x  lb(S,y)+c, x  ub(S,y)+c}  x := y+z  # S = remove(S,x)  {x  lb(S,y)+lb(S,z), x  ub(S,y)+ub(S,z)}  x := y*c  # S = remove(S,x)  if c>0 {x  lb(S,y)*c, x  ub(S,y)*c} else {x  ub(S,y)*-c, x  lb(S,y)*-c}  x := y*z  # S = remove(S,x)  ? 204

assume transformers  assume x=c  # S = ?  assume x<c  # S = ?  assume x=y  # S = ?  assume x  c  # S = ? 205

assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = ? 206

assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = (S  {x  c-1})  (S  {x  c+1}) 207

Effect of function f on lattice elements L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 208 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn()

Effect of function f on lattice elements L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 209 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn()

Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = … 210

Fixed-point theorem [Kleene] Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) 211

Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 212   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := d  f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

Chaotic iteration 213 Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

Concrete semantics equations 214 R[0] = { x  Z} R[1] =  x:=7  R[2] = R[1]  R[4] R[3] = R[2]  {s | s(x) < 1000} R[4] =  x:=x+1  R[3] R[5] = R[2]  {s | s(x)  1000} R[6] = R[5]  {s | s(x)  1001} R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Abstract semantics equations 215 R[0] =  ({ x  Z}) R[1] =  x:=7  # R[2] = R[1]  R[4] R[3] = R[2]   ({s | s(x) < 1000}) R[4] =  x:=x+1  # R[3] R[5] = R[2]   ({s | s(x)  1000}) R[6] = R[5]   ({s | s(x)  1001})  R[5]   ({s | s(x)  999}) R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Abstract semantics equations 216 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[3] = R[2]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Too many iterations to converge 217

How many iterations for this one? 218

Widening Introduce a new binary operator to ensure termination – A kind of extrapolation Enables static analysis to use infinite height lattices – Dynamically adapts to given program Tricky to design Precision less predictable then with finite- height domains (widening non-monotone) 219

Formal definition For all elements d 1  d 2  d 1  d 2 For all ascending chains d 0  d 1  d 2  … the following sequence is finite – y 0 = d 0 – y i+1 = y i  d i+1 For a monotone function f : D  D define – x 0 =  – x i+1 = x i  f(x i ) Theorem: – There exits k such that x k+1 = x k – x k  Red(f) = { d | d  D and f(d)  d } 220

Analysis with finite-height lattice 221 A  f #n  = lpf(f # )  … f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f)

Analysis with widening 222 A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

Widening for Intervals Analysis   [c, d] = [c, d] [a, b]  [c, d] = [ if a  c then a else - , if b  d then b else  223

Semantic equations with widening 224 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Choosing analysis with widening 225 Enable widening

Non monotonicity of widening [0,1]  [0,2] = ? [0,2]  [0,2] = ?

Non monotonicity of widening [0,1]  [0,2] = [0,  ] [0,2]  [0,2] = [0,2]

Analysis results with widening 228 Did we prove it?

Analysis with narrowing 229 A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

Formal definition of narrowing Improves the result of widening y  x  y  (x  y)  x For all decreasing chains x 0  x 1  … the following sequence is finite – y 0 = x 0 – y i+1 = y i  x i+1 For a monotone function f: D  D and x k  Red(f) = { d | d  D and f(d)  d } define – y 0 = x – y i+1 = y i  f(y i ) Theorem: – There exits k such that y k+1 =y k – y k  Red(f) = { d | d  D and f(d)  d }

Narrowing for Interval Analysis [a, b]   = [a, b] [a, b]  [c, d] = [ if a = -  then c else a, if b =  then d else b ]

Semantic equations with narrowing 232 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Analysis with widening/narrowing Two phases – Phase 1: analyze with widening until converging – Phase 2: use values to analyze with narrowing 233 Phase 2: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] Phase 1: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ]

Analysis with widening/narrowing 234

Analysis results widening/narrowing 235 Precise invariant

Program Analysis and Verification 0368-4479

Similar presentations

Presentation on theme: "Program Analysis and Verification 0368-4479"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Program Analysis and Verification 0368-4479

Similar presentations

Presentation on theme: "Program Analysis and Verification 0368-4479"— Presentation transcript:

Similar presentations

About project

Feedback