Presentation is loading. Please wait.

Presentation is loading. Please wait.

Program Analysis and Verification 0368-4479

Similar presentations


Presentation on theme: "Program Analysis and Verification 0368-4479"— Presentation transcript:

1 Program Analysis and Verification 0368-4479 http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html Noam Rinetzky Lecture 9: Abstract Interpretation II Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

2 From verification to analysis Manual program verification – Verifier provides assertions Loop invariants Program analysis – Automatic program verification – Tool automatically synthesize assertions Finds loop invariants

3 Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis 3

4 Abstract Interpretation [Cousot’77] Mathematical framework for approximating semantics (aka abstraction) – Allows designing sound static analysis algorithms Usually compute by iterating to a fixed-point – Computes (loop) invariants Can be interpreted as axiomatic verification assertions Generalizes Hoare Logic & WP / SP calculus 4

5 Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states ~ Assertions Join (  ) ~ Weakening – Transformer functions Abstract steps ~ Axioms – Chaotic iteration Structured Programs ~ Control-flow graphs Abstract computation ~ Loop invariants 5

6 Concrete Semantics 6 set of states operational semantics (concrete semantics) statement S set of states

7 Conservative Semantics 7 set of states operational semantics (concrete semantics) statement S set of states 

8 Abstract (conservative) interpretation 8 set of states operational semantics (concrete semantics) statement S abstract representation  abstract representation abstract semantics statement S abstract representation abstraction {P}{P}S{Q}{Q}sp(S, P)  generalizes axiomatic verification

9 Abstract (conservative) interpretation 9 set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

10 Abstract (conservative) interpretation 10 set of states operational semantics (concrete semantics) statement S set of states abstract state abstract semantics (transfer function) statement S abstract state concretization 

11 Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states ~ Assertions Join (  ) ~ Weakening – Transformer functions Abstract steps ~ Axioms – Chaotic iteration Abstract computation ~ Loop invariants Structured Programs ~ Control-flow graphs 11 Lattices (D, , , , ,  ) Lattices (D, , , , ,  ) Monotonic functions Fixpoints

12 A taxonomy of semantic domain types 12 Complete Lattice (D, , , , ,  ) Lattice (D, , , , ,  ) Join semilattice (D, , ,  ) Meet semilattice (D, , ,  ) Complete partial order (CPO) (D, ,  ) Partial order (poset) (D,  ) Preorder (D,  )

13 Preorder 13 We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’

14 Preorder 14 d d d’ d’’        We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ Hasse Diagram

15 Preorder 15 We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ d d d’ d’’        Hasse Diagram

16 Partial order We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ – Anti-symmetric: d  d’ and d’  d implies d = d’ 16 d d d’ d’’        Hasse Diagram

17 Chains d  d’ means d  d’ and d  d’ An ascending chain is a sequence x 1  x 2  …  x k … A descending chain is a sequence x 1  x 2  …  x k … The height of a poset (D,  ) is the length of the maximal ascending chain in D 17

18 poset Hasse diagram (for CP) 18 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

19 Some posets-related terminology 19

20 Least upper bound (LUB) (D,  ) is a poset b ∊ D is an upper bound of A ⊆ D if ∀ a  A: a  b b ∊ D is the least upper bound of A ⊆ D if – b is an upper bound of A – If b’ is an upper bound of A then b  b’ – Join:  X = LUB of X x  y =  {x, y} May not exist

21 Join operator Properties of a join operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x A kind of abstract union (disjunction) operator Top element of (D,  ) is  =  D 21

22 Join Example 22 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

23 Join Example 23 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

24 Join Example 24 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

25 Greatest lower bound (GLB) (D,  ) is a poset b ∊ D is an lower bound of A ⊆ D if ∀ a  A: b  a b ∊ D is the greatest lower bound of A ⊆ D if – b is an lower bound of A – If b’ is an lower bound of A then b’  b – Meet:  X = GLB of X x  y =  {x, y} May not exist

26 Meet operator Properties of a meet operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x A kind of abstract intersection (conjunction) operator Bottom element of (D,  ) is  =  D 26

27 Complete partial order (CPO) A poset (D,  ) is a complete partial if every ascending chain x 1  x 2  …  x k … has a LUB 27

28 Meet Example 28   x=0 x0x0 x<0x>0 x0x0

29 Meet Example 29   x=0 x0x0 x<0x>0 x0x0

30 Meet Example 30   x=0 x0x0 x<0x>0 x0x0

31 Complete partial order (CPO) A poset (D,  ) is a complete partial if every ascending chain x 1  x 2  …  x k … has a LUB 31

32 Join semilattices (D, , ,  ) is a join semilattice – (D,  ) is a partial order – ∀ X  FIN D.  X is defined – A top element  32

33 Meet semilattices (D, , ,  ) is a meet semilattice – (D,  ) is a partial order – ∀ X  FIN D.  X is defined – A bottom element  33

34 Lattices (D, , , , ,  ) is a lattice if – (D, , ,  ) is a join semilattice – (D, , ,  ) is a meet semilattice A lattice (D, , , , ,  ) is a complete lattice if –  X and  Y are defined for arbitrary sets 34

35 Example: Powerset lattices (2 X, , , , , X) is the powerset lattice of X – A complete lattice 35

36 Example: Sign lattice 36   x=0 x0x0 x<0x>0 x0x0

37 A taxonomy of semantic domain types 37 Complete Lattice (D, , , , ,  ) Lattice (D, , , , ,  ) Join semilattice (D, , ,  ) Meet semilattice (D, , ,  ) Join/Meet exist for every subset of D Join/Meet exist for every finite subset of D (alternatively, binary join/meet) Join of the empty set Meet of the empty set Complete partial order (CPO) (D, ,  ) Partial order (poset) (D,  ) Preorder (D,  ) reflexive transitive anti-symmetric: d  d’ and d’  d implies d = d’ reflexive: d  d transitive: d  d’, d’  d’’ implies d  d’’ poset with LUB for all ascending chains

38 Towards a recipe for static analysis 38

39 Collecting semantics For a set of program states State, we define the collecting lattice (2 State, , , , , State) The collecting semantics accumulates the (possibly infinite) sets of states generated during the execution – Not computable in general 39

40 Abstract (conservative) interpretation 40 set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

41 Abstract (conservative) interpretation 41 {x ↦ 1, x ↦ 2, …}{x ↦ 0, x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1 0 ≤ x concretization

42 Abstract (conservative) interpretation 42 {x ↦ 1, x ↦ 2, …}{…, x ↦ 0, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1  concretization

43 Abstract (non-conservative) interpretation 43 {x ↦ 1, x ↦ 2, …}{ x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …} ⊈ 0 < x abstract semantics x = x -1 0 < x concretization

44 But … what if we have x & y? – Define lattice (semantics) for each variable – Compose lattices Goal: compositional definition What if we have more than 1 statement? – Define semantics for entire program via CFG – Different “abstract states” at every CFG node 44

45 One lattice per variable 45 true false x=0 x0x0 x<0x>0 x0x0 true false y=0 y0y0 y<0y>0 y0y0 How can we compose them?

46 Cartesian product of complete lattices For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the poset L cart = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) as follows: – (x 1, x 2 )  cart (y 1, y 2 ) iff x 1  1 y 1 x 2  2 y 2 –  cart = ?  cart = ?  cart = ?  cart = ? Lemma: L is a complete lattice Define the Cartesian constructor L cart = Cart(L 1, L 2 ) 46

47 Cartesian product example 47  =( ,  ) x<0,y<0x<0,y=0x 0x=0,y<0x=0,y=0x=0,y>0x>0,y<0x>0,y=0x>0,y>0 x  0,y< 0 x  0,y< 0 x  0,y= 0 x  0,y= 0 x  0,y> 0 x  0,y> 0 x>0,y  0 … … x  0,y  0 x  0,y  0 x  0,y  0 x  0,y  0 x0x0 x0x0 y0y0 y0y0 … ( false, false ) How does it represent (x 0  y>0)?  =( ,  )

48 Disjunctive completion For a complete lattice L = (D, , , , ,  ) Define the powerset lattice L  = (2 D,  ,  ,  ,  ,   )   = ?   = ?   = ?   = ?   = ? Lemma: L  is a complete lattice L  contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L  = Disj(L) 48

49 The base lattice CP 49 {x=0}  {x=-1}{x=-2}{x=1}{x=2} …… 

50 The disjunctive completion of CP 50 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false {x=-2  x=-1}{x=-2  x=0}{x=-2  x=1}{x=1  x=2} ……… {x=0  x=1  x=2}{x=-1  x=1  x=-2} ……… … What is the height of this lattice?

51 Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = ? 51

52 Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = Disj(Cart(L 1, L 2 )) Lemma: L is a complete lattice What does it buy us? 52

53 Cartesian product example 53 true false x<0,y<0x<0,y=0x 0x=0,y<0x=0,y=0x=0,y>0x>0,y<0x>0,y=0x>0,y>0 x  0,y< 0 x  0,y< 0 x  0,y= 0 x  0,y= 0 x  0,y> 0 x  0,y> 0 x>0,y  0 … … x  0,y  0 x  0,y  0 x  0,y  0 x  0,y  0 x0x0 x0x0 y0y0 y0y0 … How does it represent (x 0  y>0)? What is the height of this lattice?

54 Relational product example 54 true false (x 0  y>0) x0x0 x0x0 y0y0 y0y0 How does it represent (x 0  y>0)? (x 0  y=0)(x<0  y  0)  (x<0  y  0) … What is the height of this lattice?

55 Collecting semantics 1 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x1][x1] [x0][x0] [x0][x0] [ x  -1] [x2][x2] [x2][x2] [x2][x2] [x2][x2] [x3][x3] [x3][x3] [x3][x3] … … … 55 [ x  -2] …

56 Defining the collecting semantics How should we represent the set of states at a given control-flow node by a lattice? How should we represent the sets of states at all control-flow nodes by a lattice? 56

57 Finite maps For a complete lattice L = (D, , , , ,  ) and finite set V Define the poset L V  L = (V  D,  V  L,  V  L,  V  L,  V  L,  V  L ) as follows: – f 1  V  L f 2 iff for all v  V f 1 (v)  f 2 (v) –  V  L = ?  V  L = ?  V  L = ?  V  L = ? Lemma: L is a complete lattice Define the map constructor L V  L = Map(V, L) 57

58 The collecting lattice Lattice for a given control-flow node v: ? Lattice for entire control-flow graph with nodes V: ? We will use this lattice as a baseline for static analysis and define abstractions of its elements 58

59 The collecting lattice Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 59

60 Equational definition of the semantics Define variables of type set of states for each control-flow node Define constraints between them 60 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]

61 Equational definition of the semantics R[2] = R[entry]   x:=x-1  R[3] R[3] = R[2]  {s | s(x) > 0} R[exit] = R[2]  {s | s(x)  0} A system of recursive equations How can we approximate it using what we have learned so far? 61 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]

62 An abstract semantics R[2] = R[entry]   x:=x-1  # R[3] R[3] = R[2]  {s | s(x) > 0} # R[exit] = R[2]  {s | s(x)  0} # A system of recursive equations 62 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit] Abstract transformer for x:=x-1 Abstract representation of {s | s(x) < 0}

63 Abstract interpretation via abstraction 63 set of states collecting semantics statement S abstract representation of sets of states  abstract representation of sets of states abstract semantics statement S abstract representation of sets of states abstraction {P}{P}S{Q}{Q}sp(S, P)  generalizes axiomatic verification

64 Abstract interpretation via concretization 64 set of states collecting semantics statement S set of states  abstract representation of sets of states abstract semantics statement S abstract representation of sets of states concretization {P}{P}S{Q}{Q} models(P)models(sp(S, P))models(Q) 

65 Required knowledge Collecting semantics Abstract semantics Connection between collecting semantics and abstract semantics Algorithm to compute abstract semantics 65

66 The collecting lattice (sets of states) Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 66

67 Collecting semantics example 1 2 3 4 5 67 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: if x > 0 x := x-1 entry exit 2 3 [x1][x1][x0][x0][ x  -1][x2][x2][x2][x2][x3][x3] … [x1][x1][x0][x0] [x2][x2][x2][x2][x3][x3] … [x1][x1] [x3][x3][x4][x4] [x2][x2] … [ x  -2] … [x0][x0] [x0][x0][x1][x1] [x2][x2][x3][x3] …

68 Shorthand notation 1 2 3 4 5 68 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: if x > 0 x := x-1 entry exit {xZ}{xZ} {xZ}{xZ} { x >0} { x  0} { x  0} 2 3

69 Equational definition example A vector of variables R[0, 1, 2, 3, 4] R[0] = { x  Z} // established input R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] A (recursive) system of equations 69 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3] Semantic function for assume x>0 Semantic function for x:=x-1 lifted to sets of states

70 General definition A vector of variables R[0, …, k] one per input/output of a node – R[0] is for entry For node n with multiple predecessors add equation R[n] =  {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] =  S  R[m] Treat true/false branches of conditions as atomic statements assume bexpr and assume  bexpr 70 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3]

71 Equation systems in general Let L be a complete lattice (D, , , , ,  ) Let R be a vector of variables R[0, …, n]  D  …  D Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 71

72 Equation systems in general Let L be a complete lattice (D, , , , ,  ) Let R be a vector of variables R[0, …, n]  D  …  D Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 72

73 Monotone functions Let L 1 =(D 1,  ) and L 2 =(D 2,  ) be two posets A function f : D 1  D 2 is monotone if for every pair x, y  D 1 x  y implies f(x)  f(y) A special case: L 1 =L 2 =(D,  ) f : D  D 73

74 Important cases of monotonicity Join: f(X, Y) = X  Y Prove it! For a set X and any function g F(X) = { g(x) | x  X } Prove it! Notice that the collecting semantics function is defined in terms of – Join (set union) – Semantic function for atomic statements lifted to sets of states 74

75 Extensive/reductive functions Let L=(D,  ) be a poset A function f : D  D is extensive if for every pair x  D, we have that x  f(x) A function f : D  D is reductive if for every pair x  D, we have that x  f(x) 75

76 Fixed-points L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 76 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn() 1.Does a solution always exist? Yes 2.If so, is it unique? No, but it has least/greatest solutions 3.If so, is it computable? Under some conditions…

77 Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 77 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : Fixed-point = d

78 Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 78 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <-5} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : pre Fixed-point  d

79 Fixed point example for program R[0] = { x  Z} R[1] = R[0]  R[4] R[2] = R[1]  {s | s(x) > 0} R[3] = R[1]  {s | s(x)  0} R[4] =  x:=x-1  R[2] 79 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <9} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : post Fixed-point  d

80 Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = … 80

81 Fixed-point theorem [Kleene] Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) Lemma: Monotone functions on posets satisfying ACC are continuous Proof: 1.Every ascending chain Y eventually stabilizes d 0  d 1  …  d n = d n+1 = … hence d n is the least upper bound of {d 0, d 1, …, d n }, thus f(  Y) = f(d n ) 2.From monotonicity of f f(d 0 )  f(d 1 )  …  f(d n ) = f(d n+1 ) = … Hence f(d n ) is the least upper bound of {f(d 0 ), f(d 1 ), …, f(d n )}, thus  { f(y) | y  Y } = f(d n ) 81

82 Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 82   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := d  f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

83 Chaotic iteration 83 Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

84 Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , ,  ) Define variables R[0,…,n] for input/output of each CFG node such that R[i]  D For each node v let v out be the variable at the output of that node: v out = F[v](  u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 84

85 Constant propagation example 85 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit x := 4; while (y  5) do z := x; x := 4

86 Constant propagation lattice For each variable x define L as For a set of program variables Var=x 1,…,x n L n = L  L  …  L 86  0-212...  no information not-a-constant

87 Write down variables 87 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit x := 4; while (y  5) do z := x; x := 4

88 Write down equations 88 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R1R1 R5R5 R0R0 x := 4; while (y  5) do z := x; x := 4

89 Collecting semantics equations 89 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 = State R 1 =  x:=4  R 0 R 2 = R 1  R 5 R 3 =  assume y  5  R 2 R 4 =  z:=x  R 3 R 5 =  x:=4  R 4 R 6 =  assume y=5  R 2 R1R1 R5R5 R0R0

90 Constant propagation equations 90 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R1R1 R5R5 R0R0 abstract transformer

91 Abstract operations for CP 91 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 Lattice elements have the form: (v x, v y, v z )  x:=4  # (v x,v y,v z ) = (4, v y, v z )  z:=x  # (v x,v y,v z ) = (v x, v y, v x )  assume y  5  # (v x,v y,v z ) = (v x, v y, v x )  assume y=5  # (v x,v y,v z ) = if v y = k  5 then ( , ,  ) else (v x, 5, v z ) R 1  R 5 = (a 1, b 1, c 1 )  (a 5, b 5, c 5 ) = (a 1  a 5, b 1  b 5, c 1  c 5 )  0-212...  CP lattice for a single variable

92 Chaotic iteration for CP: initialization 92 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =( , ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 0, R 1, R 2, R 3, R 4, R 5, R 6 }

93 Chaotic iteration for CP 93 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =( , ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 1, R 2, R 3, R 4, R 5, R 6 }

94 Chaotic iteration for CP 94 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 2, R 3, R 4, R 5, R 6 }

95 Chaotic iteration for CP 95 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =( , ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  )  0-212...  3 4 WL = {R 2, R 3, R 4, R 5, R 6 }

96 Chaotic iteration for CP 96 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =(4, ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  )  0-212...  3 4 WL = {R 2, R 3, R 4, R 5, R 6 }

97 Chaotic iteration for CP 97 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R 2 =(4, ,  ) R2R2 R2R2 R 3 =( , ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 1 =(4, ,  ) R 5 =( , ,  ) R 0 =( , ,  ) WL = {R 3, R 4, R 5, R 6 }

98 Chaotic iteration for CP 98 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, ,  ) R 4 =( , ,  ) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =( , ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 4, R 5, R 6 }

99 Chaotic iteration for CP 99 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, ,  ) R 4 =(4, , 4) R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =( , ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 5, R 6 }

100 Chaotic iteration for CP 100 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , ,  ) R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 2, R 6 } added R 2 back to worklist since it depends on R 5

101 Chaotic iteration for CP 101 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , ,  ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) WL = {R 6 }

102 Chaotic iteration for CP 102 R 0 =  R 1 =  x:=4  # R 0 R 2 = R 1  R 5 R 3 =  assume y  5  # R 2 R 4 =  z:=x  # R 3 R 5 =  x:=4  # R 4 R 6 =  assume y=5  # R 2 x := 4 if (*) assume y  5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =(4, 5,  ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, ,  ) R 1 =(4, ,  ) R 0 =( , ,  ) R 2 =(4, ,  ) Fixed-point WL = {} In practice maintain a worklist of nodes

103 Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , ,  ) Define variables R[0,…,n] for input/output of each CFG node such that R[i]  D For each node v let v out be the variable at the output of that node: v out = F[v](  u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 103

104 Complexity of chaotic iteration Parameters: – n the number of CFG nodes – k is the maximum in-degree of edges – Height h of lattice L – c is the maximum cost of Applying F v  Checking fixed-point condition for lattice L Complexity: O(n  h  c  k) Incremental (worklist) algorithm reduces the n factor in factor – Implement worklist by priority queue and order nodes by reversed topological order 104

105 Required knowledge Collecting semantics Abstract semantics (over lattices) Algorithm to compute abstract semantics (chaotic iteration) Connection between collecting semantics and abstract semantics Abstract transformers 105

106 Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 106

107 Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 107

108 Galois Connection Given two complete lattices C = (D C,  C,  C,  C,  C,  C )– concrete domain A = (D A,  A,  A,  A,  A,  A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function  : D C  D A – The concretization function  : D A  D C for every concrete element c  D C and abstract element a  D A  (  (a))  a and c   (  (c)) Alternatively  (c)  a iff c   (a) 108

109 Galois Connection: c   (  (c)) 109 1   c 2 (c)(c) 3  (  (c))  The most precise (least) element in A representing c CA

110 Galois Connection:  (  (a))  a 110 1   3  (  (a)) 2 (a)(a) a  CA What a represents in C (its meaning)

111 Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (X) = ?  (Y) = ? 111

112 Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (s) =  ({s}) = { x=y | s x = s y} that is s  x=y  (X) =  {  (s) | s  X} =  A {  (s) | s  X}  (Y) = { s | s   Y } = models(  Y) 112

113 Galois Connection: c   (  (c)) 113 1   [x  5, y  5, z  5] 2 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 3 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] …  4 x=x, y=y, z=z  The most precise (least) element in A representing [x  5, y  5, z  5] CA

114 Most precise abstract representation 114 1 c 5  CA 4 6 2 73   8 9  (c)(c)  (c) =  {c’ | c   (c’)} 

115 Most precise abstract representation 115 1 c 5  CA 4 6 2 73   8 9   (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y  (c) =  {c’ | c   (c’)} [x  5, y  5, z  5] x=y, y=z x=y, z=y x=y

116 Galois Connection:  (  (a))  a 116 1   3 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … x=y, y=z  What a represents in C (its meaning)    is called a semantic reduction CA

117 Galois Insertion  a:  (  (a))=a 117   1 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … CA How can we obtain a Galois Insertion from a Galois Connection? All elements are reduced

118 Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other:  (a) =  {c |  (c)  a}  (c) =  {a | c   (a)} 118

119 Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements  (s) =  ({s}) Then lift the abstraction to sets of elements  (X) =  A {  (s) | s  X} 119

120 The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2 State, , , , , State) A = (D A,  A,  A,  A,  A,  A ) If D A is a set of formulas then the abstraction of a state is defined as  (s) =  ({s}) =  A {  | s   } the least formula from D A that s satisfies The abstraction of a set of states is  (X) =  A {  (s) | s  X} The concretization is  (  ) = { s | s   } = models(  ) 120

121 Inducing along the connections Assume the complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) M = (D M,  M,  M,  M,  M,  M ) and Galois connections GC C,A =(C,  C,A,  A,C, A) and GC A,M =(A,  A,M,  M,A, M) Lemma: both connections induce the GC C,M = (C,  C,M,  M,C, M) defined by  C,M =  C,A   A,M and  M,C =  M,A   A,C 121

122 Inducing along the connections 122 1 C,AC,A A,CA,C c 2 C,A(c)C,A(c) 5 CA 3 M A,MA,M 4 M,AM,A c’c’ a’ =  A,M (  C,A (c))

123 Sound abstract transformer Given two lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with A concrete transformer f : D C  D C an abstract transformer f # : D A  D A We say that f # is a sound transformer (w.r.t. f) if  c: f(c)=c’   (f # (c))   (c’) For every a and a’ such that  (f(  (a)))  A f # (a) 123

124 Transformer soundness condition 1 124 12 CA f 3 4 f#f# 5   c: f(c)=c’   (f # (c))   (c’)

125 Transformer soundness condition 2 125 CA 12 f#f# 3 5 f 4   a: f # (a)=a’  f(  (a))   (a’)

126 Best (induced) transformer 126 CA 2 3 f f # (a)=  (f(  (a))) 1 f#f# 4 Problem:  incomputable directly

127 Best abstract transformer [CC’77] Best in terms of precision – Most precise abstract transformer – May be too expensive to compute Constructively defined as f # =   f   – Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer – Useful tool: partial concretization 127

128 Transformer example C = (2 State, , , , , State) EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  )  (s) =  ({s}) = { x=y | s x = s y} that is s  x=y  (X) =  {  (s) | s  X} =  A {  (s) | s  X}  (  ) = { s | s   } = models(  ) Concrete:  x:=y  X = { s[x  s y] | s  X} Abstract:  x:=y  # X = ? 128

129 Developing a transformer for EQ - 1 Input has the form X =  {a=b} sp(x:=expr,  ) =  v. x=expr[v/x]   [v/x] sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Let’s define helper notations: – EQ(X, y) = {y=a, b=y  X} Subset of equalities containing y – EQc(X, y) = X \ EQ(X, y) Subset of equalities not containing y 129

130 Developing a transformer for EQ - 2 sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Two cases – x is y: sp(x:=y,  X) = X – x is different from y: sp(x:=y,  X) =  v. x=y  EQ)X, x)[v/x]  EQc(X, x)[v/x] = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  x=y  EQc(X, x) Vanilla transformer:  x:=y  #1 X = x=y  EQc(X, x) Example:  x:=y  #1  {x=p, q=x, m=n} =  {x=y, m=n} Is this the most precise result? 130

131 Developing a transformer for EQ - 3  x:=y  #1  {x=p, x=q, m=n} =  {x=y, m=n}   {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y,  X) = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 131

132 Developing a transformer for EQ - 4 Define a reduction operator: Explicate(X) = if exist {a=b, b=c}  X but not {a=c}  X then Explicate(X  {a=c}) else X Define  x:=y  #2 =  x:=y  #1  Explicate  x:=y  #2 )  {x=p, x=q, m=n}) =  {x=y, m=n, p=q} is this the best transformer? 132

133 Developing a transformer for EQ - 5  x:=y  #2 )  {y=z}) = {x=y, y=z}  {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer  x:=y  #3 = Explicate   x:=y  #1  Explicate Observation : after the first time we apply Explicate, all subsequent values will be in the image of the abstraction so really we only need to apply it once to the input Finally:  x:=y  # (X) = Explicate   x:=y  #1 – Best transformer for reduced elements (elements in the image of the abstraction) 133

134 Negative property of best transformers Let f # =   f   Best transformer does not compose  (f(f(  (a))))  f # (f # (a)) 134

135  (f(f(  (a))))  f # (f # (a)) 135 CA 2 3 f 1 f#f# 6 5 4 f 7  f#f# 8 9 f

136 Soundness theorem 1 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  a  D A : f(  (a))   (f # (a)) Then lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # ) 136

137 Soundness theorem 1 137 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a  D A : f(  (a))   (f # (a))   a  D A : f n (  (a))   (f #n (a))   a  D A : lfp(f n )(  (a))   (lfp(f #n )(a))  lfp(f)   lfp(f # ) 

138 Soundness theorem 2 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  c  D C :  (f(c))  f # (  (c)) Then  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # )) 138

139 Soundness theorem 2 139 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c))  f # (  (c))   c  D C :  (f n (c))  f #n (  (c))   c  D C :  (lfp(f)(c))  lfp(f # )(  (c))  lfp(f)   lfp(f # ) 

140 A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound 140

141 Completeness Local property: – forward complete:  c:  (f # (c)) =  (f(c)) – backward complete:  a: f(  (a)) =  (f # (a)) A property of domain and the (best) transformer Global property: –  (lfp(f)) = lfp(f # ) – lfp(f) =  (lfp(f # )) Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties 141

142 Forward complete transformer 142 12 CA f 3 4 f#f#  c:  (f # (c)) =  (f(c))

143 Backward complete transformer 143 CA 12 f#f# 3 5 f  a: f(  (a)) =  (f # (a))

144 Global (backward) completeness 144 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a: f(  (a)) =  (f # (a))   a: f n (  (a)) =  (f #n (a))   a  D A : lfp(f n )(  (a)) =  (lfp(f #n )(a))  lfp(f)  = lfp(f # ) 

145 Global (forward) completeness 145 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c)) = f # (  (c))   c  D C :  (f n (c)) = f #n (  (c))   c  D C :  (lfp(f)(c)) = lfp(f # )(  (c))  lfp(f)  = lfp(f # ) 

146 Three example analyses Abstract states are conjunctions of constraints Variable Equalities – VE-factoids = { x=y | x, y  Var}  false VE = (2 VE-factoids, , , , false,  ) Constant Propagation – CP-factoids = { x=c | x  Var, c  Z}  false CP = (2 CP-factoids, , , , false,  ) Available Expressions – AE-factoids = { x=y+z | x  Var, y,z  Var  Z}  false A = (2 AE-factoids, , , , false,  ) 146

147 Lattice combinators reminder Cartesian Product – L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) – Cart(L 1, L 2 ) = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) Disjunctive completion – L = (D, , , , ,  ) – Disj(L) = (2 D,  ,  ,  ,  ,   ) Relational Product – Rel(L 1, L 2 ) = Disj(Cart(L 1, L 2 )) 147

148 Cartesian product of complete lattices For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the poset L cart = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) as follows: – (x 1, x 2 )  cart (y 1, y 2 ) iff x 1  1 y 1 and x 2  2 y 2 –  cart = ?  cart = ?  cart = ?  cart = ? Lemma: L is a complete lattice Define the Cartesian constructor L cart = Cart(L 1, L 2 ) 148

149 Cartesian product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X)= ? –  A  B,C (Y) = ? 149

150 Cartesian product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) What about transformers? 150

151 Cartesian product transformers GC C,A =(C,  C,A,  A,C, A)F A [st] : A  A GC C,B =(C,  C,B,  B,C, B)F B [st] : B  B Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) How should we define F A  B [st] : A  B  A  B 151

152 Cartesian product transformers GC C,A =(C,  C,A,  A,C, A)F A [st] : A  A GC C,B =(C,  C,B,  B,C, B)F B [st] : B  B Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) How should we define F A  B [st] : A  B  A  B Idea: F A  B [st](a, b) = (F A [st] a, F B [st] b) Are component-wise transformers precise? 152

153 Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 153 a := 9; b := 9; c := a; CP analysisVE analysis {a=9} {a=9, b=9} {a=9, b=9, c=9} {} {} {c=a}

154 Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 154 CP analysis + VE analysis a := 9; b := 9; c := a; {a=9} {a=9, b=9} {a=9, b=9, c=9, c=a}

155 Cartesian product analysis example Abstract interpreter 1: Constant Propagation Abstract interpreter 2: Variable Equalities Let’s compare – Running them separately and combining results – Running the analysis with their Cartesian product 155 CP  VE analysis Missing {a=b, b=c} a := 9; b := 9; c := a; {a=9} {a=9, b=9} {a=9, b=9, c=9, c=a}

156 Transformers for Cartesian product Naïve (component-wise) transformers do not utilize information from both components – Same as running analyses separately and then combining results Can we treat transformers from each analysis as black box and obtain best transformer for their combination? 156

157 Can we combine transformer modularly? No generic method for any abstract interpretations 157

158 Reducing values for CP  VE X = set of CP constraints of the form x=c (e.g., a=9 ) Y = set of VE constraints of the form x=y Reduce CP  VE (X, Y) = (X’, Y’) such that (X’, Y’)  (X’, Y’) Ideas? 158

159 Reducing values for CP  VE X = set of CP constraints of the form x=c (e.g., a=9 ) Y = set of VE constraints of the form x=y Reduce CP  VE (X, Y) = (X’, Y’) such that (X’, Y’)  (X’, Y’) ReduceRight: – if a=b  X and a=c  Y then add b=c to Y ReduceLeft: – If a=c and b=c  Y then add a=b to X Keep applying ReduceLeft and ReduceRight and reductions on each domain separately until reaching a fixed-point 159

160 Transformers for Cartesian product Do we get the best transformer by applying component-wise transformer followed by reduction? – Unfortunately, no (what’s the intuition?) – Can we do better? – Logical Product [Gulwani and Tiwari, PLDI 2006] 160

161 Product vs. reduced product 161 CP  VE lattice {a=9}{c=a}{c=9}{c=a} {a=9, c=9}{c=a} {[a  9, c  9]} collecting lattice {}    

162 Reduced product For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the reduced poset D 1  D 2 = {(d 1,d 2 )  D 1  D 2 | (d 1,d 2 ) =    (d 1,d 2 ) } L 1  L 2 = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) 162

163 Transformers for Cartesian product Do we get the best transformer by applying component-wise transformer followed by reduction? – Unfortunately, no (what’s the intuition?) – Can we do better? – Logical Product [Gulwani and Tiwari, PLDI 2006] 163

164 164

165 Logical product-- Assume A=(D,…) is an abstract domain that supports two operations: for x  D – inferEqualities(x) = { a=b |  (x)  a=b } returns a set of equalities between variables that are satisfied in all states given by x – refineFromEqualities(x, {a=b}) = y such that  (x)=  (y) y  x 165

166 Developing a transformer for EQ - 1 Input has the form X =  {a=b} sp(x:=expr,  ) =  v. x=expr[v/x]   [v/x] sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Let’s define helper notations: – EQ(X, y) = {y=a, b=y  X} Subset of equalities containing y – EQc(X, y) = X \ EQ(X, y) Subset of equalities not containing y 166

167 Developing a transformer for EQ - 2 sp(x:=y,  X) =  v. x=y[v/x]   {a=b}[v/x] = … Two cases – x is y: sp(x:=y,  X) = X – x is different from y: sp(x:=y,  X) =  v. x=y  EQ)X, x)[v/x]  EQc(X, x)[v/x] = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  x=y  EQc(X, x) Vanilla transformer:  x:=y  #1 X = x=y  EQc(X, x) Example:  x:=y  #1  {x=p, q=x, m=n} =  {x=y, m=n} Is this the most precise result? 167

168 Developing a transformer for EQ - 3  x:=y  #1  {x=p, x=q, m=n} =  {x=y, m=n}   {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y,  X) = x=y  EQc(X, x)   v. EQ)X, x)[v/x]  v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 168

169 Developing a transformer for EQ - 4 Define a reduction operator: Explicate(X) = if exist {a=b, b=c}  X but not {a=c}  X then Explicate(X  {a=c}) else X Define  x:=y  #2 =  x:=y  #1  Explicate  x:=y  #2 )  {x=p, x=q, m=n}) =  {x=y, m=n, p=q} is this the best transformer? 169

170 Developing a transformer for EQ - 5  x:=y  #2 )  {y=z}) = {x=y, y=z}  {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer  x:=y  #3 = Explicate   x:=y  #1  Explicate 170

171 Logical Product- 171 basically the strongest postcondition safely abstracting the existential quantifier

172 Abstracting the existential 172 Reduce the pair Abstract away existential quantifier for each domain

173 Example 173

174 Information loss example 174 if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b=  } can’t prove

175 Disjunctive completion of a lattice For a complete lattice L = (D, , , , ,  ) Define the powerset lattice L  = (2 D,  ,  ,  ,  ,   )   = ?   = ?   = ?   = ?   = ? Lemma: L  is a complete lattice L  contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L  = Disj(L) 175

176 Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = ? –  P(A),C (Y) = ? 176

177 Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = {  C,A ({x}) | x  X} –  P(A),C (Y) =  {  P(A) (y) | y  Y} What about transformers? 177

178 Information loss example 178 if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b= 5  b=-5 } {b= 0 } proved

179 The base lattice CP 179 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false

180 The disjunctive completion of CP 180 {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false {x=-2  x=-1}{x=-2  x=0}{x=-2  x=1}{x=1  x=2} ……… {x=0  x=1  x=2}{x=-1  x=1  x=-2} ……… … What is the height of this lattice?

181 Taming disjunctive completion Disjunctive completion is very precise – Maintains correlations between states of different analyses – Helps handle conditions precisely – But very expensive – number of abstract states grows exponentially – May lead to non-termination Base analysis (usually product) is less precise – Analysis terminates if the analyses of each component terminates How can we combine them to get more precision yet ensure termination and state explosion? 181

182 Taming disjunctive completion Use different abstractions for different program locations – At loop heads use coarse abstraction (base) – At other points use disjunctive completion Termination is guaranteed (by base domain) Precision increased inside loop body 182

183 With Disj(CP) 183 while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 } Doesn’t terminate

184 With tamed Disj(CP) 184 while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 } terminates CP Disj(CP) What MultiCartDomain implements

185 Reducing disjunctive elements A disjunctive set X may contain within it an ascending chain Y=a  b  c… We only need max(Y) – remove all elements below 185

186 Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = ? 186

187 Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = Disj(Cart(L 1, L 2 )) Lemma: L is a complete lattice What does it buy us? – How is it relative to Cart(Disj(L 1 ), Disj(L 2 ))? What about transformers? 187

188 Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = ? –  P(A  B),C (Y) = ? 188

189 Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = {(  C,A ({x}),  C,B ({x})) | x  X} –  P(A  B),C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} 189

190 Cartesian product example 190 Correlations preserved

191 Function space GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Denote the set of monotone functions from A to B by A  B Define  for elements of A  B as follows (a 1, b 1 )  (a 2, b 2 ) = if a 1 =a 2 then {(a 1, b 1  B b 1 )} else {(a 1, b 1 ), (a 2, b 2 )} Reduced cardinal power GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) =  {(  C,A ({x}),  C,B ({x})) | x  X} –  A  B,C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} Useful when A is small and B is much larger – E.g., typestate verification 191

192 Widening/Narrowing 192

193 How can we prove this automatically? 193 RelProd(CP, VE)

194 Intervals domain One of the simplest numerical domains Maintain for each variable x an interval [L,H] – L is either an integer of -  – H is either an integer of +  A (non-relational) numeric domain 194

195 Intervals lattice for variable x 195  [0,0][-1,-1][-2,-2][1,1][2,2]... [- ,+  ] [0,1][1,2][2,3][-1,0][-2,-1] [-10,10] [1, +  ][ - ,0 ]... [2, +  ][0, +  ][ - ,-1 ]... [-20,10]

196 Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] ? – [1,4]  [1,3] ? – [1,3]  [1,4] ? – [1,3]  [- ,+  ] ? What is the lattice height? 196

197 Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] no – [1,4]  [1,3] no – [1,3]  [1,4] yes – [1,3]  [- ,+  ]yes What is the lattice height? Infinite 197

198 Joining/meeting intervals [a,b]  [c,d] = ? – [1,1]  [2,2] = ? – [1,1]  [2, +  ] = ? [a,b]  [c,d] = ? – [1,2]  [3,4] = ? – [1,4]  [3,4] = ? – [1,1]  [1,+  ] = ? Check that indeed x  y if and only if x  y=y 198

199 Joining/meeting intervals [a,b]  [c,d] = [min(a,c), max(b,d)] – [1,1]  [2,2] = [1,2] – [1,1]  [2,+  ] = [1,+  ] [a,b]  [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise  – [1,2]  [3,4] =  – [1,4]  [3,4] = [3,4] – [1,1]  [1,+  ] = [1,1] Check that indeed x  y if and only if x  y=y 199

200 Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = ? 200

201 Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas? 201

202 Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x  c and x  c – Example: S =  {x  9, y  5, y  10} – Helper operations c + +  = +  remove(S, x) = S without any x-constraints lb(S, x) = 202

203 Assignment transformers  x := c  # S = ?  x := y  # S = ?  x := y+c  # S = ?  x := y+z  # S = ?  x := y*c  # S = ?  x := y*z  # S = ? 203

204 Assignment transformers  x := c  # S = remove(S,x)  {x  c, x  c}  x := y  # S = remove(S,x)  {x  lb(S,y), x  ub(S,y)}  x := y+c  # S = remove(S,x)  {x  lb(S,y)+c, x  ub(S,y)+c}  x := y+z  # S = remove(S,x)  {x  lb(S,y)+lb(S,z), x  ub(S,y)+ub(S,z)}  x := y*c  # S = remove(S,x)  if c>0 {x  lb(S,y)*c, x  ub(S,y)*c} else {x  ub(S,y)*-c, x  lb(S,y)*-c}  x := y*z  # S = remove(S,x)  ? 204

205 assume transformers  assume x=c  # S = ?  assume x<c  # S = ?  assume x=y  # S = ?  assume x  c  # S = ? 205

206 assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = ? 206

207 assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = (S  {x  c-1})  (S  {x  c+1}) 207

208 Effect of function f on lattice elements L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 208 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn()

209 Effect of function f on lattice elements L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 209 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn()

210 Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = … 210

211 Fixed-point theorem [Kleene] Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) 211

212 Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 212   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := d  f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

213 Chaotic iteration 213 Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

214 Concrete semantics equations 214 R[0] = { x  Z} R[1] =  x:=7  R[2] = R[1]  R[4] R[3] = R[2]  {s | s(x) < 1000} R[4] =  x:=x+1  R[3] R[5] = R[2]  {s | s(x)  1000} R[6] = R[5]  {s | s(x)  1001} R[0] R[2] R[3] R[4] R[1] R[5] R[6]

215 Abstract semantics equations 215 R[0] =  ({ x  Z}) R[1] =  x:=7  # R[2] = R[1]  R[4] R[3] = R[2]   ({s | s(x) < 1000}) R[4] =  x:=x+1  # R[3] R[5] = R[2]   ({s | s(x)  1000}) R[6] = R[5]   ({s | s(x)  1001})  R[5]   ({s | s(x)  999}) R[0] R[2] R[3] R[4] R[1] R[5] R[6]

216 Abstract semantics equations 216 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[3] = R[2]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

217 Too many iterations to converge 217

218 How many iterations for this one? 218

219 Widening Introduce a new binary operator to ensure termination – A kind of extrapolation Enables static analysis to use infinite height lattices – Dynamically adapts to given program Tricky to design Precision less predictable then with finite- height domains (widening non-monotone) 219

220 Formal definition For all elements d 1  d 2  d 1  d 2 For all ascending chains d 0  d 1  d 2  … the following sequence is finite – y 0 = d 0 – y i+1 = y i  d i+1 For a monotone function f : D  D define – x 0 =  – x i+1 = x i  f(x i ) Theorem: – There exits k such that x k+1 = x k – x k  Red(f) = { d | d  D and f(d)  d } 220

221 Analysis with finite-height lattice 221 A  f #n  = lpf(f # )  … f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f)

222 Analysis with widening 222 A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

223 Widening for Intervals Analysis   [c, d] = [c, d] [a, b]  [c, d] = [ if a  c then a else - , if b  d then b else  223

224 Semantic equations with widening 224 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

225 Choosing analysis with widening 225 Enable widening

226 Non monotonicity of widening [0,1]  [0,2] = ? [0,2]  [0,2] = ?

227 Non monotonicity of widening [0,1]  [0,2] = [0,  ] [0,2]  [0,2] = [0,2]

228 Analysis results with widening 228 Did we prove it?

229 Analysis with narrowing 229 A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

230 Formal definition of narrowing Improves the result of widening y  x  y  (x  y)  x For all decreasing chains x 0  x 1  … the following sequence is finite – y 0 = x 0 – y i+1 = y i  x i+1 For a monotone function f: D  D and x k  Red(f) = { d | d  D and f(d)  d } define – y 0 = x – y i+1 = y i  f(y i ) Theorem: – There exits k such that y k+1 =y k – y k  Red(f) = { d | d  D and f(d)  d }

231 Narrowing for Interval Analysis [a, b]   = [a, b] [a, b]  [c, d] = [ if a = -  then c else a, if b =  then d else b ]

232 Semantic equations with narrowing 232 R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

233 Analysis with widening/narrowing Two phases – Phase 1: analyze with widening until converging – Phase 2: use values to analyze with narrowing 233 Phase 2: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] Phase 1: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ]

234 Analysis with widening/narrowing 234

235 Analysis results widening/narrowing 235 Precise invariant


Download ppt "Program Analysis and Verification 0368-4479"

Similar presentations


Ads by Google