Download presentation
Presentation is loading. Please wait.
Published byMarjory Russell Modified over 8 years ago
1
Program Analysis and Verification 0368-4479 Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav
2
Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis 2
3
Order-Related Terminology Preorder Partial Order Pointed Posets Join/Meet Complete lattices
4
A complete lattice (D, , , , , ) is A set of elements D A partial order x y A join operator A meet operator A bottom element = = D A top element = D = 4
5
Abstract (conservative) interpretation 5 set of states operational semantics (concrete semantics) statement S set of states abstract representation abstract semantics statement S abstract representation concretization
6
Domain Theory Monotone functions Chains Complete partial orders (CPO) – Every chain has a LUB Pointed CPOs Constructing CPOs – Cartesian product, relational product, disjunctive completion …
7
Continuity A monotonic function maps a chain of inputs into a chain of outputs: x 0 x 1 … f(x 0 ) f(x 1 ) … It is always true that: i f( i ) But f( i ) i is not always true 7
8
Fixed Points Solve equation: where W:∑ ∑ ; W= S while b do S W(S s ) if B b ( )=true W( ) = if B b ( )=false if B b ( )= 8 {
9
Fixed Points Solve equation: where W:∑ ∑ ; W= S while b do S Alternatively, W = F(W) where: F(W) = . W(S s ) if B b ( )=true W( ) = if B b ( )=false if B b ( )= 9 { W(S s ) if B b ( )=true if B b ( )=false if B b ( )= {
10
Fixed Point (cont) Thus we are looking for a solution for W = F( W) – a fixed point of F Typically there are many fixed points We may argue that W ought to be continuous W [∑ ∑ ] Cut the number of solutions We will see how to find the least fixed point for such an equation provided that F itself is continuous 10
11
Fixed Point Theorem Define F k = x. F( F(… F( x)…)) (F composed k times) If D is a pointed cpo and F : D D is continuous, then – for any fixed-point x of F and k N F k ( ) x – The least of all fixed points is k F k ( ) Proof: i.By induction on k. Base: F 0 ( ) = x Induction step: F k+1 ( ) = F( F k ( )) F( x) = x ii.It suffices to show that k F k ( ) is a fixed-point F( k F k ( )) = k F k+1 ( ) = k F k ( ) 11
12
Fixed-Points (notes) If F is continuous on a pointed cpo, we know how to find the least fixed point All other fixed points can be regarded as refinements of the least one – They contain more information, they are more precise – In general, they are also more arbitrary 12
13
Fixed-Points (notes) If F is continuous on a pointed cpo, we know how to find the least fixed point All other fixed points can be regarded as refinements of the least one – They contain more information, they are more precise – In general, they are also more arbitrary – They also make less sense for our purposes 13
14
Complete Lattice Let (D, ) be a partial order D is a complete lattice if every subset has both greatest lower bounds and least upper bounds 14
15
Knaster-Tarski Theorem Let f: L L be a monotonic function on a complete lattice L The least fixed point lfp(f) exists – lfp(f) = {x L: f(x) x} 15
16
Fixed Points A monotone function f: L L where (L, , , , , ) is a complete lattice Fix(f) = { l: l L, f(l) = l} Red(f) = {l: l L, f(l) l} Ext(f) = {l: l L, l f(l)} – l 1 l 2 f(l 1 ) f(l 2 ) Tarski’s Theorem 1955: if f is monotone then: – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) f( ) f( ) f2()f2() f2()f2() Fix(f) Ext(f) Red(f) gfp(f) lfp(f) 16
17
Collecting semantics 1 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x1][x1] [x0][x0] [x0][x0] [ x -1] [x2][x2] [x2][x2] [x2][x2] [x2][x2] [x3][x3] [x3][x3] [x3][x3] … … … 17 [ x -2] …
18
Defining the collecting semantics How should we represent the set of states at a given control-flow node by a lattice? How should we represent the sets of states at all control-flow nodes by a lattice? 18
19
Finite maps For a complete lattice L = (D, , , , , ) and finite set V Define the poset L V L = (V D, V L, V L, V L, V L, V L ) as follows: – f 1 V L f 2 iff for all v V f 1 (v) f 2 (v) – V L = ? V L = ? V L = ? V L = ? Lemma: L is a complete lattice Define the map constructor L V L = Map(V, L) 19
20
The collecting lattice Lattice for a given control-flow node v: ? Lattice for entire control-flow graph with nodes V: ? We will use this lattice as a baseline for static analysis and define abstractions of its elements 20
21
The collecting lattice Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 21
22
Equational definition of the semantics Define variables of type set of states for each control-flow node Define constraints between them 22 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]
23
Equational definition of the semantics R[2] = R[entry] x:=x-1 R[3] R[3] = R[2] {s | s(x) > 0} R[exit] = R[2] {s | s(x) 0} A system of recursive equations How can we approximate it using what we have learned so far? 23 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit]
24
An abstract semantics R[2] = R[entry] x:=x-1 # R[3] R[3] = R[2] {s | s(x) > 0} # R[exit] = R[2] {s | s(x) 0} # A system of recursive equations 24 if x > 0 x := x - 1 2 3 entry exit R[entry] R[2] R[3] R[exit] Abstract transformer for x:=x-1 Abstract representation of {s | s(x) < 0}
25
Abstract interpretation via abstraction 25 set of states collecting semantics statement S abstract representation of sets of states abstract representation of sets of states abstract semantics statement S abstract representation of sets of states abstraction {P}{P}S{Q}{Q}sp(S, P) generalizes axiomatic verification
26
Abstract interpretation via concretization 26 set of states collecting semantics statement S set of states abstract representation of sets of states abstract semantics statement S abstract representation of sets of states concretization {P}{P}S{Q}{Q} models(P)models(sp(S, P))models(Q)
27
Required knowledge Collecting semantics Abstract semantics Connection between collecting semantics and abstract semantics Algorithm to compute abstract semantics 27
28
The collecting lattice (sets of states) Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements 28
29
Equation systems in general Let L be a complete lattice (D, , , , , ) Let R be a vector of variables R[0, …, n] D … D Let F be a vector of functions of the type F[i] : R[0, …, n] R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 29
30
Equation systems in general Let L be a complete lattice (D, , , , , ) Let R be a vector of variables R[0, …, n] D … D Let F be a vector of functions of the type F[i] : R[0, …, n] R[0, …, n] A system of equations R[0] = f[0](R[0], …, R[n]) … R[n] = f[n](R[0], …, R[n]) In vector notation R = F(R) Questions: 1.Does a solution always exist? 2.If so, is it unique? 3.If so, is it computable? 30
31
Monotone functions Let L 1 =(D 1, ) and L 2 =(D 2, ) be two posets A function f : D 1 D 2 is monotone if for every pair x, y D 1 x y implies f(x) f(y) A special case: L 1 =L 2 =(D, ) f : D D 31
32
Important cases of monotonicity Join: f(X, Y) = X Y Prove it! For a set X and any function g F(X) = { g(x) | x X } Prove it! Notice that the collecting semantics function is defined in terms of – Join (set union) – Semantic function for atomic statements lifted to sets of states 32
33
Extensive/reductive functions Let L=(D, ) be a poset A function f : D D is extensive if for every x D, we have that x f(x) A function f : D D is reductive if for every x D, we have that x f(x) 33
34
Fixed-points L = (D, , , , , ) f : D D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d) d } Ext(f) = { d | d f(d) } Theorem [Tarski 1955] – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) 34 Red(f) Ext(f) Fix(f) lfp gfp fn()fn() fn()fn() 1.Does a solution always exist? Yes 2.If so, is it unique? No, but it has least/greatest solutions 3.If so, is it computable? Under some conditions…
35
Fixed point example for program R[0] = { x Z} R[1] = R[0] R[4] R[2] = R[1] {s | s(x) > 0} R[3] = R[1] {s | s(x) 0} R[4] = x:=x-1 R[2] 35 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : Fixed-point = d
36
Fixed point example for program R[0] = { x Z} R[1] = R[0] R[4] R[2] = R[1] {s | s(x) > 0} R[3] = R[1] {s | s(x) 0} R[4] = x:=x-1 R[2] 36 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <-5} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : pre Fixed-point d
37
Fixed point example for program R[0] = { x Z} R[1] = R[0] R[4] R[2] = R[1] {s | s(x) > 0} R[3] = R[1] {s | s(x) 0} R[4] = x:=x-1 R[2] 37 if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <9} if x>0 x := x-1 2 3 entry exit xZxZ xZxZ { x >0}{ x <0} F(d) : post Fixed-point d
38
Continuity and ACC condition Let L = (D, , , ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y D*, f( Y) = { f(y) | y Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0 d 1 … d n = d n+1 = … 38
39
Fixed-point theorem [Kleene] Let L = (D, , , ) be a complete partial order and a continuous function f: D D then lfp(f) = n N f n ( ) Lemma: Monotone functions on posets satisfying ACC are continuous Proof: 39
40
Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 40 lfp fn()fn() f()f() f2()f2() … d := while f(d) d do d := d f(d) return d Algorithm lfp(f) = n N f n ( ) Mathematical definition
41
Chaotic iteration 41 Input: – A cpo L = (D, , , ) satisfying ACC – L n = L L … L – A monotone function f : D n D n – A system of equations { X[i] | f(X) | 1 i n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] := WL = {1,…,n} while WL do i := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X
42
Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , , ) Define variables R[0,…,n] for input/output of each CFG node such that R[i] D For each node v let v out be the variable at the output of that node: v out = F[v]( u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 42
43
Constant propagation example 43 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit x := 4; while (y 5) do z := x; x := 4
44
Constant propagation lattice For each variable x define L as For a set of program variables Var=x 1,…,x n L n = L L … L 44 0-212... no information not-a-constant
45
Write down variables 45 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit x := 4; while (y 5) do z := x; x := 4
46
Write down equations 46 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R1R1 R5R5 R0R0 x := 4; while (y 5) do z := x; x := 4
47
Collecting semantics equations 47 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 = State R 1 = x:=4 R 0 R 2 = R 1 R 5 R 3 = assume y 5 R 2 R 4 = z:=x R 3 R 5 = x:=4 R 4 R 6 = assume y=5 R 2 R1R1 R5R5 R0R0
48
Constant propagation equations 48 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R2R2 R3R3 R4R4 R6R6 R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R1R1 R5R5 R0R0 abstract transformer
49
Abstract operations for CP 49 R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 Lattice elements have the form: (v x, v y, v z ) x:=4 # (v x,v y,v z ) = (4, v y, v z ) z:=x # (v x,v y,v z ) = (v x, v y, v x ) assume y 5 # (v x,v y,v z ) = (v x, v y, v x ) assume y=5 # (v x,v y,v z ) = if v y = k 5 then ( , , ) else (v x, 5, v z ) R 1 R 5 = (a 1, b 1, c 1 ) (a 5, b 5, c 5 ) = (a 1 a 5, b 1 b 5, c 1 c 5 ) 0-212... CP lattice for a single variable
50
Chaotic iteration for CP: initialization 50 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =( , , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =( , , ) R 5 =( , , ) R 0 =( , , ) WL = {R 0, R 1, R 2, R 3, R 4, R 5, R 6 }
51
Chaotic iteration for CP 51 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =( , , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =( , , ) R 5 =( , , ) R 0 =( , , ) WL = {R 1, R 2, R 3, R 4, R 5, R 6 }
52
Chaotic iteration for CP 52 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =( , , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =(4, , ) R 5 =( , , ) R 0 =( , , ) WL = {R 2, R 3, R 4, R 5, R 6 }
53
Chaotic iteration for CP 53 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =( , , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =(4, , ) R 5 =( , , ) R 0 =( , , ) 0-212... 3 4 WL = {R 2, R 3, R 4, R 5, R 6 }
54
Chaotic iteration for CP 54 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =(4, , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =(4, , ) R 5 =( , , ) R 0 =( , , ) 0-212... 3 4 WL = {R 2, R 3, R 4, R 5, R 6 }
55
Chaotic iteration for CP 55 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R 2 =(4, , ) R2R2 R2R2 R 3 =( , , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 1 =(4, , ) R 5 =( , , ) R 0 =( , , ) WL = {R 3, R 4, R 5, R 6 }
56
Chaotic iteration for CP 56 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, , ) R 4 =( , , ) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 5 =( , , ) R 1 =(4, , ) R 0 =( , , ) R 2 =(4, , ) WL = {R 4, R 5, R 6 }
57
Chaotic iteration for CP 57 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 3 =(4, , ) R 4 =(4, , 4) R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 5 =( , , ) R 1 =(4, , ) R 0 =( , , ) R 2 =(4, , ) WL = {R 5, R 6 }
58
Chaotic iteration for CP 58 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , , ) R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, , ) R 1 =(4, , ) R 0 =( , , ) R 2 =(4, , ) WL = {R 2, R 6 } added R 2 back to worklist since it depends on R 5
59
Chaotic iteration for CP 59 R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =( , , ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, , ) R 1 =(4, , ) R 0 =( , , ) R 2 =(4, , ) WL = {R 6 }
60
Chaotic iteration for CP 60 R 0 = R 1 = x:=4 # R 0 R 2 = R 1 R 5 R 3 = assume y 5 # R 2 R 4 = z:=x # R 3 R 5 = x:=4 # R 4 R 6 = assume y=5 # R 2 x := 4 if (*) assume y 5 assume y=5 z := x x := 4 entry exit R2R2 R2R2 R 6 =(4, 5, ) R 5 =(4, , 4) R 4 =(4, , 4) R 3 =(4, , ) R 1 =(4, , ) R 0 =( , , ) R 2 =(4, , ) Fixed-point WL = {} In practice maintain a worklist of nodes
61
Chaotic iteration for static analysis Specialize chaotic iteration for programs Create a CFG for program Choose a cpo of properties for the static analysis to infer: L = (D, , , ) Define variables R[0,…,n] for input/output of each CFG node such that R[i] D For each node v let v out be the variable at the output of that node: v out = F[v]( u | (u,v) is a CFG edge) – Make sure each F[v] is monotone Variable dependence determined by outgoing edges in CFG 61
62
Complexity of chaotic iteration Parameters: – n the number of CFG nodes – k is the maximum in-degree of edges – Height h of lattice L – c is the maximum cost of Applying F v Checking fixed-point condition for lattice L Complexity: O(n h c k) Incremental (worklist) algorithm reduces the n factor in factor – Implement worklist by priority queue and order nodes by reversed topological order 62
63
Required knowledge Collecting semantics Abstract semantics (over lattices) Algorithm to compute abstract semantics (chaotic iteration) Connection between collecting semantics and abstract semantics Abstract transformers 63
64
Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 64
65
Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 65
66
Galois Connection Given two complete lattices C = (D C, C, C, C, C, C )– concrete domain A = (D A, A, A, A, A, A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function : D C D A – The concretization function : D A D C for every concrete element c D C and abstract element a D A ( (a)) a and c ( (c)) Alternatively (c) a iff c (a) 66
67
Galois Connection: c ( (c)) 67 1 c 2 (c)(c) 3 ( (c)) The most precise (least) element in A representing c CA
68
Galois Connection: ( (a)) a 68 1 3 ( (a)) 2 (a)(a) a CA What a represents in C (its meaning)
69
Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y Var} A = (2 EQ, , , , EQ, ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization (X) = ? (Y) = ? 69
70
Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y Var} A = (2 EQ, , , , EQ, ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization (s) = ({s}) = { x=y | s x = s y} that is s x=y (X) = { (s) | s X} = A { (s) | s X} (Y) = { s | s Y } = models( Y) 70
71
Galois Connection: c ( (c)) 71 1 [x 5, y 5, z 5] 2 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 3 … [x 6, y 6, z 6] [x 5, y 5, z 5] [x 4, y 4, z 4] … 4 x=x, y=y, z=z The most precise (least) element in A representing [x 5, y 5, z 5] CA
72
Most precise abstract representation 72 1 c 5 CA 4 6 2 73 8 9 (c)(c) (c) = {c’ | c (c’)}
73
Most precise abstract representation 73 1 c 5 CA 4 6 2 73 8 9 (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y (c) = {c’ | c (c’)} [x 5, y 5, z 5] x=y, y=z x=y, z=y x=y
74
Galois Connection: ( (a)) a 74 1 3 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x 6, y 6, z 6] [x 5, y 5, z 5] [x 4, y 4, z 4] … x=y, y=z What a represents in C (its meaning) is called a semantic reduction CA
75
Galois Insertion a: ( (a))=a 75 1 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x 6, y 6, z 6] [x 5, y 5, z 5] [x 4, y 4, z 4] … CA How can we obtain a Galois Insertion from a Galois Connection? All elements are reduced
76
Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other: (a) = {c | (c) a} (c) = {a | c (a)} 76
77
Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements (s) = ({s}) Then lift the abstraction to sets of elements (X) = A { (s) | s X} 77
78
The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2 State, , , , , State) A = (D A, A, A, A, A, A ) If D A is a set of formulas then the abstraction of a state is defined as (s) = ({s}) = A { | s } the least formula from D A that s satisfies The abstraction of a set of states is (X) = A { (s) | s X} The concretization is ( ) = { s | s } = models( ) 78
79
Inducing along the connections Assume the complete lattices C = (D C, C, C, C, C, C ) A = (D A, A, A, A, A, A ) M = (D M, M, M, M, M, M ) and Galois connections GC C,A =(C, C,A, A,C, A) and GC A,M =(A, A,M, M,A, M) Lemma: both connections induce the GC C,M = (C, C,M, M,C, M) defined by C,M = C,A A,M and M,C = M,A A,C 79
80
Inducing along the connections 80 1 C,AC,A A,CA,C c 2 C,A(c)C,A(c) 5 CA 3 M A,MA,M 4 M,AM,A c’c’ a’ = A,M ( C,A (c))
81
Sound abstract transformer Given two lattices C = (D C, C, C, C, C, C ) A = (D A, A, A, A, A, A ) and GC C,A =(C, , , A) with A concrete transformer f : D C D C an abstract transformer f # : D A D A We say that f # is a sound transformer (w.r.t. f) if c: f(c)=c’ (f # (c)) (c’) For every a and a’ such that (f( (a))) A f # (a) 81
82
Transformer soundness condition 1 82 12 CA f 3 4 f#f# 5 c: f(c)=c’ (f # (c)) (c’)
83
Transformer soundness condition 2 83 CA 12 f#f# 3 5 f 4 a: f # (a)=a’ f( (a)) (a’)
84
Best (induced) transformer 84 CA 2 3 f f # (a)= (f( (a))) 1 f#f# 4 Problem: incomputable directly
85
Best abstract transformer [CC’77] Best in terms of precision – Most precise abstract transformer – May be too expensive to compute Constructively defined as f # = f – Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer – Useful tool: partial concretization 85
86
Transformer example C = (2 State, , , , , State) EQ = { x=y | x, y Var} A = (2 EQ, , , , EQ, ) (s) = ({s}) = { x=y | s x = s y} that is s x=y (X) = { (s) | s X} = A { (s) | s X} ( ) = { s | s } = models( ) Concrete: x:=y X = { s[x s y] | s X} Abstract: x:=y # X = ? 86
87
Developing a transformer for EQ - 1 Input has the form X = {a=b} sp(x:=expr, ) = v. x=expr[v/x] [v/x] sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = … Let’s define helper notations: – EQ(X, y) = {y=a, b=y X} Subset of equalities containing y – EQc(X, y) = X \ EQ(X, y) Subset of equalities not containing y 87
88
Developing a transformer for EQ - 2 sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = … Two cases – x is y: sp(x:=y, X) = X – x is different from y: sp(x:=y, X) = v. x=y EQ)X, x)[v/x] EQc(X, x)[v/x] = x=y EQc(X, x) v. EQ)X, x)[v/x] x=y EQc(X, x) Vanilla transformer: x:=y #1 X = x=y EQc(X, x) Example: x:=y #1 {x=p, q=x, m=n} = {x=y, m=n} Is this the most precise result? 88
89
Developing a transformer for EQ - 3 x:=y #1 {x=p, x=q, m=n} = {x=y, m=n} {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y, X) = x=y EQc(X, x) v. EQ)X, x)[v/x] v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 89
90
Developing a transformer for EQ - 4 Define a reduction operator: Explicate(X) = if exist {a=b, b=c} X but not {a=c} X then Explicate(X {a=c}) else X Define x:=y #2 = x:=y #1 Explicate x:=y #2 ) {x=p, x=q, m=n}) = {x=y, m=n, p=q} is this the best transformer? 90
91
Developing a transformer for EQ - 5 x:=y #2 ) {y=z}) = {x=y, y=z} {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer x:=y #3 = Explicate x:=y #1 Explicate Observation : after the first time we apply Explicate, all subsequent values will be in the image of the abstraction so really we only need to apply it once to the input Finally: x:=y # (X) = Explicate x:=y #1 – Best transformer for reduced elements (elements in the image of the abstraction) 91
92
Negative property of best transformers Let f # = f Best transformer does not compose (f(f( (a)))) f # (f # (a)) 92
93
(f(f( (a)))) f # (f # (a)) 93 CA 2 3 f 1 f#f# 6 5 4 f 7 f#f# 8 9 f
94
Soundness theorem 1 1.Given two complete lattices C = (D C, C, C, C, C, C ) A = (D A, A, A, A, A, A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C D C 3.Monotone abstract transformer f # : D A D A 4. a D A : f( (a)) (f # (a)) Then lfp(f) (lfp(f # )) (lfp(f)) lfp(f # ) 94
95
Soundness theorem 1 95 CA f fn fn … lpf(f) f2 f2 f3f3 f #n … lpf(f # ) f#2 f#2 f#3f#3 f# f# a D A : f( (a)) (f # (a)) a D A : f n ( (a)) (f #n (a)) a D A : lfp(f n )( (a)) (lfp(f #n )(a)) lfp(f) lfp(f # )
96
Soundness theorem 2 1.Given two complete lattices C = (D C, C, C, C, C, C ) A = (D A, A, A, A, A, A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C D C 3.Monotone abstract transformer f # : D A D A 4. c D C : (f(c)) f # ( (c)) Then (lfp(f)) lfp(f # ) lfp(f) (lfp(f # )) 96
97
Soundness theorem 2 97 CA f fn fn … lpf(f) f2 f2 f3f3 f #n … lpf(f # ) f#2 f#2 f#3f#3 f# f# c D C : (f(c)) f # ( (c)) c D C : (f n (c)) f #n ( (c)) c D C : (lfp(f)(c)) lfp(f # )( (c)) lfp(f) lfp(f # )
98
A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound 98
99
Completeness Local property: – forward complete: c: (f # (c)) = (f(c)) – backward complete: a: f( (a)) = (f # (a)) A property of domain and the (best) transformer Global property: – (lfp(f)) = lfp(f # ) – lfp(f) = (lfp(f # )) Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties 99
100
Forward complete transformer 100 12 CA f 3 4 f#f# c: (f # (c)) = (f(c))
101
Backward complete transformer 101 CA 12 f#f# 3 5 f a: f( (a)) = (f # (a))
102
Global (backward) completeness 102 CA f fn fn … lpf(f) f2 f2 f3f3 f #n … lpf(f # ) f#2 f#2 f#3f#3 f# f# a: f( (a)) = (f # (a)) a: f n ( (a)) = (f #n (a)) a D A : lfp(f n )( (a)) = (lfp(f #n )(a)) lfp(f) = lfp(f # )
103
Global (forward) completeness 103 CA f fn fn … lpf(f) f2 f2 f3f3 f #n … lpf(f # ) f#2 f#2 f#3f#3 f# f# c D C : (f(c)) = f # ( (c)) c D C : (f n (c)) = f #n ( (c)) c D C : (lfp(f)(c)) = lfp(f # )( (c)) lfp(f) = lfp(f # )
104
Widening/Narrowing 104
105
How can we prove this automatically? 105 RelProd(CP, VE)
106
Intervals domain One of the simplest numerical domains Maintain for each variable x an interval [L,H] – L is either an integer of - – H is either an integer of + A (non-relational) numeric domain 106
107
Intervals lattice for variable x 107 [0,0][-1,-1][-2,-2][1,1][2,2]... [- ,+ ] [0,1][1,2][2,3][-1,0][-2,-1] [-10,10] [1, + ][ - ,0 ]... [2, + ][0, + ][ - ,-1 ]... [-20,10]
108
Intervals lattice for variable x D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} =[- ,+ ] = ? – [1,2] [3,4] ? – [1,4] [1,3] ? – [1,3] [1,4] ? – [1,3] [- ,+ ] ? What is the lattice height? 108
109
Intervals lattice for variable x D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} =[- ,+ ] = ? – [1,2] [3,4] no – [1,4] [1,3] no – [1,3] [1,4] yes – [1,3] [- ,+ ]yes What is the lattice height? Infinite 109
110
Joining/meeting intervals [a,b] [c,d] = ? – [1,1] [2,2] = ? – [1,1] [2, + ] = ? [a,b] [c,d] = ? – [1,2] [3,4] = ? – [1,4] [3,4] = ? – [1,1] [1,+ ] = ? Check that indeed x y if and only if x y=y 110
111
Joining/meeting intervals [a,b] [c,d] = [min(a,c), max(b,d)] – [1,1] [2,2] = [1,2] – [1,1] [2,+ ] = [1,+ ] [a,b] [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise – [1,2] [3,4] = – [1,4] [3,4] = [3,4] – [1,1] [1,+ ] = [1,1] Check that indeed x y if and only if x y=y 111
112
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = ? 112
113
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ] … D int [x k ] How can we represent it in terms of formulas? 113
114
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ] … D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x c and x c – Example: S = {x 9, y 5, y 10} – Helper operations c + + = + remove(S, x) = S without any x-constraints lb(S, x) = 114
115
Assignment transformers x := c # S = ? x := y # S = ? x := y+c # S = ? x := y+z # S = ? x := y*c # S = ? x := y*z # S = ? 115
116
Assignment transformers x := c # S = remove(S,x) {x c, x c} x := y # S = remove(S,x) {x lb(S,y), x ub(S,y)} x := y+c # S = remove(S,x) {x lb(S,y)+c, x ub(S,y)+c} x := y+z # S = remove(S,x) {x lb(S,y)+lb(S,z), x ub(S,y)+ub(S,z)} x := y*c # S = remove(S,x) if c>0 {x lb(S,y)*c, x ub(S,y)*c} else {x ub(S,y)*-c, x lb(S,y)*-c} x := y*z # S = remove(S,x) ? 116
117
assume transformers assume x=c # S = ? assume x<c # S = ? assume x=y # S = ? assume x c # S = ? 117
118
assume transformers assume x=c # S = S {x c, x c} assume x<c # S = S {x c-1} assume x=y # S = S {x lb(S,y), x ub(S,y)} assume x c # S = ? 118
119
assume transformers assume x=c # S = S {x c, x c} assume x<c # S = S {x c-1} assume x=y # S = S {x lb(S,y), x ub(S,y)} assume x c # S = (S {x c-1}) (S {x c+1}) 119
120
Effect of function f on lattice elements L = (D, , , , , ) f : D D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d) d } Ext(f) = { d | d f(d) } Theorem [Tarski 1955] – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) 120 Red(f) Ext(f) Fix(f) lfp gfp fn()fn() fn()fn()
121
Effect of function f on lattice elements L = (D, , , , , ) f : D D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d) d } Ext(f) = { d | d f(d) } Theorem [Tarski 1955] – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) 121 Red(f) Ext(f) Fix(f) lfp gfp fn()fn() fn()fn()
122
Continuity and ACC condition Let L = (D, , , ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y D*, f( Y) = { f(y) | y Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0 d 1 … d n = d n+1 = … 122
123
Fixed-point theorem [Kleene] Let L = (D, , , ) be a complete partial order and a continuous function f: D D then lfp(f) = n N f n ( ) 123
124
Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 124 lfp fn()fn() f()f() f2()f2() … d := while f(d) d do d := d f(d) return d Algorithm lfp(f) = n N f n ( ) Mathematical definition
125
Chaotic iteration 125 Input: – A cpo L = (D, , , ) satisfying ACC – L n = L L … L – A monotone function f : D n D n – A system of equations { X[i] | f(X) | 1 i n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] := WL = {1,…,n} while WL do j := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X
126
Concrete semantics equations 126 R[0] = { x Z} R[1] = x:=7 R[2] = R[1] R[4] R[3] = R[2] {s | s(x) < 1000} R[4] = x:=x+1 R[3] R[5] = R[2] {s | s(x) 1000} R[6] = R[5] {s | s(x) 1001} R[0] R[2] R[3] R[4] R[1] R[5] R[6]
127
Abstract semantics equations 127 R[0] = ({ x Z}) R[1] = x:=7 # R[2] = R[1] R[4] R[3] = R[2] ({s | s(x) < 1000}) R[4] = x:=x+1 # R[3] R[5] = R[2] ({s | s(x) 1000}) R[6] = R[5] ({s | s(x) 1001}) R[5] ({s | s(x) 999}) R[0] R[2] R[3] R[4] R[1] R[5] R[6]
128
Abstract semantics equations 128 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[3] = R[2] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
129
Too many iterations to converge 129
130
How many iterations for this one? 130
131
Widening Introduce a new binary operator to ensure termination – A kind of extrapolation Enables static analysis to use infinite height lattices – Dynamically adapts to given program Tricky to design Precision less predictable then with finite- height domains (widening non-monotone) 131
132
Formal definition For all elements d 1 d 2 d 1 d 2 For all ascending chains d 0 d 1 d 2 … the following sequence is finite – y 0 = d 0 – y i+1 = y i d i+1 For a monotone function f : D D define – x 0 = – x i+1 = x i f(x i ) Theorem: – There exits k such that x k+1 = x k – x k Red(f) = { d | d D and f(d) d } 132
133
Analysis with finite-height lattice 133 A f #n = lpf(f # ) … f#2 f#2 f#3f#3 f# f# Red(f) Fix(f)
134
Analysis with widening 134 A f#2 f#3f#2 f#3 f#2 f#2 f#3f#3 f# f# Red(f) Fix(f) lpf(f # )
135
Widening for Intervals Analysis [c, d] = [c, d] [a, b] [c, d] = [ if a c then a else - , if b d then b else 135
136
Semantic equations with widening 136 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1001,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
137
Choosing analysis with widening 137 Enable widening
138
Non monotonicity of widening [0,1] [0,2] = ? [0,2] [0,2] = ?
139
Non monotonicity of widening [0,1] [0,2] = [0, ] [0,2] [0,2] = [0,2]
140
Analysis results with widening 140 Did we prove it?
141
Analysis with narrowing 141 A f#2 f#3f#2 f#3 f#2 f#2 f#3f#3 f# f# Red(f) Fix(f) lpf(f # )
142
Formal definition of narrowing Improves the result of widening y x y (x y) x For all decreasing chains x 0 x 1 … the following sequence is finite – y 0 = x 0 – y i+1 = y i x i+1 For a monotone function f: D D and x k Red(f) = { d | d D and f(d) d } define – y 0 = x – y i+1 = y i f(y i ) Theorem: – There exits k such that y k+1 =y k – y k Red(f) = { d | d D and f(d) d }
143
Narrowing for Interval Analysis [a, b] = [a, b] [a, b] [c, d] = [ if a = - then c else a, if b = then d else b ]
144
Semantic equations with narrowing 144 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] # [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
145
Analysis with widening/narrowing Two phases – Phase 1: analyze with widening until converging – Phase 2: use values to analyze with narrowing 145 Phase 2: R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] # [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] Phase 1: R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1001,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ]
146
Analysis with widening/narrowing 146
147
Analysis results widening/narrowing 147 Precise invariant
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.