Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University
Syllabus Semantics Natural Semantics Structural semantics Axiomatic Verification Static Analysis Automating Hoare Logic Control Flow Graphs Equation Systems Collecting Semantics Abstract Interpretation fundamentals LatticesFixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR Crafting your own Soot From proofs to abstractions Systematically developing transformers 2
Previously Composing abstract domains (and GCs) Reduced product Implementing composition of analyses 3
Agenda Abstract interpretation for infinite height domains via widening and narrowing 4
A Motivating example 5
How can we prove this automatically? 6 RelProd(CP, VE)
The interval domain 7
Interval domain One of the simplest numerical domains Maintain for each variable x an interval [L,H] – L is either an integer of - – H is either an integer of + A (non-relational) numeric domain 8
Intervals lattice for variable x 9 [0,0][-1,-1][-2,-2][1,1][2,2]... [- ,+ ] [0,1][1,2][2,3][-1,0][-2,-1] [-10,10] [1, + ][ - ,0 ]... [2, + ][0, + ][ - ,-1 ]... [-20,10]
Intervals lattice for variable x D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} =[- ,+ ] = ? – [1,2] [3,4] ? – [1,4] [1,3] ? – [1,3] [1,4] ? – [1,3] [- ,+ ] ? What is the lattice height? 10
Intervals lattice for variable x D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} =[- ,+ ] = ? – [1,2] [3,4] no – [1,4] [1,3] no – [1,3] [1,4] yes – [1,3] [- ,+ ]yes What is the lattice height? Infinite 11
Joining/meeting intervals [a,b] [c,d] = ? – [1,1] [2,2] = ? – [1,1] [2, + ] = ? [a,b] [c,d] = ? – [1,2] [3,4] = ? – [1,4] [3,4] = ? – [1,1] [1,+ ] = ? Check that indeed x y if and only if x y=y 12
Joining/meeting intervals [a,b] [c,d] = [min(a,c), max(b,d)] – [1,1] [2,2] = [1,2] – [1,1] [2,+ ] = [1,+ ] [a,b] [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise – [1,2] [3,4] = – [1,4] [3,4] = [3,4] – [1,1] [1,+ ] = [1,1] Check that indeed x y if and only if x y=y 13
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = ? 14
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ] … D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x c and x c – Example: S = {x 9, y 5, y 10} – Helper operations c + + = + remove(S, x) = S without any x-constraints lb(S, x) = 15
Interval domain for programs D int [x] = { (L,H) | L - ,Z and H Z,+ and L H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ] … D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x c and x c – Example: S = {x 9, y 5, y 10} – Helper operations c + + = + remove(S, x) = S without any x-constraints lb(S, x) = ub(S, x) = 16
interval domain transformers 17
Assignment transformers x := c # S = ? x := y # S = ? x := y+c # S = ? x := y+z # S = ? x := y*c # S = ? x := y*z # S = ? 18
Assignment transformers x := c # S = remove(S,x) {x c, x c} x := y # S = remove(S,x) {x lb(S,y), x ub(S,y)} x := y+c # S = remove(S,x) {x lb(S,y)+c, x ub(S,y)+c} x := y+z # S = remove(S,x) {x lb(S,y)+lb(S,z), x ub(S,y)+ub(S,z)} x := y*c # S = remove(S,x) if c>0 {x lb(S,y)*c, x ub(S,y)*c} else {x ub(S,y)*- c, x lb(S,y)*-c} x := y*z # S = remove(S,x) ? 19
assume transformers assume x=c # S = ? assume x<c # S = ? assume x=y # S = ? assume x c # S = ? 20
assume transformers assume x=c # S = S {x c, x c} assume x<c # S = S {x c-1} assume x=y # S = S {x lb(S,y), x ub(S,y)} {y lb(S,x), y ub(S,x)} assume x c # S = ? 21
assume transformers assume x=c # S = S {x c, x c} assume x<c # S = S {x c-1} assume x=y # S = S {x lb(S,y), x ub(S,y)} {y lb(S,x), y ub(S,x)} assume x c # S = (S {x c-1}) (S {x c+1}) 22
Analysis with interval domain 23
Concrete semantics equations 24 R[0] = { x Z} R[1] = x:=7 R[2] = R[1] R[4] R[3] = assume x<1000 R[2] R[4] = x:=x+1 R[3] R[5] = assume x 1000 R[2] R[6] = assume x 1000 R[5] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
Abstract semantics equations 25 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[3] = R[2] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
Too many iterations to converge 26
How many iterations for this one? Problem: need infinite height domain Basic fixed-point analysis does not terminate when ACC does not hold Solution: come up with new fixed-point finding algorithm 27
Revisiting the basic static analysis fixed-point algorithm 28
Effect of function f on lattice elements L = (D, , , , , ) f : D D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d) d } Ext(f) = { d | d f(d) } Theorem [Tarski 1955] – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) 29 Red(f) Ext(f) Fix(f) lfp gfp fn()fn() fn()fn()
Effect of function f on lattice elements L = (D, , , , , ) f : D D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d) d } Ext(f) = { d | d f(d) } Theorem [Tarski 1955] – lfp(f) = Fix(f) = Red(f) Fix(f) – gfp(f) = Fix(f) = Ext(f) Fix(f) 30 Red(f) Ext(f) Fix(f) lfp gfp fn()fn() fn()fn()
Continuity and ACC condition Let L = (D, , , ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y D*, f( Y) = { f(y) | y Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0 d 1 … d n = d n+1 = … 31
Fixed-point theorem [Kleene] Let L = (D, , , ) be a complete partial order and a continuous function f: D D then lfp(f) = n N f n ( ) When ACC holds and f is monotone then f is continuous 32
Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp 33 lfp fn()fn() f()f() f2()f2() … d := while f(d) d do d := f(d) return d Algorithm lfp(f) = n N f n ( ) Mathematical definition
Chaotic iteration 34 Input: – A cpo L = (D, , , ) satisfying ACC – L n = L L … L – A monotone function f : D n D n – A system of equations { X[i] | f(X) | 1 i n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] := WL = {1,…,n} while WL do j := pop WL // choose index non-deterministically N := F[j](X) if N X[j] then X[j] := N add all the indexes that directly depend on j to WL (X[k] depends on X[j] if F[k] contains X[j]) return X
Widening and narrowing 35
Widening Introduce a new binary operator to ensure termination – A kind of extrapolation Enables static analysis to use infinite height lattices – Dynamically adapts to given program Tricky to design Precision less predictable then with finite- height domains (widening non-monotone) 36
Formal definition For all elements d 1 d 2 d 1 d 2 For all ascending chains d 0 d 1 d 2 … the following sequence is finite – y 0 = d 0 – y i+1 = y i d i+1 For a monotone function f : D D define – x 0 = – x i+1 = x i f(x i ) Theorem: – There exits k such that x k+1 = x k – x k Red(f) = { d | d D and f(d) d } 37
Analysis with finite-height lattice 38 A f #n = lpf(f # ) … f#2 f#2 f#3f#3 f# f# Red(f) Fix(f)
Analysis with widening 39 A f#2 f#3f#2 f#3 f#2 f#2 f#3f#3 f# f# Red(f) Fix(f) lpf(f # ) A post-fixed point
A Widening for the interval domain 40
Widening for Intervals Analysis [c, d] = [c, d] [a, b] [c, d] = [ if a c then a else - , if b d then b else 41
Semantic equations with widening 42 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1001,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
Choosing analysis with widening 43 Enable widening
Non monotonicity of widening [0,1] [0,2] = ? [0,2] [0,2] = ?
Non monotonicity of widening [0,1] [0,2] = [0, ] [0,2] [0,2] = [0,2] What is the impact of non-monotonicity?
Analysis results with widening 46 Did we prove it?
narrowing 47
Analysis with narrowing 48 A f#2 f#3f#2 f#3 f#2 f#2 f#3f#3 f# f# Red(f) Fix(f) lpf(f # )
Formal definition of narrowing Improves the result of widening y x y (x y) x For all decreasing chains x 0 x 1 … the following sequence is finite – y 0 = x 0 – y i+1 = y i x i+1 For a monotone function f: D D and x k Red(f) = { d | d D and f(d) d } define – y 0 = x – y i+1 = y i f(y i ) Theorem: – There exits k such that y k+1 =y k – y k Red(f) = { d | d D and f(d) d }
A narrowing for the interval domain 50
Narrowing for Interval Analysis [a, b] = [a, b] [a, b] [c, d] = [ if a = - then c else a, if b = then d else b ]
Semantic equations with narrowing 52 R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] # [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]
Combining widening and narrowing 53
Analysis with widening/narrowing Two phases – Phase 1: analyze with widening until converging – Phase 2: use values to analyze with narrowing 54 Phase 2: R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] # [1000,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ] Phase 1: R[0] = R[1] = [7,7] R[2] = R[1] R[4] R[2.1] = R[2.1] R[2] R[3] = R[2.1] [- ,999] R[4] = R[3] + [1,1] R[5] = R[2] [1001,+ ] R[6] = R[5] [999,+ ] R[5] [1001,+ ]
Analysis with widening/narrowing 55
Analysis results widening/narrowing 56 Precise invariant
Recent development How to Combine Widening and Narrowing for Non-monotonic Systems of Equations Kalmer Apinis, Helmut Seidl, Vesal Vojdani PLDI 2013 How to Combine Widening and Narrowing for Non-monotonic Systems of Equations Define a single operator that combines widening and narrowing a b = if b a then a b else a b More precise than two-phase approach To ensure termination requires solver to process equations in a particular order 57
Next lecture: numerical abstractions