Spring 2017 Program Analysis and Verification Lecture 11: Abstract Interpretation III Galois Connections Roman Manevich Ben-Gurion University
Tentative syllabus Program Verification Program Analysis Basics Operational semantics Hoare Logic Predicate Calculus Data Structures Termination Program Analysis Basics Control Flow Graphs Equation Systems Collecting Semantics Abstract Interpretation fundamentals Lattices Fixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR
Previously Solving monotone systems Vanilla static analysis algorithm Chaotic iteration
Static analysis R[0] = R[1] = R[0] R[4] R[2] = assume x>0 R[1] R[3] = assume x0 R[1] R[4] = x:=x-1 R[2] Given a system of equations for the collecting semantics A static analysis solves a corresponding system of equations over an abstract domain Questions: What is the relation between the solutions? This lecture How do you solve the second system? previous lecture R[0]# = R[1]# = R[0] R[4] R[2]# = assume x>0# R[1] R[3]# = assume x0# R[1] R[4]# = x:=x-1# R[2]
Required knowledge Collecting semantics Abstract semantics (over lattices) Algorithm to compute abstract semantics Vector iteration Chaotic iteration Connection between collecting semantics and abstract semantics Abstract transformers
Agenda Galois connections Abstract transformers Global soundness
Recap 1/2 We defined a reference semantics – the collecting semantics We defined an abstract semantics by Choosing an abstract domain (lattice) Developing algorithms for: Testing partial order Join Abstract transformers
Recap 2/2 We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: What is the connection between the two least fixed-points? Transformer monotonicity is required for termination – what should we require for correctness?
Relating the abstract domain to the concrete domain
((a)) A a and c C ((c)) Galois Connection Given two complete lattices C = (DC, C, C, C, C, C) – concrete domain A = (DA, A, A, A, A, A) – abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions The abstraction function : DC DA The concretization function : DA DC For every concrete element cDC and abstract element aDA 1) ((a)) A a and c C ((c)) (c) A a iff c C (a) 2)
(1.1) Galois Connection: c C ((c)) The most precise (least) element in A representing c 3 ((c)) 2 (c) c 1
(1.2) Galois Connection: ((a)) A a What a represents in C (its meaning) C A a 2 (a) 1 3 ((a))
Example: lattice of equalities Concrete lattice: C = (2State, , , , , State) Abstract lattice: EQ = { x=y | x, y Var} A = (2EQ, , , , EQ , ) Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization (X) = ? (Y) = ?
Example: lattice of equalities Concrete lattice: C = (2State, , , , , State) Abstract lattice: EQ = { x=y | x, y Var} A = (2EQ, , , , EQ , ) Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization () = ({}) = { x=y | x = y} that is x=y (X) = {() | X} = A {() | X} (Y) = { | Y } = models(Y)
Galois Connection: c C ((c)) 3 … [x6, y6, z6] [x5, y5, z5] [x4, y4, z4] … 4 x=x, y=y, z=z 2 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 1 [x5, y5, z5] The most precise (least) element in A representing [x5, y5, z5]
Most precise abstract representation Lemma: (c) = {c’ | c (c’)} C A 6 7 4 5 2 3 (c) 8 9 c 1
Most precise abstract representation Lemma: (c) = {c’ | c (c’)} C A x=y 6 7 x=y, z=y 4 x=y, y=z 5 2 3 (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 8 9 c 1 [x5, y5, z5]
Galois Connection: ((a)) A a What a represents in C (its meaning) C A 2 … [x6, y6, z6] [x5, y5, z5] [x4, y4, z4] … is called a semantic reduction x=y, y=z 1 3 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y
Partial/full reduction The operator is called a semantic reduction (or full reduction) since ((a)) means the same a a but it is a reduced – more precise version of a An operator reduce : DA DA is a partial reduction if reduce(a) A a and (a)=(reduce(a))
Galois Insertion a: ((a))=a How can we obtain a Galois Insertion from a Galois Connection? C A 2 … [x6, y6, z6] [x5, y5, z5] [x4, y4, z4] … All elements are reduced 1 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y
Special cases
Properties of a Galois Connection Theorem: the abstraction and concretization functions uniquely determine each other: (a) = {c | (c) a} (c) = {a | c (a)}
Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements (s) = ({s}) Then lift the abstraction to sets of elements (X) = A {(s) | sX}
The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2State, , , , , State) A = (DA, A, A, A, A, A) If DA is a set of formulas then the abstraction of a state is defined as () = ({}) = A{ | } the least formula from DA that s satisfies The abstraction of a set of states is (X) = A {() | sX} The concretization is () = { | } = models()
Composing Galois connections
Inducing along the connections Assume the complete lattices C = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A) M = (DM, M, M, M, M, M) and Galois connections GCC,A=(C, C,A, A,C, A) and GCA,M=(A, A,M, M,A, M) Lemma: both Galois connections induce the GCC,M= (C, C,M, M,C, M) defined by C,M = C,A A,M and M,C = M,A A,C
Inducing along the connections M A,C M,A c’ 5 4 a’ =A,M(C,A(c)) 3 c C,A(c) 1 C,A 2 A,M
Relating abstract transformers to concrete transformers
Sound abstract transformer Given two lattices C = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A) and GCC,A=(C, , , A) with A concrete transformer f : DC DC an abstract transformer f# : DA DA We say that f# is a sound transformer (w.r.t. f) if c: f(c)=c’ (f#(c)) (c’) For every a and a’ such that (f((a))) A f#(a)
Transformer soundness condition 1 c: f(c)=c’ f#((c)) (c’) C A 5 f# 4 1 f 2 3
Transformer soundness condition 2 a: f#(a)=a’ f((a)) (a’) C A 4 f 5 1 f# 2 3
Best (induced) transformer f#(a)= (f((a))) C A 4 f f# 3 1 2 Problem: incomputable directly
Best abstract transformer [CC’77] Best in terms of precision Most precise abstract transformer May be too expensive to compute Constructively defined as f# = f Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer Useful tool: partial concretization
Developing a sound abstract transformer by example
Transformer example C = (2State, , , , , State) EQ = { x=y | x, y Var} A = (2EQ, , , , EQ , ) () = ({}) = { x=y | x = y } that is x=y (S) = {() | S} = A { () | S } () = { | } = models() Concrete: x:=y S = { [x y] | S } Abstract: x:=y# S = ?
Developing a transformer for EQ - 1 Input has the form S = {a=b} sp(x:=expr, ) = v. x=expr[v/x] [v/x] sp(x:=y, S) = v. x=y[v/x] S[v/x] = … Let’s define helper notations: Mod(x:=y, S) = {x=a, b=x S} Subset of equalities containing x (will be modified) Frame(x:=y, S) = S \ Mod(x:=y, S) Subset of equalities not containing x (i.e., the frame)
Developing a transformer for EQ - 2 sp(x:=y, S) = v. x=y[v/x] {a=b}[v/x] = … Two cases x is y: sp(x:=x, S) = S x is different from y: sp(x:=y, S) = v. x=y Mod(x:=y, S)[v/x] Frame(x:=y, S)[v/x] = x=y Frame(x:=y, S) v. Mod(x:=y, S)[v/x] x=y Frame(x:=y, S) Vanilla transformer: x:=y#1 S = {x=y} Frame(x:=y, S) Example: x:=y#1 {x=p, q=x, m=n} = {x=y, m=n} Is this the most precise result?
Developing a transformer for EQ - 3 x:=y#1 {x=p, x=q, m=n} = {x=y, m=n} {x=y, m=n, p=q} Where does the information p=q come from? sp(x:=y, S) = x=y Frame(x:=y, S) v. Mod(x:=y, S)[v/x] v. Mod(x:=y, S)[v/x] holds possible equalities between different a’s and b’s – how can we account for that?
Developing a transformer for EQ - 4 Define a reduction operator: reduce(S) = if {a=b, b=c}S and {a=c} S then reduce(S {a=c}) if {a=b}S and {b=a} S then reduce(S {b=a}) else S Define x:=y#2 = x:=y#1 reduce x:=y#2 {x=p, x=q, m=n} = {x=y, m=n, p=q} is this the best transformer?
Developing a transformer for EQ - 5 x:=y#2 {y=z} = {x=y, y=z} {x=y, y=z, x=z} Solution: apply reduction operator again after the vanilla transformer x:=y#3 = reduce x:=y#1 reduce Observation: after the first time we apply reduce, all subsequent values will be in the image of the abstraction so really we only need to apply it once to the input Finally: x:=y# S = reduce x:=y#1 Best transformer for reduced elements (elements in the image of the abstraction)
Properties of abstract transformers
Negative property of best transformers Let f# = f Best transformer does not compose (f(f((a)))) f#(f#(a)) Best transformer of composed operation (f2)# = (f f)# = f f Composition of best transformers: (f#)2= f# f# = f f Source of precision loss
(f(f((a)))) f#(f#(a)) C A 9 f 7 f# 5 4 f 8 6 f f# 3 2 1
Global (fixed point) Soundness theorems
Soundness theorem 1 Given two complete lattices C = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A) and GCC,A=(C, , , A) with Monotone concrete transformer f : DC DC Monotone abstract transformer f# : DA DA Local soundness: a DA : f((a)) (f#(a)) Then, global soundness follows: lfp(f) (lfp(f#)) (lfp(f)) lfp(f#)
Soundness theorem 1 C A lpf(f) lpf(f#) fn f#n … … f3 f#3 aDA : f((a)) (f#(a)) aDA : fn((a)) (f#n(a)) aDA : lfp(fn)((a)) (lfp(f#n)(a)) lfp(f) lfp(f#) C A lpf(f) lpf(f#) fn f#n … … f3 f#3 f#2 f2 f# f
Soundness theorem 2 Given two complete lattices C = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A) and GCC,A=(C, , , A) with Monotone concrete transformer f : DC DC Monotone abstract transformer f# : DA DA Local soundness: c DC : (f(c)) f#((c)) Then, global soundness follows: (lfp(f)) lfp(f#) lfp(f) (lfp(f#))
Soundness theorem 2 C A lpf(f#) lpf(f) f#n fn … … f#3 f3 c DC : (f(c)) f#((c)) c DC : (fn(c)) f#n((c)) c DC : (lfp(f)(c)) lfp(f#)((c)) lfp(f) lfp(f#) C A lpf(f#) f fn … lpf(f) f2 f3 f#n … f#3 f#2 f#
A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound
Completeness
Completeness Local property: forward complete: c: (f#(c)) = (f(c)) backward complete: a: f((a)) = (f#(a)) A property of domain (assuming the best transformer) Global property: (lfp(f)) = lfp(f#) lfp(f) = (lfp(f#)) Very ideal but usually not possible unless we change the program model Apply very coarse abstraction and/or Aim for very simple properties
Forward complete transformer c: (f#(c)) = (f(c)) C A 4 1 f 2 f# 3
Backward complete transformer a: f((a)) = (f#(a)) C A f 5 1 f# 2 3
Global (backward) completeness a: f((a)) = (f#(a)) a: fn((a)) = (f#n(a)) aDA : lfp(fn)((a)) = (lfp(f#n)(a)) lfp(f) = lfp(f#) C A lpf(f) lpf(f#) fn f#n … … f3 f#3 f#2 f2 f# f
Global (forward) completeness c DC : (f(c)) = f#((c)) c DC : (fn(c)) = f#n((c)) c DC : (lfp(f)(c)) = lfp(f#)((c)) lfp(f) = lfp(f#) C A lpf(f#) f fn … lpf(f) f2 f3 f#n … f#3 f#2 f#
see you next time