Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.

Slides:

Advertisements

Similar presentations

Tutorial on Widening (and Narrowing) Hongseok Yang Seoul National University.

Advertisements

Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.

1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.

Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.

1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.

1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.

Data Flow Analysis Compiler Design Nov. 8, 2005.

Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.

1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.

Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.

Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.

Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (

Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.

1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.

MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.

Solving fixpoint equations

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.

1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.

Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Program Analysis and Verification

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.

CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.

Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.

Program Analysis and Verification

Program Analysis and Verification

1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:

Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.

1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.

Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.

Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.

Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University

Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

Program Analysis and Verification

Chapter 2 1. Chapter Summary Sets The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions and sequences.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Spring 2017 Program Analysis and Verification

Spring 2017 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Textbook: Principles of Program Analysis

Spring 2017 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Symbolic Implementation of the Best Transformer

Iterative Program Analysis Abstract Interpretation

Spring 2017 Program Analysis and Verification Operational Semantics

Program Analysis and Verification

Program Analysis and Verification

Program Analysis and Verification

Program Analysis and Verification

Data Flow Analysis Compiler Design

((a)) A a and c C ((c))

Program Analysis and Verification

Spring 2016 Program Analysis and Verification Operational Semantics

Presentation transcript:

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University

Syllabus Semantics Natural Semantics Structural semantics Axiomatic Verification Static Analysis Automating Hoare Logic Control Flow Graphs Equation Systems Collecting Semantics Abstract Interpretation fundamentals LatticesFixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Interprocedural Analysis Analysis Techniques Numerical Domains CEGARAlias analysis Shape Analysis Crafting your own Soot From proofs to abstractions Systematically developing transformers 2

Previously Solving monotone systems Fixed-points Vanilla static analysis algorithm Chaotic iteration 3

Static analysis Given a system of equations for the collecting semantics A static analysis solves a corresponding system of equations over an abstract domain Questions: – How do you solve the second system? Chaotic Iteration – What is the relation between the solutions? This lecture 4 R[0] = { x  Z} // established input R[1] = R[0]  R[4] R[2] =  assume x>0  R[1] R[3] =  assume x  0  R[1] R[4] =  x:=x-1  R[2] R[0] # = { x  Z} # R[1] # = R[0]  R[4] R[2] # =  assume x>0  # R[1] R[3] # =  assume x  0  # R[1] R[4] # =  x:=x-1  # R[2]

Monotone function 5 f 1 x  L1L1 L2L2 2 y 3 f(x)f(x) 4 f(y)f(y)  f

Important cases of monotonicity Join: f(X, Y) = X  Y is monotone in each operand – Prove it! Set lifting function: for a set X and any function g F(X) = { g(x) | x  X } is monotone w.r.t.  – Prove it! Notice that the collecting semantics function is defined in terms of – Join (set union) – Semantic function for atomic statements lifted to sets of states Conclusion: collecting semantics function is monotone 6

Fixed points L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) 7 Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() 1.Does a solution always exist? Yes 2.If so, is it unique? No, but it has least/greatest solutions 3.If so, is it computable? Under some conditions…

Continuous functions Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } Lemma: if f is continuous then f is monotone Proof: assume x  y Therefore x  y=y Then f(y) = f(x  y) = f(x)  f(y), which means f(x)  f(y) 8

Continuous functions Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } Lemma: if f is continuous then f is monotone Proof: assume x  y Therefore x  y=y Then f(y) = f(x  y) = f(x)  f(y), which means f(x)  f(y) 9

Kleene’s fixed point theorem Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) That is, take the ascending chain   f(  )  f(f(  ))  …  f n (  )  … and return the supremum – Why is this an ascending chain? But how do you know if a function f is continuous 10

Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = d n+2 = … Lemma: Monotone functions on posets satisfying ACC are continuous 11

Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing lfp(f) over a poset with ACC when f is monotone 12   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

Vanilla algorithm Problem Definition: 1.Lattice of properties L of finite height (ACC) 2.For each statement define a monotone transformer Preparation: 1.Parse program into AST 2.Convert AST into CFG 3.Generate system of equations from CFG Analysis: 1.Initialize each analysis variable with  2.Update all analysis variables of each equation until reaching a fixed point 13 Non-incremental. Most variables don’t change.

Chaotic iteration 14 Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

Required knowledge Collecting semantics Abstract semantics (over lattices) Algorithm to compute abstract semantics (chaotic iteration) Connection between collecting semantics and abstract semantics Abstract transformers 15

Today Galois connections Abstract transformers Global soundness 16

Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.What is the connection between the two least fixed- points? 2.Transformer monotonicity is required for termination – what should we require for correctness? 17

Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: 1.Does the algorithm terminate? 2.What is the connection between the two least fixed- points? 3.Transformer monotonicity is required for termination – what should we require for correctness? 18

Handling non-monotone transformers Kleene’s fixed point theorem gives a constructive method for computing lfp(f) over a poset with ACC when f is monotone Monotonicity ensures   f(  )  … f n (  )  … is an ascending chain What if f is not necessarily monotone? How can we ensure termination? 19 d :=  while f(d)  d do d := f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

Handling non-monotone transformers Define f’(d) = d  f(d) Now f’ is extensive: d  d  f(d) = f’(d) and so   f’(  )  … f’ n (  )  … is an ascending chain Result is not necessarily the least fixed point – we get a (post)fixed point in finite time (ACC) 20 d :=  while f’(d)  d do d := f’(d) return d Revised algorithm lfp(f) =  n  N f n (  )   n  N f’ n (  ) Mathematical definition

Relating the concrete domain and the abstract domain 21

Galois Connection Given two complete lattices C = (D C,  C,  C,  C,  C,  C )– concrete domain A = (D A,  A,  A,  A,  A,  A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function  : D C  D A – The concretization function  : D A  D C For every concrete element c  D C and abstract element a  D A  (  (a))  A a and c  C  (  (c)) Alternatively  (c)  A a iff c  C  (a) 22

Galois Connection: c  C  (  (c)) 23 1   c 2 (c)(c) 3  (  (c))  The most precise (least) element in A representing c CA

Galois Connection:  (  (a))  A a 24 1   3  (  (a)) 2 (a)(a) a  CA What a represents in C (its meaning)

Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (X) = ?  (Y) = ? 25

Example: lattice of equalities Concrete lattice: C = (2 State, , , , , State) Abstract lattice: EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  ) – Treat elements of A as both formulas and sets of constraints Useful for copy propagation – a compiler optimization  (  ) =  ({  }) = { x=y |  x =  y} that is   x=y  (X) =  {  (  ) |   X} =  A {  (  ) |   X}  (Y) = {  |    Y } = models(  Y) 26

Galois Connection: c  C  (  (c)) 27 1   [x  5, y  5, z  5] 2 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 3 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] …  4 x=x, y=y, z=z  The most precise (least) element in A representing [x  5, y  5, z  5] CA

Most precise abstract representation 28 1 c 5  CA   8 9  (c)(c)  (c) =  {c’ | c   (c’)} 

Most precise abstract representation 29 1 c 5  CA   8 9   (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y  (c) =  {c’ | c   (c’)} [x  5, y  5, z  5] x=y, y=z x=y, z=y x=y

Galois Connection:  (  (a))  A a 30 1   3 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … x=y, y=z  What a represents in C (its meaning)    is called a semantic reduction CA

Partial reduction The operator    is called a semantic reduction since  (  (a)) means the same a a but it is a reduced – more precise version of a An operator reduce : D A  D A is a partial reduction if 1.reduce(a)  A a and 2.  (a)=  (reduce(a)) 31

Galois Insertion  a:  (  (a))=a 32   1 x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y 2 … [x  6, y  6, z  6] [x  5, y  5, z  5] [x  4, y  4, z  4] … CA How can we obtain a Galois Insertion from a Galois Connection? All elements are reduced

Special cases 33

Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other:  (a) =  {c |  (c)  a}  (c) =  {a | c   (a)} 34

Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements  (s) =  ({s}) Then lift the abstraction to sets of elements  (X) =  A {  (s) | s  X} 35

The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2 State, , , , , State) A = (D A,  A,  A,  A,  A,  A ) If D A is a set of formulas then the abstraction of a state is defined as  (  ) =  ({  }) =  A {  |    } the least formula from D A that s satisfies The abstraction of a set of states is  (X) =  A {  (  ) | s  X} The concretization is  (  ) = {  |    } = models(  ) 36

Composing Galois connections 37

Inducing along the connections Assume the complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) M = (D M,  M,  M,  M,  M,  M ) and Galois connections GC C,A =(C,  C,A,  A,C, A) and GC A,M =(A,  A,M,  M,A, M) Lemma: both connections induce the GC C,M = (C,  C,M,  M,C, M) defined by  C,M =  C,A   A,M and  M,C =  M,A   A,C 38

Inducing along the connections 39 1 C,AC,A A,CA,C c 2 C,A(c)C,A(c) 5 CA 3 M A,MA,M 4 M,AM,A c’c’ a’ =  A,M (  C,A (c))

Relating abstract transformers to concrete transformers 40

Sound abstract transformer Given two lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with A concrete transformer f : D C  D C an abstract transformer f # : D A  D A We say that f # is a sound transformer (w.r.t. f) if  c: f(c)=c’   (f # (c))   (c’) For every a and a’ such that  (f(  (a)))  A f # (a) 41

Transformer soundness condition CA f 3 4 f#f# 5   c: f(c)=c’   (f # (c))   (c’)

Transformer soundness condition 2 43 CA 12 f#f# 3 5 f 4   a: f # (a)=a’  f(  (a))   (a’)

Best (induced) transformer 44 CA 2 3 f f # (a)=  (f(  (a))) 1 f#f# 4 Problem:  incomputable directly

Best abstract transformer [CC’77] Best in terms of precision – Most precise abstract transformer – May be too expensive to compute Constructively defined as f # =   f   – Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer – Useful tool: partial concretization 45

Developing a sound abstract transformer by example 46

Transformer example C = (2 State, , , , , State) EQ = { x=y | x, y  Var} A = (2 EQ, , , , EQ,  )  (  ) =  ({  }) = { x=y |  x =  y } that is   x=y  (S) =  {  (  ) |   S} =  A {  (  ) |   S }  (  ) = {  |    } = models(  ) Concrete:  x:=y  S = {  [x   y] |   S } Abstract:  x:=y  # S = ? 47

Developing a transformer for EQ - 1 Input has the form S =  {a=b} sp(x:=expr,  ) =  v. x=expr[v/x]   [v/x] sp(x:=y,  S) =  v. x=y[v/x]   S[v/x] = … Let’s define helper notations: – Mod(x:=y, S) = {x=a, b=x  S} Subset of equalities containing x (will be modified) – Frame(x:=y, S) = S \ Mod(x:=y, S) Subset of equalities not containing x (i.e., the frame) 48

Developing a transformer for EQ - 2 sp(x:=y,  S) =  v. x=y[v/x]   {a=b}[v/x] = … Two cases – x is y: sp(x:=x,  S) = S – x is different from y: sp(x:=y,  S) =  v. x=y  Mod(x:=y, S) [v/x]  Frame(x:=y, S) [v/x] = x=y  Frame(x:=y, S)   v. Mod(x:=y, S)[v/x]  x=y  Frame(x:=y, S) Vanilla transformer:  x:=y  #1 X = x=y  Frame(x:=y, S) Example:  x:=y  #1  {x=p, q=x, m=n} =  {x=y, m=n} Is this the most precise result? 49

Developing a transformer for EQ - 3  x:=y  #1  {x=p, x=q, m=n} =  {x=y, m=n}   {x=y, m=n, p=q} – Where does the information p=q come from? sp(x:=y,  S) = x=y  Frame(x:=y, S)   v. Mod(x:=y, S)[v/x]  v. Mod(x:=y, S)[v/x] holds possible equalities between different a’s and b’s – how can we account for that? 50

Developing a transformer for EQ - 4 Define a reduction operator: reduce(S) = if {a=b, b=c}  S and {a=c}  S then reduce(S  {a=c}) if {a=b}  S and {b=a}  S then reduce(S  {b=a}) else S Define  x:=y  #2 =  x:=y  #1  reduce  x:=y  #2 )  {x=p, x=q, m=n}) =  {x=y, m=n, p=q} is this the best transformer? 51

Developing a transformer for EQ - 5  x:=y  #2 )  {y=z}) = {x=y, y=z}  {x=y, y=z, x=z} Idea: apply reduction operator again after the vanilla transformer  x:=y  #3 = reduce   x:=y  #1  reduce Observation : after the first time we apply reduce, all subsequent values will be in the image of the abstraction so really we only need to apply it once to the input Finally:  x:=y  # S = reduce   x:=y  #1 – Best transformer for reduced elements (elements in the image of the abstraction) 52

Properties of abstract transformers 53

Negative property of best transformers Let f # =   f   Best transformer does not compose  (f(f(  (a))))  f # (f # (a)) Best transformer of composed operation (f 2 ) # = (f  f) # =   f  f   Composition of best transformers: (f # ) 2 = f #  f # =   f      f   54 Source of precision loss

 (f(f(  (a))))  f # (f # (a)) 55 CA 2 3 f 1 f#f# f 7  f#f# 8 9 f

Global (fixed point) Soundness theorems 56

Soundness theorem 1 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.Local soundness:  a  D A : f(  (a))   (f # (a)) Then, global soundness: lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # ) 57

Soundness theorem 1 58 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a  D A : f(  (a))   (f # (a))   a  D A : f n (  (a))   (f #n (a))   a  D A : lfp(f n )(  (a))   (lfp(f #n )(a))  lfp(f)   lfp(f # ) 

Soundness theorem 2 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.Local soundness:  c  D C :  (f(c))  f # (  (c)) Then, global soundness:  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # )) 59

Soundness theorem 2 60 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c))  f # (  (c))   c  D C :  (f n (c))  f #n (  (c))   c  D C :  (lfp(f)(c))  lfp(f # )(  (c))  lfp(f)   lfp(f # ) 

A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound 61

Completeness 62

Completeness Local property: – forward complete:  c:  (f # (c)) =  (f(c)) – backward complete:  a: f(  (a)) =  (f # (a)) A property of domain and the (best) transformer Global property: –  (lfp(f)) = lfp(f # ) – lfp(f) =  (lfp(f # )) Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties 63

Forward complete transformer CA f 3 4 f#f#  c:  (f # (c)) =  (f(c))

Backward complete transformer 65 CA 12 f#f# 3 5 f  a: f(  (a)) =  (f # (a))

Global (backward) completeness 66 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   a: f(  (a)) =  (f # (a))   a: f n (  (a)) =  (f #n (a))   a  D A : lfp(f n )(  (a)) =  (lfp(f #n )(a))  lfp(f)  = lfp(f # ) 

Global (forward) completeness 67 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f#   c  D C :  (f(c)) = f # (  (c))   c  D C :  (f n (c)) = f #n (  (c))   c  D C :  (lfp(f)(c)) = lfp(f # )(  (c))  lfp(f)  = lfp(f # ) 

implementation 68

Implementing the VE analysis Implement the abstract states in the VE domain 2. Implement the VE abstract domain operations 3. Connect analysis to Soot A = (D A,  A,  A,  A,  A,  A )

VE factoids 70 Don’t forget to override toString() – used for displaying analysis results

VE abstract state implementation 71 Implement bottom as a special object – must be handled appropriately in all abstract operations

Abstract domain base class 72

VE domain operations: , , ,  73 used to match the right transformer for a given statement Handle bottom as a special corner case

VE abstract transformers 74 Find transformer by matching statement Improved transformer with reduction operator which branch of the condition

Matching transformers to statements 75 Extend matcher to easily pattern match statements It is usually safe to ignore many statements (skip) Handle assume x==y Handle assume x!=y Handle x=y Special case: x=x (what happens if we use the general case here?)

Matching transformers to statements 76 Handling x=expr as identity is unsound. Handle soundly by removing all factoids containing x (that is, lhs).

Example transformer  x=y  # 77 Check the precondition for applying this transformer Handle bottom as a special corner case A transformer is a unary operation parameterized by the type of abstract domain elements Override the apply method with one argument

Example transformer  x=expr  # 78 As a convention never mutate states – create new ones and modify them (functional programming style)

Next lecture: abstract interpretation IV