Program Analysis and Verification 0368-4479

Slides:



Advertisements
Similar presentations
Abstract Interpretation Part II
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Tutorial on Widening (and Narrowing) Hongseok Yang Seoul National University.
Lecture 03 – Background + intro AI + dataflow and connection to AI Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
Program analysis Mooly Sagiv html://
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Data Flow Analysis Compiler Design Nov. 3, 2005.
Program analysis Mooly Sagiv html://
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Administrative stuff Office hours: After class on Tuesday.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (
Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
Compiler Construction Lecture 16 Data-Flow Analysis.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Program Analysis and Verification
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
1 Combining Abstract Interpreters Mooly Sagiv Tel Aviv University
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Program Analysis and Verification
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Program Analysis and Verification
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Spring 2017 Program Analysis and Verification
Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Textbook: Principles of Program Analysis
Spring 2016 Program Analysis and Verification
Combining Abstract Interpreters
Symbolic Implementation of the Best Transformer
Iterative Program Analysis Abstract Interpretation
Program Analysis and Verification
Program Analysis and Verification
Program Analysis and Verification
Program Analysis and Verification
((a)) A a and c C ((c))
Program Analysis and Verification
Presentation transcript:

Program Analysis and Verification Noam Rinetzky Lecture 12: Abstract Interpretation IV Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav, Ganesan Ramalingam

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states Join (  ) – Transformer functions Abstract steps – Chaotic iteration Structured Programs Abstract computation

Abstract (conservative) interpretation set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states Join (  ) – Transformer functions Abstract steps – Chaotic iteration Abstract computation Structured Programs Lattices (D, , , , ,  ) Lattices (D, , , , ,  ) Monotonic functions Fixpoints

Chains d  d’ means d  d’ and d  d’ An ascending chain is a sequence x 1  x 2  …  x k … A descending chain is a sequence x 1  x 2  …  x k … The height of a poset (D,  ) is the length of the maximal ascending chain in D

Complete partial order (CPO) A poset (D,  ) is a complete partial if every ascending chain x 1  x 2  …  x k … has a LUB

Lattices (D, , , , ,  ) is a lattice if – (D,  ) is a partial order – ∀ X  FIN D.  X is defined – A top element  – ∀ X  FIN D.  X is defined – A bottom element  A lattice (D, , , , ,  ) is a complete lattice if –  X and  Y are defined for arbitrary sets

Example I: Sign lattice   x=0 x0x0 x<0x>0 x0x0

Example II: Powerset lattices (2 X, , , , , X) is the powerset lattice of X – A complete lattice

Domain Constructors Cartesian products: 0 < x ∧ 0 < y: Disjunctive completion: x < 0 ∨ 0 < x Relational product: (0 < x ∧ 0 < y) ∨ (0 < x ∧ y < 0) Finite maps: [L1 ↦ x= ⊤, L2 ↦ x=0] – |{L1, L2}| finite

Concrete Semantics

Collecting semantics For a set of program states State, we define the collecting lattice (2 State, , , , , State) The collecting semantics accumulates the (possibly infinite) sets of states generated during the execution – Not computable in general

The collecting lattice Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements

Equational definition of the semantics R[2] = R[entry]   x:=x-1  R[3] R[3] = R[2]  {s | s(x) > 0} R[exit] = R[2]  {s | s(x)  0} A system of recursive equations Solution: concrete meaning of the program if x > 0 x := x - 1 entry exit R[entry] R[2] R[3] R[exit]

An abstract semantics R[2] = R[entry]   x:=x-1  # R[3] R[3] = R[2]  {s | s(x) > 0} # R[exit] = R[2]  {s | s(x)  0} # A system of recursive equations Solution: abstract meaning of the program – Over-approximation of the concrete semantics if x > 0 x := x - 1 entry exit R[entry] R[2] R[3] R[exit] Abstract transformer for x:=x-1 Abstract representation of {s | s(x) < 0}

Abstract interpretation via concretization set of states collecting semantics statement S set of states  abstract representation of sets of states abstract semantics statement S abstract representation of sets of states concretization

Can we always find a solution?

Monotone functions Let L 1 =(D 1,  ) and L 2 =(D 2,  ) be two posets A function f : D 1  D 2 is monotone if for every pair x, y  D 1 x  y implies f(x)  f(y) A special case: L 1 =L 2 =(D,  ) f : D  D

Fixed-points L = (D, , , , ,  ) f : D  D monotone Fix(f) = { d | f(d) = d } Red(f) = { d | f(d)  d } Ext(f) = { d | d  f(d) } Theorem [Tarski 1955] – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f) Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn() 1. A solution always exist 2. It unique 3. Not always computable

Continuity and ACC condition Let L = (D, , ,  ) be a complete partial order – Every ascending chain has an upper bound A function f is continuous if for every increasing chain Y  D*, f(  Y) =  { f(y) | y  Y } L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes: d 0  d 1  …  d n = d n+1 = …

Fixed-point theorem [Kleene] Let L = (D, , ,  ) be a complete partial order and a continuous function f: D  D then lfp(f) =  n  N f n (  ) Lemma: Monotone functions on posets satisfying ACC are continuous

Resulting algorithm Kleene’s fixed point theorem gives a constructive method for computing the lfp   lfp fn()fn() f()f() f2()f2() … d :=  while f(d)  d do d := d  f(d) return d Algorithm lfp(f) =  n  N f n (  ) Mathematical definition

Correctness Transformer monotonicity is required for termination – what should we require for correctness? What is the connection between the two least fixed-points of the concrete and abstract semantics?

Abstraction/Concretization Given two complete lattices C = (D C,  C,  C,  C,  C,  C )– concrete domain A = (D A,  A,  A,  A,  A,  A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function  : D C  D A – The concretization function  : D A  D C for every concrete element c  D C and abstract element a  D A  (  (a))  a and c   (  (c)) Alternatively  (c)  a iff c   (a) – Homework

Galois Connection: c   (  (c)) 1   c 2 (c)(c) 3  (  (c))  The most precise (least) element in A representing c CA

Galois Connection:  (  (a))  a 1   3  (  (a)) 2 (a)(a) a  CA What a represents in C (its meaning)

Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other:  (a) =  {c |  (c)  a}  (c) =  {a | c   (a)}

Abstracting sets It is usually convenient to first define the abstraction of single elements  (s) =  ({s}) Then lift the abstraction to sets of elements  (X) =  A {  (s) | s  X}

Inducing along the connections 1 C,AC,A A,CA,C c 2 C,A(c)C,A(c) 5 CA 3 M A,MA,M 4 M,AM,A c’c’ a’ =  A,M (  C,A (c))

Sound abstract transformer Given two lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with A concrete transformer f : D C  D C an abstract transformer f # : D A  D A We say that f # is a sound transformer (w.r.t. f) if  c:  (f(c))  f # (  (c))  a:  (f(  (a)))  A f # (a)

Transformer soundness condition 1 12 CA f 3 4 f#f# 5   c: f(c)=c’  f # (  (c))   (c’)

Abstract (conservative) interpretation set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Transformer soundness condition 2 CA 12 f#f# 3 5 f 4   a: f # (a)=a’  f(  (a))   (a’)

Abstract (conservative) interpretation {x ↦ 1, x ↦ 2, …}{x ↦ 0, x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1 0 ≤ x concretization

Best (induced) transformer CA 2 3 f f # (a)=  (f(  (a))) 1 f#f# 4 Problem:  incomputable directly

Best abstract transformer [CC’77] Best in terms of precision – Most precise abstract transformer – May be too expensive to compute Constructively defined as f # =   f   – Induced by the GC Not directly computable because first step is concretization We often compromise for a “good enough” transformer – Useful tool: partial concretization

Negative property of best transformers Let f # =   f   Best transformer does not compose  (f(f(  (a))))  f # (f # (a))

 (f(f(  (a))))  f # (f # (a)) CA 2 3 f 1 f#f# f 7  f#f# 8 9 f

Soundness theorem 1 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  a  D A : f(  (a))   (f # (a)) Then lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # )

Soundness theorem 1 CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f# 

Soundness theorem 2 1.Given two complete lattices C = (D C,  C,  C,  C,  C,  C ) A = (D A,  A,  A,  A,  A,  A ) and GC C,A =(C, , , A) with 2.Monotone concrete transformer f : D C  D C 3.Monotone abstract transformer f # : D A  D A 4.  c  D C :  (f(c))  f # (  (c)) Then  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # ))

Soundness theorem 2 CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f# 

A recipe for a sound static analysis Define an “appropriate” operational semantics Define “collecting” structural operational semantics Establish a Galois connection between collecting states and abstract states Local correctness: show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics Global correctness: conclude that the analysis is sound

Completeness Local property: – forward complete:  c:  (f # (c)) =  (f(c)) – backward complete:  a: f(  (a)) =  (f # (a)) A property of domain and the (best) transformer Global property: –  (lfp(f)) = lfp(f # ) – lfp(f) =  (lfp(f # )) Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties

Forward complete transformer 12 CA f 3 4 f#f#  c:  (f # (c)) =  (f(c))

Global (forward) completeness CA   f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f# 

Backward complete transformer CA 12 f#f# 3 5 f  a: f(  (a)) =  (f # (a))

Global (backward) completeness CA  f  fn fn  … lpf(f)  f2 f2  f3f3 f #n  … lpf(f # )  f#2 f#2  f#3f#3 f# f# 

Three example analyses Variable Equalities Constant Propagation Available Expressions

Combining GC and Transformers Cartesian Product – L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) – Cart(L 1, L 2 ) = (D 1  D 2,  cart,  cart,  cart,  cart,  cart ) Disjunctive completion – L = (D, , , , ,  ) – Disj(L) = (2 D,  ,  ,  ,  ,   ) Relational Product – Rel(L 1, L 2 ) = Disj(Cart(L 1, L 2 ))

Cartesian product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X)

Cartesian product transformers GC C,A =(C,  C,A,  A,C, A)F A [st] : A  A GC C,B =(C,  C,B,  B,C, B)F B [st] : B  B Cartesian Product GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) = (  C,A (X),  C,B (X)) –  A  B,C (Y) =  A,C (X)   B,C (X) F A  B [st](a, b) = (F A [st] a, F B [st] b) Is this the best we can do?

Product vs. reduced product CP  VE lattice {a=9}{c=a}{c=9}{c=a} {a=9, c=9}{c=a} {[a  9, c  9]} collecting lattice {}    

Reduced product For two complete lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) Define the reduced poset D 1  D 2 = {(d 1,d 2 )  D 1  D 2 | (d 1,d 2 ) =    (d 1,d 2 ) } L 1  L 2 = (D 1  D 2,  cart,  cart,  cart,  cart,  cart )

Transformers for Cartesian product Do we get the best transformer by applying component-wise transformer followed by reduction? – Unfortunately, no (what’s the intuition?) – Can we do better? – Logical Product [Gulwani and Tiwari, PLDI 2006]

Logical product-- Assume A=(D,…) is an abstract domain that supports two operations: for x  D – inferEqualities(x) = { a=b |  (x)  a=b } returns a set of equalities between variables that are satisfied in all states given by x – refineFromEqualities(x, {a=b}) = y such that  (x)=  (y) y  x

Example

Information loss example if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b=  } can’t prove

Disjunctive completion of a lattice For a complete lattice L = (D, , , , ,  ) Define the powerset lattice L  = (2 D,  ,  ,  ,  ,   )   = ?   = ?   = ?   = ?   = ? Lemma: L  is a complete lattice L  contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L  = Disj(L)

Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = ? –  P(A),C (Y) = ?

Disjunctive completion for GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Disjunctive completion GC C,P(A) = (C,  P(A),  P(A), P(A)) –  C,P(A) (X) = {  C,A ({x}) | x  X} –  P(A),C (Y) =  {  P(A) (y) | y  Y} What about transformers?

Information loss example if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 {} {b=5} {b=-5} {b= 5  b=-5 } {b= 0 } proved

The base lattice CP {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false

The disjunctive completion of CP {x=0} true {x=-1}{x=-2}{x=1}{x=2} …… false {x=-2  x=-1}{x=-2  x=0}{x=-2  x=1}{x=1  x=2} ……… {x=0  x=1  x=2}{x=-1  x=1  x=-2} ……… … What is the height of this lattice?

Taming disjunctive completion Disjunctive completion is very precise – Maintains correlations between states of different analyses – Helps handle conditions precisely – But very expensive – number of abstract states grows exponentially – May lead to non-termination Base analysis (usually product) is less precise – Analysis terminates if the analyses of each component terminates How can we combine them to get more precision yet ensure termination and state explosion?

Taming disjunctive completion Use different abstractions for different program locations – At loop heads use coarse abstraction (base) – At other points use disjunctive completion Termination is guaranteed (by base domain) Precision increased inside loop body

With Disj(CP) while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 }

With tamed Disj(CP) while (…) { if (…) b := 5 else b := -5 if (b>0) b := b-5 else b := b+5 assert b==0 } CP Disj(CP)

Reducing disjunctive elements A disjunctive set X may contain within it an ascending chain Y=a  b  c… We only need max(Y) – remove all elements below

Relational product of lattices L 1 = (D 1,  1,  1,  1,  1,  1 ) L 2 = (D 2,  2,  2,  2,  2,  2 ) L rel = (2 D 1  D 2,  rel,  rel,  rel,  rel,  rel ) as follows: – L rel = Disj(Cart(L 1, L 2 )) Lemma: L is a complete lattice What does it buy us? – How is it relative to Cart(Disj(L 1 ), Disj(L 2 ))? What about transformers?

Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = ? –  P(A  B),C (Y) = ?

Relational product of GCs GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Relational Product GC C,P(A  B) = (C,  C,P(A  B),  P(A  B),C, P(A  B)) –  C,P(A  B) (X) = {(  C,A ({x}),  C,B ({x})) | x  X} –  P(A  B),C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y}

Function space GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Denote the set of monotone functions from A to B by A  B Define  for elements of A  B as follows (a 1, b 1 )  (a 2, b 2 ) = if a 1 =a 2 then {(a 1, b 1  B b 1 )} else {(a 1, b 1 ), (a 2, b 2 )} Reduced cardinal power GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) =  {(  C,A ({x}),  C,B ({x})) | x  X} –  A  B,C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} Useful when A is small and B is much larger – E.g., typestate verification

Function space GC C,A =(C,  C,A,  A,C, A) GC C,B =(C,  C,B,  B,C, B) Denote the set of monotone functions from A to B by A  B Define  for elements of A  B as follows (a 1, b 1 )  (a 2, b 2 ) = if a 1 =a 2 then {(a 1, b 1  B b 1 )} else {(a 1, b 1 ), (a 2, b 2 )} Reduced cardinal power GC C,A  B = (C,  C,A  B,  A  B,C, A  B) –  C,A  B (X) =  {(  C,A ({x}),  C,B ({x})) | x  X} –  A  B,C (Y) =  {  A,C (y A )   B,C (y B ) | (y A,y B )  Y} Useful when A is small and B is much larger – E.g., typestate verification 75

Widening/Narrowing

How can we prove this automatically? RelProd(CP, VE)

Intervals domain One of the simplest numerical domains Maintain for each variable x an interval [L,H] – L is either an integer of -  – H is either an integer of +  A (non-relational) numeric domain

Intervals lattice for variable x  [0,0][-1,-1][-2,-2][1,1][2,2]... [- ,+  ] [0,1][1,2][2,3][-1,0][-2,-1] [-10,10] [1, +  ][ - ,0 ]... [2, +  ][0, +  ][ - ,-1 ]... [-20,10]

Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] ? – [1,4]  [1,3] ? – [1,3]  [1,4] ? – [1,3]  [- ,+  ] ? What is the lattice height?

Intervals lattice for variable x D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H}   =[- ,+  ]  = ? – [1,2]  [3,4] no – [1,4]  [1,3] no – [1,3]  [1,4] yes – [1,3]  [- ,+  ]yes What is the lattice height? Infinite

Joining/meeting intervals [a,b]  [c,d] = ? – [1,1]  [2,2] = ? – [1,1]  [2, +  ] = ? [a,b]  [c,d] = ? – [1,2]  [3,4] = ? – [1,4]  [3,4] = ? – [1,1]  [1,+  ] = ? Check that indeed x  y if and only if x  y=y

Joining/meeting intervals [a,b]  [c,d] = [min(a,c), max(b,d)] – [1,1]  [2,2] = [1,2] – [1,1]  [2,+  ] = [1,+  ] [a,b]  [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise  – [1,2]  [3,4] =  – [1,4]  [3,4] = [3,4] – [1,1]  [1,+  ] = [1,1] Check that indeed x  y if and only if x  y=y

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = ?

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas?

Interval domain for programs D int [x] = { (L,H) | L  - ,Z and H  Z,+  and L  H} For a program with variables Var={x 1,…,x k } D int [Var] = D int [x 1 ]  …  D int [x k ] How can we represent it in terms of formulas? – Two types of factoids x  c and x  c – Example: S =  {x  9, y  5, y  10} – Helper operations c + +  = +  remove(S, x) = S without any x-constraints lb(S, x) = k if k ≤ x ≤ m ub(S,x) = m if k ≤ x ≤ m

Assignment transformers  x := c  # S = ?  x := y  # S = ?  x := y+c  # S = ?  x := y+z  # S = ?  x := y*c  # S = ?  x := y*z  # S = ?

Assignment transformers  x := c  # S = remove(S,x)  {x  c, x  c}  x := y  # S = remove(S,x)  {x  lb(S,y), x  ub(S,y)}  x := y+c  # S = remove(S,x)  {x  lb(S,y)+c, x  ub(S,y)+c}  x := y+z  # S = remove(S,x)  {x  lb(S,y)+lb(S,z), x  ub(S,y)+ub(S,z)}  x := y*c  # S = remove(S,x)  if c>0 {x  lb(S,y)*c, x  ub(S,y)*c} else {x  ub(S,y)*-c, x  lb(S,y)*-c}  x := y*z  # S = remove(S,x)  ?

assume transformers  assume x=c  # S = ?  assume x<c  # S = ?  assume x=y  # S = ?  assume x  c  # S = ?

assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = ?

assume transformers  assume x=c  # S = S  {x  c, x  c}  assume x<c  # S = S  {x  c-1}  assume x=y  # S = S  {x  lb(S,y), x  ub(S,y)}  assume x  c  # S = (S  {x  c-1})  (S  {x  c+1})

Chaotic iteration Input: – A cpo L = (D, , ,  ) satisfying ACC – L n = L  L  …  L – A monotone function f : D n  D n – A system of equations { X[i] | f(X) | 1  i  n } Output: lfp(f) A worklist-based algorithm for i:=1 to n do X[i] :=  WL = {1,…,n} while WL   do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i]) return X

Concrete semantics equations R[0] = { x  Z} R[1] =  x:=7  R[2] = R[1]  R[4] R[3] = R[2]  {s | s(x) < 1000} R[4] =  x:=x+1  R[3] R[5] = R[2]  {s | s(x)  1000} R[6] = R[5]  {s | s(x)  1001} R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Abstract semantics equations R[0] =  ({ x  Z}) R[1] =  x:=7  # R[2] = R[1]  R[4] R[3] = R[2]   ({s | s(x) < 1000}) R[4] =  x:=x+1  # R[3] R[5] = R[2]   ({s | s(x)  1000}) R[6] = R[5]   ({s | s(x)  1001})  R[5]   ({s | s(x)  999}) R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Abstract semantics equations R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[3] = R[2]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Too many iterations to converge

How many iterations for this one?

Widening Introduce a new binary operator to ensure termination – A kind of extrapolation Enables static analysis to use infinite height lattices – Dynamically adapts to given program Tricky to design Precision less predictable then with finite- height domains (widening non-monotone)

Formal definition For all elements d 1  d 2  d 1  d 2 For all ascending chains d 0  d 1  d 2  … the following sequence is finite (stabilizes) – y 0 = d 0 – y i+1 = y i  d i+1 For a monotone function f : D  D define – x 0 =  – x i+1 = x i  f(x i ) Theorem: – There exits k such that x k+1 = x k – x k  Red(f) = { d | d  D and f(d)  d }

Analysis with finite-height lattice A  f #n  = lpf(f # )  … f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f)

Analysis with widening A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

Widening for Intervals Analysis   [c, d] = [c, d] [a, b]  [c, d] = [ if a  c then a else - , if b  d then b else 

Semantic equations with widening R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Non monotonicity of widening [0,1]  [0,2] = ? [0,2]  [0,2] = ?

Non monotonicity of widening [0,1]  [0,2] = [0,  ] [0,2]  [0,2] = [0,2]

Analysis with narrowing A  f#2  f#3f#2  f#3 f#2 f#2  f#3f#3 f# f#  Red(f) Fix(f) lpf(f # ) 

Formal definition of narrowing Improves the result of widening y  x  y  (x  y)  x For all decreasing chains x 0  x 1  … the following sequence is finite (stabilizes) – y 0 = x 0 – y i+1 = y i  x i+1 For a monotone function f: D  D and x k  Red(f) = { d | d  D and f(d)  d } define – y 0 = x – y i+1 = y i  f(y i ) Theorem: – There exits k such that y k+1 =y k

Formal definition of narrowing Improves the result of widening y  x  y  (x  y)  x For all decreasing chains x 0  x 1  … the following sequence is finite – y 0 = x 0 – y i+1 = y i  x i+1 For a monotone function f: D  D and x k  Red(f) = { d | d  D and f(d)  d } define – y 0 = x – y i+1 = y i  f(y i ) Theorem: – There exits k such that y k+1 =y k – y k  Red(f) = { d | d  D and f(d)  d }

Narrowing for Interval Analysis [a, b]   = [a, b] [a, b]  [c, d] = [ if a = -  then c else a, if b =  then d else b ]

Semantic equations with narrowing R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] R[0] R[2] R[3] R[4] R[1] R[5] R[6]

Analysis with widening/narrowing Two phases – Phase 1: analyze with widening until converging – Phase 2: use values to analyze with narrowing Phase 2: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3]+[1,1] R[5] = R[2] #  [1000,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ] Phase 1: R[0] =  R[1] = [7,7] R[2] = R[1]  R[4] R[2.1] = R[2.1]  R[2] R[3] = R[2.1]  [- ,999] R[4] = R[3] + [1,1] R[5] = R[2]  [1001,+  ] R[6] = R[5]  [999,+  ]  R[5]  [1001,+  ]

Analysis with widening/narrowing

Analysis results widening/narrowing Precise invariant