Program Analysis and Verification 0368-4479 Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Semantics Static semantics Dynamic semantics attribute grammars
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Program Analysis and Verification
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
Fall Semantics Juan Carlos Guzmán CS 3123 Programming Languages Concepts Southern Polytechnic State University.
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
Software Verification Bertrand Meyer Chair of Software Engineering Lecture 2: Axiomatic semantics.
Describing Syntax and Semantics
Floyd Hoare Logic. Semantics A programming language specification consists of a syntactic description and a semantic description. Syntactic description:symbols.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 7: Static Analysis I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Inference Rules and Proofs (Z); Program Specification and Verification Inference Rules and Proofs (Z); Program Specification and Verification.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
CS 363 Comparative Programming Languages Semantics.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Propositional Calculus CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.
Chapter 3 Part II Describing Syntax and Semantics.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
Program Analysis and Verification Noam Rinetzky Lecture 5: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 9: Dataflow & Optimizations 2 Roman Manevich Ben-Gurion University of the Negev.
Spring 2017 Program Analysis and Verification
Spring 2017 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2017 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Fall Compiler Principles Lecture 8: Loop Optimizations
Spring 2017 Program Analysis and Verification Operational Semantics
Program Analysis and Verification
Programming Languages and Compilers (CS 421)
Data Flow Analysis Compiler Design
Spring 2016 Program Analysis and Verification Operational Semantics
Programming Languages and Compilers (CS 421)
Presentation transcript:

Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Previously Operational Semantics – Large step (Natural) – Small step (SOS) Denotational Semantics – aka mathematical semantics Axiomatic Semantics – aka Hoare Logic – aka axiomatic (manual) verification 2 How? What? Why?

From verification to analysis Manual program verification – Verifier provides assertions Loop invariants Automatic program verification (Program analysis) – Tool automatically synthesize assertions Finds loop invariants

Manual proof for max 4 nums : array N : int { N  0 } x := 0 { N  0  x=0 } res := nums[0] { x=0 } Inv = { x  N } while x res then { x=k  k<N } res := nums[x] { x=k  k<N } { x=k  k<N } x := x + 1 { x=k+1  k<N } { x  N  x  N } { x=N }

Manual proof for max 5 nums : array N : unsigned int { N  0 } x := 0 { N  0  x=0 } res := nums[0] { x=0 } Inv = { x  N } while x res then { x=k  k<N } res := nums[x] { x=k  k<N } { x=k  k<N } x := x + 1 { x=k+1  k  N } { x  N  x  N } { x=N } We only prove no buffer (array) overflow

Can we find this proof automatically? 6 nums : array N : unsigned int { N  0 } x := 0 { N  0  x=0 } res := nums[0] { x=0 } Inv = { x  N } while x res then { x=k  k<N } res := nums[x] { x=k  k<N } { x=k  k<N } x := x + 1 { x=k+1  k  N } { x  N  x  N } { x=N } Observation: predicates in proof have the general form  constraint where constraint has the form X - Y  c or  X  c

Zone Abstract Domain (Analysis) Developed by Antoine Mine in his Ph.D. thesis Uses constraints of the form X - Y  c and  X  c Built on top of Difference Bound Matrices (DBM) and shortest-path algorithms – O(n 3 ) time – O(n 2 ) space 7

Analysis with Zone abstract domain 8 nums : array N : unsigned int { N  0 } x := 0 { N  0  x=0 } res := nums[0] { N  0  x=0 } Inv = { N  0  0  x  N } while x res then { N  0  0  x<N } res := nums[x] { N  0  0  x<N } { N  0  0  x<N } x := x + 1 { N  0  0<x  N } {N  0  0  x  x=N } nums : array N : unsigned int { N  0 } x := 0 { N  0  x=0 } res := nums[0] { x=0 } Inv = { x  N } while x res then { x=k  k<N } res := nums[x] { x=k  k<N } { x=k  k<N } x := x + 1 { x=k+1  k  N } { x  N  x  N } { x=N } Static Analysis with Zone AbstractionManual Proof

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis 9

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract (semantic) domains(“abstract states”) – Transformer functions (“abstract steps”) – Chaotic iteration (“abstract computation”) 10

Abstract Interpretation [CC77] A very general mathematical framework for approximating semantics – Generalizes Hoare Logic – Generalizes weakest precondition calculus Allows designing sound static analysis algorithms – Usually compute by iterating to a fixed-point – Not specific to any programming language style Results of an abstract interpretation are (loop) invariants – Can be interpreted as axiomatic verification assertions and used for verification 11

Abstract Interpretation in 5 Slides Disclaimer – Do not worry if you feel that you do not understand the next 5 slides You are not expected to … – This is just to give you a view of the land …

Collecting semantics For a set of program states State, we define the collecting lattice (2 State, , , , , State) The collecting semantics accumulates the (possibly infinite) sets of states generated during the execution – Not computable in general 13

Abstract (conservative) interpretation 14 set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Abstract (conservative) interpretation 15 {x ↦ 1, x ↦ 2, …}{x ↦ 0, x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1 0 ≤ x concretization

Abstract (conservative) interpretation 16 {x ↦ 1, x ↦ 2, …}{…, x ↦ 0, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …}  0 < x abstract semantics x = x -1  concretization

Abstract (non-conservative) interpretation 17 {x ↦ 1, x ↦ 2, …}{ x ↦ 1, …} operational semantics (concrete semantics) x=x-1 {x ↦ 0, x ↦ 1, …} ⊈ 0 < x abstract semantics x = x -1 0 < x concretization

Abstract Interpretation by Example

Motivating Application: Optimization A compiler optimization is defined by a program transformation: T : Prog  Prog The transformation is semantics-preserving:  s  State. S sos  C  s = S sos  T(C)  s The transformation is applied to the program only if an enabling condition is met We use static analysis for inferring enabling conditions 19

Common Subexpression Elimination If we have two variable assignments x := a op b … y := a op b and the values of x, a, and b have not changed between the assignments, rewrite the code as x = a op b … y := x Eliminates useless recalculation Paves the way for more optimizations – e.g., dead code elimination 20 op  {+, -, *, ==, <=}

What do we need to prove? 21 { true } C 1 x := a op b C 2 { x = a op b } y := a op b C 3 { true } C 1 x := a op b C 2 { x = a op b } y := x C 3 CSE

Available Expressions Analysis A static analysis that infers for every program point a set of facts of the form AV = { x = y | x, y  Var }  { x = - y | x, y  Var }  { x = y op z | y, z  Var, op  {+, -, *, <=} } For every program with n=|Var| variables number of possible facts is finite: |AV|=O(n 3 ) – Yields a trivial algorithm … but, is it efficient? 22

Which proof is more desirable? 23 { true } x := a + b { x=a+b } z := a + c { x=a+b } y := a + b... { true } x := a + b { x=a+b } z := a + c { z=a+c } y := a + b... { true } x := a + b { x=a+b } z := a + c { x=a+b  z=a+c } y := a + b...

Which proof is more desirable? 24 { true } x := a + b { x=a+b } z := a + c { x=a+b } y := a + b... { true } x := a + b { x=a+b } z := a + c { z=a+c } y := a + b... { true } x := a + b { x=a+b } z := a + c { x=a+b  z=a+c } y := a + b... More detailed predicate = more optimization opportunities x=a+b  z=a+c  x=a+b x=a+b  z=a+c  z=a+c Implication formalizes “more detailed” relation between predicates

Developing a theory of approximation Formulae are suitable for many analysis-based proofs but we may want to represent predicates in other ways: – Sets of “facts” – Automata – Linear (in)equalities – … ad-hoc representation Wanted: a uniform theory to represent semantic values and approximations 25

Preorder We say that a binary order relation  over a set D is a preorder if the following conditions hold for every d, d’, d’’  D – Reflexive: d  d – Transitive: d  d’ and d’  d’’ implies d  d’’ There may exist d, d’ such that d  d’ and d’  d yet d  d’ 26

Preorder example Simple Available Expressions Define SAV = { x = y | x, y  Var }  { x = y + z | y, z  Var } For D=2 SAV (sets of available expressions) define (for two subsets A 1, A 2  D) A 1  imp A 2 if and only if  A 1   A 2 A 1 is “more detailed” if it implies all facts of A 2 Compare {x=y  x=a+b} with {x=y  y=a+b} – Which one should we choose? 27 Can we decide A 1  imp A 2 ?

The meaning of implication A predicate P represents the set of states models(P) = { s | s  P } P  Q means models(P)  models(Q) 28

Partially ordered sets A partially ordered set (poset) is a pair (D,  ) – D is a set of elements – a (semantic) domain –  is a partial order between pairs of elements from D. That is  : D  D with the following properties, for all d, d’, d’’ in D Reflexive: d  d Transitive: d  d’ and d’  d’’ implies d  d’’ Anti-symmetric: d  d’ and d’  d implies d = d’ Notation: if d  d’ and d  d’ we write d  d’ 29 Unique “most detailed” element

From preorders to partial orders We can transform a preorder into a poset by 1.Coarsening the ordering 2.Switching to a canonical form by choosing a representative for the set of equivalent elements d * for { d’ | d  d’ and d’  d } 30

Coarsening for SAV For D=2 SAV (sets of available expressions) define (for two subsets A 1, A 2  D) A 1  coarse A 2 if and only if A 1  A 2 Notice that if A 1  A 2 then  A 1   A 2 Compare {x=y  x=a+b} with {x=y  y=a+b} How about {x=y  x=a+b  y=a+b} ? 31

Canonical form for SAV For an available expressions element A define Explicate(A) = minimal set B such that: 1. A  B 2.x=y  B implies y=x  B 3.x=y  B and y=z  B implies x=z  B 4.x=y+z  B implies x=z+y  B 5.x=y  B and x=z+w  B implies y=z+w  B 6.x=y  B and z=x+w  B implies z=y+w  B 7.x=z+w  B and y=z+w  B implies x=y  B Makes all implicit facts explicit Define A * = Explicate(A) Define (for two subsets A 1, A 2  D) A 1  exp A 2 if and only if A 1 *  A 2 * Lemma: A 1  exp A 2 if and only A 1  imp A 2 32 Therefore A 1  imp A 2 is decidable

Some posets-related terminology If x  y we can say – x is lower than y – x is more precise than y – x is more concrete than y – x under-approximates y – y is greater than x – y is less precise than x – y is more abstract than x – y over-approximates x 33

Visualizing ordering for SAV 34 {false} {x=y  y=z+a} * {x=y  p=q} * {y=z+a  p=q} * {x=y} * {y=z+a} * {p=q} * {true} Greater Lower D={x=y, y=x, p=q, q=p, y=z+a, y=a+z, z=y+z, x=z+a}

Pointed poset A poset (D,  ) with a least element  is called a pointed poset – For all d  D we have that   d The pointed poset is denoted by (D, ,  ) We can always transform a poset (D,  ) into a pointed poset by adding a special bottom element (D  {  },   {  d | d  D},  ) Greatest element for SAV = {true = ?} Least element for SAV = {false= ?} 35

Annotating conditions 36 { P } if b then { b  P } S 1 { Q 1 } else { b  P } S 2 { Q 2 } { Q } Q approximates Q 1 and Q 2 We need a general way to approximate a set of semantic elements by a single semantic element

Join operator Assume a poset (D,  ) Let X  D be a subset of D (finite/infinite) The join of X is defined as –  X = the least upper bound (LUB) of all elements in X if it exists –  X = min  { b | forall x  X we have that x  b} – The supremum of the elements in X – A kind of abstract union (disjunction) operator Properties of a join operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x 37

Meet operator Assume a poset (D,  ) Let X  D be a subset of D (finite/infinite) The meet of X is defined as –  X = the greatest lower bound (GLB) of all elements in X if it exists –  X = max  { b | forall x  X we have that b  x} – The infimum of the elements in X – A kind of abstract intersection (conjunction) operator Properties of a join operator – Commutative: x  y = y  x – Associative: (x  y)  z = x  (y  z) – Idempotent: x  x = x 38

Complete lattices A complete lattice (D, , , , ,  ) is A set of elements D A partial order x  y A join operator  A meet operator  A bottom element  = ? A top element  = ? 39

Complete lattices A complete lattice (D, , , , ,  ) is A set of elements D A partial order x  y A join operator  A meet operator  A bottom element  =   A top element  =  D 40

Transfer Functions Mathematical foundations

Towards an automatic proof Goal: automatically compute an annotated program proving as many facts of the form x = y + z as possible Decision 1: develop a forward-going proof Decision 2: draw predicates from a finite set D – “looking under the light of the lamp” – A compromise that simplifies problem by focusing attention – possibly miss some facts that hold Challenge 1: handle straight-line code Challenge 2: handle conditions Challenge 3: handle loops 42

Domain for SAV Define atomic facts (for SAV) as  = { x = y | x, y  Var }  { x = y + z | x, y, z  Var } – For n=|Var| number of atomic facts is O(n 3 ) Define sav-predicates as  = 2  For D  , Conj(D) =  D – Conj({a=b, c=b+d, b=c}) = (a=b)  (c=b+d)  (b=c) Note: – Conj(D 1  D 2 ) = Conj(D 1 )  Conj(D 1 ) – Conj({})  true 43

Challenge 2: handling straight-line code 44

handling straight-line code: Goal Given a program of the form x 1 := a 1 ; … x n := a n Find predicates P 0, …, P n such that 1.{P 0 } x 1 := a 1 {P 1 } … {P n-1 } x n := a n {P n } is a proof sp(x i := a i, P i-1 )  P i 2.P i = Conj(D i ) D i is a set of simple (SAV) facts 45

Example Find a proof that satisfies both conditions 46 { } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c }

Example Can we make this into an algorithm? 47 { } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c } Frame sp cons

Algorithm for straight-line code Goal: find predicates P 0, …, P n such that 1.{P 0 } x 1 := a 1 {P 1 } … {P n-1 } x n := a n {P n } is a proof That is: sp(x i := a i, P i-1 )  P i 2.Each P i has the form Conj(D i ) where D i is a set of simple (SAV) facts Idea: define a function F SAV [x:=a] :    s.t. if F SAV [x:=a](D) = D’ thensp(x := a, Conj(D))  Conj(D’) – We call F the abstract transformer for x:=a Initialize D 0 ={} For each i: compute D i+1 = Conj(F SAV [x i := a i ] D i ) Finally P i = Conj(D i ) 48

Defining an SAV abstract transformer Goal: define a function F SAV [x:=a] :    s.t. if F SAV [x:=a](D) = D’ thensp(x := a, Conj(D))  Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule 49

Defining an SAV abstract transformer Goal: define a function F SAV [x:=a] :    s.t. if F SAV [x:=a](D) = D’ thensp(x := a, Conj(D))  Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule 50

Defining an SAV abstract transformer Goal: define a function F SAV [x:=a] :    s.t. if F SAV [x:=a](D) = D’ thensp(x := a, Conj(D))  Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule 51  Is either a variable v or an addition expression v+w { x=  } x:=a { } [kill-lhs] { y=x+w } x:=a { } [kill-rhs-1] { y=w+x } x:=a { } [kill-rhs-2] { } x:=  { x=  } [gen] { y=z+w } x:=a { y=z+w } [preserve]

SAV abstract transformer example 52  Is either a variable v or an addition expression v+w { } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c } { x=  } x:=a { } [kill-lhs] { y=x+w } x:=a { } [kill-rhs-1] { y=w+x } x:=a { } [kill-rhs-2] { } x:=  { x=  } [gen] { y=z+w } x:=a { y=z+w } [preserve]

Problem 1: large expressions Large expressions on the right hand sides of assignments are problematic – Can miss optimization opportunities – Require complex transformers Solution: transform code to normal form where right-hand sides have bounded size 53 Missed CSE opportunity { } x := a + b + c { } y := a + b + c { }

Solution: Simplify Prog. Lang. Main idea: simplify expressions by storing intermediate results in new temporary variables – Three-address code Number of variables in simplified statements  3 54 { } x := a + b + c { } y := a + b + c { } { } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c }

Solution: Simplify Prog. Lang. Main idea: simplify expressions by storing intermediate results in new temporary variables – Three-address code Number of variables in simplified statements  3 55 { } x := a + b + c { } y := a + b + c { } { } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c } Need to infer i1=i2

Problem 2: Transformer Precision Our transformer only infers syntactically available expressions – ones that appear in the code explicitly We want a transformer that looks deeper into the semantics of the predicates – Takes equalities into account 56 { } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c } Need to infer i1=i2

Solution: Use Canonical Form Idea: make as many implicit facts explicit by – Using symmetry and transitivity of equality – Commutativity of addition – Meaning of equality – can substitute equal variables For P=Conj(D) let Explicate(D) = minimal set D * such that: 1.D  D * 2.x=y  D * implies y=x  D * 3.x=y  D * y=z  D * implies x=z  D * 4.x=y+z  D * implies x=z+y  D * 5.x=y  D * and x=z+w  D * implies y=z+w  D * 6.x=y  D * and z=x+w  D * implies z=y+w  D * 7.x=z+w  D * and y=z+w  D * implies x=y  D * Notice that Explicate(D)  D – Explicate is a special case of a reduction operator 57

Sharpening the transformer Define: F * [x:=a] = Explicate  F SAV [x:=a] 58 { } i1 := a + b { i1=a+b, i1=b+a } x := i1 + c { i1=a+b, i1=b+a, x=i1+c, x=c+i1 } i2 := a + b { i1=a+b, i1=b+a, x=i1+c, x=c+i1, i2=a+b, i2=b+a, i1=i2, i2=i1, x=i2+c, x=c+i2, } y := i2 + c {... } Since sets of facts and their conjunction are isomorphic we will use them interchangeably

An algorithm for annotating SLP Annotate(P, x:=a) = {P} x:=a F * [x:=a](P) Annotate(P, S 1 ; S 2 ) = {P} S 1 ; {Q 1 } S 2 {Q 2 – Annotate(P, S 1 ) = {P} S 1 {Q 1 } – Annotate(Q 1, S 2 ) = {Q 1 } S 2 {Q 2 } 59

Challenge 2: handling conditions 60

handling conditions: Goal Annotate a program if b then S 1 else S 2 with predicates from  Assumption 1: P is given (otherwise use true) Assumption 2: b is a simple binary expression e.g., x=y, x  y, x<y (why?) 61 { P } if b then { b  P } S 1 { Q 1 } else {  b  P } S 2 { Q 2 } { Q }

handling conditions: Goal Annotate a program if b then S 1 else S 2 with predicates from  Assumption 1: P is given (otherwise use true) Assumption 2: b is a simple binary expression e.g., x=y, x  y, x<y (why?) 62 { P } if b then { b  P } S 1 { Q 1 } else {  b  P } S 2 { Q 2 } { Q }

Annotating conditions 1.Start with P or {b  P} and annotate S 1 (yielding Q 1 ) 2.Start with P or {  b  P} and annotate S 2 (yielding Q 2 ) 3.How do we infer a Q such that Q 1  Q and Q 2  Q? Q 1 =Conj(D 1 ), Q 2 =Conj(D 2 ) Define: Q = Q 1  Q 2 = Conj(D 1  D 2 ) 63 { P } if b then { b  P } S 1 { Q 1 } else {  b  P } S 2 { Q 2 } { Q } Possibly an SAV-fact

Joining predicates 1.Start with P or {b  P} and annotate S 1 (yielding Q 1 ) 2.Start with P or {  b  P} and annotate S 2 (yielding Q 2 ) 3.How do we infer a Q such that Q 1  Q and Q 2  Q? Q 1 =Conj(D 1 ), Q 2 =Conj(D 2 ) Define: Q = Q 1  Q 2 = Conj(D 1  D 2 ) 64 { P } if b then { b  P } S 1 { Q 1 } else {  b  P } S 2 { Q 2 } { Q } The join operator for SAV

Joining predicates Q 1 =Conj(D 1 ), Q 2 =Conj(D 2 ) We want to soundly approximate Q 1  Q 2 in  Define: Q = Q 1  Q 2 = Conj(D 1  D 2 ) Notice that Q 1  Q and Q 2  Q meaning Q 1  Q 2  Q 65

Handling conditional expressions Let D be a set of facts and b be an expression Goal: Elements in  that soundly approximate – D  bexpr – D   bexpr Technique: Add statement assume bexpr  assume bexpr, s   sos s if B  bexpr  s = tt Find a function F[ assume bexpr] :    Conj(D)  bexpr  Conj(F[ assume bexpr]) 66

Handling conditional expressions F[ assume bexpr] :    such that Conj(D)  bexpr  Conj(F[ assume bexpr])  (bexpr) = if bexpr is an SAV-fact then {bexpr} else {} – Notice bexpr   (bexpr) – Examples  (y=z) = {y=z}  (y<z) = {} F[ assume bexpr](D) = D   (bexpr) 67

Example 68 { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }

Example 69 { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }

Recap We now have an algorithm for soundly annotating loop-free code Generates forward-going proofs Algorithm operates on abstract syntax tree of code – Handles straight-line code by applying F * – Handles conditions by recursively annotating true and false branches and then intersecting their postconditions 70

An algorithm for conditions Annotate(P, if bexpr then S 1 else S 2 ) = {P} if bexpr then S 1 else S 2 Q 1  Q 2 – Annotate(P   (bexpr), S 1 ) = F[ assume bexpr](P) S 1 {Q 1 } – Annotate(P   (  bexpr), S 2 )= F[ assume  bexpr](P) S 2 {Q 2 } 71

Example 72 { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }

Challenge 2: handling loops 73 By Stefan Scheer (Own work (Own Photo)) [GFDL ( CC-BY-SA-3.0 ( or CC-BY-SA ( via Wikimedia Commons

Challenge 2: handling loops 74

handling loops: Goal 75 { P } Inv = { N } while bexpr do { bexpr  N } S { Q } {  bexpr  N }

handling loops: Goal Annotate a program while bexpr do S with predicates from  – s.t. P  N Main challenge: find N Assumption 1: P is given (otherwise use true) Assumption 2: bexpr is a simple binary expression 76 { P } Inv = { N } while bexpr do { bexpr  N } S { Q } {  bexpr  N }

Example: annotate this program 77 { y=x+a, y=a+x, w=d, d=w } Inv = { y=x+a, y=a+x } while (x  z) do { z=x+a, z=a+x, w=d, d=w } x := x + 1 { w=d, d=w } y := x + a { y=x+a, y=a+x, w=d, d=w } d := x + a { y=x+a, y=a+x, d=x+a, d=a+x, y=d, d=y } { y=x+a, y=a+x, x=z, z=x }

Example: annotate this program 78 { y=x+a, y=a+x, w=d, d=w } Inv = { y=x+a, y=a+x } while (x  z) do { y=x+a, y=a+x } x := x + 1 { } y := x + a { y=x+a, y=a+x } d := x + a { y=x+a, y=a+x, d=x+a, d=a+x, y=d, d=y } { y=x+a, y=a+x, x=z, z=x }

handling loops: Idea Idea: try to guess a loop invariant from a small number of loop unrollings – We know how to annotate S (by induction) 79 { P } Inv = { N } while bexpr do { bexpr  N } S { Q } {  bexpr  N }

k-loop unrolling 80 { P } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a } if (x  z) x := x + 1 y := x + a d := x + a Q 2 = { y=x+a } … { P } Inv = { N } while (x  z) do x := x + 1 y := x + a d := x + a { y=x+a, y=a+x, w=d, d=w } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a }

k-loop unrolling 81 { P } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } if (x  z) x := x + 1 y := x + a d := x + a Q 2 = { y=x+a, y=a+x } … { P } Inv = { N } while (x  z) do x := x + 1 y := x + a d := x + a { y=x+a, y=a+x, w=d, d=w } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x }

k-loop unrolling 82 The following must hold: P  N Q 1  N Q 2  N … Q k  N { P } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } if (x  z) x := x + 1 y := x + a d := x + a Q 2 = { y=x+a, y=a+x } … { P } Inv = { N } while (x  z) do x := x + 1 y := x + a d := x + a { y=x+a, y=a+x, w=d, d=w } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x }

k-loop unrolling 83 The following must hold: P  N Q 1  N Q 2  N … Q k  N … { P } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } if (x  z) x := x + 1 y := x + a d := x + a Q 2 = { y=x+a, y=a+x } … { P } Inv = { N } while (x  z) do x := x + 1 y := x + a d := x + a { y=x+a, y=a+x, w=d, d=w } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } We can compute the following sequence: N 0 = P N 1 = N 1  Q 1 N 2 = N 1  Q 2 … N k = N k-1  Q k Observation 1: No need to explicitly unroll loop – we can reuse postcondition from unrolling k-1 for k

k-loop unrolling 84 The following must hold: P  N Q 1  N Q 2  N … Q k  N … { P } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } if (x  z) x := x + 1 y := x + a d := x + a Q 2 = { y=x+a, y=a+x } … { P } Inv = { N } while (x  z) do x := x + 1 y := x + a d := x + a { y=x+a, y=a+x, w=d, d=w } if (x  z) x := x + 1 y := x + a d := x + a Q 1 = { y=x+a, y=a+x } We can compute the following sequence: N 0 = P N 1 = N 1  Q 1 N 2 = N 1  Q 2 … N k = N k-1  Q k Observation 2: N k monotonically decreases set of facts. Question: does it stabilizes for some k?

Algorithm for annotating a loop Annotate(P, while bexpr do S ) = Initialize N’ := N c := P repeat let Annotate(P, if b then S else skip ) be {N c } if bexpr then S else skip {N’} N c := N c  N’ until N’ = N c return {P} INV= N’ while bexpr do F[ assume bexpr](N) Annotate(F[ assume bexpr](N), S) F[ assume  bexpr](N) 85

A technical issue Unrolling loops is quite inconvenient and inefficient (but we can avoid it as we just saw) How do we handle more complex control-flow constructs, e.g., goto, break, exceptions…? Solution: model control-flow by labels and goto statements Would like a dedicated data structure to explicitly encode control flow in support of the analysis Solution: control-flow graphs (CFGs) 86

Intermediate language example 87 while (x  z) do x := x + 1 y := x + a d := x + a a := b label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b

Control-flow graph example 88 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b line number

Control-flow graph example 89 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b entry exit 7

Control-flow graph Node are statements or labels Special nodes for entry/exit A edge from node v to node w means that after executing the statement of v control passes to w – Conditions represented by splits and join node – Loops create cycles Can be generated from abstract syntax tree in linear time – Automatically taken care of by the front-end Usage: store analysis results in CFG nodes 90

CFG with Basic Blocks Stores basic blocks in a single node Extended blocks – maximal connected loop- free subgraphs 91 if x  z x := x + 1 y := x + a d := x + a a := b entry exit 4 5

Next lecture: abstract interpretation 4

Previously Available Expressions analysis – Abstract transformer for assignments – Processing serial composition – Processing conditions – Processing loop Control-flow graphs 93

CSE optimization 94 { x = a + b } y := a + b y := x CSE

Semantic domain Define factoids:  = { x = y | x, y  Var }  { x = y + z | x, y, z  Var } Define predicates as  = 2  Treat conjunctive formulas as sets of factoids {a=b, c=b+d, b=c} ~ (a=b)  (c=b+d)  (b=c) 95

Defining an SAV abstract transformer Goal: define a function F SAV [x:=aexpr] :    s.t. if F SAV [x:=aexpr](D) = D’ thensp(x :=aexpr, D)  D’ Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule 96  Is either a variable v or an addition expression v+w { x=  } x:= aexpr { } [kill-lhs] { y=x+w } x:= aexpr { } [kill-rhs-1] { y=w+x } x:= aexpr { } [kill-rhs-2] { } x:=  { x=  } [gen] { y=z+w } x:= aexpr { y=z+w } [preserve]

Defining a reduction For an SAV-predicate D define Explicate(D) = minimal set D * such that: 1.D  D * 2.x=y  D * implies y=x  D * 3.x=y  D * y=z  D * implies x=z  D * 4.x=y+z  D * implies x=z+y  D * 5.x=y  D * and x=z+w  D * implies y=z+w  D * 6.x=y  D * and z=x+w  D * implies z=y+w  D * 7.x=z+w  D * and y=z+w  D * implies x=y  D * Define: F * [x:=aexpr] = Explicate  F SAV [x:= aexpr] 97

Recap

An algorithm for annotating SLP Annotate(P, S 1 ; S 2 ) = let Annotate(P, S 1 ) be {P} A 1 {Q 1 } let Annotate(Q 1, S 2 ) be {Q 1 } A 2 {Q 2 } return {P} A 1 ; {Q 1 } A 2 {Q 2 } 99

Handling conditional expressions We want to soundly approximate D  bexpr and D   bexpr in  Define an artificial statement assume bexpr  assume bexpr, s   sos s if B  bexpr  s = tt Define  (bexpr) = if bexpr is factoid {bexpr} else {} Define F[ assume bexpr](D) = D   (bexpr) Can sharpen F * [ assume bexpr] = Explicate  F SAV [ assume bexpr] 100

An algorithm for annotating conditions let P t = F * [ assume bexpr] P let P f = F * [ assume  bexpr] P let Annotate(P t, S 1 ) be {P t } A 1 {Q 1 } let Annotate(P f, S 2 ) be {P f } A 2 {Q 2 } return {P} if bexpr then {P t } A 1 {Q 1 } else {P f } A 2 {Q 2 } {Q 1  Q 2 } 101

Algorithm for annotating loops Annotate(P, while bexpr do S) = N’ := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N’} N c := N c  N’ until N’ = N c return {P} INV= {N’} while bexpr do {P t } A body {F[ assume  bexpr](N)} 102

Algorithm for annotating a program Annotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F * [x:=aexpr] P} case S is S 1 ; S 2 let Annotate(P, S 1 ) be {P} A 1 {Q 1 } let Annotate(Q 1, S 2 ) be {Q 1 } A 2 {Q 2 } return {P} A 1 ; {Q 1 } A 2 {Q 2 } case S is if bexpr then S 1 else S 2 let P t = F[ assume bexpr] P let P f = F[ assume  bexpr] P let Annotate(P t, S 1 ) be {P t } A 1 {Q 1 } let Annotate(P f, S 2 ) be {P f } A 2 {Q 2 } return {P} if bexpr then {P t } A 1 {Q 1 } else {P f } A 2 {Q 2 } {Q 1  Q 2 } case S is while bexpr do S N’ := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N’} N c := N c  N’ until N’ = Nc return {P} INV= {N’} while bexpr do {P t } A body {F[ assume  bexpr](N)} 103

Exercise: apply algorithm 104 { } y := a+b { } x := y {} while (x  z) do {} w := a+b {} x := a+b {} a := z {}

Step 1/ {} y := a+b { y=a+b }* x := y while (x  z) do w := a+b x := a+b a := z Not all factoids are shown – apply Explicate to get all factoids

Step 2/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* while (x  z) do w := a+b x := a+b a := z

Step 3/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (x  z) do w := a+b x := a+b a := z

Step 4/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (x  z) do { y=a+b, x=y, x=a+b }* w := a+b x := a+b a := z

Step 5/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (x  z) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b a := z

Step 6/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (x  z) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z

Step 7/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (x  z) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*

Step 8/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (x  z) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*

Step 9/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (x  z) do { x=y }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*

Step 10/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (x  z) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*

Step 11/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (x  z) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=y, w=x, x=y, a=z }*

Step 12/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (x  z) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 13/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (x  z) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 14/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (x  z) do { } w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 15/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (x  z) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 16/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (x  z) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 17/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (x  z) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*

Step 18/ {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }* Inv = { } while (x  z) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }* { x=z }