Download presentation
Presentation is loading. Please wait.
1
Spring 2016 Program Analysis and Verification
Lecture 7: Static Analysis I Roman Manevich Ben-Gurion University
2
Tentative syllabus Program Verification Program Analysis Basics
Operational semantics Hoare Logic Applying Hoare Logic Weakest Precondition Calculus Proving Termination Data structures Automated Verification Program Analysis Basics From Hoare Logic to Static Analysis Control Flow Graphs Equation Systems Collecting Semantics Using Soot Abstract Interpretation fundamentals Lattices Fixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR
3
Previously Axiomatic verification Weakest precondition calculus
Strongest postcondition calculus Handling data structures Total correctness
4
Agenda Static analysis for compiler optimization
Common Subexpression Elimination Available Expression domain Develop a static analysis: Simple Available Expressions Constant Propagation Basic concepts in static analysis Control flow graphs Equation systems Collecting semantics
5
Array-max example: Post1
nums : array N : int // N stands for num’s length { N0 } x := 0 { N0 x=0 } res := nums[0] { x=0 } Inv = { xN } while x < N { x=k k<N } if nums[x] > res then res := nums[x] { x=k k<N } x := x { x=k+1 k<N } { xN xN } { x=N }
6
Can we find this proof automatically?
nums : array N : int { N0 } x := 0 { N0 x=0 } res := nums[0] { x=0 } Inv = { xN } while x < N { x=k k<N } if nums[x] > res then { x=k k<N } res := nums[x] { x=k k<N } { x=k k<N } x := x { x=k+1 k<N } { xN xN } { x=N } Observation: predicates in proof have the general form constraint where constraint has the form X - Y c or X c
7
Look under the street lamp
…We may move lamp a bit By Infopablo00 (Own work) [CC-BY-SA-3.0 ( via Wikimedia Commons
8
Zone Abstract Domain Developed by Antoine Mine in his Ph.D. thesis
Uses constraints of the form X - Y c and X c
9
Analysis with Zone abstract domain
Static Analysis with Zone Abstraction Manual Proof nums : array N : int { N0 } x := 0 { N0 x=0 } res := nums[0] { N0 x=0 } Inv = { N0 0xN } while x < N { N0 0x<N } if nums[x] > res then { N0 0x<N } res := nums[x] { N0 0x<N } { N0 0x<N } x := x { N0 0<x<N } {N0 0x x=N } nums : array N : int { N0 } x := 0 { N0 x=0 } res := nums[0] { x=0 } Inv = { xN } while x < N { x=k kN } if nums[x] > res then { x=k k<N } res := nums[x] { x=k k<N } { x=k k<N } x := x { x=k+1 k<N } { xN xN } { x=N }
10
Array-max example: Post3
nums : array { N0 0m<N } // N stands for num’s length x := 0 { x=0 } res := nums[0] { x=0 res=nums(0) } Inv = { 0m<x nums(m)res } while x < N { x=k res=oRes 0m<k nums(m)oRes } if nums[x] > res then { nums(x)>oRes res=oRes x=k 0m<k nums(m)oRes } res := nums[x] { res=nums(x) nums(x)>oRes x=k 0m<k nums(m)oRes } { x=k 0mk nums(m)<res } { (x=k 0m<k nums(m)<res) (res≥nums(x) x=k res=oRes 0m<k nums(m)oRes)} { x=k 0m<k nums(m)res } x := x { x=k+1 0mx-1 nums(m)res } { 0m<x nums(m)res } { x=N 0m<x nums(m)res} [univp]{ m. 0m<N nums(m)res }
11
Can we find this proof automatically?
Various static analysis techniques can A framework for numeric analysis of array operations [Gopan et al. in POPL 2015] Discovering properties about arrays in simple programs [Halbwachs & Péron in PLDI 2008]
12
Static analysis for compiler optimizations
13
Motivating problem: optimization
A compiler optimization is defined by a program transformation: T : Stmt Stmt The transformation is semantics-preserving: s. Ssos C s = Ssos T(C) s The transformation is applied to the program only if an enabling condition is met We use static analysis for inferring enabling conditions
14
Common Subexpression Elimination
If we have two variable assignments x := a op b … y := a op b and the values of x, a, and b have not changed between the assignments, rewrite the code as x = a op b … y := x Eliminates useless recalculation Paves the way for more optimizations (e.g., dead code elimination) op {+, -, *, ==, <=}
15
What do we need to prove? CSE { true } C1
x := a op b C2 { x = a op b } y := a op b C3 { true } C1 x := a op b C2 { x = a op b } y := x C3 CSE Assertion localizes decision
16
A simplified problem CSE { true } C1 x := a + b C2 { x = a + b }
y := a + b C3 { true } C1 x := a + b C2 { x = a + b } y := x C3 CSE
17
Available Expressions analysis
A static analysis that infers for every program point a set of facts of the form AV = { x = y | x, y Var } { x = op y | x, y Var, op {-, !} } { x = y op z | y, z Var, op {+, -, *, <=} } For every program with n=|Var| variables number of possible facts is finite: |AV|=O(n3) Yields a trivial algorithm … Is it efficient?
18
Simple Available Expressions
Define atomic facts (for SAV) as = { x = y | x, y Var } { x = y + z | x, y, z Var } For n=|Var| number of atomic facts is O(n3) Define sav-predicates as = 2
19
Notation for conjunctive sets of facts
For a set of atomic facts D , we define Conj(D) = D E.g., if D={a=b, c=b+d, b=c} then Conj(D) = (a=b) (c=b+d) (b=c) Notice that for two sets of facts D1 and D2 Conj(D1 D2) = Conj(D1) Conj(D1) What does Conj({}) stand for…?
20
Towards an automatic proof
Goal: automatically compute an annotated program proving as many facts as possible of the form x = y and x = y + z Decision 1: develop a forward-going proof Decision 2: draw predicates from a finite set D “looking under the light of the lamp” A compromise that simplifies problem by focusing attention – possibly miss some facts that hold Challenge 1: handle straight-line code Challenge 2: handle conditions Challenge 3: handle loops
21
Challenge 1: handling straight-line code
By Zachary Dylan Tax (Zachary Dylan Tax) [GFDL ( or CC-BY-3.0 ( via Wikimedia Commons
22
Straight line code example
{ } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c } Find a proof that satisfies both conditions
23
Straight line code example
sp { } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c } cons Frame Can we turn this into an algorithm? What should we ensure for each triple?
24
Goal Given a program of the form x1 := a1; … xn := an
Find predicates P0, …, Pn such that {P0} x1 := a1 {P1} … {Pn-1} xn := an {Pn} is a proof That is: sp(xi := ai, Pi-1) Pi Each Pi has the form Conj(Di) where Di is a set of atomic
25
Algorithm for straight-line code
Goal: find predicates P0, …, Pn such that {P0} x1 := a1 {P1} … {Pn-1} xn := an {Pn} is a proof That is: sp(xi := ai, Pi-1) Pi Each Pi has the form Conj(Di) where Di is a set of atomic facts Idea: define a function FSAV[x:=a] : s.t. if FSAV[x:=a](D) = D’ then sp(x := a, Conj(D)) Conj(D’) We call F the abstract transformer of x:=a Unless D0 is given, initialize D0={} (why?) For each i: compute Di+1 = Conj(FSAV[xi := ai] Di) Finally Pi = Conj(Di)
26
Defining an SAV abstract transformer
Goal: define a function FSAV[x:=a] : s.t. if FSAV[x:=a](D) = D’ then sp(x := a, Conj(D)) Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule
27
Defining an SAV abstract transformer
Goal: define a function FSAV[x:=a] : s.t. if FSAV[x:=a](D) = D’ then sp(x := a, Conj(D)) Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule
28
Defining an SAV abstract transformer
Goal: define a function FSAV[x:=a] : s.t. if FSAV[x:=a](D) = D’ then sp(x := a, Conj(D)) Conj(D’) Idea: define rules for individual facts and generalize to sets of facts by the conjunction rule { x= } x:=a { } [kill-lhs] Is either a variable v or an addition expression v+w { y=x+w } x:=a { } [kill-rhs-1] { y=w+x } x:=a { } [kill-rhs-2] { } x:= { x= } [gen] { y=z+w } x:=a { y=z+w } [preserve]
29
SAV abstract transformer example
{ } x := a + b { x=a+b } z := a + c { x=a+b, z=a+c } b := a * c { z=a+c } Is either a variable v or an addition expression v+w { x= } x:= aexpr { } [kill-lhs] { y=x+w } x:= aexpr { } [kill-rhs-1] { y=w+x } x:= aexpr { } [kill-rhs-2] { } x:= { x= } [gen] { y=z+w } x:= aexpr { y=z+w } [preserve]
30
Problem 1: large expressions
{ } x := a + b + c { } y := a + b + c { } Missed CSE opportunity Large expressions on the right hand sides of assignments are problematic Can miss optimization opportunities Require complex transformers Solution: …?
31
Problem 1: large expressions
{ } x := a + b + c { } y := a + b + c { } Missed CSE opportunity Large expressions on the right hand sides of assignments are problematic Can miss optimization opportunities Require complex transformers Solution: transform code to normal form where right-hand sides have bounded size Standard compiler transformation – lowering into three address code
32
Three-address code { } x := a + b + c { } y := a + b + c { } { } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c } Main idea: simplify expressions by storing intermediate results in new temporary variables Number of variables in simplified statements 3
33
Three-address code { } x := a + b + c { } y := a + b + c { } { } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c } Need to infer i1=i2 Main idea: simplify expressions by storing intermediate results in new temporary variables Number of variables in simplified statements 3
34
Problem 2: transformer precision
{ } i1 := a + b { i1=a+b } x := i1 + c { i1=a+b, x=i1+c } i2 := a + b { i1=a+b, x=i1+c, i2=a+b } y := i2 + c { i1=a+b, x=i1+c, i2=a+b, y=i2+c } Need to infer i1=i2 Our transformer only infers syntactically available expressions – ones that appear in the code explicitly We want a transformer that considers the meaning of the predicates Takes equalities into account
35
Defining a semantic reduction
Idea: make as many implicit facts explicit by Using symmetry and transitivity of equality Commutativity of addition Meaning of equality – can substitute equal variables For an SAV-predicate P=Conj(D) define reduce(D) = minimal set D* such that: D D* x=y D* implies y=x D* x=y D* y=z D* implies x=z D* x=y+z D* implies x=z+y D* x=y D* and x=z+w D* implies y=z+w D* x=y D* and z=x+w D* implies z=y+w D* x=z+w D* and y=z+w D* implies x=y D* Notice that reduce(D) D reduce is a special case of a semantic reduction
36
Sharpening the transformer
Define: F*[x:=aexpr] = reduce FSAV[x:= aexpr] { } i1 := a + b { i1=a+b, i1=b+a } x := i1 + c { i1=a+b, i1=b+a, x=i1+c, x=c+i1 } i2 := a + b { i1=a+b, i1=b+a, x=i1+c, x=c+i1, i2=a+b, i2=b+a, i1=i2, i2=i1, x=i2+c, x=c+i2, } y := i2 + c { ... } Since sets of facts and their conjunction are isomorphic we will use them interchangeably
37
An algorithm for annotating SLP
Annotate(P, x:=aexpr) = {P} x:=aexpr F*[x:= aexpr](P) Annotate(P, S1; S2) = let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2}
38
Challenge 2: handling conditions
39
Goal {bexpr P } S1 { Q }, { bexpr P } S2 { Q } { P } if bexpr then S1 else S2 { Q } [ifp] Annotate a program if bexpr then S1 else S2 with predicates from Assumption 1: P is given (otherwise use true) Assumption 2: bexpr is a simple binary expression e.g., x=y, xy, x<y (why?) { P } if bexpr then { bexpr P } S1 { Q1 } else { bexpr P } S2 { Q2 } { Q }
40
Joining predicates [ifp]
{bexpr P } S1 { Q }, { bexpr P } S2 { Q } { P } if bexpr then S1 else S2 { Q } [ifp] Possibly an SAV-fact Start with P or {bexpr P} and annotate S1 (yielding Q1) Start with P or {bexpr P} and annotate S2 (yielding Q2) How do we infer a Q such that Q1Q and Q2Q? Q1=Conj(D1), Q2=Conj(D2) Define: Q = Q1 Q = Conj(D1 D2) { P } if bexpr then { bexpr P } S1 { Q1 } else { bexpr P } S2 { Q2 } { Q } Possibly an SAV-fact
41
Joining predicates [ifp]
{bexpr P } S1 { Q }, { bexpr P } S2 { Q } { P } if bexpr then S1 else S2 { Q } [ifp] Start with P or {bexpr P} and annotate S1 (yielding Q1) Start with P or {bexpr P} and annotate S2 (yielding Q2) How do we infer a Q such that Q1Q and Q2Q? Q1=Conj(D1), Q2=Conj(D2) Define: Q = Q1 Q = Conj(D1 D2) { P } if bexpr then { bexpr P } S1 { Q1 } else { bexpr P } S2 { Q2 } { Q } The join operator for SAV
42
Joining predicates Q1=Conj(D1), Q2=Conj(D2)
We want to soundly approximate Q1 Q2 in Define: Q = Q1 Q = Conj(D1 D2) Notice that Q1Q and Q2Q meaning Q1 Q2 Q
43
Simplifying handling of conditions
Extend While with Non-determinism (or) and An assume statement assume b, s sos s if B b s = tt Use the fact that the following two statements are equivalent if b then S1 else S2 (assume b; S1) or (assume b; S2)
44
Handling conditional expressions
We want to soundly approximate D bexpr and D bexpr in Define (bexpr) = if bexpr is factoid {bexpr} else {} Define F[assume bexpr](D) = D (bexpr) Can sharpen F*[assume bexpr] = reduce FSAV[assume bexpr]
45
Handling conditional expressions
Notice bexpr (bexpr) Examples (y=z) = {y=z} (y<z) = {}
46
An algorithm for annotating conditions
let Pt = F*[assume bexpr] P let Pf = F*[assume bexpr] P let Annotate(Pt, S1) be {Pt} A1 {Q1} let Annotate(Pf, S2) be {Pf} A2 {Q2} return {P} if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2}
47
Example { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }
48
Example { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }
49
Recap We now have an algorithm for soundly annotating loop-free code
Generates forward-going proofs Algorithm operates on abstract syntax tree of code Handles straight-line code by applying F* Handles conditions by recursively annotating true and false branches and then intersecting their postconditions
50
Example { } if (x = y) { x=y, y=x } a := b + c { x=y, y=x, a=b+c, a=c+b } d := b – c { x=y, y=x, a=b+c, a=c+b } else { } a := b + c { a=b+c, a=c+b } d := b + c { a=b+c, a=c+b, d=b+c, d=c+b, a=d, d=a } { a=b+c, a=c+b }
51
Challenge 2: handling loops
By Stefan Scheer (Own work (Own Photo)) [GFDL ( CC-BY-SA-3.0 ( or CC-BY-SA ( via Wikimedia Commons
52
{bexpr P } S { P } { P } while b do S {bexpr P }
Goal {bexpr P } S { P } { P } while b do S {bexpr P } [whilep] Annotate a program while bexpr do S with predicates from s.t. P N Main challenge: find N Assumption 1: P is given (otherwise use true) Assumption 2: bexpr is a simple binary expression { P } Inv = { N } while bexpr do { bexpr N } S { Q } {bexpr N }
53
Example: annotate this program
{ y=x+a, y=a+x, w=d, d=w } Inv = { y=x+a, y=a+x } while (x z) do { z=x+a, z=a+x, w=d, d=w } x := x { w=d, d=w } y := x + a { y=x+a, y=a+x, w=d, d=w } d := x + a { y=x+a, y=a+x, d=x+a, d=a+x, y=d, d=y } { y=x+a, y=a+x, x=z, z=x }
54
Example: annotate this program
{ y=x+a, y=a+x, w=d, d=w } Inv = { y=x+a, y=a+x } while (x z) do { y=x+a, y=a+x } x := x { } y := x + a { y=x+a, y=a+x } d := x + a { y=x+a, y=a+x, d=x+a, d=a+x, y=d, d=y } { y=x+a, y=a+x, x=z, z=x }
55
{bexpr P } S { P } { P } while b do S {bexpr P }
Goal {bexpr P } S { P } { P } while b do S {bexpr P } [whilep] Idea: try to guess a loop invariant from a small number of loop unrollings We know how to annotate S (by induction) { P } Inv = { N } while bexpr do { bexpr N } S { Q } {bexpr N }
56
k-loop unrolling { y=x+a, y=a+x, w=d, d=w } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a } { P } Inv = { N } while (x z) do x := x y := x + a d := x + a { P } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a } if (x z) x := x y := x + a d := x + a Q2 = { y=x+a } …
57
k-loop unrolling { y=x+a, y=a+x, w=d, d=w } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } { P } Inv = { N } while (x z) do x := x y := x + a d := x + a { P } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } if (x z) x := x y := x + a d := x + a Q2 = { y=x+a, y=a+x } …
58
k-loop unrolling { y=x+a, y=a+x, w=d, d=w } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } { P } Inv = { N } while (x z) do x := x y := x + a d := x + a { P } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } if (x z) x := x y := x + a d := x + a Q2 = { y=x+a, y=a+x } The following must hold: P N Q1 N Q2 N … Qk N …
59
k-loop unrolling { y=x+a, y=a+x, w=d, d=w } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } { P } Inv = { N } while (x z) do x := x y := x + a d := x + a { P } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } if (x z) x := x y := x + a d := x + a Q2 = { y=x+a, y=a+x } The following must hold: P N Q1 N Q2 N … Qk N … Observation 1: No need to explicitly unroll loop – we can reuse postcondition from unrolling k-1 for k We can compute the following sequence: N0 = P N1 = N0 Q1 N2 = N1 Q2 … Nk = Nk-1 Qk …
60
k-loop unrolling { y=x+a, y=a+x, w=d, d=w } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } { P } Inv = { N } while (x z) do x := x y := x + a d := x + a { P } if (x z) x := x y := x + a d := x + a Q1 = { y=x+a, y=a+x } if (x z) x := x y := x + a d := x + a Q2 = { y=x+a, y=a+x } The following must hold: P N Q1 N Q2 N … Qk N … Observation 2: Nk monotonically decreases set of facts. Question: does it stabilizes for some k? We can compute the following sequence: N0 = P N1 = N1 Q1 N2 = N1 Q2 … Nk = Nk-1 Qk …
61
Algorithm for annotating a loop
Annotate(P, while bexpr do S) = Initialize N := Nc := P repeat let Annotate(P, if b then S else skip) be {Nc} if bexpr then S else skip {N} Nc := Nc N until N = Nc return {P} INV= N while bexpr do F[assume bexpr](N) Annotate(F[assume bexpr](N), S) F[assume bexpr](N)
62
Putting it together
63
Algorithm for annotating a program
Annotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F*[x:=aexpr] P} case S is S1; S2 let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2} case S is if bexpr then S1 else S2 let Pt = F[assume bexpr] P let Pf = F[assume bexpr] P let Annotate(Pt, S1) be {Pt} A1 {Q1} let Annotate(Pf, S2) be {Pf} A2 {Q2} return {P} if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2} case S is while bexpr do S N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)}
64
Exercise: apply algorithm
{ } y := a+b { } x := y { } while (xz) do { } w := a+b { } x := a+b { } a := z { }
65
Step 1/18 {} y := a+b { y=a+b }*
Not all factoids are shown – apply reduce to get all factoids {} y := a+b { y=a+b }* x := y while (xz) do w := a+b x := a+b a := z
66
Step 2/18 {} y := a+b { y=a+b }* x := y { y=a+b, x=y, x=a+b }*
while (xz) do w := a+b x := a+b a := z
67
Step 3/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (xz) do w := a+b x := a+b a := z
68
Step 4/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (xz) do { y=a+b, x=y, x=a+b }* w := a+b x := a+b a := z
69
Step 5/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (xz) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b a := z
70
Step 6/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (xz) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z
71
Step 7/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’ = { y=a+b, x=y, x=a+b }* while (xz) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*
72
Step 8/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (xz) do { y=a+b, x=y, x=a+b }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*
73
Step 9/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (xz) do { x=y }* w := a+b { y=a+b, x=y, x=a+b, w=a+b, w=x, w=y }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*
74
Step 10/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (xz) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { y=a+b, w=a+b, w=y, x=a+b, w=x, x=y }* a := z { w=y, w=x, x=y, a=z }*
75
Step 11/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (xz) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=y, w=x, x=y, a=z }*
76
Step 12/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’ = { x=y }* while (xz) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
77
Step 13/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (xz) do { x=y }* w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
78
Step 14/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (xz) do { } w := a+b { x=y, w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
79
Step 15/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (xz) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
80
Step 16/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (xz) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
81
Step 17/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv’’’ = { } while (xz) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }*
82
Step 18/18 {} y := a+b { y=a+b }*
x := y { y=a+b, x=y, x=a+b }* Inv = { } while (xz) do { } w := a+b { w=a+b }* x := a+b { x=a+b, w=a+b, w=x }* a := z { w=x, a=z }* { x=z }*
83
Constant propagation
84
Second static analysis example
Optimization: constant folding Example: x:=7; y:=x*9 transformed to: x:=7; y:=7*9 and then to: x:=7; y:=63 Analysis: constant propagation (CP) Infers facts of the form x=c simplifies constant expressions constant folding { x=c } y := aexpr y := eval(aexpr[c/x])
85
Plan Define domain – set of allowed assertions Handle assignments
Handle composition Handle conditions Handle loops
86
Constant propagation domain
87
CP semantic domain ?
88
CP semantic domain Define CP-factoids: = { x = c | x Var, c Z }
How many factoids are there? Define predicates as = 2 How many predicates are there? Do all predicates make sense? (x=5) (x=7) Treat conjunctive formulas as sets of factoids {x=5, y=7} ~ (x=5) (y=7)
89
Handling assignments
90
CP abstract transformer
Goal: define a function FCP[x:=aexpr] : such that if FCP[x:=aexpr] P = P’ then sp(x:=aexpr, P) P’ ?
91
CP abstract transformer
Goal: define a function FCP[x:=aexpr] : such that if FCP[x:=aexpr] P = P’ then sp(x:=aexpr, P) P’ { x=c } x:=aexpr { } [kill] { } x:=c { x=c } [gen-1] { y=c1, z=c2 } x:=y op z { x=c} and c=c1 op c2 [gen-2] { y=c } x:=aexpr { y=c } [preserve]
92
Gen-kill formulation of transformers
Suited for analysis propagating sets of factoids Available expressions, Constant propagation, etc. For each statement, define a set of killed factoids and a set of generated factoids F[S] P = (P \ kill(S)) gen(S) FCP[x:=aexpr] P = (P \ {x=c}) aexpr is not a constant FCP[x:=k] P = (P \ {x=c}) {x=k} Used in dataflow analysis – a special case of abstract interpretation
93
Handling composition
94
Does this still work? Annotate(P, S1; S2) = let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2}
95
Handling conditions
96
Handling conditional expressions
We want to soundly approximate D bexpr and D bexpr in Define (bexpr) = if bexpr is CP-factoid {bexpr} else {} Define F[assume bexpr](D) = D (bexpr)
97
Does this still work? let Pt = F[assume bexpr] P let Pf = F[assume bexpr] P let Annotate(Pt, S1) be {Pt} A1 {Q1} let Annotate(Pf, S2) be {Pf} A2 {Q2} return {P} if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2} How do we define join for CP?
98
Join example {x=5, y=7} {x=3, y=7, z=9} =
99
Handling loops
100
Does this still work? What about correctness? What about termination?
Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)} What about correctness? What about termination?
101
Does this still work? What about correctness? What about termination?
Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)} What about correctness? If loop terminates then is N a loop invariant? What about termination?
102
A termination principle
g : X X is a function How can we determine whether the sequence x0, x1 = g(x0), …, xk+1=g(xk),… stabilizes? Technique: Find ranking function rank : X N (that is show that rank(x) 0 for all x) Show that if xg(x) then rank(g(x)) < rank(x)
103
Rank function for available expressions
rank(P) = ?
104
Rank function for available expressions
Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)} rank(P) = |P| number of factoids Prove that either Nc = Nc N or rank(Nc N) <? rank(Nc)
105
Rank function for constant propagation
Annotate(P, while bexpr do S) = N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)} rank(P) = ? Prove that either Nc = Nc N or rank(Nc) >? rank(Nc N)
106
Rank function for constant propagation
Annotate(P, while bexpr do S) = N’ := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N’} Nc := Nc N’ until N’ = Nc return {P} INV= {N’} while bexpr do {Pt} Abody {F[assume bexpr](N)} rank(P) = |P| number of factoids Prove that either Nc = Nc N’ or rank(Nc) >? rank(Nc N’)
107
Available Expressions Abstract Interpretation
Generalizing 1 Available Expressions Constant Propagation By NMZ (Photoshop) [CC0], via Wikimedia Commons Abstract Interpretation
108
Towards a recipe for static analysis
Two static analyses Available Expressions (extended with equalities) Constant Propagation Semantic domain – a family of formulas Join operator approximates pairs of formulas Abstract transformers for basic statements Assignments assume statements Initial precondition
109
Control flow graphs
110
A technical issue Unrolling loops is quite inconvenient and inefficient (but we can avoid it as we just saw) How do we handle more complex control-flow constructs, e.g., goto , break, exceptions…? The problem: non-inductive control flow constructs Solution: model control-flow by labels and goto statements Would like a dedicated data structure to explicitly encode control flow in support of the analysis Solution: control-flow graphs (CFGs)
111
Modeling control flow with labels
while (x z) do x := x y := x + a d := x + a a := b label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b
112
Control-flow graph example
line number label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b 1 2 3 4 1 label0: 5 6 2 if x z 7 8 label1: x := x + 1 7 3 a := b y := x + a 8 4 d := x + a 5 goto label0 6
113
Control-flow graph example
label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b 1 entry 2 3 4 1 label0: 5 6 2 if x z 7 8 label1: x := x + 1 7 3 a := b y := x + a 8 4 exit d := x + a 5 goto label0 6
114
Control-flow graph Node are statements or labels
Special nodes for entry/exit A edge from node v to node w means that after executing the statement of v control passes to w Conditions represented by splits and join node Loops create cycles Can be generated from abstract syntax tree in linear time Automatically taken care of by the front-end Usage: store analysis results (assertions) in CFG nodes
115
Control-flow graph example
label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b 1 entry 2 3 4 1 label0: 5 6 2 if x z 7 8 label1: x := x + 1 7 3 a := b y := x + a 8 4 exit d := x + a 5 goto label0 6
116
Eliminating labels We can use edges to point to the nodes following labels and remove all label nodes (other than entry/exit)
117
Control-flow graph example
label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b 1 entry 2 3 4 1 label0: 5 6 2 if x z 7 8 label1: x := x + 1 7 3 a := b y := x + a 8 4 exit d := x + a 5 goto label0 6
118
Control-flow graph example
label0: if x z goto label1 x := x y := x + a d := x + a goto label0 label1: a := b 1 entry 2 3 4 5 6 2 if x z 7 8 x := x + 1 3 a := b y := x + a 8 4 exit d := x + a 5
119
Basic blocks A basic block is a chain of nodes with a single entry point and a single exit point Entry/exit nodes are separate blocks entry 2 if x z x := x + 1 3 a := b y := x + a 8 4 exit d := x + a 5
120
Blocked CFG Stores basic blocks in a single node
Extended blocks – maximal connected loop-free subgraphs entry 2 if x z x := x + 1 y := x + a d := x + a 3 4 a := b 5 8 exit
121
Collecting semantics
122
Why need another semantics?
Operational semantics explains how to compute output from a given input Useful for implementing an interpreter/compiler Less useful for reasoning about safety properties Not suitable for analysis purposes – does not explicitly show how assertions in different program points influence each other Need a more explicit semantics Over a control flow graph
123
Control-flow graph example
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 entry 2 3 label0: 4 1 5 2 if x > 0 x := x - 1 3 label1: goto label0: 5 4 exit
124
Trimmed CFG label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 3 entry 4 5 2 if x > 0 exit x := x - 1 3
125
Collecting semantics example: input 1
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 … 3 [x3] [x2] [x1] entry 4 5 [x0] [x1] 2 if x > 0 [x0] exit [x1] x := x - 1 3
126
Collecting semantics example: input 2
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 … 3 [x3] [x2] [x1] entry 4 5 [x2] [x0] [x1] 2 if x > 0 [x0] exit [x1] x := x - 1 3 [x2]
127
Collecting semantics example: input 3
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 … 3 [x3] [x2] [x1] entry 4 5 [x3] [x2] [x0] [x1] 2 if x > 0 [x0] exit [x1] x := x - 1 3 [x3] [x2]
128
ad infinitum – fixed point
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 … 3 [x3] [x2] [x1] entry 4 5 … [x3] [x2] [x0] [x1] 2 if x > 0 [x0] exit [x1] x := x - 1 … 3 [x-2] [x-1] [x3] [x2] …
129
Predicates at fixed point
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 3 {true} entry 4 5 {?} 2 if x > 0 {?} exit {?} x := x - 1 3
130
Predicates at fixed point
label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 3 {true} entry 4 5 {true} 2 if x > 0 {x0} exit {x>0} x := x - 1 3 {x0}
131
Collecting semantics Accumulates for each control-flow node the (possibly infinite) sets of states that can reach there by executing the program from some given set of input states Not computable in general A reference point for static analysis (An abstraction of the trace semantics) We will define it formally
132
Collecting semantics in equational form
133
Math reference: function lifting
Let f : X Y be a function The lifted function f’ : 2X 2Y is defined as f’(XS) = { f(x) | x XS } We will sometimes use the same symbol for both functions when it is clear from the context which one is used
134
Equational definition example
A vector of variables R[0, 1, 2, 3, 4] R[0] = {xZ} // established input R[1] = R[0] R[4] R[2] = assume x>0 R[1] R[3] = assume (x>0) R[1] R[4] = x:=x-1 R[2] A (recursive) system of equations Semantic function for x:=x-1 lifted to sets of states entry R[0] R[1] if x > 0 R[3] R[2] R[4] exit x := x-1
135
General definition A vector of variables R[0, …, k] one per input/output of a node R[0] is for entry For node n with multiple predecessors add equation R[n] = {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] = S R[m] Transform if b then S1 else S2 to (assume b; S1) or (assume b; S2) entry R[0] R[1] if x > 0 R[3] R[2] R[4] exit x := x-1
136
see you next time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.