Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 5: Axiomatic Semantics II Roman Manevich Ben-Gurion University.
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
More Dataflow Analysis CS153: Compilers Greg Morrisett.
CS 536 Spring Global Optimizations Lecture 23.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
White Box Testing and Symbolic Execution Written by Michael Beder.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Ben Livshits Based in part of Stanford class slides from
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Compiler Construction Lecture 16 Data-Flow Analysis.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 8: Dataflow & Optimizations 1 Roman Manevich Ben-Gurion University of the Negev.
Program Analysis and Verification Spring 2016 Program Analysis and Verification Lecture 5: Axiomatic Semantics II Roman Manevich Ben-Gurion University.
Spring 2017 Program Analysis and Verification
Spring 2017 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2017 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Fall Compiler Principles Lecture 8: Loop Optimizations
Iterative Program Analysis Abstract Interpretation
Spring 2017 Program Analysis and Verification Operational Semantics
Topic 10: Dataflow Analysis
Program Analysis and Verification
Fall Compiler Principles Lecture 10: Loop Optimizations
Data Flow Analysis Compiler Design
Spring 2016 Program Analysis and Verification Operational Semantics
Presentation transcript:

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University

Tentative syllabus Semantics Natural Semantics Structural semantics Axiomatic Verification Static Analysis Automating Hoare Logic Control Flow Graphs Equation Systems Collecting Semantics Abstract Interpretation fundamentals LatticesFixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR Crafting your own Soot From proofs to abstractions Systematically developing transformers 2

Previously Static Analysis by example – Simple Available Expressions analysis – Abstract transformer for assignments – Three-address code – Processing serial composition – Processing conditions – Processing loops 3

Agenda Another static analysis example: Constant Propagation Basic concepts in static analysis – Control flow graphs – Equation systems – Collecting semantics – (Trace semantics) 4

Constant propagation 5

Second static analysis example Optimization: constant folding – Example: x:=7; y:=x*9 transformed to: x:=7; y:=7*9 and then to: x:=7; y:=63 Analysis: constant propagation (CP) – Infers facts of the form x = c 6 { x = c } y := aexpr y := eval(aexpr[c/x]) constant folding simplifies constant expressions

Plan Define domain – set of allowed assertions Handle assignments Handle composition Handle conditions Handle loops 7

Constant propagation domain 8

CP semantic domain 9 ?

Define CP-factoids:  = { x = c | x  Var, c  Z } – How many factoids are there? Define predicates as  = 2  – How many predicates are there? – Do all predicates make sense? (x=5)  (x=7) Treat conjunctive formulas as sets of factoids {x=5, y=7} ~ (x=5)  (y=7) 10

Handling assignments 11

CP abstract transformer Goal: define a function F CP [x:=aexpr] :    such that if F CP [x:=aexpr] P = P’ then sp(x:=aexpr, P)  P’ 12 ?

CP abstract transformer Goal: define a function F CP [x:=aexpr] :    such that if F CP [x:=aexpr] P = P’ then sp(x:=aexpr, P)  P’ 13 { x=c } x:=aexpr { } [kill] { y=c 1, z=c 2 } x:=y op z { x=c} and c=c 1 op c 2 [gen-2] { } x:=c { x=c } [gen-1] { y=c } x:=aexpr { y=c } [preserve]

Gen-kill formulation of transformers Suited for analysis propagating sets of factoids – Available expressions, – Constant propagation, etc. For each statement, define a set of killed factoids and a set of generated factoids F[S] P = (P \ kill(S))  gen(S) F CP [x:=aexpr] P = (P \ {x=c}) aexpr is not a constant F CP [x:=k] P = (P \ {x=c})  {x=k} Used in dataflow analysis – a special case of abstract interpretation 14

Handling composition 15

Does this still work? Annotate(P, S 1 ; S 2 ) = let Annotate(P, S 1 ) be {P} A 1 {Q 1 } let Annotate(Q 1, S 2 ) be {Q 1 } A 2 {Q 2 } return {P} A 1 ; {Q 1 } A 2 {Q 2 } 16

Handling conditions 17

Handling conditional expressions We want to soundly approximate D  bexpr and D   bexpr in  Define  (bexpr) = if bexpr is CP-factoid {bexpr} else {} Define F[ assume bexpr](D) = D   (bexpr) 18

Does this still work? let P t = F[ assume bexpr] P let P f = F[ assume  bexpr] P let Annotate(P t, S 1 ) be {P t } A 1 {Q 1 } let Annotate(P f, S 2 ) be {P f } A 2 {Q 2 } return {P} if bexpr then {P t } A 1 {Q 1 } else {P f } A 2 {Q 2 } {Q 1  Q 2 } 19 How do we define join for CP?

Join example {x=5, y=7}  {x=3, y=7, z=9} = 20

Handling loops 21

Does this still work? What about correctness? What about termination? 22 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

Does this still work? What about correctness? – If loop terminates then is N a loop invariant? What about termination? 23 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

A termination principle g : X  X is a function How can we determine whether the sequence x 0, x 1 = g(x 0 ), …, x k+1 =g(x k ),… stabilizes? Technique: 1.Find ranking function rank : X  N (that is show that rank(x)  0 for all x) 2.Show that if x  g(x) then rank(g(x)) < rank(x) 24

Rank function for available expressions rank(P) = ? 25

Rank function for available expressions rank(P) = |P| number of factoids Prove that either N c = N c  N or rank(N c  N) < ? rank(N c ) 26 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

Rank function for constant propagation rank(P) = ? Prove that either N c = N c  N or rank(N c ) > ? rank(N c  N) 27 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

Rank function for constant propagation rank(P) = |P| number of factoids Prove that either N c = N c  N’ or rank(N c ) > ? rank(N c  N’) 28 Annotate(P, while bexpr do S) = N’ := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N’} N c := N c  N’ until N’ = Nc return {P} INV= {N’} while bexpr do {P t } A body {F[ assume  bexpr](N)}

Generalizing 29 By NMZ (Photoshop) [CC0], via Wikimedia Commons 1 Available Expressions Constant Propagation Abstract Interpretation

Towards a recipe for static analysis Two static analyses – Available Expressions (extended with equalities) – Constant Propagation Semantic domain – a family of formulas – Join operator approximates pairs of formulas Abstract transformers for basic statements – Assignments – assume statements Initial precondition 30

Control flow graphs 31

A technical issue Unrolling loops is quite inconvenient and inefficient (but we can avoid it as we just saw) How do we handle more complex control-flow constructs, e.g., goto, break, exceptions…? – The problem: non-inductive control flow constructs Solution: model control-flow by labels and goto statements Would like a dedicated data structure to explicitly encode control flow in support of the analysis Solution: control-flow graphs (CFGs) 32

Modeling control flow with labels 33 while (x  z) do x := x + 1 y := x + a d := x + a a := b label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b

Control-flow graph example 34 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b line number

Control-flow graph example 35 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b entry exit 7

Control-flow graph Node are statements or labels Special nodes for entry/exit A edge from node v to node w means that after executing the statement of v control passes to w – Conditions represented by splits and join node – Loops create cycles Can be generated from abstract syntax tree in linear time – Automatically taken care of by the front-end Usage: store analysis results (assertions) in CFG nodes 36

Control-flow graph example 37 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b entry exit

Eliminating labels We can use edges to point to the nodes following labels and remove all label nodes (other than entry/exit) 38

Control-flow graph example 39 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b entry exit

Control-flow graph example 40 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b if x  z x := x + 1 y := x + a d := x + a a := b entry exit

Basic blocks A basic block is a chain of nodes with a single entry point and a single exit point Entry/exit nodes are separate blocks 41 if x  z x := x + 1 y := x + a d := x + a a := b entry exit

Blocked CFG Stores basic blocks in a single node Extended blocks – maximal connected loop- free subgraphs 42 if x  z x := x + 1 y := x + a d := x + a a := b entry exit 4 5

43 Collecting semantics

Why need another semantics? Operational semantics explains how to compute output from a given input – Useful for implementing an interpreter/compiler – Less useful for reasoning about safety properties – Not suitable for analysis purposes – does not explicitly show how assertions in different program points influence each other Need a more explicit semantics – Over a control flow graph 44

Control-flow graph example if x > 0 x := x - 1 goto label0: label1: entry exit label0: 1 45 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

Trimmed CFG if x > 0 x := x entry exit 46 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

Collecting semantics example: input if x > 0 x := x entry exit [x1][x1] [x1][x1] [x0][x0] [x0][x0] 47 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

Collecting semantics example: input if x > 0 x := x entry exit [x1][x1] [x1][x1] [x0][x0][x2][x2] [x2][x2] 48 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [x0][x0]

Collecting semantics example: input if x > 0 x := x entry exit [x1][x1] [x1][x1] [x0][x0][x2][x2] [x2][x2] [x3][x3] [x3][x3] 49 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [x0][x0]

ad infinitum – fixed point if x > 0 x := x entry exit [x1][x1] [x1][x1] [x1][x1] [x0][x0] [x2][x2] [x2][x2] [x2][x2] [x3][x3] [x3][x3] [x3][x3] … … … 50 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [ x  -1][ x  -2] … [x0][x0]

Predicates at fixed point if x > 0 x := x entry exit 51 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: { true } {?}{?} {?}{?}{?}{?}

Predicates at fixed point if x > 0 x := x entry exit 52 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: { true } { x>0 }{x0}{x0}{x0}{x0}

Collecting semantics Accumulates for each control-flow node the (possibly infinite) sets of states that can reach there by executing the program from some given set of input states Not computable in general A reference point for static analysis (An abstraction of the trace semantics) We will define it formally 53

Collecting semantics in equational form 54

Math reference: function lifting Let f : X  Y be a function The lifted function f’ : 2 X  2 Y is defined as f’(XS) = { f(x) | x  XS } We will sometimes use the same symbol for both functions when it is clear from the context which one is used 55

Equational definition example A vector of variables R[0, 1, 2, 3, 4] R[0] = { x  Z} // established input R[1] = R[0]  R[4] R[2] =  assume x>0  R[1] R[3] =  assume  ( x>0)  R[1] R[4] =  x:=x-1  R[2] A (recursive) system of equations 56 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3] Semantic function for x:=x-1 lifted to sets of states

General definition A vector of variables R[0, …, k] one per input/output of a node – R[0] is for entry For node n with multiple predecessors add equation R[n] =  {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] =  S  R[m] Transform if b then S 1 else S 2 to ( assume b; S 1 ) or ( assume  b; S 2 ) 57 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3]

Next lecture: abstract interpretation fundamentals