1 Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program Analysis.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Program analysis Mooly Sagiv html://
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Programming Language Semantics Denotational Semantics Chapter 5 Part II.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Program analysis Mooly Sagiv html://
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Interprocedural Analysis Noam Rinetzky Mooly Sagiv Tel Aviv University Textbook Chapter 2.5.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Administrative stuff Office hours: After class on Tuesday.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Overview of program analysis Mooly Sagiv html://
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Sets, POSets, and Lattice © Marcelo d’Amorim 2010.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Solving fixpoint equations
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Program Analysis and Verification
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
Program Analysis and Verification
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
Program Analysis and Verification
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
DFA foundations Simone Campanoni
Program Analysis Last Lesson Mooly Sagiv. Goals u Show the significance of set constraints for CFA of Object Oriented Programs u Sketch advanced techniques.
Spring 2017 Program Analysis and Verification
Textbook: Principles of Program Analysis
Spring 2016 Program Analysis and Verification
Combining Abstract Interpreters
Symbolic Implementation of the Best Transformer
Iterative Program Analysis Abstract Interpretation
Program Analysis and Verification
Data Flow Analysis Compiler Design
Presentation transcript:

1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis Chapter (modified)

Outline u Mathematical Background u Abstract Interpretation u Type systems u Conclusions

Mathematical Background u Declaratively define –The result of the analysis –The exact solution –Allow comparison

Posets u A partial ordering is a binary relation  : L  L  {false, true} –For all l  L : l  l (Reflexive) –For all l 1, l 2, l 3  L : l 1  l 2, l 2  l 3  l 1  l 3 (Transitive) –For all l 1, l 2  L : l 1  l 2, l 2  l 1  l 1 = l 2 (Anti-Symmetric) u Denoted by (L,  ) u In program analysis –l 1  l 2  l 1 is more precise than l 2  l 1 represents fewer concrete states than l 2 u Examples –Total orders (N,  ) –Powersets (P(S),  ) –Powersets (P(S),  ) u More notations –l 1  l 2  l 2  l 1 –l 1  l 2  l 1  l 2  l 1  l 2 –l 1  l 2  l 2  l 1

Upper and Lower Bounds u Consider a poset (L,  ) u A subset L’  L has a lower bound l  L if for all l’  L’ : l  l’ u A subset L’  L has an upper bound u  L if for all l’  L’ : l’  u u A greatest lower bound of a subset L’  L is a lower bound l 0  L such that l  l 0 for any lower bound l of L’ u A lowest upper bound of a subset L’  L is an upper bound u 0  L such that u 0  u for any upper bound u of L’ u For every subset L’  L: –The greatest lower bound of L’ is unique if at all exists »  L’ (meet) a  b –The lowest upper bound of L’ is unique if at all exists »  L’ (join) a  b

Complete Lattices u A poset (L,  ) is a complete lattice if every subset has least and upper bounds u L = (L,  ) = (L, , , , ,  ) –  =   =  L –  =  L =   u Lemma For every poset (L,  ) the following conditions are equivalent –L is a complete lattice –Every subset of L has a least upper bound –Every subset of L has a greatest lower bound

Cartesian Products u A complete lattice (L 1,  1 ) = (L 1, ,  1,  1,  1,  1 ) u A complete lattice (L 2,  2 ) = (, ,  2,  2,  2,  2 ) u Define a Poset L = (L 1  L 2,  ) where –(x 1, x 2 )  (y 1, y 2 ) if »x 1  x 2 and »y 1  y 2 u L is a complete lattice

Chains u A subset Y  L in a poset (L,  ) is a chain if every two elements in Y are ordered –For all l 1, l 2  Y: l 1  l 2 or l 2  l 1 u An ascending chain is a sequence of values –l 1  l 2  l 3  … u A strictly ascending chain is a sequence of values –l 1  l 2  l 3  … u A descending chain is a sequence of values –l 1  l 2  l 3  … u A strictly descending chain is a sequence of values –l 1  l 2  l 3  … u L has a finite height if every chain in L is finite u Lemma A poset (L,  ) has finite height if and only if every strictly decreasing and strictly increasing chains are finite

Monotone Functions u A poset (L,  ) u A function f: L  L is monotone if for every l 1, l 2  L: –l 1  l 2  f(l 1 )  f(l 2 )

Fixed Points u A monotone function f: L  L where (L, , , , ,  ) is a complete lattice u Fix(f) = { l: l  L, f(l) = l} u Red(f) = {l: l  L, f(l)  l} u Ext(f) = {l: l  L, l  f(l)} –l 1  l 2  f(l 1 )  f(l 2 ) u Tarski’s Theorem 1955: if f is monotone then: – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f)   f(  ) f(  ) f2()f2() f2()f2() Fix(f) Ext(f) Red(f) gfp(f) lfp(f)

Chaotic Iterations u A lattice L = (L, , , , ,  ) with finite strictly increasing chains u L n = L  L  …  L u A monotone function f: L n  L n u Compute lfp(f) u The simultaneous least fixed of the system {x[i] = f i (x) : 1  i  n } x := ( , , …,  ) while (f(x)  x ) do x := f(x) for i :=1 to n do x[i] =  WL = {1, 2, …, n} while (WL   ) do select and remove an element i  WL new := f i (x) if (new  x[i]) then x[i] := new; Add all the indexes that directly depends on i to WL

The Abstract Interpretation Technique u The foundation of program analysis u Goals –Establish soundness of (find faults in) a given program analysis algorithm –Design new program analysis algorithms u The main ideas: –Relate each step in the algorithm to a step in a structural semantics –Establish global correctness using a general theorem u Not limited to a particular form of analysis

Soundness in Reaching Definitions u Every reachable definition is detected u May include more definitions –Less constants may be identified –Not all the loop invariant code will be identified –May warn against uninitailzed variables that are in fact in initialized u At every elementary block l RD entry (l) includes all the possibly definitions reaching l u At every elementary block l RD entry (l) “represents” all the possible concrete states arising when the structural operational semantics reaches l

Proof of Soundness u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and reaching definitions u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound CC1976

Structural Operational Semantics to justify Reaching Definitions u Normal states [Var *  Z] are not enough u Instrumented states [Var *  Z]  [Var *  Lab * ] u For an instrumented state (s, def) and variable x def(x) holds the last definition of x

Instrumented Structural Semantics for While [ass sos ]  (s[x  A  a  s], d(x  l)) [skip sos ]  (s, d) [comp 1 sos ]   axioms rules [comp 2 sos ]  (s’, d’) 

Instrumented Structural Semantics if construct [if tt sos ]  if B  b  s=tt [if ff sos ]  if B  b  s=ff

Instrumented Structural Semantics while construct [while sos ] 

The Factorial Program [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ;

Code Instrumentation u Alternative instrumentation u Generate an equivalent program which maintains more information u Use standard structural operational semantics

Other Consumers of Instrumentation u Specialized interpreters u Code Instrumentation –Performance analysis qpt »count the number of executions of basic blocks or the number of calls to a function –Profiling Tools »Find “hot” paths (paths that are executed often) by remembering which edge in the control flow graph was executed –Cleanness Tools Purify, Insure »identify uninitialized objects

Collecting (Instrumented) Semantics u The input state is not known at compile-time u “Collect” all the (instrumented) states for all possible inputs to the program u No lost of precision

Flow Information for While u Associate labels with program statements describing when statements begin and end u init:Stm  Lab * –init([x := a] l ) = l –init([skip] l ) = l –init(S 1 ; S 2 ) = init(S 1 ) –init(if [b] l then S 1 else S 2 ) = l –init(while [b] l do S) = l u final:Stm  P(Lab * ) –final([x := a] l ) = {l} –final([skip] l ) = {l} –final(S 1 ; S 2 ) = final(S 2 ) –final(if [b] l then S 1 else S 2 ) = final(S 1 )  final(S 2 ) –final(while [b] l do S) = {l}

Collecting (Instrumented) Semantics(Cont) u The input state is not known at compile-time u “Collect” all the (instrumented) states for all possible inputs to the program u Define d ? :Var *  Lab * by d ? (x)=? u CS entry (l) = {(s’, d’)|  s 0 : (P, (s 0, d ? )  * (S’, (s’, d’)), init(S’)=l} u Soundness w.r.t. operational semantics For all (s’, d’) in CS entry (l) For all variable x (x, d(l))  RD entry (l) u Optimality w.r.t. operational semantics

The Factorial Program [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ;

An “Iterative” Definition u Generate a system of monotonic equations u The least solution is well-defined u The least solution is the collecting interpretation

Equations Generated for Collecting Interpretation u Equations for elementary statements –[skip] l CS exit (1) = CS entry (l) –[b] l CS exit (1) = CS entry (l) –[x := a] l CS exit (1) = { (s[x  A  a  s], d(x  l)) | (s, d)  CS entry (l)} u Equations for control flow constructs CS entry (l) =  CS exit (l’) l’ immediately precedes l in the control flow graph u An equation for the entry CS entry (1) = {(s 0, d ? ) |s 0  Var *  Z }

The Least Solution u 12 sets of equations CS entry (1), …, CS exit (6) u Can be written in vectorial form u The least solution lfp(F cs ) is well-defined u Every component is minimal u Since F cs is monotonic such a solution always exists u CS entry (l) = {(s’, d’)|  s 0 : (P, (s 0, d ? )  * (S’, (s’, d’)), init(S’)=l} u Simplify the soundness criteria

Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states

The Abstraction Function u Map collecting states into reaching definitions u The abstraction of an individual state  :[Var *  Z]  [Var *  Lab * ]  P(Var *  Lab * )  (s,d) = {(x, d(x) | x  Var * } u The abstraction of set of states  :P([Var *  Z]  [Var *  Lab * ])  P(Var *  Lab * )  (CS) =  (s, d)  CS  (s,d) = = {(x, d(x) | (s, d)  CS, x  Var * } u Soundness  (CS entry (l))  RD entry (l) u Optimality

The Concretization Function u Map reaching definitions into collecting states u The formal meaning of reaching definitions u The concretization  : P(Var *  Lab * )  P([Var *  Z]  [Var *  Lab * ])  (RD) = {(s, d) |  x  Var * : (x, d(x)  RD}= = { (s, d) |  (s, d)  RD } u Soundness CS entry (l)   (RD entry (l)) u Optimality

Galois Connections u The pair of functions ( ,  ) form a Galois connection if:  CS  P([Var *  Z]  [Var *  Lab * ])  RD  P(Var *  Lab * )  (CS)  RD iff CS   (RD) u Alternatively:  CS  P([Var *  Z]  [Var *  Lab * ])  RD  P(Var *  Lab * )  (  (RD))  RD and CS   (  (CS)) u  and  uniquely determine each other

Local Concrete Semantics u For every atomic statement S –  S  : [Var *  Z]  [Var *  Lab * ]  [Var *  Z]  [Var *  Lab * ] –  x := a] l  ((s, d)) = (s[x  A  a  s], d(x  l)) –  skip] l  ((s, d)) = (s, d) –  b] l  ((s, d)) = (s, d)

Local Abstract Semantics u For every atomic statement S –  S  # : P(Var *  Lab * )  P(Var *  Lab * ) –  x := a] l  # (RD) = (RD - {(x, l’) | l’  Lab })  {(x, l)} –  skip] l  # (RD) = (RD) –  b] l  # (RD) = (RD)

Local Soundness u For every atomic statement S show one of the following –  ({  S  (s, d) | (s, d)  CS }   S  # (  (CS)) –{  S  (s, d) | (s, d)   (RD)}   (  S  # (RD)) –  ({  S  (s, d) | (s, d)   (RD)})   S  # (RD) u The above condition implies global soundness [Cousot & Cousot 1976]  (CS entry (l))  RD entry (l) CS entry (l)   (RD entry (l))

Proof of Soundness (Summary) u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and reaching definitions u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound

Induced Analysis (Relatively Optimal) u It is sometimes possible to show that a given analysis is not only sound but optimal w.r.t. the chosen abstraction (but not necessarily optimal) u Define  S  # (RD) =  ({  S  (s, d) | (s, d)   (RD)}) u But this  S  # may not be computable u Derive (at compiler-generation time) an alternative form for  S  # u A useful measure to decide if the abstraction must lead to overly imprecise results

Type and Effect Systems u The type of a program expression at a given program point provides a conservative estimation to its value in all the execution paths u A type system provides a syntax directed rules for annotating expressions with types –Simple type inference algorithms are linear –But in Ada, ML, ABC… u But types can also include implementation information such as reaching definitions

Annotated Type Base for Reaching Definitions u S : RD 1  RD 2 if S is executed when the reaching definitions is RD 1 it produces reaching definitions RD 2 u Similar to the constraint based approach

Annotated Type Base for Reaching Definitions [ass] [x := a] l’ : RD  (RD - { {(x, l) | l  Lab })  {(x, l’)} [skip] [skip] l : RD  RD [seq] S 1 : RD 1  RD 2, S 2 : RD 2  RD 3 S 1 ; S 2 : RD 1  RD 3 axioms rules [if] S 1 : RD 1  RD 2, S 2 : RD 1  RD 2 if [b] l then S 1 else S 2 : RD 1  RD 2

Annotated Type Base For While while construct [wh] S : RD  RD while [b] l do S: RD  RD

Annotated Type Base For While subsumption rule [sub] S : RD 2  RD 3 S : RD 1  RD 4 if RD 1  RD 2 and RD 3  RD 4

Not Covered u Effect Systems u Transformations

Conclusions u Three similar techniques –Dataflow analysis –Constraint based approach (a generalization) –Type and effect system (directly deals with the syntax) u Abstract interpretation can be used to show soundness of these methods u But more convenient in the dataflow setting u We are ready for more sophisticated analyses