Program analysis Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc08.html.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Semantics Static semantics Dynamic semantics attribute grammars
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Lecture 11: Code Optimization CS 540 George Mason University.
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Program analysis Mooly Sagiv html://
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
An Overview on Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Liveness Analysis Mooly Sagiv Schrierber Wed 10:00-12:00 html://
An Overview on Static Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Overview of program analysis Mooly Sagiv html://
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
Describing Syntax and Semantics
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Overview of program analysis Mooly Sagiv html://
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
An Overview on Static Program Analysis Mooly Sagiv.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Data Flow Analysis Compiler Baojian Hua
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Compiler Construction Lecture 16 Data-Flow Analysis.
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
An Overview on Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Semantics In Text: Chapter 3.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
1 Languages and Compilers (SProg og Oversættere) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With acknowledgement.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Optimization Simone Campanoni
STATIC CODE ANALYSIS. OUTLINE  INTRODUCTION  BACKGROUND o REGULAR EXPRESSIONS o SYNTAX TREES o CONTROL FLOW GRAPHS  TOOLS AND THEIR WORKING  ERROR.
Program Analysis Last Lesson Mooly Sagiv. Goals u Show the significance of set constraints for CFA of Object Oriented Programs u Sketch advanced techniques.
Data Flow Analysis Suman Jana
Mooly Sagiv html://
Textbook: Principles of Program Analysis
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Symbolic Implementation of the Best Transformer
Iterative Program Analysis Abstract Interpretation
University Of Virginia
Fall Compiler Principles Lecture 10: Global Optimizations
Data Flow Analysis Compiler Design
Languages and Compilers (SProg og Oversættere) Compiler Optimizations
Ras Bodik WF 11-12:30 slides adapted from Mooly Sagiv
Presentation transcript:

Program analysis Mooly Sagiv html://

Outline What is (static) program analysis Example Undecidability An Iterative Algorithm Properties of the algorithm The theory of Abstract Interpretation

Abstract Interpretation Static analysis Automatically identify program properties –No user provided loop invariants Sound but incomplete methods –But can be rather precise Non-standard interpretation of the program operational semantics Applications –Compiler optimization –Code quality tools Identify potential bugs Prove the absence of runtime errors Partial correctness

Control Flow Graph(CFG) z = 3 while (x>0) { if (x = 1) y = 7; else y =z + 4; assert y == 7 } z =3 while (x>0) if (x=1) y =7y =z+4 assert y==7

Constant Propagation z =3 while (x>0) if (x=1) y =7y =z+4 assert y==7 [x  ?, y  ?, z  ? ] [x  ?, y  ?, z  3 ] [x  1, y  ?, z  3 ] [x  1, y  7, z  3 ] [x  ?, y  7, z  3 ] [x  ?, y  ?, z  3 ]

List reverse(Element  head) { List rev, n; rev = NULL; while (head != NULL) { n = head  next; head  next = rev; head = n; rev = head; } return rev; } Memory Leakage potential leakage of address pointed to by head

Memory Leakage Element  reverse(Element  head) { Element  rev,  n; rev = NULL; while (head != NULL) { n = head  next; head  next = rev; rev = head; head = n; } return rev; }  No memory leaks

A Simple Example void foo(char *s ) { while ( *s != ‘ ‘ ) s++; *s = 0; } Potential buffer overrun: offset(s)  alloc(base(s))

A Simple Example void foo(char string(s) { while ( *s != ‘ ‘&& *s != 0) s++; *s = 0; }  No buffer overruns

Example Static Analysis Problem Find variables which are live at a given program location Used before set on some execution paths from the current program point

A Simple Example /* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c ab c

Compiler Scheme String Scanner Parser Semantic Analysis Code Generator Static analysis Transformations Tokens source-program tokens AST IR IR +information

Undecidability issues It is impossible to compute exact static information Finding if a program point is reachable Difficulty of interesting data properties

Undecidabily A variable is live at a given point in the program –if its current value is used after this point prior to a definition in some execution path It is undecidable if a variable is live at a given program location

Proof Sketch Pr L: x := y Is y live at L?

Conservative (Sound) The compiler need not generate the optimal code Can use more registers (“spill code”) than necessary Find an upper approximation of the live variables Err on the safe side A superset of edges in the interference graph Not too many superfluous live variables

Conservative(Sound) Software Quality Tools Can never miss an error But may produce false alarms –Warning on non existing errors

Data Flow Values Order data flow values –a  b  a “is more precise than” b –In live variables a  b  a  b –In constant propagation a  b  a includes more constants than b Compute the least solution Merge control flow paths optimistically –a  b –In live variables a  b= a  b

Transfer Functions Program statements operate on data flow values conservatively

Transfer Functions (Constant Propagation) Program statements operate on data flow values conservatively If a=3 and b=7 before “z = a + b;” –then a=3, b =7, and z =10 after If a=? and b=7 before “z = a + b;” –then a=?, b =7, and z =? After For x = exp –CpOut = CpIn [x  [[exp]](CpIn)]

Transfer Functions LiveVariables If a and c are potentially live after “a = b *2” –then b and c are potentially live before For “x = exp;” –LiveIn = Livout – {x}  arg(exp)

Iterative computation of conservative static information Construct a control flow graph(CFG) Optimistically start with the best value at every node “Interpret” every statement in a conservative way Forward/Backward traversal of CFG Stop when no changes occur

Pseudo Code (forward) forward(G(V, E): CFG, start: CFG node, initial: value){ // initialization value[start]:= initial for each v  V – {start} do value[v] :=  // iteration WL = V while WL != {} do select and remove a node v  WL for each u  V such that (v, u)  E do value[u] := value[u]  f(v, u)(value[v]) if value[u] was changed WL := WL  {u}

Constant Propagation 1: z =3 2: while (x>0) 3: if (x=1) 4: y =75: y =z+4 NValWL v[1]=[x  ?, y  ?, z  ?] {1, 2, 3, 4, 5} 1 v[2]=[x  ?, y  ?, z  3] {2, 3, 4, 5} 2 v[3]=[x  ?, y  ?, z  3] {3, 4, 5} 3 v[4] =[x  ?, y  ?, z  3] v[5] =[x  ?, y  ?, z  3] {4, 5} 4{5} 5{} Only values before CFG are shown

Pseudo Code (backward) backward(G(V, E): CFG, exit: CFG node, initial: value){ // initialization value[exit]:= initial for each v  V – {exit} do value[v] :=  // iteration WL = V while WL != {} do select and remove a node v  WL for each u  V such that (u, v)  E do value[u] := value[u]  f(v, u)(value[v]) if value[u] was changed WL := WL  {u}

/* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;      

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}    

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}   

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b}  

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} 

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} {c, a} {c, b}

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c, a} {c, b} {c, a} {c, b}

Summary Iterative Procedure Analyze one procedure at a time –More precise solutions exit Construct a control flow graph for the procedure Initializes the values at every node to the most optimistic value Iterate until convergence

Abstract Interpretation The mathematical foundations of program analysis Established by Cousot and Cousot 1979 Relates static and runtime values

Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states

Example rule of signs Safely identify the sign of variables at every program location Abstract representation {P, N, ?} Abstract (conservative) semantics of * *#*#PN? PPN? NNP? ????

Abstract (conservative) interpretation {…,,…} concretization Abstract semantics x := x*#y abstraction Operational semantics x := x*y {…, …}

Example rule of signs Safely identify the sign of variables at every program location Abstract representation {P, N, ?}  (C) = if all elements in C are positive then return P else if all elements in C are negative then return N else return ?  (a) = if (a==P) then return{0, 1, 2, … } else if (a==N) return {-1, -2, -3, …, } else return Z

Example Constant Propagation Abstract representation –set of integer values and and extra value “?” denoting variables not known to be constants Conservative interpretation of +

Example Program x = 5; y = 7; if (getc()) y = x + 2; z = x +y;

Example Program (2) if (getc()) x= 3 ; y = 2; else x =2; y = 3; z = x +y;

Local Soundness of Abstract Interpretation Abstract semantics statement#  concretization Operational semantics statement concretization

Local Soundness of Abstract Interpretation abstraction Operational semantics statement Abstract semantics statement# 

Some Success Stories Software Quality Tools The prefix Tool identified interesting bugs in Windows The Microsoft SLAM tool checks correctness of device driver –Driver correctness rules Polyspace checks ANSI C conformance – Flags potential runtime errors

Summary Program analysis provides non-trivial insights on the runtime executions of the program –Degenerate case – types (flow insensitive) Mathematically justified –Operational semantics –Abstract interpretation (lattice theory) Employed in compilers Will be employed in software quality tools