An Overview on Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 1: Introduction Roman Manevich Ben-Gurion University.
Optimization Compiler Baojian Hua
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
ISBN Chapter 3 Describing Syntax and Semantics.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
An Overview on Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
An Overview on Static Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of.
Program analysis Mooly Sagiv html://
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Overview of program analysis Mooly Sagiv html://
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
1 Languages and Compilers (SProg og Oversættere) Lecture 15 (1) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With.
Describing Syntax and Semantics
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Overview of program analysis Mooly Sagiv html://
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Introduction to Program Analysis
An Overview on Static Program Analysis Mooly Sagiv.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Chapter 0.2 – Pointers and Memory. Type Specifiers  const  may be initialised but not used in any subsequent assignment  common and useful  volatile.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Constants Numeric Constants Integer Constants Floating Point Constants Character Constants Expressions Arithmetic Operators Assignment Operators Relational.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2013 Program Analysis and Verification Lecture 1: Introduction Roman Manevich Ben-Gurion University.
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
Introduction to Program Analysis CS 8803 FPL Aug 20, 2013.
Chapter 3 Part II Describing Syntax and Semantics.
Semantics In Text: Chapter 3.
Chapter 3 Syntax, Errors, and Debugging Fundamentals of Java.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
Program Analysis Mooly Sagiv
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Languages and Compilers (SProg og Oversættere) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With acknowledgement.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
A Static Analyzer for Large Safety-­Critical Software Presented by Dario Bösch, ETH Zürich Research Topics in Software Engineering Dario.
Spring 2017 Program Analysis and Verification
Functional Programming
Symbolic Implementation of the Best Transformer
Languages and Compilers (SProg og Oversættere) Compiler Optimizations
CUTE: A Concolic Unit Testing Engine for C
Ras Bodik WF 11-12:30 slides adapted from Mooly Sagiv
Presentation transcript:

An Overview on Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis F. Nielson, H. Nielson, C.L. Hankin

Prerequisites u Compiler construction course

Course Requirements u Course Notes 15% u Assignments 35% u Exam 50%

Class Notes u Prepare a document with (word,latex) –Original material covered in class –Explanations –Questions and answers –Extra examples –Self contained u Send class notes by Sunday night to u Incorporate changes u Available next class

Subjects u What is dynamic analysis u What is static analysis u Usage in compilers u Other clients u Why is it called ``abstract interpretation''? u Undecidability u Handling Undecidability u Soundness of abstract interpretation u Relation to program verification u Origins u Some program analysis tools –SLAM –ASTREE –TVLA u Tentative schedule

Dynamic Program Analysis u Automatically infer properties of the program while it is being executed u Examples –Dynamic Array bound checking »Purify »Valgrind –Memory leaks –Likely invariants »Daikon

Static Analysis u Automatic inference of static properties which hold on every execution leading to a program location

Example Static Analysis Problem u Find variables with constant value at a given program location u Example program int p(int x){ return x *x ; void main()} { int z; if (getc()) z = p(6) + 8; else z = p(-7) -5; printf (z); } 44

Recursive Program int x void p(a) { read (c); if c > 0 { a = a -2; p(a); a = a + 2; } x = -2 * a + 5; print (x); } void main { p(7); print(x); }

y :=z+4 Iterative Approximation [x  ?, y  ?, z  ? ] [x  ?, y  ?, z  3 ] [x  1, y  7, z  3 ] [x  ?, y  7, z  3 ] [x  ?, y  ?, z  3 ] z :=3 z <=0 z >0 x==1 x!=1 y :=7 assert y==7 [x  1, y  7, z  3 ] [x  ?, y  7, z  3 ]

List reverse(Element  head) { List rev, n; rev = NULL; while (head != NULL) { n = head  next; head  next = rev; head = n; rev = head; } return rev; } Memory Leakage potential leakage of address pointed to by head

Memory Leakage Element  reverse(Element  head) { Element  rev,  n; rev = NULL; while (head != NULL) { n = head  next; head  next = rev; rev = head; head = n; } return rev; }  No memory leaks

A Simple Example void foo(char *s ) { while ( *s != ‘ ‘ ) s++; *s = 0; } Potential buffer overrun: offset(s)  alloc(base(s))

A Simple Example void foo(char string(s) { while ( *s != ‘ ‘&& *s != 0) s++; *s = 0; }  No buffer overruns

Buffer Overrun Exploits int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) { if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”)

Example Static Analysis Problem u Find variables which are live at a given program location u Used before set on some execution paths from the current program point

A Simple Example ab c a :=0 b :=a+1 c :=c+b a :=b+2 c < N c >=N c c b, c a, c, a

Compiler Scheme String Scanner Parser Semantic Analysis Code Generator Static analysis Transformations Tokens AST LIR source-program tokens AST IR IR +information

Other Example Program Analyses u Reaching definitions u Expressions that are ``available'' u Dead code u Pointer variables never point into the same location u Points in the program in which it is safe to free an object u An invocation of virtual method whose address is unique u Statements that can be executed in parallel u An access to a variable which must be in cache u Integer intervals u The termination problem

The Program Termination Problem u Determine if the program terminates on all possible inputs

Program Termination Simple Examples z := 3; while z > 0 do { if (x == 1) z := z +3; else z := z + 1; while z > 0 do { if (x == 1) z := z -1; else z := z -2;

Program Termination Complicated Example while (x !=1) do { if (x %2) == 0 { x := x / 2; } else { x := x * 3 + 1; } }

Summary Program Termination u Very hard in theory u Most programs terminate for simple reasons u But termination may involve proving intricate program invariants u Tools exist –MSR Terminator us/um/cambridge/projects/terminator/ –ARMC

The Need for Static Analysis u Compilers –Advanced computer architectures –High level programming languages (functional, OO, garbage collected, concurrent) u Software Productivity Tools –Compile time debugging »Stronger type Checking for C »Array bound violations »Identify dangling pointers »Generate test cases »Generate certification proofs u Program Understanding

Challenges in Static Analysis u Non-trivial u Correctness u Precision u Efficiency of the analysis u Scaling

C Compilers u The language was designed to reduce the need for optimizations and static analysis u The programmer has control over performance (order of evaluation, storage, registers) u C compilers nowadays spend most of the compilation time in static analysis u Sometimes C compilers have to work harder!

Software Quality Tools u Detecting hazards (lint) –Uninitialized variables a = malloc() ; b = a; cfree (a); c = malloc (); if (b == c) printf(“unexpected equality”); u References outside array bounds u Memory leaks (occurs even in Java!)

Foundation of Static Analysis u Static analysis can be viewed as interpreting the program over an “abstract domain” u Execute the program over larger set of execution paths u Guarantee sound results –Every identified constant is indeed a constant –But not every constant is identified as such

Example Abstract Interpretation Casting Out Nines u Check soundness of arithmetic using 9 values 0, 1, 2, 3, 4, 5, 6, 7, 8 u Whenever an intermediate result exceeds 8, replace by the sum of its digits (recursively) u Report an error if the values do not match u Example query “123 * = $?” –Left 123* = 6 * =6 + 7 = 4 –Right 3 –Report an error u Soundness (10a + b) mod 9 = (a + b) mod 9 (a+b) mod 9 = (a mod 9) + (b mod 9) (a*b) mod 9 = (a mod 9) * (b mod 9)

Even/Odd Abstract Interpretation u Determine if an integer variable is even or odd at a given program point

Example Program while (x !=1) do { if (x %2) == 0 { x := x / 2; } else { x := x * 3 + 1; assert (x %2 ==0); } } /* x=? */ /* x=E */ /* x=O */ /* x=? */ /* x=E */ /* x=O*/

  Abstract Abstract Interpretation Concrete  Sets of stores Descriptors of sets of stores 

Odd/Even Abstract Interpretation  {-2, 1, 5} {0,2} {2}{0}  EO ? All concrete states   {x: x  Even}

Odd/Even Abstract Interpretation  {-2, 1, 5} {0,2} {2}{0}  EO ?    All concrete states   {x: x  Even}

Odd/Even Abstract Interpretation  {-2, 1, 5} {0,2} {2}{0}  EO ?  All concrete states {x: x  Even}  

Example Program while (x !=1) do { if (x %2) == 0 { x := x / 2; } else { x := x * 3 + 1; assert (x %2 ==0); } } /* x=O *//* x=E */

(Best) Abstract Transformer Concrete Representation ConcretizationAbstraction Operational Semantics St Abstract Representation Abstract Semantics St

Concrete and Abstract Interpretation

Runtime vs. Static Testing RuntimeAbstract EffectivenessMissed ErrorsFalse alarms Locate rare errors CostProportional to program’s execution Proportional to program’s size No need to efficiently handle rare cases Can handle limited classes of programs and still be useful

Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states

Example rule of signs u Safely identify the sign of variables at every program location u Abstract representation {P, N, ?} u Abstract (conservative) semantics of *

Abstract (conservative) interpretation {…,,…} concretization Abstract semantics x := x*#y abstraction Operational semantics x := x*y {…, …}

Example rule of signs (cont) u Safely identify the sign of variables at every program location u Abstract representation {P, N, ?} u  (C) = if all elements in C are positive then return P else if all elements in C are negative then return N else return ? u  (a) = if (a==P) then return{0, 1, 2, … } else if (a==N) return {-1, -2, -3, …, } else return Z

Example Constant Propagation u Abstract representation set of integer values and and extra value “?” denoting variables not known to be constants u Conservative interpretation of +

Example Constant Propagation(Cont) u Conservative interpretation of *

Example Program x = 5; y = 7; if (getc()) y = x + 2; z = x +y;

Example Program (2) if (getc()) x= 3 ; y = 2; else x =2; y = 3; z = x +y;

Undecidability Issues u It is undecidable if a program point is reachable in some execution u Some static analysis problems are undecidable even if the program conditions are ignored

The Constant Propagation Example while (getc()) { if (getc()) x_1 = x_1 + 1; if (getc()) x_2 = x_2 + 1;... if (getc()) x_n = x_n + 1; } y = truncate (1/ (1 + p 2 (x_1, x_2,..., x_n)) /* Is y=0 here? */

Coping with undecidabilty u Loop free programs u Simple static properties u Interactive solutions u Conservative estimations –Every enabled transformation cannot change the meaning of the code but some transformations are no enabled –Non optimal code –Every potential error is caught but some “false alarms” may be issued

Analogies with Numerical Analysis u Approximate the exact semantics u More precision can be obtained at greater u computational costs

Violation of soundness u Loop invariant code motion u Dead code elimination u Overflow ((x+y)+z) != (x + (y+z)) u Quality checking tools may decide to ignore certain kinds of errors

Abstract interpretation cannot be always homomorphic (rules of signs) abstraction abstraction Operational semantics x := x+y Abstract semantics x := x+#y

Local Soundness of Abstract Interpretation abstraction Operational semantics statement Abstract semantics statement# 

Optimality Criteria u Precise (with respect to a subset of the programs) u Precise under the assumption that all paths are executable (statically exact) u Relatively optimal with respect to the chosen abstract domain u Good enough

Relation to Program Verification u Fully automatic u Applicable to a programming language u Can be very imprecise u May yield false alarms u Requires specification and loop invariants u Program specific u Relative complete u Provide counter examples u Provide useful documentation u Can be mechanized using theorem provers Program AnalysisProgram Verification

Origins of Abstract Interpretation u [Naur 1965] The Gier Algol compiler “A process which combines the operators and operands of the source text in the manner in which an actual evaluation would have to do it, but which operates on descriptions of the operands, not their value” u [Reynolds 1969] Interesting analysis which includes infinite domains (context free grammars) u [Syntzoff 1972] Well foudedness of programs and termination u [Cousot and Cousot 1976,77,79] The general theory u [Kamm and Ullman, Kildall 1977] Algorithmic foundations u [Tarjan 1981] Reductions to semi-ring problems u [Sharir and Pnueli 1981] Foundation of the interprocedural case u [Allen, Kennedy, Cock, Jones, Muchnick and Scwartz]

Driver’s Source Code in C Precise API Usage Rules (SLIC) Defects 100% path coverage Rules Static Driver Verifier Environment model Static Driver Verifier

Bill Gates’ Quote "Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we’re building tools that can do actual proof about the software and how it works in order to guarantee the reliability." Bill Gates, April 18, Keynote address at WinHec 2002 Keynote addressWinHec 2002

SLAM Dataflow C program SLIC rules Boolean Program C2BP BEBOP Abstract Trace NEWTON Concrete Program Trace 

The ASTRÉE Static Analyzer Patrick Cousot Radhia Cousot Jérôme Feret Laurent Mauborgne Antoine Miné Xavier Rival ENS France|

Goals u Prove absence of errors in safety critical C code u ASTRÉE was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire system –a program of 132,000 lines of C analyzed

A Simple Example /* boolean.c */ typedef enum {FALSE = 0, TRUE = 1} BOOLEAN; void main () { unsigned int x, y; BOOLEAN b; while (1) { … b = (x == 0); /* b == 0  x > 0 */ if (!b) /* x > 0 */ { y = 1 / x; }; … }; }

Another Example /* float-error.c */ void main () { float x, y, z, r; x = e+38; y = x + 1.0e21; z = x - 1.0e21; r = y - z; printf("%f\n", r); } % gcc float-error.c %./a.out

Another Example void main () { float x, y, z, r; scanf(“%f”, &x); if ((x 1.0e38)) return; /* -1.0e38  x  1.0e38 */ y = x + 1.0e21; z = x - 1.0e21; r = y - z; /* r == 2.0 e21 */ printf("%f\n", r); }

TVLA: A system for generating shape analyzers Tal Lev-Ami Alexey Loginov Roman Manevich

Example: Concrete Interpretation x t n n t x n x t n x t n n x t n n x t t x n t t n t x t x t x empty return x x = t t =malloc(..); t  next=x; x = NULL T F

Example: Shape Analysis t x n x t n x t n n x t t x n t t n t x t x t x empty x t n n x t n n n x t n t n x n x t n n return x x = t t =malloc(..); t  next=x; x = NULL T F

TVLA: A parametric system for Shape Analysis A research tool Parameters –Concrete Semantics States Interpretation Rules –Abstraction –Transformers Iteratively compute a probably sound solution Specialize the shape analysis algorithm to class of programs

Partial Correctness List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r  n; pl = NULL; while (l != r) { if (l  data > r  data) { pr  n = rn; r  n = l; if (pl = = NULL) x = r; else pl  n = r; r = pr; break; } pl = l; l = l  n; } pr = r; r = rn; } assert sorted[x,n]; //assert perm[x, n, x, n]; return x; } typedef struct list_cell { int data; struct list_cell *n; } *List;