Program Analysis with Set Constraints Ravi Chugh.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Interprocedural Analysis. Currently, we only perform data-flow analysis on procedures one at a time. Such analyses are called intraprocedural analyses.
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Intermediate Code Generation
Analysis of programs with pointers. Simple example What are the dependences in this program? Problem: just looking at variable names will not give you.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK
1 CS 201 Compiler Construction Lecture Interprocedural Data Flow Analysis.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
Methods. int month; int year class Month Defining Classes A class contains data declarations (static and instance variables) and method declarations (behaviors)
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Program Analysis with Set Constraints Ravi Chugh.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Intraprocedural Points-to Analysis Flow functions:
Chapter 6. 2 Objectives You should be able to describe: Function and Parameter Declarations Returning a Single Value Pass by Reference Variable Scope.
Staged Information Flow for JavaScript Ravi Chugh, Jeff Meister, Ranjit Jhala, Sorin Lerner UC San Diego.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Improving Code Generation Honors Compilers April 16 th 2002.
Guide To UNIX Using Linux Third Edition
Overview of program analysis Mooly Sagiv html://
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Java Methods By J. W. Rider. Java Methods Modularity Declaring methods –Header, signature, prototype Static Void Local variables –this Return Reentrancy.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Topic #3: Lexical Analysis
Data Objects (revisited) Recall that values are stored in data objects, and that each data object holds one value of a particular type. Data objects may.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Programmer's view on Computer Architecture by Istvan Haller.
A First Book of C++: From Here To There, Third Edition2 Objectives You should be able to describe: Function and Parameter Declarations Returning a Single.
CSE 1302 Lecture 7 Object Oriented Programming Review Richard Gesick.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Value and Reference Parameters. CSCE 1062 Outline  Summary of value parameters  Summary of reference parameters  Argument/Parameter list correspondence.
An Object-Oriented Approach to Programming Logic and Design Chapter 3 Using Methods and Parameters.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
CSCI 6962: Server-side Design and Programming JSF DataTables and Shopping Carts.
Intermediate Code Representations
Chapter 1 C++ Basics Review (Section 1.4). Classes Defines the organization of a data user-defined type. Members can be  Data  Functions/Methods Information.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
1 Binary trees Outline In this talk, we will look at the binary tree data structure: –Definition –Properties –A few applications Ropes (strings) Expression.
Inter-procedural analysis
Design issues for Object-Oriented Languages
Manuel Fahndrich Jakob Rehof Manuvir Das
Lecture 9 Symbol Table and Attributed Grammars
Static Analysis of Object References in RMI-based Java Software
User-Written Functions
Chapter 7 User-Defined Methods.
Constructing Precedence Table
Compositional Pointer and Escape Analysis for Java Programs
Points-to Analysis for Java Using Annotated Constraints
Interprocedural Analysis Chapter 19
Program Slicing Baishakhi Ray University of Virginia
Pointer analysis.
자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼
Presentation transcript:

Program Analysis with Set Constraints Ravi Chugh

Set-constraint based analysis Another technique for computing information about program variables Phase 1: constraint generation – Create set variables corresponding to program – Add inclusion constraints between these sets – Usually a local, syntax-directed process (ASTs vs CFGs) Phase 2: constraint resolution – Solve for values of all set variables Extends naturally to inter-procedural analysis

Constant propagation int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Want to determine whether x and y are constant values when they are used We will build a flow- insensitive analysis

Set constraints Terms t := c (constant) | X (set variable) | C(t 1,...,t n )(constructed term) Constraints t 1  t 2 (set inclusion) Constructors – C(v 1,...,v n ) is an n-arg ctor C with variances v i – v i is either + (covariant) or – (contravariant) – Covariance corresponds to “forwards flow” – Contravariance corresponds to “backwards flow”

Additional constraints Implicit constraints added by following rules: 1) Transitivity if t 1  t 2 and t 2  t 3 then t 1  t 3 2) Variance through constructed terms if C(...,t i,...)  C(...,u i,...) then t i  u i for covariant positions of C u i  t i for contravariant positions of C

Constraint graphs 1  X X  Y Ctor(A,B,C)  Ctor(D,E,F) where Ctor(+,-,+) X X 1 1 Y Y Ctor A A B B C C D D E E F F

Function calls Define ctor Fun(-,+) for one input/one output To encode a function def/call: int z = id(2); Fun(i,r)  id  Fun(2,z) By contravariance, the actual 2 flows to i By covariance, the return value of id flows to z Fun i i r r 2 2 z z id

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... }

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... }

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs Fun i i r1 abs

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs Fun i i r1 abs

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 Fun i i r1 abs

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 Fun i i r1 abs

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id Fun j j r2 id Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id Fun j j r2 id Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 Fun j j r2 id Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 Fun j j r2 id Fun i i r1 abs T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b Fun j j r2 id b b 2 2 Fun i i r1 abs a a 1 1 T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b Fun j j r2 id b b 2 2 Fun i i r1 abs a a 1 1 T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) Fun j j r2 id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) Fun j j r2 id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) id  Fun(b,y) Fun j j r2 Fun y y id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) id  Fun(b,y) Fun j j r2 Fun y y id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T ??  x ??  y

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) id  Fun(b,y) Fun j j r2 Fun y y id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T {1,T}  x ??  y

int abs(int i) { if (...) { return i; } else { return –i; } } int id(int j) { return j; } void main() { int a = 1, b = 2; int x = abs(a); int y = id(b);... use x use y... } Fun(i,r1)  abs i  r1 T  r1 Fun(j,r2)  id j  r2 1  a 2  b abs  Fun(a,x) id  Fun(b,y) Fun j j r2 Fun y y id b b 2 2 Fun i i r1 Fun x x abs a a 1 1 T T {1,T}  x {2}  y

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; 1 1 i i

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; Ref i i 1 1

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; Ref i i p p 1 1

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; Ref 2 2 i i p p 1 1

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; Ref 2 2 j j i i p p 1 1

Pointers Handle pointers with a Ref(-,+) constructor Two args correspond to set and get operations int i = 1; int *p = &i; *p = 2; int j = *p; Ref 2 2 j j i i p p 1 1

More on functions This encoding supports higher-order functions – Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); Fun j j id

More on functions This encoding supports higher-order functions – Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); Fun j j Ref id

More on functions This encoding supports higher-order functions – Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); Fun j j funcPtr Ref id

Ref More on functions This encoding supports higher-order functions – Passing around Fun terms just like constants Function pointers also work int (*funcPtr)(int); int id(int j) { return j }; funcPtr = &id; int x = (*funcPtr)(0); Fun j j 0 0 x x Ref id funcPtr

Context (in)sensitivity Multiple call sites int x = id(1); int y = id(2); Fun 2 2 y y id Fun 1 1 x x j j r r {1,2}  x{1,2}  y

Context sensitivity Multiple call sites int x = id 1 (1); int y = id 2 (2); Option 1: Specialization Each call id i gets a new copy of id Eliminates smearing but increases graph size Fun 2 2 y y id 2 Fun j2j2 j2j2 r2r2 r2r2 1 1 x x id 1 Fun j1j1 j1j1 r1r1 r1r1 {1}  x {2}  y

Context sensitivity Option 2: Unique labeled edges for each call site Not using Fun constructor There is flow only if there is a path that spells a substring of a well-bracketed string – [ a [ b ] b ] a and [ a ] a [ b are valid; [ a [ b ] a ] b is not For both options, if there are higher-order functions or function pointers, need a first pass to compute pointer targets [1[1 ]1]1 [2[2 ]2]2 1 1 x x 2 2 y y j j r r

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); o o Fld f o.f Fld g o.g Fld f 3 3 Fld g 4 4 id Fun j j r r

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); o o Fld f o.f Fld g o.g Fld f 3 3 Fld g 4 4 id Fun j j r r

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); id Fun j j r r readG Fun p p r3r3 r3r3 Fld g Fld f o.f Fld g o.g o o Fld f 3 3 Fld g 4 4

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); id Fun j j r r readG Fun p p r3r3 r3r3 Fld g Fld f o.f Fld g o.g o o Fld f 3 3 Fld g 4 4 Fld f Fun w w

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); id Fun j j r r readG Fun p p r3r3 r3r3 Fld g Fld f o.f Fld g o.g o o Fld f 3 3 Fld g 4 4 Fld f Fun w w

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); id Fun j j r r readG Fun p p r3r3 r3r3 Fld g Fld f o.f Fld g o.g o o Fld f 3 3 Fld g 4 4 Fun z z Fld f Fun w w

Field sensitivity For each field f, define Fld f (-,+) constructor obj o = { f:3; g:4 }; int readG(obj p) { return p.g; } int w = id(o.f); int z = readG(o); id Fun j j r r readG Fun p p r3r3 r3r3 Fld g Fld f o.f Fld g o.g o o Fld f 3 3 Fld g 4 4 Fun z z Fld f Fun w w

Scalability Constraint graph for entire program is in memory Even for flow-insensitive analyses, this can become a bottleneck Even worse for flow-sensitive analyses Techniques for analyzing parts of program in isolation and storing summaries of their observable effects

Summary Set constraints a natural way to express various program analyses – Constant propagation, pointer analysis – Closure analysis – Receiver class analysis, prototype-based inheritance – Information flow Rich literature on solving systems of constraints Non-trivial to extend to flow-sensitive or summary-based analyses Interference between functions and references