Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK

Slides:

Advertisements

Similar presentations

Advertisements

Heuristic Search techniques

CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.

C Structures What is a structure? A structure is a collection of related variables. It may contain variables of many different data types---in contrast.

Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...

Chapter 17 vector and Free Store John Keyser’s Modifications of Slides By Bjarne Stroustrup

Introduction to Linked Lists In your previous programming course, you saw how data is organized and processed sequentially using an array. You probably.

Analysis of programs with pointers. Simple example What are the dependences in this program? Problem: just looking at variable names will not give you.

3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.

Preference Elicitation Partial-revelation VCG mechanism for Combinatorial Auctions and Eliciting Non-price Preferences in Combinatorial Auctions.

Principles of programming languages 4: Parameter passing, Scope rules Department of Information Science and Engineering Isao Sasano.

INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.

Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.

1 CS 162 Introduction to Computer Science Chapter 8 Pointers Herbert G. Mayer, PSU Status 11/20/2014.

ECE 103 Engineering Programming Chapter 11 One Minute Synopsis Herbert G. Mayer, PSU CS Status 7/1/2014.

Program Analysis with Set Constraints Ravi Chugh.

The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.

Program Analysis with Set Constraints Ravi Chugh.

Dynamic Data Structures H&K Chapter 14 Instructor – Gokcen Cilingir Cpt S 121 (July 26, 2011) Washington State University.

Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...

61 Nondeterminism and Nodeterministic Automata. 62 The computational machine models that we learned in the class are deterministic in the sense that the.

Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.

1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3

PSUCS322 HM 1 Languages and Compiler Design II Parameter Passing Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 13 Pointers and Linked Lists.

Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...

Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing

C++ fundamentals.

C FAQ’S Collected from the students who attended technical round in TCS recruitment.

Pointers (Continuation) 1. Data Pointer A pointer is a programming language data type whose value refers directly to ("points to") another value stored.

C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.

IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.

June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.

Implementing a Language with Flow-Sensitive and Structural Subtyping on the JVM David J. Pearce and James Noble Victoria University of Wellington.

1 Chapter 3 Scanning – Theory and Practice. 2 Overview Formal notations for specifying the precise structure of tokens are necessary –Quoted string in.

Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.

Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.

Detecting Equality of Variables in Programs Bowen Alpern, Mark N. Wegman, F. Kenneth Zadeck Presented by: Abdulrahman Mahmoud.

ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 3 – August 28, 2001.

Generic lists Vassilis Athitsos. Problems With Textbook Interface? Suppose that we fix the first problem, and we can have multiple stacks. Can we have.

MT311 Java Application Development and Programming Languages Li Tak Sing ( 李德成 )

School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.

Lecture 7 Pointers & Refrence 1. Background 1.1 Variables and Memory  When you declare a variable, the computer associates the variable name with a particular.

Brian Mitchell - Drexel University MCS680-FCS 1 Patterns, Automata & Regular Expressions int MSTWeight(int graph[][], int size)

Regular Expressions Chapter 6. Regular Languages Regular Language Regular Expression Finite State Machine L Accepts.

A dynamic algorithm for topologically sorting directed acyclic graphs David J. Pearce and Paul H.J. Kelly Imperial College, London, UK

ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 – August 23, 2001.

Computer Science: A Structured Programming Approach Using C1 Objectives ❏ To understand the concept and use of pointers ❏ To be able to declare, define,

Discrete Mathematics Lecture # 22 Recursion.  First of all instead of giving the definition of Recursion we give you an example, you already know the.

CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Lucas Bang Lecture 11: Pointers.

Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.

Announcements Quiz this Thursday 1. Multi dimensional arrays A student got a warning when compiling code like: int foo(char **a) { } int main() { char.

1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.

Programming Languages Meeting 3 September 9/10, 2014.

Object Oriented Programming Lecture 2: BallWorld.

Recap Resizing the Vector Push_back function Parameters passing Mechanism Primitive Arrays of Constants Multidimensional Arrays The Standard Library string.

Online Cycle Detection and Difference Propagation for Pointer Analysis David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London

Pointers and Linked Lists

Pointers and Linked Lists

Lecture 9 Theory of AUTOMATA

Java Review: Reference Types

C programming language

Introduction to Abstract Data Types

Structures, Unions, and Enumerations

Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

Presentation transcript:

Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK

What is Pointer Analysis?  Determine pointer targets without running program  What is flow-insensitive pointer analysis? >One solution for all statements – so precision lost >This is a trade-off for efficiency over precision >This work considers flow-insensitive pointer analysis only int a,b,*p,*q = NULL; p = &a; if(…) q = p; // p  {a,b}, q  {a,NULL} p = &b;

Pointer analysis via set-constraints  Generate set-constraints from program and solve them >Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; r = &b; q = &c; if(...) q = p; else q = r; (program)

Pointer analysis via set-constraints int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r (program)(constraints)  Generate set-constraints from program and solve them >Use constraint graph for efficient solving

Pointer analysis via set-constraints int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r pqr {a}{b} (program)(constraints)(constraint graph) {c}  Generate set-constraints from program and solve them >Use constraint graph for efficient solving

Pointer analysis via set-constraints int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r pqr {a}{b} (program)(constraints)(constraint graph) {a,b,c}  Generate set-constraints from program and solve them >Use constraint graph for efficient solving

Field-Sensitivity  How to deal with aggregate types ? >Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x  p x.f2 = q; // x  q r = x.f1; // r  x pxq {a} {b} {} r

Field-Sensitivity  How to deal with aggregate types ? >Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x  p x.f2 = q; // x  q r = x.f1; // r  x pxq {a} {b} {a,b}{a,b} r {a,b}{a,b}

Field-Sensitivity – A simple solution  Use a separate node per field for each aggregate >Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x f1  p x.f2 = q; // x f2  q r = x.f1; // r  x f1 px f2 q {a} {b} {} r x f1 {}

Field-Sensitivity – A simple solution  Use a separate node per field for each aggregate >Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x f1  p x.f2 = q; // x f2  q r = x.f1; // r  x f1 px f2 q {a} {b} {a}{a} r {a}{a} x f1 {b}{b}

Problem – can take address of field in C  System thus far has no mechanism for this  First idea – use string concatenation operator || >Works well for this example typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p  ? x f2 {..} x f1 {..}

Problem – can take address of field in C  System thus far has no mechanism for this  First idea – use string concatenation operator || >Works well for this example typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p  (*s) || f2 x f2 {..} x f1 {..}

Problem – can take address of field in C  System thus far has no mechanism for this  First idea – use string concatenation operator || >Works well for this example typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p  (*s) || f2  p  { x } || f2  p  { x f2 } x f2 {..} x f1 {..}

Problem – compatible types  First idea – use string concatenation operator || >Casting identical types except for field names >Derivation same as before - but,node x f2 no longer exists! typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x } p = &(s->f2); // p  (*s) || f2 x f4 {..} x f3 {..}

Problem – compatible types  First idea – use string concatenation operator || >Casting identical types except for field names >Derivation same as before - but,node x f2 no longer exists! typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x } p = &(s->f2); // p  (*s) || f2  p  { x } || f2  p  { x f2 } x f4 {..} x f3 {..}

Field-Sensitivity – Our Solution typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x f3 } p = &(s->f2); // p  s + 1  Our solution – map variables to integers >Solution sets become integer sets >Use integer addition to model taking address of field >Address of aggregate modelled by address of its first field psx f3 x f4 0123

Field-Sensitivity – Our Solution typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x f3 }  s  { 2 } p = &(s->f2); // p  s + 1  Our solution – map variables to integers >Solution sets become integer sets >Use integer addition to model taking address of field >Address of aggregate modelled by address of its first field psx f3 x f4 0123

Field-Sensitivity – Our Solution typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x f3 }  s  { 2 } p = &(s->f2); // p  s + 1  p  { 2 } + 1  p  { 3 }  Our solution – map variables to integers >Solution sets become integer sets >Use integer addition to model taking address of field >Address of aggregate modelled by address of its first field psx f3 x f4 0123

Experimental Study Time (s)Avg Deref Size bash (55324 LOC) Field-insensitive Field-sensitive emacs (93151 LOC) Field-insensitive Field-sensitive sendmail (49053 LOC) Field-insensitive Field-sensitive Named (75599 LOC) Field-insensitive Field-sensitive ghostscript ( LOC) Field-insensitive Field-sensitive

Conclusion  Field-sensitive Pointer Analysis >Presented new technique for C language >Elegantly copes with language features -Taking address of field -Compatible types and casting -Technique also handles function pointers without modification >Experimental evaluation over 7 common C programs -Considerable improvements in precision obtained -But, much higher solving times -And, relative gains appear to diminish with larger benchmarks

Constraint Graphs (continued)  What about statements involving a pointer dereference? >Cannot be represented in the constraint graph >Instead, add edges as solution of q becomes known >Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s pq {r} sr {} {a} (program)(constraints)(constraint graph) {}

Constraint Graphs (continued)  What about statements involving a pointer dereference? >Cannot be represented in the constraint graph >Instead, add edges as solution of q becomes known >Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s pq {r} sr {} {a} (program)(constraints)(constraint graph) {r}{r}

Constraint Graphs (continued)  What about statements involving a pointer dereference? >Cannot be represented in the constraint graph >Instead, add edges as solution of q becomes known >Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s pq {r} sr {} {a} (program)(constraints)(constraint graph) {r}{r}

Constraint Graphs (continued)  What about statements involving a pointer dereference? >Cannot be represented in the constraint graph >Instead, add edges as solution of q becomes known >Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s pq {r} sr {a}{a} {a} (program)(constraints)(constraint graph) {r}{r}