Rational XL C/C++ Compiler Development © 2007 IBM Corporation Identifying Aliasing Violations in Source Code A Points-to Analysis Approach Ettore Tiotto,

Slides:



Advertisements
Similar presentations
CS3012: Formal Languages and Compilers Static Analysis the last of the analysis phases of compilation type checking - is an operator applied to an incompatible.
Advertisements

SYMBOL TABLES &CODE GENERATION FOR EXECUTABLES. SYMBOL TABLES Compilers that produce an executable (or the representation of an executable in object module.
the c language BY SA 1972 by Ken Thompson and Dennis Ritchie.
C Language.
1 Structures. 2 User-Defined Types C provides facilities to define one’s own types. These may be a composite of basic types ( int, double, etc) and other.
Carnegie Mellon Lecture 7 Instruction Scheduling I. Basic Block Scheduling II.Global Scheduling (for Non-Numeric Code) Reading: Chapter 10.3 – 10.4 M.
Pointer Variables The normal variables hold values. For example, int j; j = 2; Then a reference to j in an expression will be identified with the value.
Lecture # 21 Chapter 6 Uptill 6.4. Type System A type system is a collection of rules for assigning type expressions to the various parts of the program.
This Time Pointers (declaration and operations) Passing Pointers to Functions Const Pointers Bubble Sort Using Pass-by-Reference Pointer Arithmetic Arrays.
Pointer applications. Arrays and pointers Name of an array is a pointer constant to the first element whose value cannot be changed Address and name refer.
Lecture 3 Feb 4 summary of last week’s topics and review questions (handout) Today’s goals: Chapter 1 overview (sections 1.4 to 1.6) c++ classes constructors,
Informática II Prof. Dr. Gustavo Patiño MJ
6/10/2015C++ for Java Programmers1 Pointers and References Timothy Budd.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 5 Types Types are the leaven of computer programming;
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved Fundamentals of Strings and Characters Characters.
© 2002 IBM Corporation IBM Toronto Software Lab October 6, 2004 | CASCON2004 Interprocedural Strength Reduction Shimin Cui Roch Archambault Raul Silvera.
1 ES 314 Advanced Programming Lec 3 Sept 8 Goals: complete discussion of pointers discuss 1-d array examples Selection sorting Insertion sorting 2-d arrays.
Review on pointers and dynamic objects. Memory Management  Static Memory Allocation  Memory is allocated at compiling time  Dynamic Memory  Memory.
C pointers (Reek, Ch. 6) 1CS 3090: Safety Critical Programming in C.
1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation –The new operator –The delete operator –Dynamic.
1 Review of Chapter 6: The Fundamental Data Types.
1 Procedural Concept The main program coordinates calls to procedures and hands over appropriate data as parameters.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes
Prof. amr Goneid, AUC1 CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 10. Pointers & Dynamic Data Structures.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
C Tokens Identifiers Keywords Constants Operators Special symbols.
Compiler Construction
Dynamic Memory Allocation Conventional array and other data declarations An incorrect attempt to size memory dynamically Requirement for dynamic allocation.
Addresses in Memory When a variable is declared, enough memory to hold a value of that type is allocated for it at an unused memory location. This is.
Comp 245 Data Structures Linked Lists. An Array Based List Usually is statically allocated; may not use memory efficiently Direct access to data; faster.
CSE 232: C++ pointers, arrays, and references Overview of References and Pointers Often need to refer to another object –Without making a copy of the object.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
C++ Data Types Structured array struct union class Address pointer reference Simple IntegralFloating char short int long enum float double long double.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
Dynamic Memory Allocation. Domain A subset of the total domain name space. A domain represents a level of the hierarchy in the Domain Name Space, and.
Copyright 2005, The Ohio State University 1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation.
Types(1). Lecture 52 Type(1)  A type is a collection of values and operations on those values. Integer type  values..., -2, -1, 0, 1, 2,...  operations.
Lecture 6 C++ Programming Arne Kutzner Hanyang University / Seoul Korea.
Slides created by: Professor Ian G. Harris Hello World #include main() { printf(“Hello, world.\n”); }  #include is a compiler directive to include (concatenate)
Pointers *, &, array similarities, functions, sizeof.
C Programming Lecture 16 Pointers. Pointers b A pointer is simply a variable that, like other variables, provides a name for a location (address) in memory.
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 8 – C: Miscellanea Control, Declarations, Preprocessor, printf/scanf.
Fall 2015CISC/CMPE320 - Prof. McLeod1 CISC/CMPE320 Reminders: Have you filled out the questionnaire in Moodle? (23 left – as of last night). Are you able.
Type Systems CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
EEL 3801 C++ as an Enhancement of C. EEL 3801 – Lotzi Bölöni Comments  Can be done with // at the start of the commented line.  The end-of-line terminates.
Computer Science: A Structured Programming Approach Using C1 Objectives ❏ To understand the concept and use of pointers ❏ To be able to declare, define,
CSE 332: C++ pointers, arrays, and references Overview of Pointers and References Often need to refer to another object –Without making a copy of the object.
Array and Pointers An Introduction Unit Unit Introduction This unit covers the usage of pointers and arrays in C++
 Data Type is a basic classification which identifies different types of data.  Data Types helps in: › Determining the possible values of a variable.
FUNCTIONS (CONT). Midterm questions (21-30) 21. The underscore can be used anywhere in an identifier. 22. The keyword void is a data type in C. 23. Floating.
CE-2810 Dr. Mark L. Hornick 1 “Classes” in C. CS-280 Dr. Mark L. Hornick 2 A struct is a complex datatype that can consist of Primitive datatypes Ints,
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
1 C++ Programming C++ Basics ● C++ Program ● Variables, objects, types ● Functions ● Namespace ● Tests ● Loops ● Pointers, references.
Overview Working directly with memory locations is beneficial. In C, pointers allow you to: change values passed as arguments to functions work directly.
Types Type Errors Static and Dynamic Typing Basic Types NonBasic Types
The C++ Data Types Fundamental Data Types
LESSON 06.
Pointers Introduction
Student Book An Introduction
C Basics.
Memory and Addresses Memory is just a sequence of byte-sized storage devices. The bytes are assigned numeric addresses, starting with zero, just like the.
Pointers, Dynamic Data, and Reference Types
Built-In (a.k.a. Native) Types in C++
Pointers Lecture 1 Thu, Jan 15, 2004.
A simple function.
Pointers and pointer applications
Pointers, Dynamic Data, and Reference Types
Presentation transcript:

Rational XL C/C++ Compiler Development © 2007 IBM Corporation Identifying Aliasing Violations in Source Code A Points-to Analysis Approach Ettore Tiotto, XL C/C++ Compiler Development, TPO Chris Bowler, XL C/C++ Compiler Development, C++FE

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 2 Overview  What is Aliasing?  Aliasing Violations in Type-Based Aliasing  A Simple Approach to Detection of Aliasing Violations  A Better Approach to Detection of Aliasing Violations

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 3 Aliasing – The Purpose Definition: Identifies the memory locations operations may refer to –Enables optimization Example: void f1(float* pf1, float* pf2) { *pf1 = 5.0; float f = *pf2; // can this be move up? //...more code } void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // how about here? //...more code }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 4 Aliasing – Can You Spot a Potential Problem? void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // …more code } int main() { float mf = 5.0; f2((int*)&mf, &mf); // …more code }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 5 Aliasing – Can You Spot a Potential Problem? void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // value of f may be undefined unless int* and float* alias // …more code } int main() { float mf = 5.0; f2((int*)&mf, &mf); // …more code }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 6 Type-Based Aliasing  Permits optimization within a compilation unit –Contract between programmer and compiler Guaranteed by the programmer Assumed by the compiler –See C++ Standard regarding lvalues and rvalues: (ISO/IEC 14882:2003(E) section 3.10)  Indirect reads/writes should not permit “unnatural” access to an object.  C/C++ char pointers may point to anything –char pointers are used to manipulate raw memory

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 7 Type-Based Aliasing  Why would a programmer violate the type-based aliasing rules? –Processing byte-stream data in different ways is problematic (database, network packets etc.) –Casts inserted to bypass type checking diagnostics  Type-based aliasing can be relaxed in most compilers: –xlC -qalias=noansi –May result in performance degradation  Optimization can change the behaviour of the program: –Costly to debug –Migration difficulty –New optimizations may expose aliasing violations

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 8 A Simple Approach – Diagnose Invalid Type Casts –The real problem is the indirect, not the cast: float toFloat(int *pi) { return *(float*)pi; // diagnose cast } –Useful, but often proves to be excessively verbose False positive if indirect uses an aliased type. –No dataflow from address-taken to indirect

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 9 A Better Approach Using Points-to Analysis  Emit a diagnostic when: –An indirect operation may point to a memory location that is not aliased to the indirect type. –Diagnostic provides: Type of indirect (pointer type) Name and type of object (points-to entry) A possible dataflow path for the invalid points-to entry (source location sequence)

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 10 A Better Approach Using Points-to Analysis  Problem: –What objects may be referred by a given pointer?  One Approach: –Static analysis –Undecidable in most cases –Context and flow insensitive –Compilation unit, linker view, or more…

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 11 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 12 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 13 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 14 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) } pj { j (6.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 15 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) j (7.6) } pj { j (6.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 16 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; }  Traceback for pi points-to i (5.11, 5.13)  Traceback for pi points-to j (7.6, 6.11, 6.13) &i { i (5.13) } &j { j (6.13) } pi { i (5.11) j (7.6) } pj { j (6.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 17 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 18 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) } &h::i { h::i (12.5) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 19 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 20 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); }  Note node sharing for the second call &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) h::i (12.4) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 21 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); }  Traceback f::p points-to g::i (6.4, 6.5)  Traceback f::p points-to h::i (12.4, 12.5) &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) h::i (12.4) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 22 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } pi { i (5.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 23 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 24 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) } &pi { pi (7.14) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 25 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 26 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; }  Why can’t pi point to f? &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) } *pp { i (5.11) f (6.11) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 27 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; }  How to traceback pi points to j?  *pp points-to j + pp points-to pi &i { i (5.13) } &f { f (6.19) } pf { f (6.11) j (9.7) } pi { i (5.11) j (9.7) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) } &j { j (9.9) } *pp { i (5.11) f (6.11) j (9.7) }

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 28 Prototype Implementation Example 1  Example 1: 1 int main() 2 { 3 short s = 42; 4 int *pi = (int*)&s; 5 *pi = 63; 6 return 0; 7 } line 5.3: (I) Dereference may not conform to the current aliasing rules. line 5.3: (I) The dereferenced expression has type "int". "pi" may point to "s" which has incompatible type "short". line 5.3: (I) Check assignment at line 4 column 11 of t.C.

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 29 Prototype Implementation Example 2  Example 2: 1 struct Foo 2 { 3 Foo() : _p(0) { } 4 int foo() { return *_p; } 5 void setP(int& p) { _p =&p; } 6 int* _p; 7 }; 8 int main() 9 { 10 Foo foo1; 11 short s; 12 foo1.setP((int&)s); 13 return foo1.foo(); 14 } line 4.22: (I) Dereference may not conform to the current aliasing rules. line 4.22: (I) The dereferenced expression has type "int". "_p" may point to "s" which has incompatible type "short". line 4.22: (I) Check assignment at line 12 column 13 of t.C. line 4.22: (I) Check assignment at line 5 column 26 of t.C.

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 30 A Better Approach Using Points-to Analysis  Other complications: –Aggregates, arrays, nameless objects  Inter-procedural analysis is ideal, otherwise significant limitations in points-to analysis: – int *p = f(); //can’t see definition of f –Flow and context-insensitive analysis works well, otherwise space and time complexity can be an issue –Shared libraries still a problem

Rational XL C/C++ Compiler Development © 2007 IBM Corporation 31 IBM Patents Pending  This presentation contains material which has Patents Pending –XL C/C++ Compiler Development: Ettore Tiotto Enrique Varillas Raymond Mak Sean Perry Chris Bowler