Rational XL C/C++ Compiler Development © 2007 IBM Corporation Identifying Aliasing Violations in Source Code A Points-to Analysis Approach Ettore Tiotto, XL C/C++ Compiler Development, TPO Chris Bowler, XL C/C++ Compiler Development, C++FE
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 2 Overview What is Aliasing? Aliasing Violations in Type-Based Aliasing A Simple Approach to Detection of Aliasing Violations A Better Approach to Detection of Aliasing Violations
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 3 Aliasing – The Purpose Definition: Identifies the memory locations operations may refer to –Enables optimization Example: void f1(float* pf1, float* pf2) { *pf1 = 5.0; float f = *pf2; // can this be move up? //...more code } void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // how about here? //...more code }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 4 Aliasing – Can You Spot a Potential Problem? void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // …more code } int main() { float mf = 5.0; f2((int*)&mf, &mf); // …more code }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 5 Aliasing – Can You Spot a Potential Problem? void f2(int* pi, float* pf) { *pi = 42; float f = *pf; // value of f may be undefined unless int* and float* alias // …more code } int main() { float mf = 5.0; f2((int*)&mf, &mf); // …more code }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 6 Type-Based Aliasing Permits optimization within a compilation unit –Contract between programmer and compiler Guaranteed by the programmer Assumed by the compiler –See C++ Standard regarding lvalues and rvalues: (ISO/IEC 14882:2003(E) section 3.10) Indirect reads/writes should not permit “unnatural” access to an object. C/C++ char pointers may point to anything –char pointers are used to manipulate raw memory
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 7 Type-Based Aliasing Why would a programmer violate the type-based aliasing rules? –Processing byte-stream data in different ways is problematic (database, network packets etc.) –Casts inserted to bypass type checking diagnostics Type-based aliasing can be relaxed in most compilers: –xlC -qalias=noansi –May result in performance degradation Optimization can change the behaviour of the program: –Costly to debug –Migration difficulty –New optimizations may expose aliasing violations
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 8 A Simple Approach – Diagnose Invalid Type Casts –The real problem is the indirect, not the cast: float toFloat(int *pi) { return *(float*)pi; // diagnose cast } –Useful, but often proves to be excessively verbose False positive if indirect uses an aliased type. –No dataflow from address-taken to indirect
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 9 A Better Approach Using Points-to Analysis Emit a diagnostic when: –An indirect operation may point to a memory location that is not aliased to the indirect type. –Diagnostic provides: Type of indirect (pointer type) Name and type of object (points-to entry) A possible dataflow path for the invalid points-to entry (source location sequence)
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 10 A Better Approach Using Points-to Analysis Problem: –What objects may be referred by a given pointer? One Approach: –Static analysis –Undecidable in most cases –Context and flow insensitive –Compilation unit, linker view, or more…
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 11 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 12 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 13 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 14 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) } pj { j (6.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 15 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } &i { i (5.13) } &j { j (6.13) } pi { i (5.11) j (7.6) } pj { j (6.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 16 Example 1: The Basics void f() { int i; int j; int* pi = &i; int* pj = &j; pi = pj; } Traceback for pi points-to i (5.11, 5.13) Traceback for pi points-to j (7.6, 6.11, 6.13) &i { i (5.13) } &j { j (6.13) } pi { i (5.11) j (7.6) } pj { j (6.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 17 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 18 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) } &h::i { h::i (12.5) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 19 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 20 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } Note node sharing for the second call &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) h::i (12.4) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 21 Example 2: Function Calls void f(int *p); void g() { int i; f(&i); } void h() { int i; f(&i); } Traceback f::p points-to g::i (6.4, 6.5) Traceback f::p points-to h::i (12.4, 12.5) &g::i { g::i (6.5) } &h::i { h::i (12.5) } f::p { g::i (6.4) h::i (12.4) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 22 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } pi { i (5.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 23 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 24 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) } &pi { pi (7.14) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 25 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 26 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } Why can’t pi point to f? &i { i (5.13) } &f { f (6.19) } pf { f (6.11) } pi { i (5.11) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) } *pp { i (5.11) f (6.11) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 27 Example 3: Handling Indirects int main() { int i, j; float f; int* pi = &i; int* pf = (int*)&f; int** pp = π pp = &pf; *pp = &j; *pi = 42; return 0; } How to traceback pi points to j? *pp points-to j + pp points-to pi &i { i (5.13) } &f { f (6.19) } pf { f (6.11) j (9.7) } pi { i (5.11) j (9.7) } pp { pi (7.12) pf (8.6) } &pi { pi (7.14) } &pf { pf (8.8) } &j { j (9.9) } *pp { i (5.11) f (6.11) j (9.7) }
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 28 Prototype Implementation Example 1 Example 1: 1 int main() 2 { 3 short s = 42; 4 int *pi = (int*)&s; 5 *pi = 63; 6 return 0; 7 } line 5.3: (I) Dereference may not conform to the current aliasing rules. line 5.3: (I) The dereferenced expression has type "int". "pi" may point to "s" which has incompatible type "short". line 5.3: (I) Check assignment at line 4 column 11 of t.C.
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 29 Prototype Implementation Example 2 Example 2: 1 struct Foo 2 { 3 Foo() : _p(0) { } 4 int foo() { return *_p; } 5 void setP(int& p) { _p =&p; } 6 int* _p; 7 }; 8 int main() 9 { 10 Foo foo1; 11 short s; 12 foo1.setP((int&)s); 13 return foo1.foo(); 14 } line 4.22: (I) Dereference may not conform to the current aliasing rules. line 4.22: (I) The dereferenced expression has type "int". "_p" may point to "s" which has incompatible type "short". line 4.22: (I) Check assignment at line 12 column 13 of t.C. line 4.22: (I) Check assignment at line 5 column 26 of t.C.
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 30 A Better Approach Using Points-to Analysis Other complications: –Aggregates, arrays, nameless objects Inter-procedural analysis is ideal, otherwise significant limitations in points-to analysis: – int *p = f(); //can’t see definition of f –Flow and context-insensitive analysis works well, otherwise space and time complexity can be an issue –Shared libraries still a problem
Rational XL C/C++ Compiler Development © 2007 IBM Corporation 31 IBM Patents Pending This presentation contains material which has Patents Pending –XL C/C++ Compiler Development: Ettore Tiotto Enrique Varillas Raymond Mak Sean Perry Chris Bowler