Pointer Analysis – Part I CS 6340. Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Pointer Analysis.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Programming Languages and Paradigms
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Pointer Analysis.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
Interprocedural analysis © Marcelo d’Amorim 2010.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
5/2/2008Prof. P. N. Hilfinger CS164 Lecture 401 Pointer Analysis Lecture 40 (adapted from notes by R. Bodik)
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Mayur Naik Alex Aiken John Whaley Stanford University Effective Static Race Detection for Java.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Interprocedural pointer analysis for C We’ll look at Wilson & Lam PLDI 95, and focus on two problems solved by this paper: –how to represent pointer information.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
A Context-Sensitive Pointer Analysis Phase in Open64 Compiler Tianwei Sheng, Wenguang Chen, Weimin Zheng Tsinghua University.
Intraprocedural Points-to Analysis Flow functions:
Java Alias Analysis for Online Environments Manu Sridharan 2004 OSQ Retreat Joint work with Rastislav Bodik, Denis Gopan, Jong-Deok Choi.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”
ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.
Overview of program analysis Mooly Sagiv html://
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Pointer and Shape Analysis Seminar Mooly Sagiv Schriber 317 Office Hours Thursday
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
An Efficient Inclusion-Based Points-To Analysis for Strictly-Typed Languages John Whaley Monica S. Lam Computer Systems Laboratory Stanford University.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Context-Sensitivity Analysis Literature Review by José Nelson Amaral University of Alberta.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.
Mark Marron IMDEA-Software (Madrid, Spain) 1.
Featherweight X10: A Core Calculus for Async-Finish Parallelism Jonathan K. Lee, Jens Palsberg Presented By- Vasvi Kakkad.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Using Types to Analyze and Optimize Object-Oriented Programs By: Amer Diwan Presented By: Jess Martin, Noah Wallace, and Will von Rosenberg.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Data Flow Analysis for Software Prefetching Linked Data Structures in Java Brendon Cahoon Dept. of Computer Science University of Massachusetts Amherst,
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
Escape Analysis for Java Will von Rosenberg Noah Wallace.
Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University.
Pointer Analysis CS Alias Analysis  Aliases: two expressions that denote the same memory location.  Aliases are introduced by:  pointers  call-by-reference.
Inter-procedural analysis
Abstract Interpretation and Future Program Analysis Problems Martin Rinard Alexandru Salcianu Laboratory for Computer Science Massachusetts Institute of.
Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting Tian Tan, Yue Li and Jingling Xue SAS 2016 September,
Compositional Pointer and Escape Analysis for Java Programs
Pointer Analysis Lecture 2
Topic 17: Memory Analysis
Pointer Analysis Lecture 2
Pointer analysis.
자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼
Pointer Analysis Jeff Da Silva Sept 20, 2004 CARG.
Presentation transcript:

Pointer Analysis – Part I CS 6340

Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization & verification problems Problem is undecidable –No exact (i.e. both sound & complete) solution But many conservative (i.e. sound) approximate solutions exist –Determine which pointers may point to which locations –All incomplete but differ in precision (i.e. false-positive rate) Continues to be active area of research –hundreds of papers and tens of theses

Example Java Program class Link { T data; Link next; } class List { Link tail; void append(T c) { Link k = new Link (); k.data = c; Link t = this.tail; if ( t.next = k; this.tail = k; } t != null) static void main() { String[] a = new String[] { “ a1 ”, “ a2 ” }; String[] b = new String[] { “ b1 ”, “ b2 ” }; List l; l = new List (); for ( String v1 = a[i]; l.append(v1); } print(l); l = new List (); for ( String v2 = b[i]; l.append(v2); } print(l); } int i = 0; i < a.length; i++) { int i = 0; i < b.length; i++) {

Flow sensitivity –flow-insensitive: ignores intra-procedural control flow Call graph construction Heap abstraction Aggregate modeling Context sensitivity 0-CFA Pointer Analysis for Java

* i static void main() { String[] a = new String[] { “ a1 ”, “ a2 ” } String[] b = new String[] { “ b1 ”, “ b2 ” } List l l = new List () for ( String v1 = a[ ] l.append(v1) l = new List () for ( String v2 = b[ ] l.append(v2) } ; ; * int i = 0; i < a.length; i++) { int i = 0; i < b.length; i++) { Flow Insensitivity: Example class Link { T data; Link next; } class List { T tail; void append(T c) { Link k = new Link () k.data = c Link t = this.tail if ( t.next = k this.tail = k } } ; ; ; ; ; ; ; ; ; ; *) { t != null) *) i ; ; } }

class List { T tail; void append(T c) { Link k = new Link () k.data = c Link t = this.tail t.next = k this.tail = k } static void main() { String[] a = new String[] { “ a1 ”, “ a2 ” } String[] b = new String[] { “ b1 ”, “ b2 ” } List l l = new List () String v1 = a[*] l.append(v1) l = new List () String v2 = b[*] l.append(v2) } Flow Insensitivity: Example

Flow sensitivity –flow-insensitive: ignores intra-procedural control flow Call graph construction –“ on-the-fly ” : mutually recursively with pointer analysis Heap abstraction Aggregate modeling Context sensitivity 0-CFA Pointer Analysis for Java

static void main() { String[] a = new String[] { “ a1 ”, “ a2 ” } String[] b = new String[] { “ b1 ”, “ b2 ” } List l l = new List () String v1 = a[*] l.append(v1) l = new List () String v2 = b[*] l.append(v2) } Call Graph (Base Case): Example Code deemed reachable so far … class List { T tail; void append(T c) { Link k = new Link () k.data = c Link t = this.tail t.next = k this.tail = k }

Flow sensitivity –flow-insensitive: ignores intra-procedural control flow Call graph construction –“ on-the-fly ” : mutually recursively with pointer analysis Heap abstraction –object alloc. sites: does not distinguish objects created at same site Aggregate modeling Context sensitivity 0-CFA Pointer Analysis for Java

static void main() { String[] a = new String[] { “ a1 ”, “ a2 ” } String[] b = new String[] { “ b1 ”, “ b2 ” } List l l = new List () String v1 = a[*] l.append(v1) l = new List () String v2 = b[*] l.append(v2) } Heap Abstraction: Example class List { T tail; void append(T c) { Link k = new Link () k.data = c Link t = this.tail t.next = k this.tail = k }

static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } Heap Abstraction: Example class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k }

Heap Abstraction: Example class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } Note: Pointer analyses for Java typically do not distinguish between string literals (like “a1”, “a2”, “b1”, “b2” above): they use a single location to abstract them all static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) }

v = new i … Rule for Object Alloc. Sites Before: After: Note: This and each subsequent rule involving assignment is a “weak update” as opposed to a “strong update” (i.e., it accumulates as opposed to updates the points-to information for the l.h.s.), a hallmark of flow-insensitivity v new j … … v new i new j … …

Rule for Object Alloc. Sites: Example static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } l new 4 new 3 new 1 ba new 2

Flow sensitivity –flow-insensitive: ignores intra-procedural control flow Call graph construction –“ on-the-fly ” : mutually recursively with pointer analysis Heap abstraction –object alloc. sites: does not distinguish objects created at same site Aggregate modeling –does not distinguish elements of same array –field-sensitive for instance fields Context sensitivity 0-CFA Pointer Analysis for Java

v1.f = v2 v1 Rule for Heap Writes Before: After: new i … … v2 new j … … v2 new j … … new k new i … … f new j new k … … … … v1 new i … … f f f is instance field or [*] (array element)

Rule for Heap Writes: Example static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ”

v1 = v2.f v1 Rule for Heap Reads Before: After: new i v1 new k new i … … … … … … v2 new j … … v2 new j … … new k new j … … f new k new j … … f f is instance field or [*] (array element)

Rule for Heap Reads: Example static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ” v1v2

Flow sensitivity –flow-insensitive: ignores intra-procedural control flow Call graph construction –“ on-the-fly ” : mutually recursively with pointer analysis Heap abstraction –object alloc. sites: does not distinguish objects created at same site Aggregate modeling –does not distinguish elements of same array –field-sensitive for instance fields Context sensitivity –context-insensitive: ignores inter-procedural control flow (analyzes each function in a single context) 0-CFA Pointer Analysis for Java

CHA(T j, foo) = T m ::foo() { …; return r; } v1 = v2.foo() Rule for Dynamic Dispatching Calls Before: After: v1 new l new i v1 new i … … … … … … v2 new j … … v2 new j … … this new k … … r new l … … TjTj TjTj r … … this new j new k … … { …; ; …; } T n ::bar() T m ::foo() c c T n ::bar() T m ::foo() { } …; return r;

Call Graph (Inductive Step): Example l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ” v1v2 static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } c this new 5 k data tail t next

Classifying Pointer Analyses Heap abstraction Alias representation Aggregate modeling Flow sensitivity Context sensitivity Call graph construction Compositionality Adaptivity

Heap Abstraction Single node for entire heap –Cannot distinguish between heap-directed pointers –Popular in stack-directed pointer analyses for C Object allocation sites ( “ 0-CFA ” ) –Cannot distinguish objects allocated at same site –Predominant pointer analysis for Java String of call sites ( “ k-CFA with heap specialization/cloning ” ) –Distinguishes objects allocated at same site using finitely many strings of call sites –Predominant heap-directed pointer analysis for C Strings of object allocation sites in object-oriented languages ( “ k-object-sensitivity ” ) –Distinguishes objects allocated at same site using finitely many strings of object allocation sites

Example l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ” v1v2 static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } c this new 5 k data tail t next

Alias Representation Points-to Analysis: Computes the set of memory locations that a pointer may point to –Points-to graph represented explicitly or symbolically (e.g. using Binary Decision Diagrams) –Predominant kind of pointer analysis Alias Analysis: Computes pairs of pointers that may point to the same memory location –Used primarily by older pointer analyses for C –Can be computed using a points-to analysis: may-alias(v 1,v 2 ) if points-to(v 1 ) ∩ points-to(v 2 ) ≠ Ø

Aggregate Modeling Arrays –Single field ([*]) representing all array elements –Cannot distinguish elements of same array –But array dependence analysis used in parallelizing compilers is capable of making such distinctions Records/Structs –Field-insensitive/field-independent: merge all fields of each abstract record object –Field-based: merge each field of all record objects –Field-sensitive: model each field of each abstract record object (most precise)

Flow Sensitivity Flow-insensitive –Ignores intra-procedural control-flow (i.e. order of statements within a function) –Computes one solution for whole program or per function –Usually combined with Static Single Assignment (SSA) transformation to get limited flow sensitivity –Two kinds: Steensgaard ’ s or equality-based: almost linear time Anderson ’s or subset-based: cubic time Flow-sensitive –Computes one solution per program point –Typically performs “ strong updates ” for assignments –More precise but typically less scalable

Example l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ” v1v2 static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } c this new 5 k data tail t next

Context Sensitivity Context-insensitive –Ignores inter-procedural control-flow (does not match calls/returns) –Analyzes each function in a single abstract context Context-sensitive –Two kinds: Cloning-based (k-limited) –k-CFA or k-object-sensitive (for object-oriented languages) Summary-based –Top-down or bottom-up –Analyzes each function in multiple abstract contexts (cloning-based or top-down summary-based) or in a single parametric context (bottom- up summary-based) –More precise but typically less scalable

Example l new 4 new 3 new 1 ba new 2 [*] “ a1 ” “ a2 ” [*] “ b2 ” “ b1 ” v1v2 static void main() { String[] a = new 1 String[] { “ a1 ”, “ a2 ” } String[] b = new 2 String[] { “ b1 ”, “ b2 ” } List l l = new 3 List () String v1 = a[*] l.append(v1) l = new 4 List () String v2 = b[*] l.append(v2) } class List { T tail; void append(T c) { Link k = new 5 Link () k.data = c Link t = this.tail t.next = k this.tail = k } c this new 5 k data tail t next

Call Graph Construction Significant issue for OO languages like C++/Java Answers which methods a dynamically-dispatching call site can invoke at run-time Problem is undecidable –No exact (i.e. both sound and complete) solution But many conservative (i.e. sound) approximate solutions exist –Find which methods each dynamically-dispatching call site may invoke –All incomplete but differ in precision (i.e. false-positive rate)

Call Graph Construction (contd.) Two variants –in advance: prior to pointer analysis –on-the-fly: mutually recursively with pointer analysis Context insensitive algorithms –CHA: Class Hierarchy Analysis Use class hierarchy alone to compute which methods each dynamically- dispatching call site may invoke –RTA: Rapid Type Analysis Restrict CHA to classes instantiated in reachable object allocation sites –0-CFA Use pointer analysis to compute object allocation sites “ this ” argument of dynamically-dispatching call site may point to Context-sensitive algorithms –k-CFA –k-object-sensitive

Compositionality Whole-program –Cannot analyze open programs (e.g. libraries) –Predominant kind of pointer analysis Compositional/modular –Can analyze program fragments Missing callers (does not need “ harness ” ) Missing callees (does not need “ stubs ” ) –Solution is parameterized to accommodate unknown facts from the missing parts –Solution is instantiated to yield less parameterized (or fully instantiated) solution when missing parts are encountered –Parameterization harder in presence of dynamic dispatching Existing approaches rely on call graph computed by a whole-program analysis but can be highly imprecise –Open problem

Adaptivity Non-adaptive –Computes exhaustive solution of fixed precision regardless of client Demand-driven –Computes partial solution, depending upon a query from a client, but of fixed precision Client-driven –Computes exhaustive solution but can use different precision in different parts of the solution, depending upon client Iterative/Refinement-based –Starts with an imprecise solution and refines it in successive iterations depending upon client