The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Planar point location -- example
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research.
Demand-driven Alias Analysis Implementation Based on Open64 Xiaomi An
Whole-Program Linear-Constant Analysis with Applications to Link-Time Optimization Ludo Van Put – Dominique Chanet – Koen De Bosschere Ghent University.
Code Compaction of an Operating System Kernel Haifeng He, John Trimble, Somu Perianayagam, Saumya Debray, Gregory Andrews Computer Science Department.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Parallel Inclusion-based Points-to Analysis Mario Méndez-Lojo Augustine Mathew Keshav Pingali The University of Texas at Austin (USA) 1.
ABCD: Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Eliminating Stack Overflow by Abstract Interpretation John Regehr Alastair Reid Kirk Webb University of Utah.
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
Intraprocedural Points-to Analysis Flow functions:
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Data Flow Analysis Compiler Design Nov. 8, 2005.
An Efficient Inclusion-Based Points-To Analysis for Strictly-Typed Languages John Whaley Monica S. Lam Computer Systems Laboratory Stanford University.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Identifying Reversible Functions From an ROBDD Adam MacDonald.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
A GPU Implementation of Inclusion-based Points-to Analysis Mario Méndez-Lojo (AMD) Martin Burtscher (Texas State University, USA) Keshav Pingali (U.T.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
ABCD: Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
11th Nov 2004PLDI Region Inference for an Object-Oriented Language Wei Ngan Chin 1,2 Joint work with Florin Craciun 1, Shengchao Qin 1,2, Martin.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Pointer Analysis – Part II CS Unification vs. Inclusion Earlier scalable pointer analysis was context- insensitive unification-based [Steensgaard.
A dynamic algorithm for topologically sorting directed acyclic graphs David J. Pearce and Paul H.J. Kelly Imperial College, London, UK
CS321 Data Structures Jan Lecture 2 Introduction.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Static Timing Analysis
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Binary Decision Diagrams Prof. Shobha Vasudevan ECE, UIUC ECE 462.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
INFORMATION-FLOW ANALYSIS OF ANDROID APPLICATIONS IN DROIDSAFE JARED YOUNG.
Online Cycle Detection and Difference Propagation for Pointer Analysis David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London
Optimistic Hybrid Analysis
Optimizing Parallel Algorithms for All Pairs Similarity Search
Inference and search for the propositional satisfiability problem
Harry Xu University of California, Irvine & Microsoft Research
Pointer Analysis Lecture 2
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Pointer Analysis Lecture 2
Pointer analysis.
CUTE: A Concolic Unit Testing Engine for C
Using decision trees to improve signature-based intrusion detection
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Multidisciplinary Optimization
Presentation transcript:

The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best Presentation Award)

Contributions of the Paper Identify current state-of-the-art  Compare 3 well-known algorithms  Fastest takes ½ hour to analyze 1M lines of C Advance the state-of-the-art  Two techniques(The Ant and The Grasshopper) for inclusion-based analysis  Over 3x faster, same precision  Will be incorporated into GCC

The Agenda Background Lazy Cycle Detection Hybrid Cycle Detection Evaluation Q & A

Why Pointer Analysis ? Pointer information - vital for most program analyses like program verification and program understanding. Precise pointer analysis is NP hard. The most precise analyses are flow sensitive and context sensitive, but do not scale to large programs.

Pointer Analysis - Simplified Flow-sensitive analysis computes a different graph at each program point. But this can be quite expensive. Solution: Flow-insensitive analysis - compute a points-to relation which is the least upper bound of all the points-to relations computed by the flow-sensitive analysis Compute a SINGLE points-to relation that holds regardless of the order in which assignment statements are actually executed “consider all the assignment statements together, replacing strong updates in dataflow equations with weak updates” Lets see some equations!

Dataflow Equations – Flow-Sensitive x := &y G G’ = G with pt’(x)  {y} x := y G G’ = G with pt’(x)  pt(y) x := *y G G’ = G with pt’(x)  U pt(a) for all a in pt(y) *x := y G G’ = G with pt’(a) U  pt(y) for all a in pt(x) strong updates weak update

Dataflow Equations – Flow-Insensitive Statements x := &y G G = G with pt(x) U  {y} x := y G G = G with pt(x) U  pt(y) x := *y G G = G with pt(x) U  pt(a) for all a in pt(y) *x := y G G = G with pt(a) U  pt(y) for all a in pt(x) weak updates only

Set Constraints Statements x := &y x := y x := *y *x := y Base Simple Complex1 Complex2

Background: Inclusion-based Analysis Generate constraints from the code Build a constraint graph Nodes: variables Edges: inclusion constraints Add indirect constraints Pointer dereference Recursively compute transitive closure of the graph

Background: Inclusion-based Analysis c = &f; e = &c; g = &a; a = d; b = a; d = *e; *e = b; *g = e;

Background: Inclusion-based Analysis c = &f; e = &c; g = &a; a = d; b = a; d = *e; *e = b; *g = e; c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Background: Inclusion-based Analysis c = &f; e = &c; g = &a; a = d; b = a; d = *e; *e = b; *g = e; c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Background: Inclusion-based Analysis c = &f; e = &c; g = &a; a = d; b = a; d = *e; *e = b; *g = e; c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Background: Inclusion-based Analysis c = &f; e = &c; g = &a; a = d; b = a; d = *e; *e = b; *g = e; c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e g f d a c b Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e g f d a b Constraint Graph cfcf

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f d a cfcf b Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f d a cfcf b Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b dfdf a cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf a f,c cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b f,c dfdf a f,c cfcf Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b f,c dfdf a f,c c f,c Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b f,c d f,c a f,c c f,c Constraint Graph

Background: Inclusion-based Analysis c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b f,c d f,c a f,c c f,c Constraint Graph

Background: Online Cycle Detection Inclusion-based analysis is O(n 3 ) Optimize with online cycle detection  All nodes in the same cycle will have identical points-to sets  Most cycles appear during the analysis as new edges are added

Background: Online Cycle Detection Cycle detection mechanism will largely determine performance of analysis Must carefully balance aggression versus overhead  Too aggressive → too much graph traversal  Too conservative → cycles found too late

Contributions Two new techniques for cycle detection  Lazy Cycle Detection  Hybrid Cycle Detection Techniques are complementary  Hybrid Cycle Detection can be composed with any other cycle detection technique

Contributions Two new techniques for cycle detection  Lazy Cycle Detection  Hybrid Cycle Detection Techniques are complementary  Hybrid Cycle Detection can be composed with any other cycle detection technique

Lazy Cycle Detection Well-known fact: A cycle forces nodes to have identical points-to sets Key Insight: Nodes with identical points-to sets indicate possible cycles  Balance aggression and overhead by waiting for the effect of the cycle (identical points-to sets) to become obvious

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b dfdf a cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f b dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bfbf dfdf afaf cfcf Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f,c Constraint Graph

Lazy Cycle Detection c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f,c Constraint Graph

Lazy Cycle Detection IS LAZY because cycles are detected only while propagating constraints to them (well after they are created) Nodes with identical points-to sets MAY NOT be part of a cycle An Additional heuristic is needed to stop this wasteful search : Don’t trigger cycles detection on the same edge twice => Cycle detection is not guaranteed to find all cycles

Contributions Two new techniques for cycle detection  Lazy Cycle Detection  Hybrid Cycle Detection Techniques are complementary  Hybrid Cycle Detection can be composed with any other cycle detection technique

Hybrid Cycle Detection finds few cycles finds many cycles cheap expensive Offline Cycle Detection Rountev and Chandra, Off-line Variable Substitution for Scaling Points-to Analysis, in PLDI 2000.

Hybrid Cycle Detection finds few cycles finds many cycles cheap expensive Offline Cycle Detection Online Cycle Detection Fähndrich et al, Partial Online Cycle Elimination in Inclusion Constraint Graphs, in PLDI 1998.

Hybrid Cycle Detection Key Insight: combining offline and online techniques can give us the best of both worlds PS: Offline – before actual constraint graph traversal finds few cycles finds many cycles cheap expensive Offline Cycle Detection Online Cycle Detection Hybrid Cycle Detection

Hybrid Cycle Detection Eagerly finds cycles without traversing constraint graph Not guaranteed to find all cycles  46 ‒ 74% in these benchmarks Can be combined with other cycle detection techniques

Hybrid Cycle Detection Offline component Online component

Hybrid Cycle Detection ‒ Offline Linear time static analysis prior to actual pointer analysis. Uses a simpler offline constraint graph a node for each variable a ref node for variable dereference an edge for each simple, complex constr. ignore base constraints Detect cycles in this graph using Tarjan’s Algorithm Key: No need to perform transitive closure here!

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e Ignore Base Constraints!

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b Offline Constraint Graph

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b Offline Constraint Graph

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b Offline Constraint Graph

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b Offline Constraint Graph

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b Offline Constraint Graph

Hybrid Cycle Detection ‒ Offline c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e e *gd a *e b e → {a,b,d} Offline Constraint Graph

Hybrid Cycle Detection Offline component Online component

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f bd a cfcf Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f,c Constraint Graph e → {a,b,d}

Hybrid Cycle Detection ‒ Online c  {f} e  {c} g  {a} a  d b  a d  *e *e  b *g  e ecec gaga f a/b/c/d f,c Constraint Graph e → {a,b,d}

Evaluation Compare against 3 well-known algorithms Heintze and Tardieu [PLDI'01] Pearce et al [PASTE'04] Berndl et al [PLDI'03] All algorithms compute the exact same solution Six benchmarks: 100K—2M LOC Emacs, Ghostscript, Gimp, Insight, Wine, Linux Kernel

Performance Comparison 59

Performance Comparison 59

Concluding Remarks 20x faster than Berndl et al, 6x faster than Pearce et al and 3x faster than Heintze et al The paper also looks at different data structures to efficiently represent points-to sets Sparse-bitmap (used in GCC) Binary Decision Trees (BDD) BDD implementation is 2x slower on average, but used 5.5X less memory Question: They DO NOT compare with IDEAL. Is there further opportunity here ?

Backup: Transitive Closure Algorithm

Lazy Cycle Detection Algorithm

Heintze and Tardieu [PLDI'01] Online algorithm During construction, as new inclusion edges are added to the graph, the transitive edges are NOT added During analysis, indirect constraints are resolved through REACHABILITY queries. => lots of redundant queries In their paper, they reported results for field based implementation. When fields are expanded, the algorithm is dramatically slow.

Pearce et al [PASTE'04] Two variants First: Maintain topological order of the graph A newly inserted edge that violates this ordering COULD create a cycle, so check whenever this happens. Second Periodic sweep of the constraint graph to detect and collapse cycles.

Berndl et al [PLDI'03] Field sensitive inclusion-based pointer analysis for JAVA programs Uses BDDs to represent graph and points-to sets This paper extends this algorithm by Making it field insensitive Handle indirect function calls

Benchmarks Constraints reduced using offline variable substitution (60-77%)

Memory Consumption

Observations Number of nodes collapsed o Reduces nodes and edges in graph Number of nodes searched in DFS o Overhead due to cycle detection Number of points-to info propogation o Expensive operation