Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei.

Slides:



Advertisements
Similar presentations
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
Advertisements

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Institute of Computing Technology, Chinese Academy of Sciences Working and Researching on Open64 Hongtao YuFeng LiWei Huo Wei MiLi ChenChunhui Ma Wenwen.
Static Data Race detection for Concurrent Programs with Asynchronous Calls Presenter: M. Amin Alipour Software Design Laboratory
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Interprocedural pointer analysis for C We’ll look at Wilson & Lam PLDI 95, and focus on two problems solved by this paper: –how to represent pointer information.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
A Context-Sensitive Pointer Analysis Phase in Open64 Compiler Tianwei Sheng, Wenguang Chen, Weimin Zheng Tsinghua University.
A Type-Checked Restrict Qualifier Jeff Foster OSQ Retreat May 9-10, 2001.
Intraprocedural Points-to Analysis Flow functions:
Java Alias Analysis for Online Environments Manu Sridharan 2004 OSQ Retreat Joint work with Rastislav Bodik, Denis Gopan, Jong-Deok Choi.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
C Stack Frames / Pointer variables Stack: Local Variables Pass & Return values Frame Ptr linkage (R5) and PC linkage (R7) Pointer Variables: Defining &
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
Program Analysis with Dynamic Change of Precision Dirk Beyer Tom Henzinger Grégory Théoduloz Presented by: Pashootan Vaezipoor Directed Reading ASE 2008.
Toward Efficient Flow-Sensitive Induction Variable Analysis and Dependence Testing for Loop Optimization Yixin Shou, Robert A. van Engelen, Johnnie Birch,
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.
Pointer Analysis as a System of Linear Equations. Rupesh Nasre (CSA). Advisor: Prof. R. Govindarajan. Jan 22, 2010.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
1 The System Dependence Graph and its use in Program Slicing.
Topics discussed in this section:
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
SAT-Based Model Checking Without Unrolling Aaron R. Bradley.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Computer Science: A Structured Programming Approach Using C1 Objectives ❏ To understand the concept and use of pointers ❏ To be able to declare, define,
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
ReIm & ReImInfer: Checking and Inference of Reference Immutability and Method Purity Wei Huang 1, Ana Milanova 1, Werner Dietl 2, Michael D. Ernst 2 1.
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
The Ins and Outs of Gradual Type Inference Avik Chaudhuri Basil Hosmer Adobe Systems Aseem Rastogi Stony Brook University.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
Pick Your Contexts Well: Understanding Object-Sensitivity The Making of a Precise and Scalable Pointer Analysis Yannis Smaragdakis University of Massachusetts,
Inter-procedural analysis
Pointer Analysis – A Survey Vishwanath Raman (call me vishwa please) Dec. 1, 2004.
Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting Tian Tan, Yue Li and Jingling Xue SAS 2016 September,
Manuel Fahndrich Jakob Rehof Manuvir Das
Structural testing, Path Testing
Interprocedural Analysis Chapter 19
Hongtao Yu Wei Huo ZhaoQing Zhang XiaoBing Feng
Radu Rugina and Martin Rinard Laboratory for Computer Science
(can use two lines for Title if needed)
Hongtao Yu Wei Huo ZhaoQing Zhang XiaoBing Feng
MOPS: an Infrastructure for Examining Security Properties of Software
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei 1 Jingling Xue University of New South Wales

INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Framework Analyzing a Level Experiments Conclusion 2

INSTITUTE OF COMPUTING TECHNOLOGY Introduction Motivation –Who needs flow- and context-sensitive (FSCS) pointer analysis ? Software checking tools Program understanding Parallelization tools Hardware synthesis –Existed methods cannot scale to large real programs –Aiming at millions of lines of C code 3

INSTITUTE OF COMPUTING TECHNOLOGY Improve scalability For flow-sensitivity –Decreasing iterations in dataflow analysis –Saving space of points-to graph For context-sensitivity –Summary-based –Low storage penalty –Low apply penalty 4

INSTITUTE OF COMPUTING TECHNOLOGY Idea Level by Level analysis –Analyze the pointers in decreasing order of their points-to levels Suppose int **q, *p, x; q has a level 2, p has a level 1 and x has a level 0. –Fast flow-sensitive analysis on full sparse SSA –Fast and accurate context-sensitive analysis using a full transfer function 5

INSTITUTE OF COMPUTING TECHNOLOGY Contribution performs a full-sparse flow-sensitive pointer analysis using a flow-insensitive algorithm performs a context-sensitive pointer analysis efficiently with precise full transfer function yields a flow- and context-sensitive interproce- dural may/must mod/ref on a compact SSA form analyzes million lines of code in minutes, fast- er than the state-of-the art FSCS pointer ana- lysis algorithms 6

INSTITUTE OF COMPUTING TECHNOLOGY Framework Figure 1. Level-by-level pointer analysis (LevPA). Evalute transfer functions Bottom-up Top-down Propagate points-to set Compute points-to level for points-to level from the highest to lowest incremental build call graph 7

INSTITUTE OF COMPUTING TECHNOLOGY Points-to level Property 1. If a variable x is possibly pointed to by a pointer y, then ptl( x ) ≤ ptl( y ). Property 2. If a variable y is possibly assigned to x, then ptl( x ) = ptl( y ). Compute points-to level by a Unification-based pointer analysis 8

INSTITUTE OF COMPUTING TECHNOLOGY Example int o, t; main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } 9 ptl(x, y, p, q) =2 ptl(a, b, c, d, e) =1 ptl(t, o) = 0 ptl(x, y, p, q) =2 ptl(a, b, c, d, e) =1 ptl(t, o) = 0 analyze first { x, y, p, q } then { a, b, c, d, e} last { t, o } analyze first { x, y, p, q } then { a, b, c, d, e} last { t, o }

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } 10

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p 1 = *q 1 ; L12: *q 1 = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } 11 p 1 ’s points-to depend on formal-in p q 1 ’s points-to depend on formal-in q

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p 1 = *q 1 ; L12: *q 1 = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x 1 = &a; y 1 = &b; L4: foo(x 1, y 1 ); L5: *b = 5; L6: if ( … ) { x 2 = &c; y 2 = &e; } L7: else { x 3 = &d; y 3 = &d; } x 4 = ϕ (x 2, x 3 ); y 4 = ϕ (y 2, y 3 ) L8: c = &t; L9: foo( x 4, y 4 ); L10: *e = 10; } 12 p 1 ’s points-to depend on formal-in p q 1 ’s points-to depend on formal-in q x 1 → { a } y 1 → { b } x 2 → { c } y 2 → { e } x 3 → { d } y 3 → { d } x 4 → { c, d } y 4 → { e, d }

INSTITUTE OF COMPUTING TECHNOLOGY Full-sparse Analysis Achieve flow-sensitivity flow-insensitively –Regard each SSA name as a unique variable –Set constraint-based pointer analysis Full sparse –Saving time –Saving space 13

INSTITUTE OF COMPUTING TECHNOLOGY Top-down analyze level 2 L4: foo.p → { a } foo.q → { b } L9: foo.p → { c, d } foo.q → { d, e } foo.p → { a, c, d } foo.q → { b, d, e } main: Propagate to callsite 14 void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

INSTITUTE OF COMPUTING TECHNOLOGY Top-down analyze level 2 void foo( int **p, int **q) { μ(b, d, e) L11: *p 1 = *q 1 ; χ(a, c, d) L12: *q 1 = &obj; χ(b, d, e) } foo: Expand pointer dereferences 15 Merging calling contexts here void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

INSTITUTE OF COMPUTING TECHNOLOGY Context Condition To be context-sensitive Points-to relation c i –p v (p→v ), p must (may) point to v, p is a formal parameter. Context Condition ℂ (c 1,…,c k ) –a Boolean function consists of higher-level points-to relations Context-sensitive μ and χ – μ(v i, ℂ (c 1,…,c k )) –v i+1 =χ(v i, M, ℂ (c 1,…,c k )) M ∈ {may, must}, indicates weak/strong update 16

INSTITUTE OF COMPUTING TECHNOLOGY Context-sensitive μ and χ void foo( int **p, int **q) { μ(b, q  b) μ(d, q→d) μ(e, q→e) L11: *p 1 = *q 1 ; a=χ(a, must, p  a) c=χ(c, may, p→c) d=χ(d, may, p→d) L12: *q1 = &obj; b=χ(b, must, q  b) d=χ(d, may, q→d) e=χ(e, may, q→e) } 17

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 void foo( int **p, int **q) { μ(b 1, q  b) μ(d 1, q→d) μ(e 1, q→e) L11: *p 1 = *q 1 ; a 2 =χ (a 1, must, p  a ) c 2 =χ (c 1, may, p→c ) d 2 =χ(d 1, may, p→d) L12: *q 1 = &obj; b 2 =χ(b 1, must, q  b) d 3 =χ(d 2, may, q→d) e 2 =χ(e 1, may, q→e) } 18

INSTITUTE OF COMPUTING TECHNOLOGY Points-to Set Local Points-to Set –Loc (p) = { | ℂ (c 1,…,c k ) is a context condition}. –p can point to v if and only if ℂ (c 1,…,c k ) holds. –is computed explicitly during the bottom-up analysis. Dependence Set –Dep(p) = { | q is a formal-in parameter of level lev and ℂ (c 1,…,c k ) is a context condition –Ptr(p) includes Ptr(q) if and only if ℂ (c 1,…,c k ) holds. 19

INSTITUTE OF COMPUTING TECHNOLOGY Transfer function Trans(proc, v) – v is a formal-out parameter ℂ (c 1,…,c k ) is a context condition. –V can be modified at a callsite invoking proc only if ℂ (c 1,…,c k ) holds at the callsite M ∈ {may, must} , –indicates may/must mod effect Trans(proc) –a set of all individual transfer functions Trans(proc, v). 20

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 void foo( int **p, int **q) { μ(b 1, q  b) μ(d 1, q→d) μ(e 1, q→e) L11: *p 1 = *q 1 ; a 2 =χ(a 1, must, p  a) c 2 =χ(c 1, may, p→c) d 2 =χ(d 1, may, p→d) L12: *q 1 = &obj; b 2 =χ(b 1, must, q  b) d 3 =χ(d 2, may, q→d) e 2 =χ(e 1, may, q→e) } Trans(foo, a) =,, }, pa, must > 21 Trans(foo, c) =,, }, p→c, may > Trans(foo, b) = }, { }, qb, must > Trans(foo, e) = }, { }, q→e, may > Trans(foo, d) = }, {,, }, p→d ∨ q→d, may >

INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 int obj, t; main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x 1 = &a; y 1 = &b; μ(b 1, true) L4: foo(x 1, y 1 ); a 2 =χ(a 1, must, true) b 2 =χ(b 1, must, true) L5: *b 1 = 5; L6: if ( … ) { x 2 = &c; y 2 = &e; } L7: else { x 3 = &d; y 3 = &d; } x 4 = ϕ (x 2, x 3 ) y 4 = ϕ (y 2, y 3 ) L8: c 1 = &t; μ(d 1, true) μ(e 1, true) L9: foo(x 4, y 4 ); c 2 =χ(c 1, may, true) d 2 =χ(d 1, may, true) e 2 =χ(e 1, may, true) L10: *e 1 = 10; } at L4, p a holds, q b holds at L4, p a holds, q b holds at L9, p → c, p → d holds, q → e, q → d holds, at L9, p → c, p → d holds, q → e, q → d holds, 22

INSTITUTE OF COMPUTING TECHNOLOGY BDD and context condition Context conditions are implemented using BDD –Compactly represented –Boolean operations efficiently 23 x1 x2 x variable x1 represents p → a variable x2 represents q → a variable x3 represents p → b BDD for ℂ = (p → a ∧ q → a) ∨ p → b if only p → b holds at a call site, we can write ℂ | x1=0;x2=0;x3=1 to see whether C holds at the call site.

INSTITUTE OF COMPUTING TECHNOLOGY Experiment Analyzes million lines of code in minutes Faster than the state-of-the art FSCS pointer analysis algorithms. Table 2. Performance (secs). 24 BenchmarkKLOC LevPA Bootstrapping(PLDI’08) 64bit32bit Icecast sendmail httpd gombk / wine / wireshark /

INSTITUTE OF COMPUTING TECHNOLOGY Conclusion We present a scalable method for flow- and context-sensitive pointer analysis Analyzes the pointers in a program level by level in terms of their points-to levels. –Fast flow-sensitive analysis on full sparse SSA form –Fast and accurate context-sensitive analysis using full transfer functions represented by BDD. Can analyze million lines of C code in minutes, faster than the state-of-the-art methods. 25

INSTITUTE OF COMPUTING TECHNOLOGY Thanks 26