Presentation is loading. Please wait.

Presentation is loading. Please wait.

Incrementalized Pointer and Escape Analysis Martin Rinard MIT LCS.

Similar presentations


Presentation on theme: "Incrementalized Pointer and Escape Analysis Martin Rinard MIT LCS."— Presentation transcript:

1 Incrementalized Pointer and Escape Analysis Martin Rinard MIT LCS

2 Context Unsafe languages (C, C++, …) Safe languages (Java, ML, Scheme, …) Big difference: garbage collection Goal: analyze program to safely allocate objects on the call stack instead of in garbage- collected heap Advantages No dangling references No memory leaks Disadvantages Collection overhead Collection pauses

3 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites

4 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs

5 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs

6 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs It allocates objects in heap

7 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs It allocates objects in heap

8 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs It allocates objects in heap

9 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When program runs It allocates objects in heap

10 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When objects become unreachable

11 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Program With Allocation Sites When objects become unreachable Garbage collector (eventually) reclaims memory

12 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Stack Allocation Correlate lifetimes of objects with lifetimes of procedures

13 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Stack Allocation Allocate object on activation record of procedure

14 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Stack Allocation When procedure returns

15 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Stack Allocation When procedure returns Object automatically deallocated

16 Problem How does compiler determine if it is safe to allocate objects on stack? Classic problem in program analysis Standard approach Analyze whole program Use information to find “captured” objects Allocate captured objects on stack

17 Stack Allocation Percentage of Memory Allocated on Stack 0 20 40 60 80 100 barneswaterjlexdbraytracecompress Whole-Program Analysis

18 Normalized Execution Times 0 20 40 60 80 100 barneswaterjlexdbraytracecompress Whole-Program Analysis Reference: execution time without optimization

19 Analysis Times 0 25 50 75 100 125 150 barneswaterjlexdbraytracecompress Analysis Time (seconds) Whole-Program Analysis 223 645

20 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Key Observation Number One: Most optimizations require only the analysis of a small part of program surrounding the object allocation site

21 void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— Key Observation Number Two: Most of the optimization benefit comes from a small percentage of the allocation sites 99% of objects allocated at these two sites

22 Intuition for Better Analysis Locate important allocation sites Use demand-driven approach to analyze region surrounding site Somehow avoid sinking analysis resources into unprofitable sites void compute(d,e) ———— void multiplyAdd(a,b,c) ————————— void multiply(m) ———— void add(u,v) —————— void main(i,j) ——————— void evaluate(i,j) —————— void abs(r) ———— void scale(n,m) —————— 99% of objects allocated at these two sites

23 How we do it Develop an incremental, demand-driven analysis Use profiling to identify important allocation sites Monitor analysis results to estimate Chance of optimizing site Analysis cost of realizing optimization Use estimates to direct analysis to profitable sites Result Usually get almost all of the benefit of whole-program analysis But at a fraction of the analysis cost

24 What This Talk is About How we turned this intuition into an algorithm that usually 1)obtains almost all the benefit of the whole program analysis 2) analyzes a small fraction of program 3) consumes a small fraction of whole program analysis time

25 Basic Idea Incremental, demand-driven analysis Profiling identifies important allocation sites Monitor analysis results to estimate Chance of optimizing site Analysis cost of realizing optimization Estimates direct analysis to profitable sites

26 Structure of Talk Motivating Example Base whole program analysis (Whaley and Rinard, OOPSLA 99) Incrementalized analysis Analysis policy Experimental results

27 Example

28 Employee Database Example Traverse database, extract max salary Name Salary John Doe $45,000 Ben Bit $30,000 Jane Roe $55,000 Vector max salary = $55,000 highest paid = Jane Roe

29 Coding Max Computation (in Java) class EmployeeDatabase { Vector database = new Vector(); Employee highestPaid; void computeMax() { int max = 0; Enumeration enum = database.elements(); while (enum.hasMoreElements()) { Employee e = enum.nextElement(); if (max < e.salary()) { max = e.salary(); highestPaid = e; }

30 Coding Max Computation (in Java) class EmployeeDatabase { Vector database = new Vector(); Employee highestPaid; void computeMax() { int max = 0; Enumeration enum = database.elements(); while (enum.hasMoreElements()) { Employee e = enum.nextElement(); if (max < e.salary()) { max = e.salary(); highestPaid = e; } Would like to allocate enum object on stack, not on the heap

31 Whole Program Analysis

32 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Bottom Up and Compositional Currently analyzed procedure Currently analyzed part of the program

33 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Bottom Up and Compositional Currently analyzed procedure Currently analyzed part of the program

34 Base Analysis Basic Abstraction: Points-to Escape Graph Intraprocedural Analysis Flow sensitive abstract interpretation Produces points-to escape graph at each program point Interprocedural Analysis Bottom up and compositional Analyzes each procedure once to obtain a single parameterized analysis result Result is specialized for use at each call site

35 Points-to Graph in Example void computeMax() { int max = 0; Enumeration enum = database.elements(); while (enum.hasMoreElements()) { Employee e = enum.nextElement(); if (max < e.salary()) { max = e.salary(); highestPaid = e; } } vectorelementData [ ] this enum e database highestPaid

36 Edge Types Inside Edges: Represent references created in currently analyzed part of program Outside Edges: Represent references created outside currently analyzed part of program vectorelementData [ ] this enum e database highestPaid dashed = outside solid = inside

37 Node Types Inside Nodes: Represent objects created in currently analyzed part of program Outside Nodes: Parameter nodes – represent parameters Load nodes - represent objects accessed via pointers created outside analyzed part vectorelementData [ ] this enum e database highestPaid dashed = outside solid = inside

38 Escape Information Escaped nodes parameter nodes returned nodes nodes reachable from other escaped nodes Captured is the opposite of escaped vectorelementData [ ] this enum e database highestPaid green = escaped white = captured

39 Stack Allocation Optimization Examine graph from end of procedure If a node is captured in this graph Allocate corresponding objects on stack (may need to inline procedures to apply optimization) vectorelementData [ ] this enum e database highestPaid green = escaped white = captured Can allocate enum object on stack

40 Interprocedural Analysis void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); }

41 Start with graph before call site void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database

42 Retrieve graph from end of callee void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid Enum object is not present because it was captured in the callee

43 Map formals to actuals void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid

44 Match corresponding inside and outside edges to complete mapping void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid

45 Match corresponding inside and outside edges to complete mapping void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid

46 Match corresponding inside and outside edges to complete mapping void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid

47 Match corresponding inside and outside edges to complete mapping void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid

48 Combine graphs to obtain new graph after call site void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database elementData [ ] this database highestPaid graph after call site e elementData [ ] highestPaid

49 Continue analysis after call site void printStatistics(BufferedReader r) { EmployeeDatabase e = new EmployeeDatabase(r); e.computeMax(); System.out.println(“max salary = “ + e.highestPaid); } e graph before call site elementData [ ] database graph after call site e elementData [ ] highestPaid

50 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

51 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

52 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

53 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

54 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

55 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

56 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

57 Whole Program Analysis void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() ——————

58 Incrementalized Analysis

59 Incrementalized Analysis Requirements Must be able to Analyze procedure independently of callers Whole-program analysis is compositional Already does this Skip analysis of invoked procedures But later incrementally integrate analysis results if desirable to do so

60 First Extension to Whole-Program Analysis Skip the analysis of invoked procedures Parameters are marked as escaping into skipped call site Assume analysis skips enum.nextElement() vector 1 this enum e database highestPaid Node 1 escapes into enum.nextElement()

61 First Extension Almost Works Can skip analysis of invoked procedures If allocation site is captured, great! If not, escape information tells you what procedures you should have analyzed… Should have analyzed enum.nextElement() vector 1 this enum e database highestPaid Node 1 escapes into enum.nextElement()

62 Second Extension to Base Algorithm Record enough information to undo skip and incorporate analysis into existing result Parameter mapping at call site Ordering information for call sites

63 void compute() { —————— foo(x,y); ——————— } Graph before call site Graph after call site Graph at end of procedure Graphs from Whole-Program Analysis Generated during analysis Used to apply stack allocation optimization

64 void compute() { —————— foo(x,y); ——————— } Graph before call site Graph after skipped call site Graph at end of procedure Graphs from Incrementalized Analysis Generated during analysis Used to apply stack allocation optimization

65 Incorporating Result from Skipped Call Site Naive approach: use mapping algorithm directly on graph from end of caller procedure Before incorporating result from skipped call site After incorporating result from skipped call site

66 Incorporating Result from Skipped Call Site Naive approach: use mapping algorithm directly on graph from end of caller procedure Graph from whole program analysis After incorporating result from skipped call site

67 Basic Problem and Solution Problem: additional edges in graph from end of method make result less precise Solution: augment abstraction For each call skipped call site, record Edges which were present in the graph before the call site Edges which were present after call site Use this information when incorporating results from skipped call sites

68 After Augmenting Abstraction Graph from whole program analysis After incorporating result from skipped call site

69 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis

70 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Attempt to stack allocate Enumeration object from elements

71 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze elements (intraprocedurally)

72 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze elements (intraprocedurally)

73 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze elements (intraprocedurally) Escapes only into the caller

74 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze computeMax (intraprocedurally)

75 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze computeMax (intraprocedurally)

76 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze computeMax (intraprocedurally) Escapes to hasMoreElements nextElement

77 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Analyze hasMoreElements and nextElement (intraprocedurally)

78 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Incorporate Results

79 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Enumeration object Captured in computeMax

80 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis Enumeration object Captured in computeMax Inline elements Stack allocate enumeration object

81 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis We skipped the analysis of some procedures

82 void computeMax() ———— Enumeration elements() ——— Vector elementData() ———— boolean hasMoreElements() —————— void printStatistics() ——————— Employee nextElement() —————— int salary() —————— Incrementalized Analysis We skipped the analysis of some procedures We ignored some other procedures

83 Result We can incrementally analyze Only what is needed For whatever allocation site we want And even temporarily suspend analysis part of the way through!

84 New Issue We can incrementally analyze Only what is needed For whatever allocation site we want And even temporarily suspend analysis part of the way through! But… Lots of analysis opportunities Not all opportunities are profitable Where to invest analysis resources? How much resources to invest?

85 Analysis Policy Formulate policy as solution to an investment problem Goal Maximize optimization payoff from invested analysis resources

86 Analysis Policy Implementation For each allocation site, estimate marginal return on invested analysis resources Loop Invest a unit of analysis resources (time) in site that offers best return Expand analyzed region surrounding site When unit expires, recompute marginal returns (best site may change)

87 Marginal Return Estimate N · P(d) C · T N = Number of objects allocated at the site P(d) = Probability of capturing the site, knowing we explored a region of call depth d C = Number of skipped call sites the allocation site escapes through T = Average time needed to analyze a call site

88 Marginal Return Estimate N · P(d) C · T As invest analysis resources explore larger regions around allocation sites get more information about sites marginal return estimates improve analysis makes better investment decisions!

89 Usage Scenarios Ahead of time compiler Give algorithm an analysis budget Algorithm spends budget Takes whatever optimizations it uncovered Dynamic compiler Algorithm acquires analysis budget as a percentage of run time Periodically spends budget, delivers additional optimizations Longer program runs, more optimizations

90 Experimental Results

91 Methodology Implemented analysis in MIT Flex System Obtained several benchmarks Scientific computations: barnes, water Our lexical analyzer: jlex Spec benchmarks: db, raytrace, compress

92 Analysis Times 223645 jdk 1.2

93 Allocation Sites Analyzed

94

95 Distribution of Allocated Objects

96 Analysis Time Payoff Stack allocated by incrementalized analysis Decided Stack allocated by whole-program analysis % of Objects Analysis Time (seconds) compressraytracedb Analysis Time (seconds) barneswaterjlex

97 Stack Allocation

98 Normalized Execution Times Reference: execution time without optimization

99 Experimental Summary Key Application Properties Most objects allocated at very few allocation sites Can capture objects with an incremental analysis of region surrounding allocation sites Consequence Most of the benefits of whole-program analysis Fraction of the cost of whole-program analysis

100 Other Analyses and Uses Whole-program pointer and escape analysis (OOPSLA 1999) Stack allocation Synchronization elimination Pointer and escape analysis for multithreaded programs (PLDI 1999, PPoPP 2001) Correct use of region-based allocation Elimination of dynamic region checks Synchronization elimination Memory bank disambiguation (RAW, DeepC) Foundation for other analyses Bitwidth analysis (MIT and CMU) Symbolic accessed region analysis

101 Other Analyses and Uses Symbolic analysis of accessed regions of memory blocks (PPoPP 1999, PLDI 2000) Automatic parallelization of sequential divide and conquer programs Data race freedom for parallel divide and conquer programs Absence of array bounds violations for both Bitwidth analysis

102 Future Directions Information from designers and developers Type systems for program properties Automatic parallelization and data race freedom for divide and conquer programs (CC 2001) Data race freedom for OO programs (OOPSLA 2001) Application-specific design properties Shapes of recursive data structures Object role transitions and constraints Distributed systems, component-based systems Interaction patterns Failure propagation properties Global view of system

103 Related Work Demand-driven Analyses Horwitz, Reps, Sagiv (FSE 1995) Duesterwald, Soffa, Gupta (TOPLAS 1997) Heintze, Tardieu (PLDI 2001) Key differences Integration of escape information enables incrementalized algorithms to suspend partially completed analyses Maintain accurate marginal payoff estimates Avoid overly costly analyses

104 Related Work Previous Escape Analyses Blanchet (OOPSLA 1999) Bogda, Hoelzle (OOPSLA 1999) Choi, Gupta, Serrano, Sreedhar, Midkiff (OOPSLA 1999) Whaley, Rinard (OOPSLA 1999) Ruf (PLDI 2000) Key Differences Previous algorithms analyze whole program But could incrementalize other analyses Get a range of algorithms with varying analysis time/precision tradeoffs

105 Conclusion Whole-Program Analysis Incrementalized Analysis Properties Uses escape information to incrementally analyze relevant regions of program Analysis policy driven by estimates of optimization benefits and costs Results Most of benefits of whole program analysis Fraction of the cost


Download ppt "Incrementalized Pointer and Escape Analysis Martin Rinard MIT LCS."

Similar presentations


Ads by Google