Interprocedural Path Profiling David Melski Thomas Reps University of Wisconsin
Outline Introduction and Applications The Ball-Larus Intraprocedural Technique Overview of Interprocedural Path-profiling Conclusion
Introduction What is path profiling? –Counts the number of times particular path fragments are executed Our work: extensions of Ball-Larus –New interprocedural techniques –New intraprocedural techniques
Applications Program Optimization –Path-qualified dataflow analysis (Ammons, Larus) Software Maintenance –Path spectra (Reps et al.) –“Oddball” paths –Debugger applications
The Ball-Larus Technique “Remove” cycles from the CFG Label each vertex v with numPaths[v], the number of paths from v to Exit. Label each edge with an integer such that: –For each path p, p’s path number---the sum of the values on p’s edges---is unique value –Each path number is in [0..numpaths[Entry]-1] Add Instrumentation
Ball-Larus Tech. “Remove” cycles –Add surrogate edges –Remove backedges Left with a DAG: have a finite number of acyclic paths
Ball-Larus Tech. Label each vertex v with numPaths[v], (the number of paths from v to Exit.) Use bottom-up traversal
Ball-Larus Tech. Label edges such that: For each path p, p’s path number—the sum of p’s edges—is a unique value pathNum = 0 pathNum = 3 pathNum = 1 pathNum = 2
Ball-Larus Tech. Add Instrumentation: Var. pathNum Example: start w/ pathNum=0 pathNum+=0 (pathNum=0) Profile[pathNum] ++ pathNum=2
Edge Labels
Overview Introductory example program The program supergraph G* Modifying G* to create G*-fin Parameterized instrumentation
Introductory Example (main)
Introductory Example (pow function)
Supergraph G* Unique vertices Entry global and Exit global CFG for each procedure P –Unique Entry P and Exit P call and return-site vertices for each procedure call call-edges and return-edges connect calls to procedures
Invalid Path Example Do not want to consider invalid paths for profiling
Valid Paths Label interprocedural edges with parens Don’t accept paths with mismatched parens ( ) [ } { ] Invalid: ( { ] ) Same-Level: ( { } ) Unbalanced-Left: ( {
Interprocedural Cycles Complicates interprocedural profiling
Creating G*-fin Modify G*: –In each procedure: Add Gexit Remove Backedges –(Removes cycles in control-flow graphs) u8: Gexit pow
Creating G*-fin Modify G* (cont): –Remove recursive edges –(Removes cycles in call graph)
Observable Paths In G*-fin: –A finite number of unbalanced-left paths. –Each unbalanced-left path defines an observable path—an item that we log in a profile. –(observable paths are unbalanced-left because they may end in the middle of a procedure)
Context-Prefix and Active-Suffix Each path has –a context-prefix –an active-suffix –a counter The counter counts the executions of the active-suffix in the context of the context- prefix
Overview: and functions For vertex v in procedure P, assign fn v – v takes the number of paths from Exit P and returns the number of paths from v For edge e in procedure P, assign fn e – e takes the number of paths from Exit P and returns an integer value for e
Overview: Instrumentation Each procedure Q takes additional parameters: –pathNum (passed by reference) –numPathsExitQ (passed by value) On a procedure call to Q from P, calculate numPathsExitQ for current context: –numPathsExitQ = r ( numPathsExitP )
Overview: Instrumentation numPathsExitPow = r (numPathsExitMain) pathNumOnEntryPow = pathNum E.g., Function call:
Overview: Instrumentation pathNum += e (numPathsExitPow) E.g., Edge Traversal:
Overview: Instrumentation pathNum += e (numPathsExitPow) Profile[pathNum] ++ pathNum = pathNumOnEntryPow pathNum += e (numPathsExitPow) E.g., Backedge Traversal:
Assigning functions Solve the following equations:
Assigning call functions numPaths[ExitPow] = rtn(numPaths[ExitMain]) numPaths[EntryPow] = entry( rtn(numPaths[ExitMain]) numPaths[Call] = entry( rtn(numPaths[ExitMain]) call = entry rtn
Discussion Relationship of functions to Interprocedural DFA (e.g., Sharir and Pnueli’s functions):
Assigning functions Let the vertex w have successors w 1 … w k :
Conclusion Interprocedural Context Path Profiling –still some difficulties: Doubly exponential observable paths (but can “prune” paths) instrumentation is somewhat more costly ( 2 ops per edge instead of 1) –Need static analysis to find and functions
Conclusion Developed a toolkit of path-profiling techniques: –Interprocedural vs. intraprocedural –Edge functions vs. edge values –Context vs. piecewise
Possible Approaches Remove most call-edges and return-edges Inline every non-recursive procedure Duplicate every non-recursive procedure Parameterize instrumentation in each procedure to behave differently for different contexts
Handling Recursion Assign values to each edge Entry global Entry R i :
Handling Recursion Before a recursive call to R: –save pathNum (in pathNumBeforeCall ) –set pathNum to value on Entry global Entry R After a recursive call –update profile with pathNum –restore pathNum to pre-call value (from pathNumBeforeCall )
Theory: Context-Free DAGs Let L be a context-free language over Let G be a directed graph whose edges are labeled with members of A path in G is an L-path if its word is in L (L,G) is a context-free DAG if the number of L-paths through G is finite
Interprocedural Piecewise Profiling Modification of context profiling: –For each procedure P: Add the vertex GEntry P Add the edge Entry global GEntry P –Replace each surrogate edge Entry P v with GEntry P v Use Unbalanced-Right-Left paths in G*-fin (must handle unbalanced-right paths)
Other Techniques Intraprocedural context path profiling –Context indicates the path taken to a loop header Hybrid techniques –Exploit parameterization of the instrumentation
Discussion Keeping the numbering dense: –the number of paths can be exponential in the size of the graph –might require O(n) bits for pathNum –dense numbering important –piecewise or hybrid may be more practical