ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University 2 Siemens Corporate Research Funded by NSF grants CCR , CCR and Siemens Corporate Research
ESEC/FSE-99 2 Overview Motivation Theoretical model Application for pointer alias analysis Experimental results
ESEC/FSE-99 3 Data-Flow Analysis Information about program behavior Defines: –Graph for the control-flow structure –Lattice L of data-flow values –Transfer functions f i : L L Flow sensitivity: propagate data-flow values by respecting execution order of statements
ESEC/FSE-99 4 Limitations of Whole-Program Analysis Traditionally designed as whole-program analysis Precise analyses do not scale for large programs Incomplete programs cannot be analyzed: e.g., programs with libraries Information may be needed only for a small part of a large program
ESEC/FSE-99 5 Fragment Data-Flow Analysis Idea: analyze a program fragment instead of a whole program Use summary information about the rest of the program Advantages: –Analyze fragments of large programs –Analyze incomplete programs –Analyze only the “interesting part” of the program
ESEC/FSE-99 6 Questions What is the analysis structure? What is the relationship to whole-program analysis? How to define and ensure safety? What factors affect analysis cost and precision?
ESEC/FSE-99 7 Model of Whole-Program Analysis Consider only flow-sensitive analysis Interprocedural control-flow graph: Lattice L of data-flow values Node transfer functions f i : L L Solutions and safety Call Return Exit Entry Return Call Procedure
ESEC/FSE-99 8 Fragment Analysis Structure Input: fragment + whole-program information Graph, lattice, node transfer functions Boundary nodes: entry, call, return Boundary entry: summary value from Boundary call: summary function CallExit Entry Return CallEntryCall Fragment
ESEC/FSE-99 9 Fragment Analysis Safety All possible containing programs: p Progs Abstraction relation If, then safely abstracts x A safe solution safely abstracts the most precise whole-program solution for every p Sufficient requirements for analysis safety: transfer functions, boundary summaries
ESEC/FSE An Application Initial whole-program flow-insensitive analysis Fragment analysis input –Flow-insensitive solution –Call graph Use flow-insensitive solution at the boundary Two fragment pointer alias analyses
ESEC/FSE Pointer Alias Analysis Aliases refer to the same memory location Example: p = &x; (*p,x) Whole-program flow- and context-sensitive analysis [Landi-Ryder] Fixed and non-fixed locations: x, s.f, *p, p g Resolution of through-deref assignments Example: *p = 0;
ESEC/FSE Fragment Alias Analyses Input: whole-program flow-insensitive solution –Flow-insensitive analysis: almost linear time [Steensgaard, Zhang-Ryder-Landi] Basic analysis: assumptions at boundary Extended analysis: include called procedures; no boundary calls
ESEC/FSE Experiments Sun Sparc-20, 75 MHz, 352 MB 6 data programs: 8K - 25K LOC 12 fragments: –Cohesive subsets of procedures implementing certain functionality –Size: 2%-22% of program size, median 7% Resolved through-deref assignments –Metric: average number of modified fixed locations
ESEC/FSE Analysis Precision
ESEC/FSE Analysis Time Flow-insensitive analysis –Range: 2-9 s –Median: 7 s Basic analysis –Range: s –Median: 52 s Extended analysis –Range: s –Median: 85 s
ESEC/FSE Summary Fragment analysis as an alternative to whole- program analysis Theoretical issues of safety and feasibility Application using inexpensive whole-program analysis Initial experiments –Extended analysis: significant precision increase at a practical cost Ongoing work: scalability, incomplete programs
ESEC/FSE The New Lattice What is the set of names? Number of names should not depend on the size of the whole program Each whole-program name is: –preserved –ignored –represented by a placeholder One placeholder name per equivalence class
ESEC/FSE Fragment Sizes