Evaluating the Precision of Static Reference Analysis Using Profiling Maikel Pennings, Donglin Liang, Mary Jean Harrold Georgia Institute of Technology Supported by NSF CCR , CCR , EIA , Boeing Aerospace Corp, State of Georgia Yamacraw Misson
Static Reference Analysis Task 1: Identify instances using static names (naming scheme) 1.A createA() { 2. return new A(); 3.} 4. … 5. p = createA(); 6. q = createA(); 1.Use allocation site (level 0) p and q point to a2 2.Use allocation site + N top most call sites (level N) p points to a2-5 q points to a2-6 3.Use allocation site + the allocation site for the receiver of the most recent call site
Static Reference Analysis Task 2: Determine the points-to set for each reference field or reference variable 1.p = new A(); 2.p.m(); 3.p = new A(); Flow-sensitive vs. flow- insensitive Calling-context-sensitive vs. calling-context-insensitive Object-sensitive vs. object- insensitive
Evaluation Framework Subject program JVM Profiler Static reference analysis eventsdynamic information reference information
Evaluation Framework Subject program JVM Profiler Static reference analysis eventsdynamic information reference information Mapping instances to static names reference profile
Comparison of the Information … p.m(); // C1 for (I=0;I<2;I++) { p = new A(); // C2 if (I==0) p.m1(); // C3 else p.m2(); // C4 }
Comparison of the Information C1C1 C2C2 C3C3 C4C4 i1i1 XX i2i2 XX NXXX NXXXX Dynamic information Reference profile Reference information … p.m(); // C1 for (I=0;I<2;I++) { p = new A(); // C2 if (I==0) p.m1(); // C3 else p.m2(); // C4 } Instances are identified using allocation site only
Comparison of the Information C1C1 C2C2 C3C3 C4C4 i1i1 XX i2i2 XX NXXX NXXXX Dynamic information Reference profile Reference information Instances are identified using allocation site only Study 1: Effectiveness Study 2: Precision
Subjects ProgramLocs#Class #Reached methods #Covered methods Java_cup (77%) Jess (78%) Sablecc (86%)
1: Evaluate Naming Schemes Compute a precision reference value PRV[i] for each instance i C1C1 C2C2 C3C3 C4C4 PRV i1i1 XX0.67 i2i2 XX NXXX Dynamic information Reference profile
Distribution of PRVs for Instances Java_cupJessSablecc
Effectiveness at Allocation Sites Compute the average of the PRVs for instances allocated at each allocation site Let I(a) = {instances allocated at allocation site a} i in I(a) PRV[ i ] I(a) Average PRV for allocation site a
Average PRVs for Allocation Sites (0.0,0.2)[0.2,0.4)[0.4,0.6)[0.6,0.8)[0.8,1.0)1.0 Java_ cup Level Level Level Level Jess Level Level Level Level Sablecc Level Level Level Level
Case Study (0.0,0.2)[0.2,0.4)[0.4,0.6)[0.6,0.8)[0.8,1.0)1.0 Java_ cup Level Level Level Level Java_cup.lalr_item Java_cup.lalr_item_set Java_cup.terminal_set Java.lang.Vector
Comparison of the Information C1C1 C2C2 C3C3 C4C4 i1i1 XX i2i2 XX NXXX NXXXX Dynamic information Reference profile Reference information Instances are identified using allocation site only Study 1: Effectiveness Study 2: Precision
Andersen’s Algorithm Overview Identifies instances using only allocation sites (level-0 naming scheme) Context-insensitive and flow-insensitive. Our implementation Context-sensitive and model-based approach for calls to methods of library classes (e.g., Vector, Map, Set) Avoids analyzing library methods (more efficient) Computes more precise information for programs that use library classes
2: Evaluate Andersen’s Algorithm Compute a precision reference value PRV[A] for each allocation site A C1C1 C2C2 C3C3 C4C4 PRV AXXX0.75 AXXXX Reference profile Reference information
Distribution of PRVs
Case Study Total 74 allocation sites whose precision reference values are in (0,0.1) Jess.Context 3 Jess.Funcall 1 Jess.FuncallValue 6 Jess.IntArrayValue 1 Jess.Value 51 Jess.ValueVector 6 Jess.Variable 6
Conclusion Study 1: Effectiveness of naming schemes Using allocation sites may be effective for instances allocated at many allocation sites Using allocation sites + N most recent call sites may increase the effectiveness The naming schemes may be ineffective for instances used for recursive data structures. Study 2: Precision of Andersen’s algorithm Andersen’s algorithm can be imprecise for many allocation sites, especially when instances are allocated to construct recursive data structures
Future work Perform further empirical and case studies More subjects More test cases Develop more effective naming schemes Use insight to develop better reference analysis techniques
QUESTIONS?