Rethinking Soot for Summary-Based Whole- Program Analysis PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan 1, Guoqing Xu 2, Atanas Rountev 1 1 Ohio State University 2 University of California, Irvine
Overview Programs are built with reusable components – Standard libraries in Java, C++, C# – Domain-specific libraries and frameworks Whole-program analysis – Analysis of both application and library code – Methods are analyzed under different contexts Summary-based analysis – Pre-analysis of library code (summary generation) – Reuse result of pre-analysis (summary application) Challenges – Carefully designed abstractions and algorithms – Infrastructure for summary generation and application 2
Case Study: An Alias Analysis [ISSTA’11] 3 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m
Case Study: An Alias Analysis [ISSTA’11] 4 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m o1o1 sasa f sfsf ret c g d o2o2 b smsm entry m exit m a
Case Study: An Alias Analysis [ISSTA’11] 5 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m o1o1 sasa f sfsf ret c g d o2o2 b smsm entry m exit m a b alias? d
Case Study: An Alias Analysis [ISSTA’11] 6 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m o1o1 sasa f sfsf ret c g d o2o2 b smsm entry m exit m a b alias? d
Case Study: An Alias Analysis [ISSTA’11] 7 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m o1o1 sasa f sfsf ret c g d o2o2 b smsm entry m exit m a
Case Study: An Alias Analysis [ISSTA’11] 8 m(a) { c = new …; // o 1 a.f = c; return c.g; } d = new …; // o 2 b = m(d); // call m o1o1 sasa f sfsf ret c g d o2o2 b smsm entry m exit m a summary(m): sasa f, g sgsg
Experimental Evaluation 19 Java programs – For all programs, more than 50% of nodes in the call graph are methods in the Java standard library – For most programs, the percentage exceeds 80% Two experiments – Summary-based construction of program representation – Summary-based computation of graph reachability Results – Significant potential savings in analysis running time – Additional savings limited by infrastructure 9
Discussion Goal: support summary-based analysis in Soot Problems with ad-hoc extensions – Difficulty in code maintenance – Difficulty in comparing analyses – Limited benefits in summarization Issues to consider – Configuration mechanisms – Management of summary information – Verification of summary information 10
Discussion Configuration mechanisms – Customization to allow summary-based analysis – Dependence between analyses Management of summary information – Unified summary APIs – Techniques of data structure persistence – Mapping back to program entities Verification of summary information – Consistency between summary and Jimple – Code changes 11
Conclusion Case study on an alias analysis Potential savings in analysis running time with summarization Discussion on supporting summary-based analysis – Configuration mechanisms – Management of summary information – Verification of summary information 12
Thank you 13