Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities Luca Della Toffola – ETH Zurich Michael Pradel – TU Darmstadt Thomas R. Gross – ETH Zurich October 30 th, OOPSLA15 1
MemoizeIt 2 Dynamic analysis Memoization opportunities Automatic 9 new real-world memoization opportunities
Apache POI – Issue Performance Issue
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue Java profiler Ranked 10 (189), 4000 calls Java profiler Ranked 10 (189), 4000 calls Java profiler No additional bottleneck info Java profiler No additional bottleneck info
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue Research tools Sympthoms are not there* Research tools Sympthoms are not there* No nested loops No memory bloat * [Nistor, ISCE13], [Xu, OOPSLA12]
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue Observation Many calls have the same input and output values! Observation Many calls have the same input and output values! Output Returned value Output Returned value Input Parameters + accessed fields Input Parameters + accessed fields true false 0, “m/d/yy” 1, “h:mm” Memoization ?
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue Purity analysis? Too conservative! Purity analysis? Too conservative! Side effect s Side effect s Side effect s Ignore side effects!
public boolean DateUtil.isADateFormat(int idx, String format) { StringBuilder sb = new StringBuilder(format.length()); for (int i = 0; i < sb.length(); i++) { // Modify format and write to sb } String f = sb.toString(); // Process f using date pattern matching return date_ptrn.matcher(f).matches(); } Apache POI – Issue MemoizeIt 1 st ranked method! MemoizeIt 1 st ranked method! MemoizeIt Finds calls with the same input and output values. MemoizeIt Finds calls with the same input and output values. Memoization!
boolean cache_value; int cache_key1; String cache_key2; public boolean isADateFormatSlow(int idx, String format) { // Slow isADateFormat code } public boolean isADateFormat(int idx, String format) { if (cache_key1 == idx && cache_key2.equals(format)) { return cache_value; } // Update cache keys and value return isADateFormatSlow(idx, format); } Apache POI – Issue Single entry instance cache Up to 25% speed-up!
MemoizeIt – Contributions 4 1. Automatic analysis to find memoization opportunities 2. Suggest fix configurations for candidate methods
MemoizeIt – Contributions 5 1. Automatic analysis to find memoization opportunities 2. Suggest fix configurations for candidate methods Challenge boolean DateUtil.isADateFormat(int idx, MyClass format) Heap
MemoizeIt – Contributions 6 1. Automatic analysis to find memoization opportunities 2. Suggest fix configurations for candidate methods Challenge MemoizeIt == Memoization + Iterative
MemoizeIt 7 ProgramProfiling Input CPU-Time Profiling Filtering of methods: 1.Number of executions 2.Average execution time 3.Relative execution time Filtering of methods: 1.Number of executions 2.Average execution time 3.Relative execution time Initial method candidates
MemoizeIt 8 ProgramProfiling Input CPU-Time Profiling Input-Output Profiling
Input-Output Profiling 9 Input: Parameters + accessed fields Output: Returned value Input-output tuple (T) main … … … 1. For each call of candidate method 3. Select method candidates T1T1 T2T2 multiplicity(T 1 ) = 3 multiplicity(T 2 ) = 2 Repeated Input-Output Memoization boolean DateUtil.isADateFormat(int idx, String format) 2. Trace method input-output values true false 0, “m/d/yy” 1, “h:mm”
Challenge – Complex Objects 10 boolean DateUtil.isADateFormat(int idx, MyClass format)
Challenge – Complex Objects 10 … x: 45 MyClass y: 1 z: B a: equals? Structural and content equivalence … x: 45 MyClass y: 0 z: B a:
Challenge – Complex Objects 11 flat(object) (MyClass 1, [45, 1, (B 1, [...])]) … x: 45 MyClass y: 1 z: B a:
Challenge – Complex Objects 12 Heap … x: 45 MyClass y: 1 z: B a: Can’t keep everything!
Challenge – Complex Objects 13 depth = 1depth = 2 x: 45 MyClass y: 0 z: B a: x: 45 MyClass y: 1 z: B a: Heap ref 1 ref 2 equals? Exhaustive traversal is expensive!
Solution - Iterative Profiling 14 depth = 1depth = 2 x: 45 MyClass y: 0 z: B a: x: 45 MyClass y: 1 z: B a: Heap ref 1 ref 2 equals? Iterative approach can analyze programs with complex structures
MemoizeIt 15 ProgramProfiling input CPU-Time Profiling Input-Output Profiling Candidates ranking Fix suggestions Initial method candidates Input-Output Profiling Filter method candidates if max depth || time limit new candidates depth++ exit() d = 1
MemoizeIt 16 ProgramProfiling Input CPU-Time Profiling Input-Output Profiling Ranking of Candidates ! Ranked candidate methods Ranking based 1.Estimated saved time 2.Estimated hit-ratio Ranking based 1.Estimated saved time 2.Estimated hit-ratio
MemoizeIt 17 ProgramProfiling Input CPU-Time Profiling Input-Output Profiling Ranking of Candidates Fix Suggestions Optimal cache configuration ! Ranked candidate methods Suggests configuration among: Single Instance Single Global Multi Instance Multi Global + need for invalidation
Experimental Setup 18 ProgramDescription DaCapo 2006 MR2antlr, bloat, chart, fop, luindex, pmd Checkstyle - 5.6Source-code style checker Soot – ae0cec69c0Static program analysis / manipulation Apache Tika - 1.3Content analysis toolkit Apache POI - 3.9MS Office documents manipulation
Evaluation – Research Question Is MemoizeIt effective at finding new memoization opportunities? 1.Manually select realistic input 2.Execute MemoizeIt 3.Manually inspect methods 4.Implement MemoizeIt’s suggestions Timeout for profiling: 1 hour 19
Evaluation – Results 20 9 new opportunities DaCapo-antlr, DaCapo-bloat, DaCapo-fop Soot, Apache-Tika, Apache-POI, Checkstyle 1 duplicate method in Apache-Tika, Apache-POI 31 memoization opportunities Is MemoizeIt effective at finding new memoization opportunities?
Evaluation – Results 21 Small workload [speed-up] Large workload [speed-up] DaCapo-antlr 1.04 ± ± 0.02 DaCapo-bloat 1.08 ± DaCapo-fop 1.05 ± 0.01NA Checkstyle ± 0.10 Soot 1.27 ± ± 0.05 Apache-Tika Excel ± 0.02 Apache-Tika Jar 1.09 ± ± 0.02 Apache-POI (1) 1.11 ± ± 0.01 Apache-POI (2) 1.07 ± ± 0.01
Evaluation – Research Question 22 Is the iterative or exhaustive approach more efficient?
Evaluation – Results 22 Iterative Time [minutes] Exhaustive Time [minutes] DaCapo-antlr timeout DaCapo-bloat timeout DaCapo-chart 22 DaCapo-fop 18timeout DaCapo-luindex 32timeout DaCapo-pmd timeout Checkstyle 622 Soot timeout Apache-Tika Excel 5856 Apache-Tika Jar 4135 Apache-POI 2337 Iterative wins Exhaustive wins Is the iterative or exhaustive approach more efficient?
Related Work Performance problems Detecting [Xu, OOPSLA12], [Zaparanuks, PLDI12] Understanding [Song, OOPSLA14], [Yu, ASPLOS14] Fixing [Nistor, ICSE15] 23 Compiler optimizations [Ding, CGO04], [Costa, CGO13], [St-Amour, OOPSLA12] Incremental computations [Pugh, POPL89] Other caching techniques [Ma, WWW15]
Conclusions Profiling of memoization opportunities New real-world opportunities Relevant speed-ups Iterative strategy beneficial Suggests cache configurations Suggestions easy to implement Artifact evaluated 24 Heap Single Global Multi Instance Multi Global Single Instance