Interpreting Java Program Runtimes Stuart Hansen UW - Parkside
The Problem: The Real Problem: The JVM influences program runtimes in complex and mysterious ways The Real Problem: It is VERY frustrating to obtain runtimes that you can’t explain
Algorithm Analysis One goal of algorithm analysis is to predict resources (e.g. time) that an implementation will require There are many confounding influences Speed of CPU - Implementation Language Load on the system - Compiler Size of physical memory Operating system
Algorithm Analysis con’t. We use the confounding influences as pedagogic tools CPU time vs. elapsed time Physical memory vs. virtual memory Compiler optimizations
Java VM Presents New Issues What are they? Can we control them? Can we use them at teaching aids?
Lack of Documentation Lots of material on Java performance with lots of practical advice None tells why my BubbleSort code doesn’t graph to be a parabola
Three Examples Calls to Arrays.sort Modified Mergesort Rehashing All experiments performed using Sun’s Java SDK 1.4.2 under Debian Linux
Arrays.sort( ) Compare runtimes of CS1 sorts to Arrays.sort()
Dynamic Class Loading Java loads a class only when it is first referenced Disk IO is slow Can force pre-loading using Class.forName()
Side Issue – Java Startup Simplest Java application public class Simple { public static void main (String [] args) { } } Loads 280 classes
Mergesort Requires an auxiliary array for merging Should only be as big as subarrays to be merged Too large of an array degrades performance to O(n2) Let students work with and improve bad code
Results
Finding the Parabola Limit the graph to the smaller array sizes
Heap Management Generational Garbage Collector Controlling Heap size Major and minor garbage collections Alternative is incremental garbage collection Controlling Heap size Grows as needed by default -Xms and –Xmx options
Incremental GC
Things to Note Still not perfectly smooth Incremental GC creates a performance hit of about 10%
Rehashing Suggested by Mike Clancy during panel discussion at SIGCSE 2002 Create a Hashtable of a fixed size Load with different amounts of data Explicitly invoke rehash()
Rehashing Results
Optimizations in Java Dynamic Compilation Dynamic Optimization Byte code to native code Dynamic Optimization E.g. Method inlining Both are hard to control
Rehashing with Large Initial Heap Lets more time be dedicated to rehashing
Remaining Issues Each data set is still analyzed on an ad hoc basis No consistent set of instructions to give to students to get meaningful runtimes No handle on Java 5
Conclusion Questions? Highly recommend experimentation in classes Need tenacity to explain the results Questions?