SOOT By Joe Palmer Information taken from
General Overview Developed by Sable Research Group out of McGill University in Used to optimize Java Bytecode 4 source languages 4 intermediate representations used
Sources Languages Primarily takes Java Source as its input Can also take: SML Scheme Eiffel
I.R.’s Baf: Streamlined, stack-based representation of bytecode Abstracts type dependent variations of expressions into a single expression Jimple: Stack-less, typed, 3-Address representation of bytecode Mix between java source and java bytecode Linearization of a single expression into 3 separate statements Only refers to 3 local vars or conts at once Only 15 jimple instructions are used Compared to 200 possible instructions in java bytecode! Shimple: SSA-form version of Jimple Each local var has a single static point of definition (never reassign) Uses Phi-Nodes for control flow Grimp: Similar to Jimple but allows trees of expressions together with a representation of a “new” operator Expressions are “aggregated” main IR used!!
Phases of the Optimization
Analysis Tested using 8 SPECjvm98 benchmarks running on JDK 1.2 Showed 8% improvement when optimized bytecode is run using an interpreter 21% improvement when optimized bytecode is run using a JIT compiler Used in research with traditional compiler analyses, analyses for software engineering, analysis for distributed programs, and software verification Ptolemy Project Bandera Canvas Project
Strengths and Future Enhancements Used as a common infrastructure with which researchers could compare common analyses Enhancements coming: Attribute management Attribute legends Improved visual attributes in source Interactive CFGs Growable graphical callgraph Making conversion from Java to Jimple more stable and complete