Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011
Where Are We? Last two lectures: SSA Construction Insert ϕ -functions Rename Lots of optimizations This lecture Out-of-SSA Translation Using SSA: Dead Code Elimination COMP 512, Rice University2
3 SSA Deconstruction At some point, we need executable code Few machines implement ϕ operations Need to fix up the flow of values Basic idea Insert copies in ϕ -function pred’s Simple algorithm Works in most cases Adds lots of copies Most of them coalesce away X 17 ϕ (x 10,x 11 )... x x 17 X 17 x 10 X 17 x 11 *
Translation Out of SSA Form Two “classic” problems arise in SSA deconstruction Lost-copy problem Swap problem In each case, simple copy insertion produces incorrect code COMP 512, Rice University4 These problems were identified by Cliff Click and first published in “Improvements to the Construction and Destruction of Static Single Assignment Form”, (Briggs, Cooper, Harvey, & Simpson, SP&E, 1997). That paper presented ad-hoc ways of solving the problems.
The assignment to z now receives the wrong value. The Lost Copy Problem COMP 512, Rice University5 i ← 1 y ← i i ← i + 1 d ← y + … Original code i 0 ← 1 i 1 ← Φ(i 0,i 2 ) y 0 ← i 1 i 2 ← i z 0 ← y 0 + … In SSA form i 0 ← 1 i 1 ← Φ(i 0,i 2 ) i 2 ← i z 0 ← i 1 + … With copies folded i 0 ← 1 i 1 ← i 0 i 2 ← i i 1 ← i 2 z 0 ← i 1 + … Copies naïvely inserted To fix this problem, the compiler needs to create a temporary name to hold the penultimate value of i. Copy folding has nothing to do with either i or SSA form. Translation Out of SSA Form
Code is incorrect The Swap Problem COMP 512, Rice University6 x ← … y ← … t ← x x ← y y ← t Original code x 0 ← … y 0 ← … x 1 ← Φ(x 0,x 2 ) y 1 ← Φ(y 0,y 2 ) t 0 ← x 1 x 2 ← y 1 y 2 ← t 0 In SSA Form x 0 ← … y 0 ← … x 1 ← Φ(x 0,y 1 ) y 1 ← Φ(y 0,x 1 ) Copies folded x 0 ← … y 0 ← … x 1 ← x 0 y 1 ← y 0 x 1 ← y 1 y 1 ← x 1 Copies naïvely inserted This problem arises when a Φ -function argument is defined by a Φ -function in the same block. Requires one or more copies & temporary names. Translation Out of SSA Form
The Big Picture Insert parallel copies to convert SSA to conventional SSA In CSSA, we can just drop the subscripts on SSA name Coalesce aggressively, using a strong notion of interference x and y are copy connected & don’t interfere ⇒ coalesce Translate out of CSSA by renaming Sequentialize parallel copies ( may introduce new temporaries ) COMP 512, Rice University7
Convert Optimized SSA to CSSA For a ϕ -function a 0 ← ϕ (a 1,a 2, …, a n ) : 1.Insert a copy a 1 ’ ← a 1 at the end of the block corresponding to a 1 2.Replace a 0 ← ϕ (a 1,a 2, …, a n ) with a 0 ’ ← ϕ (a 1 ’,a 2 ’, …, a n ’ ) 3.Insert a copy a 0 ← a 0 ’ after the ϕ -function At this point, the compiler can replace all the a i ’ names with a single name and eliminate the ϕ -function. [ Sreedhar et al, SAS 99 ] Why does this work? Creates a new name for right-hand side of the ϕ -function Inserts copies into new name in each predecessor block Copies out of that name after all the ϕ -functions execute Treat multiple inserted copies in a block as parallel copies. COMP 512, Rice University8 Sketch of ideas from “Revisiting Out-of-SSA Translation for Correctness, Efficiency, and Speed”, Boissinot, Darte, de Dinechin, Guillon, and Rastello, Proceedings of CGO Isolates the merge
Coalesce Aggressively Can build interference graph & coalesce ( as in Chaitin ) Building the graph is expensive SSA allows us to detect value equivalence Boissinot et al. introduce an efficient set of tests to determine, for two sets of coalesced live ranges, if they interfere Fast test for liveness, fast test for value equivalence Given these two tests, they built an interference-graph free coalescing tool that is quite fast x ← y and x and y do not otherwise interfere ⇒ coalesce them Conceptually, however, you can think of the algorithm as using any strong coalescing algorithm ( not conservative coalescing ) COMP 512, Rice University9
Coalesce Aggressively Can build interference graph & coalesce ( as in Chaitin ) Building the graph is expensive SSA allows us to detect value equivalence Boissinot et al. introduce an efficient set of tests to determine, for two sets of coalesced live ranges, if they interfere Fast test for liveness, fast test for value equivalence Given these two tests, they built an interference-graph free coalescing tool that is quite fast Conceptually, however, you can think of the algorithm as using any strong coalescing algorithm ( not conservative coalescing ) COMP 512, Rice University10 Worklist ← copy operations while (Worklist is nonempty) pull x ← y from Worklist if x & y do not interfere combine them Difficulties in coalescing: 1.Order of coalescing matters 2.Impact of coalesce on accuracy of test for interference
Sequentializing Parallel Copies A set of parallel copies forms a graph Graph is either a tree or a cycle Schedule a tree, bottom up from the leaves Must break each cycle with an extra copy May require a new name Other copies may avoid the extra name Break on the value preserved in a non-cycle copy COMP 512, Rice University11 a←b; b←c; c←a; d←a ad b c Parallel copies Corresponding graph d←a a←b b←c c←d Serialized copies