Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.

Slides:



Advertisements
Similar presentations
Code Optimization, Part II Regional Techniques Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Advertisements

Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
SSA.
Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Lazy Code Motion Comp 512 Spring 2011
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Code Shape III Booleans, Relationals, & Control flow Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic H: SSA for Predicated Code José Nelson Amaral
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Dynamic Optimization as typified by the Dynamo System See “Dynamo: A Transparent Dynamic Optimization System”, V. Bala, E. Duesterwald, and S. Banerjia,
CSE P501 – Compiler Construction
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Lexical Analysis: DFA Minimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Static Single Assignment Form in the COINS Compiler Infrastructure Masataka Sassa, Toshiharu Nakaya, Masaki Kohama, Takeaki Fukuoka and Masahito Takahashi.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Algebraic Reassociation of Expressions Briggs & Cooper, “Effective Partial Redundancy Elimination,” Proceedings of the ACM SIGPLAN 1994 Conference on Programming.
Local Instruction Scheduling — A Primer for Lab 3 — Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
Introduction to Code Generation Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Terminology, Principles, and Concerns, III With examples from DOM (Ch 9) and DVNT (Ch 10) Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
Profile-Guided Code Positioning See paper of the same name by Karl Pettis & Robert C. Hansen in PLDI 90, SIGPLAN Notices 25(6), pages 16–27 Copyright 2011,
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Definition-Use Chains
Introduction to Optimization
Code Optimization.
Building SSA Form (A mildly abridged account)
Efficiently Computing SSA
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Princeton University Spring 2016
Local Instruction Scheduling
Introduction to Optimization
Introduction to Code Generation
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Instruction Scheduling: Beyond Basic Blocks
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Local Instruction Scheduling — A Primer for Lab 3 —
Code Shape III Booleans, Relationals, & Control flow
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Introduction to Optimization
Instruction Scheduling: Beyond Basic Blocks
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Where Are We? Last two lectures: SSA Construction Insert ϕ -functions Rename Lots of optimizations This lecture Out-of-SSA Translation Using SSA: Dead Code Elimination COMP 512, Rice University2

3 SSA Deconstruction At some point, we need executable code Few machines implement ϕ operations Need to fix up the flow of values Basic idea Insert copies in ϕ -function pred’s Simple algorithm  Works in most cases Adds lots of copies  Most of them coalesce away X 17  ϕ (x 10,x 11 )...  x  x 17 X 17  x 10 X 17  x 11 *

Translation Out of SSA Form Two “classic” problems arise in SSA deconstruction Lost-copy problem Swap problem In each case, simple copy insertion produces incorrect code COMP 512, Rice University4 These problems were identified by Cliff Click and first published in “Improvements to the Construction and Destruction of Static Single Assignment Form”, (Briggs, Cooper, Harvey, & Simpson, SP&E, 1997). That paper presented ad-hoc ways of solving the problems.

The assignment to z now receives the wrong value. The Lost Copy Problem COMP 512, Rice University5 i ← 1 y ← i i ← i + 1 d ← y + … Original code i 0 ← 1 i 1 ← Φ(i 0,i 2 ) y 0 ← i 1 i 2 ← i z 0 ← y 0 + … In SSA form i 0 ← 1 i 1 ← Φ(i 0,i 2 ) i 2 ← i z 0 ← i 1 + … With copies folded i 0 ← 1 i 1 ← i 0 i 2 ← i i 1 ← i 2 z 0 ← i 1 + … Copies naïvely inserted To fix this problem, the compiler needs to create a temporary name to hold the penultimate value of i. Copy folding has nothing to do with either i or SSA form. Translation Out of SSA Form

Code is incorrect The Swap Problem COMP 512, Rice University6 x ← … y ← … t ← x x ← y y ← t Original code x 0 ← … y 0 ← … x 1 ← Φ(x 0,x 2 ) y 1 ← Φ(y 0,y 2 ) t 0 ← x 1 x 2 ← y 1 y 2 ← t 0 In SSA Form x 0 ← … y 0 ← … x 1 ← Φ(x 0,y 1 ) y 1 ← Φ(y 0,x 1 ) Copies folded x 0 ← … y 0 ← … x 1 ← x 0 y 1 ← y 0 x 1 ← y 1 y 1 ← x 1 Copies naïvely inserted This problem arises when a Φ -function argument is defined by a Φ -function in the same block. Requires one or more copies & temporary names. Translation Out of SSA Form

The Big Picture Insert parallel copies to convert SSA to conventional SSA  In CSSA, we can just drop the subscripts on SSA name Coalesce aggressively, using a strong notion of interference  x and y are copy connected & don’t interfere ⇒ coalesce Translate out of CSSA by renaming Sequentialize parallel copies ( may introduce new temporaries ) COMP 512, Rice University7

Convert Optimized SSA to CSSA For a ϕ -function a 0 ← ϕ (a 1,a 2, …, a n ) : 1.Insert a copy a 1 ’ ← a 1 at the end of the block corresponding to a 1 2.Replace a 0 ← ϕ (a 1,a 2, …, a n ) with a 0 ’ ← ϕ (a 1 ’,a 2 ’, …, a n ’ ) 3.Insert a copy a 0 ← a 0 ’ after the ϕ -function At this point, the compiler can replace all the a i ’ names with a single name and eliminate the ϕ -function. [ Sreedhar et al, SAS 99 ] Why does this work? Creates a new name for right-hand side of the ϕ -function Inserts copies into new name in each predecessor block Copies out of that name after all the ϕ -functions execute Treat multiple inserted copies in a block as parallel copies. COMP 512, Rice University8 Sketch of ideas from “Revisiting Out-of-SSA Translation for Correctness, Efficiency, and Speed”, Boissinot, Darte, de Dinechin, Guillon, and Rastello, Proceedings of CGO Isolates the merge

Coalesce Aggressively Can build interference graph & coalesce ( as in Chaitin )  Building the graph is expensive  SSA allows us to detect value equivalence Boissinot et al. introduce an efficient set of tests to determine, for two sets of coalesced live ranges, if they interfere  Fast test for liveness, fast test for value equivalence Given these two tests, they built an interference-graph free coalescing tool that is quite fast  x ← y and x and y do not otherwise interfere ⇒ coalesce them Conceptually, however, you can think of the algorithm as using any strong coalescing algorithm ( not conservative coalescing ) COMP 512, Rice University9

Coalesce Aggressively Can build interference graph & coalesce ( as in Chaitin )  Building the graph is expensive  SSA allows us to detect value equivalence Boissinot et al. introduce an efficient set of tests to determine, for two sets of coalesced live ranges, if they interfere  Fast test for liveness, fast test for value equivalence Given these two tests, they built an interference-graph free coalescing tool that is quite fast Conceptually, however, you can think of the algorithm as using any strong coalescing algorithm ( not conservative coalescing ) COMP 512, Rice University10 Worklist ← copy operations while (Worklist is nonempty) pull x ← y from Worklist if x & y do not interfere combine them Difficulties in coalescing: 1.Order of coalescing matters 2.Impact of coalesce on accuracy of test for interference

Sequentializing Parallel Copies A set of parallel copies forms a graph Graph is either a tree or a cycle Schedule a tree, bottom up from the leaves Must break each cycle with an extra copy  May require a new name  Other copies may avoid the extra name  Break on the value preserved in a non-cycle copy COMP 512, Rice University11 a←b; b←c; c←a; d←a ad b c Parallel copies Corresponding graph d←a a←b b←c c←d Serialized copies