CMPUT 680 - Compiler Design and Optimization1 CMPUT680 - Fall 2003 Topic 7: Register Allocation and Instruction Scheduling José Nelson Amaral

Slides:



Advertisements
Similar presentations
Register Allocation COS 320 David Walker (with thanks to Andrew Myers for many of these slides)
Advertisements

CMPUT Compiler Design and Optimization1 Borrowed from J. N. Amaral, slightly modified LIVE-IN: k j.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Register Usage Keep as many values in registers as possible Register assignment Register allocation Popular techniques – Local vs. global – Graph coloring.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Register allocation Morgensen, Torben. "Register Allocation." Basics of Compiler Design. pp from (
Register Allocation CS 320 David Walker (with thanks to Andrew Myers for most of the content of these slides)
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Coalescing Register Allocation CS153: Compilers Greg Morrisett.
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
COMPILERS Register Allocation hussein suleman uct csc305w 2004.
Graph-Coloring Register Allocation CS153: Compilers Greg Morrisett.
SSA.
Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
CompSci 102 Discrete Math for Computer Science April 19, 2012 Prof. Rodger Lecture adapted from Bruce Maggs/Lecture developed at Carnegie Mellon, primarily.
1 Optimistic Register Coalescing Sobeeh Almukhaizim UC San Diego Computer Science & Engineering Based on the work of: J. Park and S. Moon School of Electrical.
Carnegie Mellon Lecture 6 Register Allocation I. Introduction II. Abstraction and the Problem III. Algorithm Reading: Chapter Before next class:
Topic 2b Basic Back-End Optimization
1 CS 201 Compiler Construction Lecture 12 Global Register Allocation.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Register Allocation (Slides from Andrew Myers). Main idea Want to replace temporary variables with some fixed set of registers First: need to know which.
Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.
Code Generation for Basic Blocks Introduction Mooly Sagiv html:// Chapter
Register Allocation (via graph coloring)
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Improving Code Generation Honors Compilers April 16 th 2002.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
4/29/09Prof. Hilfinger CS164 Lecture 381 Register Allocation Lecture 28 (from notes by G. Necula and R. Bodik)
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
CS745: Register Allocation© Seth Copen Goldstein & Todd C. Mowry Register Allocation.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Supplementary Lecture – Register Allocation EECS 483 University of Michigan.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Register Allocation John Cavazos University.
Final Code Generation and Code Optimization.
Register Usage Keep as many values in registers as possible Keep as many values in registers as possible Register assignment Register assignment Register.
Register Allocation CS 471 November 12, CS 471 – Fall 2007 Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
Compiler Principles Fall Compiler Principles Lecture 12: Register Allocation Roman Manevich Ben-Gurion University.
2/22/2016© Hal Perkins & UW CSEP-1 CSE P 501 – Compilers Register Allocation Hal Perkins Winter 2008.
Great Theoretical Ideas in Computer Science for Some.
COMPSCI 102 Introduction to Discrete Mathematics.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Register Allocation Ajay Mathew Pereira and Palsberg. Register allocation via coloring of chordal graphs. APLOS'05Register allocation via coloring of chordal.
COMPILERS Liveness Analysis hussein suleman uct csc3003s 2009.
Global Register Allocation Based on
Topic Register Allocation
Mooly Sagiv html://
Great Theoretical Ideas In Computer Science
The compilation process
Register Allocation Hal Perkins Autumn 2009
Register Allocation Noam Rinetzky Text book:
Topic 12: Register Allocation
Topic 10: Dataflow Analysis
Register Allocation Hal Perkins Autumn 2011
Final Code Generation and Code Optimization
Static Single Assignment
Compiler Construction
Lecture 17: Register Allocation via Graph Colouring
Register Allocation via Coloring of Chordal Graphs
Discrete Mathematics for Computer Science
Fall Compiler Principles Lecture 13: Summary
(via graph coloring and spilling)
CS 201 Compiler Construction
Presentation transcript:

CMPUT Compiler Design and Optimization1 CMPUT680 - Fall 2003 Topic 7: Register Allocation and Instruction Scheduling José Nelson Amaral

CMPUT Compiler Design and Optimization2 Reading List zTiger book: chapter 10 and 11 zDragon book: chapter 10 zOther papers as assigned in class or homeworks

CMPUT Compiler Design and Optimization3 Register Allocation zMotivation zLive ranges and interference graphs zProblem formulation zSolution methods

CMPUT Compiler Design and Optimization4 Goals of Optimized Register Allocation zTo select variables that should be assigned to registers. zUse the same register for multiple variables when it is legal, and profitable, to do so.

CMPUT Compiler Design and Optimization5 Liveness Intuitively a variable v is live if it holds a value that may be needed in the future. In other words, v is live at a point p i if: (i) v has been defined in a statement that precedes p i in any path, and (ii) v may be used by a statement s j, and there is a path from p i to s j. (iii) v is not killed between p i and s j.

CMPUT Compiler Design and Optimization6 Live Variables a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 A variable v is live between the point p i immediately after its definition and the point p j immediately after its last use. The interval [p i, p j ] is the live range of the variable v. Which variables have the longest live range in the example? Variables s1 and s2 have a live range of four statements. s2 s1

CMPUT Compiler Design and Optimization7 Register Allocation a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 How can we find out what is the minimum number of registers required by this basic block to avoid spilling values to memory? We have to compute the live range of all variables and find the “fatest” statement. Which statements have the most variables live simultaneously?

CMPUT Compiler Design and Optimization8 Register Allocation a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s7 s6 s5 s4 s3 s2 s1 At statement d variables s1, s2, s3, and s4 are live, and during statement e variables s2, s3, s4, and s5 are live. But we have to use some math: our choice is liveness analysis.

CMPUT Compiler Design and Optimization9 Live-in and Live-out a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s7 s6 s5 s4 s3 s2 s1 live-in(r): set of variables that are live at the point immediately before statement r. live-out(r): set of variables that are live at the point immediately after statement r.

CMPUT Compiler Design and Optimization10 Live-in and Live-out: Program Example a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s7 s6 s5 s4 s3 s2 s1 What are live-in(e) and live-out(e)? live-in(e) = {s1, s2, s3, s4} live-out(e) = {s2, s3, s4, s5}

CMPUT Compiler Design and Optimization11 Live-in and Live-out in Control Flow Graphs The entry point of a basic block B is the point before its first statement. The exit point is the point after its last statement. live-in(B): set of variables that are live at the entry point of the basic block B. live-out(B): set of variables that are live at the exit point of the basic block B.

CMPUT Compiler Design and Optimization12 Live-in and Live-out of basic blocks a := b + c d := d - b e := a + f b := d + c b := d + f e := a - c f := a - d B4B4 B3B3 B2B2 B1B1 live-in(B 4 )={c,d,e,f} live-out(B 4 )={b,c,d,e,f} b, d, e, f live b, c, d, e, f live (Aho-Sethi-Ullman, pp. 544) live-in(B 1 )={b,c,d,f} live-out(B 1 )={a,c,d,e,f} live-in(B 2 )={a,c,d,e} live-out(B 2 )={c,d,e,f} live-in(B 3 )={a,c,d,f} live-out(B 3 )={b,c,d,e,f} Compute live-in and live-out for each basic block

CMPUT Compiler Design and Optimization13 Register-Interference Graph A register-interference graph is an undirected graph that summarizes live analysis at the variable level as follows: zA node is a variable/temporary that is a candidate for register allocation. zAn edge connects nodes v1 and v2 if there is some statement in the program where variables v1 and v2 are live simultaneously. (Variables v1 and v2 are said to interfere, in this case).

CMPUT Compiler Design and Optimization14 Register Interference Graph: Program Example a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 s7 s6 s5 s4 s3 s2 s1 s2 s3 s4 s5 s7 s6

CMPUT Compiler Design and Optimization15 Register Allocation by Graph Coloring Background: A graph is k-colorable if each node can be assigned one of k colors in such a way that no two adjacent nodes have the same color. Basic idea: A k-coloring of the interference graph can be directly mapped to a legal register allocation by mapping each color to a distinct register. The coloring property ensures that no two variables that interfere with each other are assigned the same register.

CMPUT Compiler Design and Optimization16 The basic idea behind register allocation by graph coloring is to 1. Build the register interference graph, 2. Attempt to find a k-coloring for the interference graph. Register Allocation by Graph Coloring

CMPUT Compiler Design and Optimization17 Complexity of the Graph Coloring Problem zThe problem of determining if an undirected graph is k- colorable is NP-hard for k  3. zIt is also hard to find approximate solutions to the graph coloring problem.

CMPUT Compiler Design and Optimization18 Question: What to do if a register-interference graph is not k-colorable? Or if the compiler cannot efficiently find a k-coloring even if the graph is k-colorable? Register Allocation Answer: Repeatedly select less profitable variables for “spilling” (i.e. not to be assigned to registers) and remove them from the interference graph until the graph becomes k-colorable.

CMPUT Compiler Design and Optimization19 Estimating Register Profitability

CMPUT Compiler Design and Optimization20 Heuristic Solution for Graph Coloring Remove node x and all associate edges Key observation: graph Ggraph G’ Then G is k-colorable if G’ is k-colorable. Let G be an undirected graph. Let x be a node of G such that degree(x) < k.

CMPUT Compiler Design and Optimization21 A 2-Phase Register Allocation Algorithm Build IG Simplify Select and Spill Forward pass Reverse pass

CMPUT Compiler Design and Optimization22 /* neighbor(v) contains a list of the neighbors of v. */ /* Build step */ Build G, the register-interference graph; /* Forward pass */ Initialize an empty stack; repeat while G has a node v such that |neighbors(v)| < k do /* Simplify step */ Push (v, neighbors(v), no-spill) Delete v and its edges from G end while if G is non-empty then /* Spill step */ Choose “least profitable” node v as a potential spill node; Push (v, neighbors(v), may-spill) Delete v and its edges from G end if until G is an empty graph; Heuristic “Optimistic” Algorithm

CMPUT Compiler Design and Optimization23 /* Reverse Pass */ while the stack is non-empty do Pop (v, neighbors(v), tag) N := set of nodes in neighbors(v); if (tag = no-spill) then /* Select step */ Select a register R for v such that R is not assigned to nodes in N; Insert v as a new node in G; Insert edges in G from v to each node in N; else /* tag = may-spill */ if v can be assigned a register R such that R is not assigned to nodes in N then /* Optimism paid off: need not spill */ Assign register R to v; Insert v as a new node in G; Insert edges in G from v to each node in N; else /* Need to spill v */ Mark v as a node that needs spill end if end while Heuristic “Optimistic” Algorithm

CMPUT Compiler Design and Optimization24 Remarks This register allocation algorithm, based on graph coloring, is both efficient (linear time) and effective (good assignment). It has been used in many industry- strength compilers to obtain significant improvements over simpler register allocation heuristics.

CMPUT Compiler Design and Optimization25 Extensions zCoalescing zLive range splitting

CMPUT Compiler Design and Optimization26 Coalescing In the sequence of intermediate level instructions with a copy statement below, assume that registers are allocated to both variables x and y. x := …... y := x... … := y There is an opportunity for further optimization by eliminating the copy statement if x and y are assigned the same register. The constraint that x and y receive the same register can be modeled by coalescing the nodes for x and y in the interference graph i.e., by treating them as the same variable.

CMPUT Compiler Design and Optimization27 Simplify Build IG Select and Spill Coalesce An Extension with Coalesce

CMPUT Compiler Design and Optimization28 Register Allocation with Coalescing 2. Simplify: one at a time, remove non-move-related nodes of low (< k) degree from G. 1. Build: build the register interference graph G and categorize nodes as move-related or non-move-related. 3. Coalesce: conservatively coalesce G: only coalesce nodes a and b if the resulting a-b node has less than k neighbors. 4. Freeze: If neither coalesce nor simplify works, freeze a move-related node of low degree, making it non-move-related and available for simplify. (Appel, pp. 240)

CMPUT Compiler Design and Optimization29 Register Allocation with Coalescing 5. Spill: if there are no low-degree nodes, select a node for potential spilling. 6. Select: pop each element of the stack assigning colors. (re)buildcoalescefreezesimplify select potential spill actual spill (Appel, pp. 240)

CMPUT Compiler Design and Optimization30 Example: Step 1: Compute Live Ranges LIVE-IN: k j g := mem[j+12] h := k -1 f := g + h e := mem[j+8] m := mem[j+16] b := mem[f] c := e + 8 d := c j := b k := m + 4 LIVE-OUT: d k j m e f h g kj b c d

CMPUT Compiler Design and Optimization31 Example: Step 3: Simplify (K=4) bmkj gh d c e f (Appel, pp. 237) stack

CMPUT Compiler Design and Optimization32 Example: Step 3: Simplify (K=4) bmkj gh d c e f (Appel, pp. 237) (h,no-spill) stack

CMPUT Compiler Design and Optimization33 Example: Step 3: Simplify (K=4) bmkj g d c e f (Appel, pp. 237) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization34 Example: Step 3: Simplify (K=4) bmkj d c e f (Appel, pp. 237) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization35 Example: Step 3: Simplify (K=4) bmj d c e f (Appel, pp. 237) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization36 Example: Step 3: Simplify (K=4) bmj d c e (Appel, pp. 237) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization37 Example: Step 3: Simplify (K=4) bmj d c (Appel, pp. 237) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization38 Example: Step 3: Coalesce (K=4) bj d c (Appel, pp. 237) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack Why we cannot simplify? Cannot simplify move-related nodes.

CMPUT Compiler Design and Optimization39 Example: Step 3: Coalesce (K=4) bj d c (Appel, pp. 237) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization40 Example: Step 3: Simplify (K=4) bj c-d (Appel, pp. 237) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization41 Example: Step 3: Coalesce (K=4) bj (Appel, pp. 237) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization42 Example: Step 3: Simplify (K=4) b-j (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization43 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization44 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization45 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization46 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization47 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization48 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization49 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization50 Example: Step 3: Select (K=4) bmkj gh d c e f (Appel, pp. 237) (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) stack R1 R2 R3 R4

CMPUT Compiler Design and Optimization51 Could we do the allocation in the previous example with 3 registers?

CMPUT Compiler Design and Optimization52 Example: Step 3: Simplify (K=3) bmkj gh d c e f (Appel, pp. 237) (h,no-spill) stack

CMPUT Compiler Design and Optimization53 Example: Step 3: Simplify (K=3) bmkj g d c e f (Appel, pp. 237) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization54 Example: Step 5: Freeze (K=3) bmkj d c e f (Appel, pp. 237) (g, no-spill) (h, no-spill) stack Coalescing would make things worse. We can freeze the move d-c.

CMPUT Compiler Design and Optimization55 Example: Step 3: Simplify (K=3) bmkj d c e f (Appel, pp. 237) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization56 Example: Step 6: Spill (K=3) bmkj d e f (Appel, pp. 237) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack Neither coalescing nor freezing help us. At this point we should use some profitability analysis to choose a node as may-spill.

CMPUT Compiler Design and Optimization57 Example: Step 3: Simplify (K=3) bmkj d f (Appel, pp. 237) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization58 Example: Step 3: Simplify (K=3) bmkj d (Appel, pp. 237) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization59 Example: Step 3: Coalesce (K=3) bkj d (Appel, pp. 237) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization60 Example: Step 3: Coalesce (K=3) kj-b d (Appel, pp. 237) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization61 Example: Step 3: Coalesce (K=3) kj-b (Appel, pp. 237) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization62 Example: Step 3: Coalesce (K=3) j-b (Appel, pp. 237) (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) stack

CMPUT Compiler Design and Optimization63 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization64 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization65 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization66 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization67 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization68 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill) This is when our optimism could have paid off.

CMPUT Compiler Design and Optimization69 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization70 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization71 Example: Step 3: Select (K=3) bmkj gh d c e f (Appel, pp. 237) stack R1 R2 R3 (j-b, no-spill) (k, no-spill) (d, no-spill) (m, no-spill) (f, no-spill) (e, may-spill) (c, no-spill) (g, no-spill) (h, no-spill)

CMPUT Compiler Design and Optimization72 Live Range Splitting The basic coloring algorithm does not consider cases in which a variable can be allocated to a register for part of its live range. Some compilers split live ranges within the iteration structure of the coloring algorithm. When a variable is split into two new variables, one of the new variables might be profitably assigned to a register while the other is not.

CMPUT Compiler Design and Optimization73 Length of Live Ranges The interference graph does not contain information of where in the CFG variables interfere and what the length of a variable’s live range is. For example, if we only had few available registers in the following intermediate-code example, the right choice would be to spill variable w because it has the longest live range: x = w + 1 c = a - 2 y = x * 3 z = w + y

CMPUT Compiler Design and Optimization74 Effect of Instruction Reordering on Register Pressure The coloring algorithm does not take into account the fact that reordering IL instructions can reduce interference. Consider the following example: Original OrderingOptimized Ordering (needs 3 registers) (needs 2 registers) t 1 := A[i] t 2 ;= A[j] t 2 := A[j] t 3 := A[k] t 3 := A[k] t 4 := t 2 * t 3 t 4 := t 2 * t 3 t 1 := A[i] t 5 := t 1 + t 4 t 5 := t 1 + t 4

CMPUT Compiler Design and Optimization75 Brief History of Register Allocation Chaitin: Use the simple stack heuristic for ACM register allocation. Spill/no-spill SIGPLAN decisions are made during the Notices stack construction phase of the 1982 algorithm Briggs: Finds out that Chaitin’s algorithm PLDI spills even when there are available 1989 registers. Solution: the optimistic approach: may-spill during stack construction, decide at spilling time.

CMPUT Compiler Design and Optimization76 Brief History of Register Allocation Callahan: Hierarchical Coloring Graph, PLDI register preferencing, 1991 profitability of spilling. Chow-Hennessy: Priority-based coloring. SIGPLAN Integrate spilling decisions in the 1984 coloring decisions: spill a variable ASPLOS for a limited life range Favor dense over sparse use regions. Consider parameter passing convention.