Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Liveness analysis and Register Allocation Cheng-Chia Chen.

Similar presentations


Presentation on theme: "1 Liveness analysis and Register Allocation Cheng-Chia Chen."— Presentation transcript:

1 1 Liveness analysis and Register Allocation Cheng-Chia Chen

2 The Memory Hierarchy Registers 1 cycle 256-8000 bytes Cache 3 cycles 256k-1M Main memory 20-100 cycles 32M-1G Disk 0.5-5M cycles 4G-1T

3 Managing the Memory Hierarchy Programs are written as if there are only two kinds of memory: main memory and disk Programmer is responsible for moving data from disk to memory (e.g., file I/O) Hardware is responsible for moving data between memory and caches Compiler is responsible for moving data between memory and registers

4 Current Trends Cache and register sizes are growing slowly Processor speed improves faster than memory speed and disk speed –The cost of a cache miss is growing –The widening gap is bridged with more caches It is very important to: –Manage registers properly –Manage caches properly Compilers are good at managing registers

5 The Register Allocation Problem Recall that intermediate code uses as many temporaries as necessary –This complicates final translation to assembly –But simplifies code generation and optimization –Typical intermediate code uses too many temporaries The register allocation problem: –Rewrite the intermediate code to use fewer temporaries than there are machine registers –Method: assign more temporaries to a register But without changing the program behavior

6 History Register allocation is as old as intermediate code Register allocation was used in the original FORTRAN compiler in the ‘ 50s –Very crude algorithms A breakthrough was not achieved until 1980 when Chaitin invented a register allocation scheme based on graph coloring –Relatively simple, global and works well in practice

7 An Example Consider the program a := c + d e := a + b f := e - 1 –with the assumption that a and e die after use Temporary a can be “ reused ” after e := a + b Same with temporary e Can allocate a, e, and f all to one register (r 1 ): r 1 := r 2 + r 3 r 1 := r 1 + r 4 r 1 := r 1 - 1

8 Basic Register Allocation Idea The value in a dead temporary is not needed for the rest of the computation –A dead temporary can be reused Basic rule: –Temporaries t 1 and t 2 can share the same register if at any point in the program at most one of t 1 or t 2 is live !

9 Algorithm: Part I Compute live variables for each point: a := b + c d := -a e := d + f f := 2 * e b := d + e e := e - 1 b := f + c {b} {c,e} {b} {c,f} {b,c,e,f} {c,d,e,f} {b,c,f} {c,d,f} {a,c,f}

10 10 Dataflow Equations Control-flow graph –Out-edge, in-edge, successor, predicessor Use and Def Sets –Use : on the right-hand side of assignment –Def : assignment to a var defines that var. Liveness –A variable is “ live ” on an edge if there is a directed path from that edge to a use fo the variable that does not go through any def.

11 11 Control Flow Graph (CFG) A control flow graph for a program (or subroutine) is a digraph such that –Each statement (or basic block) in the program is a node in the CFG. –if statement X can be followed by statement Y, then there is an edge from X to Y. –e.g., Figure 10.1. A variable is live if its current value may be used in the future. –The analysis is backward (from the future to the past). –e.g., b used in S4, => live at edge 3  4, 2  3. –since S2 assign new value to b, content of b from S1 is not used. – the live range of b is { 2  3, 3  4}.

12 12 10.1 CFG of a program a  0 L1: b  a + 1 c  c + b a  b * 2 if a < N goto L1 return c. a := 0 return c c := c + b a < N b := a + 1 a := b * 2 1 2 3 4 5 6

13 13 Liveness of variable a and b a := 0 return c c := c + b a < N b := a + 1 a := b * 2 a := 0 return c c := c + b a < N b := a + 1 a := b * 2 var defined var used

14 14 Liveness of variable c a := 0 return c c := c + b a < N b := a + 1 a := b * 2 1 2 3 4 5 6 uninitialized c detected if c is not a formal defined/used var used

15 15 predecessor and successor If there is an edge E from node X to Node Y, then E is an in-edge of Y and an out-edge of X; Y is said to be a successor of X and X a predecessor of Y. For each node n, define –pred[n] = the set of all predecessors of node n and –succ[n] = the set of all successor edges of node n. –e.g., in Figure 10.1, –out-edges of node 5 are: 5  6, 5  2, –in-edges of node 2 are : 5  2, 1  2. –pred[2] = {1,5}; –succ[2] = { 3 }.

16 16 Uses and defs An assignment to a variable or temporary defines the variable. –x  y – 2; b  a + 3; c  f(a,c,b) + 2 An occurrence of a variable on the right hand side of an assignment (or in other expression) uses the variable. –x  y – 2; b  a + 3; c  f(a,c,b) + 2 Definition: –def[x] = { n | var x is defined in Node n} –def[n] = { x | var x is defined in node n }. –use[x] = { n | var x is used in Node n} –use[n] = { x | var x is used in node n }.

17 17 Liveness A variable is live on an edge if there is a directed path from that edge to a use of the variable without going through any def. A variable is live-in at a node if it is live on any of the in- edges of that node. A variable is live-out at a node if it is live on any of the out-edges of that node. X is live-in at this node X is live at this edge X is live-out at this node X is live at this edge

18 18 (y1,y2,x3)  f(x1,x2,x3). X is live at all these edge iff X is used at n (i.e., x1 or x2 or x3). or X is live at at some out-edge of n and X is not defined at n (i.e., not y1,y2 or x3) y1,x2 z1 x3 out[n] = {y1,x2, x3,z1 } in[n] = use[n] U (out[n]-defs[n]) ={x1,x2,x3} U {x2,z1}

19 19 Dataflow equations for liveness analysis definitions: –in[n] = set of variables that are live-in at node n. –out[n] = set of variables that are live-out at node n. Equations for defs, in and out. 1.in[n] = use[n] U (out[n] – def[n] ). –var x is live at an (and all) edge to n iff either – 1. x is used at n or – 2. x is live at an out-edge of n and n does not define x. 2.out[n] = U s  succ[n] in[s] –var x live-out at node n iff –var x is live at an edge leaving from n.

20 20 Computing liveness For each node n –in[n]  {}; out[n]  {}; repeat –for each node n – in ’ [n]  in[n]; out ’ [n]  out[n]; // save old values – in[n]  use[n] U [out[n] – def[n]). // propagate – out[n]  U s  succ[n] in[s] // propagate until in[n]=in ’ [n] && out[n] = out ’ [n] for all n. // i.e, no additional var added during whole iteration

21 21 Algorithm: Example Compute live variables for each point: a := b + c d := -a e := d + f f := 2 * e b := d + e e := e - 1 b := f + c {b} {c,e} {b} {c,f} {b,c,e,f} {c,d,e,f} {b,c,f} {c,d,f} {a,c,f} return b

22 The Register Interference Graph Two temporaries that are live simultaneously cannot be allocated in the same register We construct an undirected graph –A node for each temporary –An edge between t 1 and t 2 if they are live simultaneously at some point in the program This is the register interference graph (RIG) –Two temporaries can be allocated to the same register if there is no edge connecting them

23 Register Interference Graph. Example. For our example: a f e d c b E.g., b and c cannot be in the same register E.g., b and d can be in the same register

24 Register Interference Graph. Properties. It extracts exactly the information needed to characterize legal register assignments It gives a global (i.e., over the entire flow graph) picture of the register requirements After RIG construction the register allocation algorithm is architecture independent

25 Graph Coloring. Definitions. A coloring of a graph is an assignment of colors to nodes, such that nodes connected by an edge have different colors A graph is k-colorable if it has a coloring with k colors

26 Register Allocation Through Graph Coloring In our problem, colors = registers –We need to assign colors (registers) to graph nodes (temporaries) Let k = number of machine registers If the RIG is k-colorable then there is a register assignment that uses no more than k registers

27 Graph Coloring. Example. Consider the example RIG a f e d c b There is no coloring with less than 4 colors There are 4-colorings of this graph r4r4 r1r1 r2r2 r3r3 r2r2 r3r3

28 Graph Coloring. Example. Under this coloring the code becomes: r 2 := r 3 + r 4 r 3 := -r 2 r 2 := r 3 + r 1 r 1 := 2 * r 2 r 3 := r 3 + r 2 r 2 := r 2 - 1 r 3 := r 1 + r 4

29 Computing Graph Colorings The remaining problem is to compute a coloring for the interference graph But: 1.This problem is very hard (NP-hard). No efficient algorithms are known. 2.A coloring might not exist for a given number or registers The solution to (1) is to use heuristics We ’ ll consider later the other problem

30 Graph Coloring Heuristic Observation: –Pick a node t with fewer than k neighbors in RIG –Eliminate t and its edges from RIG –If the resulting graph has a k-coloring then so does the original graph Why: –Let c 1, …,c n be the colors assigned to the neighbors of t in the reduced graph –Since n < k we can pick some color for t that is different from those of its neighbors

31 Graph Coloring Heuristic The following works well in practice: –Pick a node t with fewer than k neighbors –Put t on a stack and remove it from the RIG –Repeat until the graph has one node Then start assigning colors to nodes on the stack (starting with the last node added) –At each step pick a color different from those assigned to already colored neighbors

32 Graph Coloring Example (1) Remove a and then d a f e d c b Start with the RIG and with k = 4: Stack: {}

33 Graph Coloring Example (2) Now all nodes have fewer than 4 neighbors and can be removed: c, b, e, f f e c b Stack: {d, a}

34 Graph Coloring Example (2) Start assigning colors to: f, e, b, c, d, a b a e c r4r4 fr1r1 r2r2 r3r3 r2r2 r3r3 d

35 What if the Heuristic Fails? What if during simplification we get to a state where all nodes have k or more neighbors ? Example: try to find a 3-coloring of the RIG: a f e d c b

36 What if the Heuristic Fails? Remove a and get stuck (as shown below) f e d c b Pick a node as a candidate for spilling –A spilled temporary “lives” in memory Assume that f is picked as a candidate

37 What if the Heuristic Fails? Remove f and continue the simplification –Simplification now succeeds: b, d, e, c e d c b

38 What if the Heuristic Fails? On the assignment phase we get to the point when we have to assign a color to f We hope that among the 4 neighbors of f we use less than 3 colors  optimistic coloring f e d c b r3r3 r1r1 r2r2 r3r3 ?

39 Spilling Since optimistic coloring failed we must spill temporary f We must allocate a memory location as the home of f –Typically this is in the current stack frame –Call this address fa Before each operation that uses f, insert f := load fa After each operation that defines f, insert store f, fa

40 Spilling. Example. This is the new code after spilling f a := b + c d := -a f := load fa e := d + f f := 2 * e store f, fa b := d + e e := e - 1 f := load fa b := f + c

41 Recomputing Liveness Information The new liveness information after spilling: a := b + c d := -a f := load fa e := d + f f := 2 * e store f, fa b := d + e e := e - 1 f := load fa b := f + c {b} {c,e} {b} {c,f} {b,c,e,f} {c,d,e,f} {b,c,f} {c,d,f} {a,c,f} {c,d,f} {c,f}

42 Recomputing Liveness Information The new liveness information is almost as before f is live only –Between a f := load fa and the next instruction –Between a store f, fa and the preceding instr. Spilling reduces the live range of f And thus reduces its interferences Which result in fewer neighbors in RIG for f

43 Recompute RIG After Spilling The only changes are in removing some of the edges of the spilled node In our case f still interferes only with c and d And the resulting RIG is 3-colorable a f e d c b

44 Spilling (Cont.) Additional spills might be required before a coloring is found The tricky part is deciding what to spill Possible heuristics: –Spill temporaries with most conflicts –Spill temporaries with few definitions and uses –Avoid spilling in inner loops Any heuristic is correct

45 Conclusions Register allocation is a “ must have ” optimization in most compilers: –Because intermediate code uses too many temporaries –Because it makes a big difference in performance Graph coloring is a powerful register allocation schemes Register allocation is more complicated for CISC machines


Download ppt "1 Liveness analysis and Register Allocation Cheng-Chia Chen."

Similar presentations


Ads by Google