Download presentation
Presentation is loading. Please wait.
Published byGeorgia Barber Modified over 9 years ago
1
Instruction Selection Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html
2
Already Studied Source program (string) lexical analysis syntax analysis semantic analysis Translate Tokens Abstract syntax tree Tree IR Abstract syntax tree Cannon Cannonical Tree IR
3
Instruction Selection Input: –Cannonical IR –Description of translation rules from IR into machine language Output –Machine code Unbounded number of registers Some prologue and epilogue instructions are missing
4
LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) LABEL(l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend) LABEL(l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2)
5
l3:beq t128, $0, l0 l1: or t131, $0, t128 addi t132, t128, -1 or $4, $0, t132 jal nfactor or t130, $0, $2 or t133, $0, t131 mult t133, t130 mflo t133 or t129, $0, t133 l2: or t103, $0, t129 b lend l0: addi t129, $0, 1 b l2
6
The Challenge “Clumps” of trees can be translated into a single machine instruction MOVE MEM BINOP TEMP t1 PLUSTEMP t2CONST c lw t1, c(t2)
7
Outline The “Tiling” problem An optimal solution An optimum solution (via dynamic programming) Tree grammars The Pentium architecture Instruction selection for Tiger –Abstract data type for machine instructions
8
Instruction set in the Jouette Machine ADDr i r j + r k MUL r i r j * r k SUBr i r j - r k DIVr i r j / r k ADDIr i r j + c SUBIr i r j - c LOADr i M[r j + c] STORE M[r i + c] r j MOVEM M[r i ] M[r j ]
9
Tree Patterns for Jouette Machine NameEffectTrees rjrj TEMP ADD MUL r i r j + r j r i r j * r k +(e 1, e 2 ) *(e 1, e 2 ) SUB DIV r i r j - r k r i r j / r k -(e 1, e 2 ) /(e 1, e 2 )
10
Tree Patterns for Jouette Machine(cont) NameEffectTrees ADDI r i r j + c +(e, CONST c) +(CONST c, e) CONST c SUBI r i r j - c -(e, CONST c) LOAD r i M[r j + c] MEM(+(e, CONST(c)) MEM(+(CONST(c),e) MEM(CONST c) MEM(e) STORE M[r i + c] r j MOVE(MEM(+(e 1, CONST c)), e 2 ) MOVE(MEM(+(CONST c, e 1 )), e 2 ) MOVE(MEM(CONST c), e 2 ) MOVE(MEM(e 1 ), e 2 ) MOVEM M[r i ] M[r j ] MOVE(MEM(e 1 ), MEM(e 2 ))
11
The Tiling Problem Cover the tree with non overlapping tiles from the tree patterns Minimize “the cost” of the generated code
12
Example Tiger input a[e] := x PLUS MOVE MEM BINOP MEM BINOP PLUSTEMP FPCONST -8 TIMESTEMP teCONST 4 BINOP PLUSTEMP FPCONST -4 MEM
13
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMESTEMP teCONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 2 r 0 + 4 LOAD r 1 M[FP + -8] MUL r 2 te * r 2 ADD r 1 r 1 +r 2 MEM LOAD r 2 M[FP + -4] STORE M[ r 1 + 0] r 2
14
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMES TEMP te CONST 4 BINOP PLUS TEMP FP CONST -4 LOAD r 1 M[FP + -8] ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 LOAD r 2 M[FP + -4] STORE M[ r 1 + 0] r 2 MEM
15
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMESTEMP teCONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 2 r 0 + 4 LOAD r 1 M[FP + -8] MUL r 2 te * r 2 ADD r 1 r 1 +r 2 MEM MOVEM M[ r 1 ] M[ r 2 ] ADDI r 2 FP + -4
16
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMES TEMP te CONST 4 BINOP PLUS TEMP FP CONST -4 LOAD r 1 M[FP + -8] ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 ADD r 2 FP + r 2 MOVEM M[ r 1 ] M[ r 2 ] MEM
17
The Tiling Problem Cover the tree with non overlapping tiles from the tree patterns Minimize “the cost” of the generated code Assures that every tree can be covered –Tree patterns for all the “tiny” tiles
18
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMESTEMP teCONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 ADDI r 2 FP + -4 ADDI r 1 r 0 + -8 ADD r 1 FP + r 1 LOAD r 1 M[r 1 +0] MEM LOAD r 2 M[r 2 + 0] STORE M[ r 1 + 0] r 2
19
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMES TEMP te CONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 1 r 0 + -8 ADD r 1 FP + r 1 LOAD r 1 M[r 1 + 0] ADDI r 2 r 0 +4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 ADDI r 2 r 0 + -4 ADD r 2 FP + r 2 LOAD r 2 M[r 2 + 0] STORE M[ r 1 ] r 2 MEM
20
Optimal vs. Optimum Tiling Optimum Tiling –Minimum cost of tile sum Optimal Tiling –No two adjacent tiles can be combined to reduce the cost
21
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMESTEMP teCONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 2 r 0 + 4 LOAD r 1 M[FP + -8] MUL r 2 te * r 2 ADD r 1 r 1 +r 2 MEM LOAD r 2 M[FP + -4] STORE M[ r 1 + 0] r 2
22
PLUS MOVE MEM BINOP MEM BINOP PLUS TEMP FP CONST -8 TIMESTEMP teCONST 4 BINOP PLUS TEMP FP CONST -4 ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 ADDI r 2 FP + -4 ADDI r 1 r 0 + -8 ADD r 1 FP + r 1 LOAD r 1 M[r 1 +0] MEM LOAD r 2 M[r 2 + 0] STORE M[ r 1 + 0] r 2
23
Optimum Tiling LOAD r 1 M[FP + -8] ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 LOAD r 2 M[FP + -4] STORE M[ r 1 + 0] r 2 LOAD r 1 M[FP + -8] ADDI r 2 r 0 + 4 MUL r 2 te * r 2 ADD r 1 r 1 +r 2 ADD r 2 FP + r 2 MOVEM M[ r 1 ] M[ r 2 ]
24
RISC vs. CISC Machines FeatureRISCCISC Registers 32 6, 8, 16 Register ClassesOneSome Arithmetic OperandsRegistersMemory+Registers Instructions3-addr2-addr Addressing Modes r M[r+c] (l,s) several Instruction Length32 bitsVariable Side-effectsNoneSome Instruction-Cost“Uniform”Varied
25
Architecture and Tiling Algorithm RISC –Cost of operations is uniform –Optimal tiling usually suffices CISC –Optimum tiling may be significantly better
26
Optimal Tiling using “Maximal Munch” Top-down traversal of the IR tree At every node try the relevant tree patterns in “cost-order” Generate assembly code in reverse order Tiny tiles guarantees that we can never get stack
27
static void munchStm(T_stm s) { switch(s->kind) { case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (dst->u.MEM->kind==T_BINOP && dst->u.MEM->u.BINOP.op==T_PLUS && dst->u.MEM->u.BINOP.right.kind==T_CONST) { T_exp e1 =dst->u.MEM->u.BINOP.left, e2=src; /* MOVE(MEM(BINOP(PLUS, e 1, CONST c,), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (dst->u.MEM->kind==T_BINOP && dst->u.MEM->u.BINOP.op==T_PLUS && dst->u.MEM->u.BINOP.left.kind==T_CONST) { T_exp e1 =dst->u.MEM->u.BINOP.right, e2=src; /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); }
28
static void munchStm(T_stm s) { switch(s->kind) { case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (… ) { /* MOVE(MEM(BINOP(PLUS, e 1, CONST c,), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (src->kind==T_MEM) { T_exp e1= dst->u.MEM, e2=src->u.MEM; /* MOVE(MEM(e1), MEM(e2)) */ munchExp(e1), munchExp(e2); emit(“MOVEM”) ;} else { T_exp e1=dst->u.MEM, e2=src; /* MOVE(MEM(e1), e2) */ munchExp(e1), munchExp(e2); emit(“STORE”) ;}
29
case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (… ) { /* MOVE(MEM(BINOP(PLUS, e 1, CONST c,), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e 2 ) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(e1), MEM(e2)) */ munchExp(e1), munchExp(e2); emit(“MOVEM”) ;} else {/* MOVE(MEM(e1), e2) */ munchExp(e1), munchExp(e2); emit(“STORE”) ;} else if (dst->kind==T_TEMP) { T_exp e=src; /* MOVE(TEMP t, e) */ munchExp(e); emit(“ADD”); } else assert(0);
30
static void munchStm(T_stm s) { MOVE(MEM(BINOP(PLUS, e 1, CONST c), e 2 ) munchExp(e1); munchExp(e2); emit(“STORE”); MOVE(MEM(BINOP(PLUS, CONST c, e1), e 2 ) munchExp(e1); munchExp(e2); emit(“STORE”); MOVE(MEM(e1), MEM(e2)) munchExp(e1), munchExp(e2); emit(“MOVEM”) ; MOVE(TEMP t, e) munchExp(e); emit(“ADD”); JUMP(e) … CJUMP(e) … LABEL(l) }
31
static void munchExp(T_exp e) { MEM(BINOP(PLUS, e, CONST c)) munchExp(e); emit(“LOAD”); MEM(BINOP(PLUS, CONST c, e 1 ) munchExp(e); emit(“LOAD”); MEM(CONST c) emit(“LOAD”); MEM(e) munchExp(e); emit(“LOAD”); BINOP(PLUS, e, CONST c) munchExp(e); emit(“ADDI”); BINOP(PLUS, CONST c, e) munchExp(e); emit(“ADDI”); BINOP(CONST c) munchExp(e); emit(“ADDI”); BINOP(PLUS, e1, e2) munchExp(e1; munchExp(e2); emit(“ADD”); … TEMP t
32
Example Tiger input a[e] := x PLUS MOVE MEM BINOP MEM BINOP PLUSTEMP FPCONST -8 TIMESTEMP teCONST 4 BINOP PLUSTEMP FPCONST -4 MEM
33
Optimum Tiling Maximal munch does not necessarily produce optimum results The number of potential code sequences is quite big But Dynamic Programming yields an optimum solution in linear time Assign optimum cost to every sub-tree Two phase solution –Find the optimum cost for every subtree in a bottom up traversal –Generate the optimum solution in a top down traversal Skip subtrees
34
Dynamic Programming For each subtree with root n –For each tile t which matches n of cost c Calculate the cost of t as: c + c i –The cost of the subtree rooted at n is the minimum of all matching tiles Generate the optimum code during top- down traversal
35
Example MEM BINOP PLUSCONST 1CONST 2
36
CONST 1 TileInstructionTile Cost Leaves Cost Total Cost CONST CADDI101
37
CONST 2 TileInstructionTile Cost Leaves Cost Total Cost CONST CADDI101
38
TileInst.Tile Cost Leaves Cost Total Cost ADD11+13 ADDI112 112 BINOP PLUSCONST 1CONST 2 CONST C BINOP PLUS e BINOP PLUS e1 e2 BINOP PLUS CONST c e
39
MEM BINOP PLUS CONST 1 CONST 2
40
TileInst.Tile Cost Leaves Cost Total Cost LOAD123 11+13 LOAD112 112 e1 BINOP PLUS e2 MEM e BINOP PLUSe CONST c MEM BINOP PLUS CONST c e
41
MEM BINOP PLUS CONST 1 CONST 2 Top-Down Code Generation ADDI(1) ADDI(2) LOAD(2) ADDI r 1 r 0 + 1 LOAD r 1 M[r 1 + 2]
42
The “Schizo”-Jouette Machine In the spirit of Motorola 68000 Two types of registers –data registers –address registers Arithmetic performed on data registers Load and Store using address registers Machine instruction to convert between addresses and data
43
Tree Patterns for Schizo-Jouette NameEffectTrees djdj TEMP ajaj ADD MUL d i d j + d j d d j * d k d+(e 1, e 2 ) d*(e 1, e 2 ) SUB DIV d i d j - d k d i d j / d k d-(e 1, e 2 ) d/(e 1, e 2 )
44
Tree Patterns for Schizo-Jouette Machine NameEffectTrees ADDI d i d j + c d+(e, CONST c) d+(CONST c, e) dCONST c SUBI d i d j - c d-(e, CONST c) LOAD d i M[a j + c] dMEM(+(ae, CONST(c)) dMEM(+(CONST(c),ae) dMEM(CONST c) dMEM(ae) STORE M[a i + c] d j MOVE(MEM(+(ae 1, CONST c)), de 2 ) MOVE(MEM(+(CONST c, ae 1 )), de 2 ) MOVE(MEM( CONST c), de 2 ) MOVE(MEM(ae 1 ), de 2 ) MOVEM M[a i ] M[a j ] MOVE(MEM(a 1 ), MEM(a 2 ))
45
Tree Patterns for Schizo-Jouette NameEffectTrees MOVEA d i a j d a MOVED a i d j a d
46
Tree Grammars A generalization of dynamic programming Input –A (usually ambiguous) context free grammar describing the machine tree patterns non-terminals correspond to machine types every production has machine cost –A linearized IR tree Output –A parse-tree with the minimum cost
47
Partial Grammar for Schizo-Jouette d TEMP t a TEMP t d +(d, d) d +(d, CONST) d +(CONST, d) d MEM(+(a, CONST)) d MEM(+(CONST, a)) d MEM(CONST) d MEM(a) d a a d MEM(+(CONST 1, CONST 2))
48
Simple Instruction-Selection in the Pentium Architecture Six general purpose registers The multiply requires that the left arg. is eax Two-address instructions Arithmetic on memory Several addressing modes Variable-length instructions Instructions with side-effects Good register allocation For t 1 t 2 * t 3 –mov eax, t 1 –mul t 2 –mov t 3, eax For t 1 t 2 + t 3 –mov t 1, t 2 –add t 1, t 3 add [ebp –8], ecx –mov eax, [ebp –8] –add eax, ecx –mov [ebp-8], eax
49
Instruction-Selection in the Tiger Compiler Use maximal munch Store the generated code in an abstract data type –The following phases are machine-independent –Control flow of the program is explicitly represented –Special representation of MOVE Register allocation can remove
50
/* assem.h */ typedef struct {Temp_labelList labels;} AS_targets; AS_targets AS_Targets(Temp_labelList labels); typedef struct AS_instr_ *AS_instr; typedef enum {I_OPER, I_LABEL, I_MOVE} AS_instr_kind; struct AS_instr_ { AS_instr_kind kind; union {struct {string assem; Temp_tempList dst, src; AS_targets jumps;} OPER; struct {string assem; Temp_label label;} LABEL; struct {string assem; Temp_tempList dst, src;} MOVE; } u; }; AS_instr AS_Oper(string a, Temp_tempList d, Temp_tempList s, AS_targets j); AS_instr AS_Label(string a, Temp_label label); AS_instr AS_Move(string a, Temp_tempList d, Temp_tempList s);
51
Summary Type of Machines –CISC(Pentium, MC68000, IBM 370) –RISC(MIPS, Sparc) 1990- –Other VLIW(Itanium) 2000- Types of Instruction-Selection Algorithms –Ad hock –Optimal using Maximal-Munch –Optimum using Dynamic Programming –Optimum using Tree-Grammars and ambiguous parsers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.