Intermediate Code Generation CS308 Compiler Theory1.

Intermediate Code Generation CS308 Compiler Theory1

2 Intermediate Code Generation Intermediate codes are machine independent codes, but they are close to machine instructions. The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator. Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language. –syntax trees can be used as an intermediate language. –postfix notation can be used as an intermediate language. –three-address code (Quadraples) can be used as an intermediate language we will use quadraples to discuss intermediate code generation quadraples are close to machine instructions, but they are not actual machine instructions. –some programming languages have well defined intermediate languages. java – java virtual machine prolog – warren abstract machine In fact, there are byte-code emulators to execute instructions in these intermediate languages.

CS308 Compiler Theory3 Three-Address Code (Quadraples) A quadraple is: x := y op z where x, y and z are names, constants or compiler-generated temporaries; op is any operator. But we may also the following notation for quadraples (much better notation because it looks like a machine code instruction) op y,z,x apply operator op to y and z, and store the result in x. We use the term “three-address code” because each statement usually contains three addresses (two for operands, one for the result).

CS308 Compiler Theory4 Three-Address Statements Binary Operator: op y,z,result or result := y op z where op is a binary arithmetic or logical operator. This binary operator is applied to y and z, and the result of the operation is stored in result. Ex: add a,b,c gt a,b,c addr a,b,c addi a,b,c Unary Operator: op y,,result or result := op y where op is a unary arithmetic or logical operator. This unary operator is applied to y, and the result of the operation is stored in result. Ex: uminus a,,c not a,,c inttoreal a,,c

CS308 Compiler Theory5 Three-Address Statements (cont.) Move Operator: mov y,,result or result := y where the content of y is copied into result. Ex: mov a,,c movi a,,c movr a,,c Unconditional Jumps: jmp,,L or goto L We will jump to the three-address code with the label L, and the execution continues from that statement. Ex: jmp,,L1 // jump to L1 jmp,,7 // jump to the statement 7

CS308 Compiler Theory6 Three-Address Statements (cont.) Conditional Jumps: jmprelop y,z,L or if y relop z goto L We will jump to the three-address code with the label L if the result of y relop z is true, and the execution continues from that statement. If the result is false, the execution continues from the statement following this conditional jump statement. Ex: jmpgt y,z,L1 // jump to L1 if y>z jmpgte y,z,L1 // jump to L1 if y>=z jmpe y,z,L1 // jump to L1 if y==z jmpne y,z,L1 // jump to L1 if y!=z Our relational operator can also be a unary operator. jmpnz y,,L1 // jump to L1 if y is not zero jmpz y,,L1 // jump to L1 if y is zero jmpt y,,L1 // jump to L1 if y is true jmpf y,,L1 // jump to L1 if y is false

CS308 Compiler Theory7 Three-Address Statements (cont.) Procedure Parameters: param x,, or param x Procedure Calls: call p,n, or call p,n where x is an actual parameter, we invoke the procedure p with n parameters. Ex: param x 1,, param x 2,,  p(x 1,...,x n ) param x n,, call p,n, f(x+1,y)  add x,1,t1 param t1,, param y,, call f,2,

CS308 Compiler Theory8 Three-Address Statements (cont.) Indexed Assignments: move y[i],,x or x := y[i] move x,,y[i] or y[i] := x Address and Pointer Assignments: moveaddr y,,x or x := &y movecont y,,x or x := *y

CS308 Compiler Theory9 Syntax-Directed Translation into Three-Address Code S  id := ES.code = E.code || gen(‘mov’ E.place ‘,,’ id.place) E  E 1 + E 2 E.place = newtemp(); E.code = E 1.code || E 2.code || gen(‘add’ E 1.place ‘,’ E 2.place ‘,’ E.place) E  E 1 * E 2 E.place = newtemp(); E.code = E 1.code || E 2.code || gen(‘mult’ E 1.place ‘,’ E 2.place ‘,’ E.place) E  - E 1 E.place = newtemp(); E.code = E 1.code || gen(‘uminus’ E 1.place ‘,,’ E.place) E  ( E 1 )E.place = E 1.place; E.code = E 1.code E  idE.place = id.place; E.code = null

CS308 Compiler Theory10 Syntax-Directed Translation (cont.) S  while E do S 1 S.begin = newlabel(); S.after = newlabel(); S.code = gen(S.begin “:”) || E.code || gen(‘jmpf’ E.place ‘,,’ S.after) || S 1.code || gen(‘jmp’ ‘,,’ S.begin) || gen(S.after ‘:”) S  if E then S 1 else S 2 S.else = newlabel(); S.after = newlabel(); S.code = E.code || gen(‘jmpf’ E.place ‘,,’ S.else) || S 1.code || gen(‘jmp’ ‘,,’ S.after) || gen(S.else ‘:”) || S 2.code || gen(S.after ‘:”)

CS308 Compiler Theory11 Translation Scheme to Produce Three-Address Code S  id := E{ p= lookup(id.name); if (p is not nil) then emit(‘mov’ E.place ‘,,’ p) else error(“undefined-variable”) } E  E 1 + E 2 { E.place = newtemp(); emit(‘add’ E 1.place ‘,’ E 2.place ‘,’ E.place) } E  E 1 * E 2 { E.place = newtemp(); emit(‘mult’ E 1.place ‘,’ E 2.place ‘,’ E.place) } E  - E 1 { E.place = newtemp(); emit(‘uminus’ E 1.place ‘,,’ E.place) } E  ( E 1 ){ E.place = E 1.place; } E  id{ p= lookup(id.name); if (p is not nil) then E.place = id.place else error(“undefined-variable”) }

CS308 Compiler Theory12 Translation Scheme with Locations S  id := { E.inloc = S.inloc } E { p = lookup(id.name); if (p is not nil) then { emit(E.outloc ‘mov’ E.place ‘,,’ p); S.outloc=E.outloc+1 } else { error(“undefined-variable”); S.outloc=E.outloc } } E  { E 1.inloc = E.inloc } E 1 + { E 2.inloc = E 1.outloc } E 2 { E.place = newtemp(); emit(E 2.outloc ‘add’ E 1.place ‘,’ E 2.place ‘,’ E.place); E.outloc=E 2.outloc+1 } E  { E 1.inloc = E.inloc } E 1 * { E 2.inloc = E 1.outloc } E 2 { E.place = newtemp(); emit(E 2.outloc ‘mult’ E 1.place ‘,’ E 2.place ‘,’ E.place); E.outloc=E 2.outloc+1 } E  - { E 1.inloc = E.inloc } E 1 { E.place = newtemp(); emit(E 1.outloc ‘uminus’ E 1.place ‘,,’ E.place); E.outloc=E 1.outloc+1 } E  ( E 1 ){ E.place = E 1.place; E.outloc=E 1.outloc+1 } E  id { E.outloc = E.inloc; p= lookup(id.name); if (p is not nil) then E.place = id.place else error(“undefined-variable”) }

CS308 Compiler Theory13 Boolean Expressions E  { E 1.inloc = E.inloc } E 1 and { E 2.inloc = E 1.outloc } E 2 { E.place = newtemp(); emit(E 2.outloc ‘and’ E 1.place ‘,’ E 2.place ‘,’ E.place); E.outloc=E 2.outloc+1 } E  { E 1.inloc = E.inloc } E 1 or { E 2.inloc = E 1.outloc } E 2 { E.place = newtemp(); emit(E 2.outloc ‘and’ E 1.place ‘,’ E 2.place ‘,’ E.place); E.outloc=E 2.outloc+1 } E  not { E 1.inloc = E.inloc } E 1 { E.place = newtemp(); emit(E 1.outloc ‘not’ E 1.place ‘,,’ E.place); E.outloc=E 1.outloc+1 } E  { E 1.inloc = E.inloc } E 1 relop { E 2.inloc = E 1.outloc } E 2 { E.place = newtemp(); emit(E 2.outloc relop.code E 1.place ‘,’ E 2.place ‘,’ E.place); E.outloc=E 2.outloc+1 }

CS308 Compiler Theory14 Translation Scheme(cont.) S  while { E.inloc = S.inloc } E do { emit(E.outloc ‘jmpf’ E.place ‘,,’ ‘NOTKNOWN’); S 1.inloc=E.outloc+1; } S 1 { emit(S 1.outloc ‘jmp’ ‘,,’ S.inloc); S.outloc=S 1.outloc+1; backpatch(E.outloc,S.outloc); } S  if { E.inloc = S.inloc } E then { emit(E.outloc ‘jmpf’ E.place ‘,,’ ‘NOTKNOWN’); S 1.inloc=E.outloc+1; } S 1 else { emit(S 1.outloc ‘jmp’ ‘,,’ ‘NOTKNOWN’); S 2.inloc=S 1.outloc+1; backpatch(E.outloc,S 2.inloc); } S 2 { S.outloc=S 2.outloc; backpatch(S 1.outloc,S.outloc); }

CS308 Compiler Theory15 Three Address Codes - Example x:=1; y:=x+10; while (x<y) {  x:=x+1; if (x%2==1) then y:=y+1; else y:=y-2; } 01: mov 1,,x 02: add x,10,t1 03: mov t1,,y 04: lt x,y,t2 05: jmpf t2,,17 06: add x,1,t3 07: mov t3,,x 08: mod x,2,t4 09: eq t4,1,t5 10: jmpf t5,,14 11: add y,1,t6 12: mov t6,,y 13: jmp,,16 14: sub y,2,t7 15: mov t7,,y 16: jmp,,4 17:

CS308 Compiler Theory16 Arrays Elements of arrays can be accessed quickly if the elements are stored in a block of consecutive locations. A one-dimensional array A: base A lowi width base A is the address of the first location of the array A, width is the width of each array element. low is the index of the first array element location of A[i]  base A +(i-low)*width … …

CS308 Compiler Theory17 Arrays (cont.) base A +(i-low)*width can be re-written as i*width + (base A -low*width) should be computed at run-timecan be computed at compile-time So, the location of A[i] can be computed at the run-time by evaluating the formula i*width+c where c is (base A -low*width) which is evaluated at compile-time. Intermediate code generator should produce the code to evaluate this formula i*width+c (one multiplication and one addition operation).

CS308 Compiler Theory18 Two-Dimensional Arrays A two-dimensional array can be stored in –either row-major (row-by-row) or –column-major (column-by-column). Most of the programming languages use row-major method. Row-major representation of a two-dimensional array: row 1 row 2 row n

CS308 Compiler Theory19 Two-Dimensional Arrays (cont.) The location of A[i 1,i 2 ] is base A + ((i 1 -low 1 )*n 2 +i 2 -low 2 )*width base A is the location of the array A. low 1 is the index of the first row low 2 is the index of the first column n 2 is the number of elements in each row width is the width of each array element Again, this formula can be re-written as ((i 1 *n 2 )+i 2 )*width + (base A -((low 1 *n 1 )+low 2 )*width) should be computed at run-timecan be computed at compile-time

CS308 Compiler Theory20 Multi-Dimensional Arrays In general, the location of A[i 1,i 2,...,i k ] is ((... ((i 1 *n 2 )+i 2 )...)*n k +i k )*width + (base A - ((...((low 1 *n 1 )+low 2 )...)*n k +low k )*width) So, the intermediate code generator should produce the codes to evaluate the following formula (to find the location of A[i 1,i 2,...,i k ]) : ((... ((i 1 *n 2 )+i 2 )...)*n k +i k )*width + c To evaluate the ((... ((i 1 *n 2 )+i 2 )...)*n k +i k portion of this formula, we can use the recurrence equation: e 1 = i 1 e m = e m-1 * n m + i m

CS308 Compiler Theory21 Translation Scheme for Arrays If we use the following grammar to calculate addresses of array elements, we need inherited attributes. L  id | id [ Elist ] Elist  Elist, E | E Instead of this grammar, we will use the following grammar to calculate addresses of array elements so that we do not need inherited attributes (we will use only synthesized attributes). L  id | Elist ] Elist  Elist, E | id [ E

CS308 Compiler Theory22 Translation Scheme for Arrays (cont.) S  L := E{ if (L.offset is null) emit(‘mov’ E.place ‘,,’ L.place) else emit(‘mov’ E.place ‘,,’ L.place ‘[‘ L.offset ‘]’) } E  E 1 + E 2 { E.place = newtemp(); emit(‘add’ E 1.place ‘,’ E 2.place ‘,’ E.place) } E  ( E 1 ){ E.place = E 1.place; } E  L{ if (L.offset is null) E.place = L.place) else { E.place = newtemp(); emit(‘mov’ L.place ‘[‘ L.offset ‘]’ ‘,,’ E.place) } }

CS308 Compiler Theory23 Translation Scheme for Arrays (cont.) L  id { L.place = id.place; L.offset = null; } L  Elist ] { L.place = newtemp(); L.offset = newtemp(); emit(‘mov’ c(Elist.array) ‘,,’ L.place); emit(‘mult’ Elist.place ‘,’ width(Elist.array) ‘,’ L.offset) } Elist  Elist 1, E { Elist.array = Elist 1.array ; Elist.place = newtemp(); Elist.ndim = Elist 1.ndim + 1; emit(‘mult’ Elist 1.place ‘,’ limit(Elist.array,Elist.ndim) ‘,’ Elist.place); emit(‘add’ Elist.place ‘,’ E.place ‘,’ Elist.place); } Elist  id [ E {Elist.array = id.place ; Elist.place = E.place; Elist.ndim = 1; }

CS308 Compiler Theory24 Translation Scheme for Arrays – Example1 A one-dimensional double array A : 5..100  n 1 =95 width=8 (double) low 1 =5 Intermediate codes corresponding to x := A[y] mov c,,t1 // where c=base A -(5)*8 mult y,8,t2 mov t1[t2],,t3 mov t3,,x

CS308 Compiler Theory25 Translation Scheme for Arrays – Example2 A two-dimensional int array A : 1..10x1..20  n 1 =10 n 2 =20 width=4 (integers) low 1 =1 low 2 =1 Intermediate codes corresponding to x := A[y,z] mult y,20,t1 add t1,z,t1 mov c,,t2 // where c=base A -(1*20+1)*4 mult t1,4,t3 mov t2[t3],,t4 mov t4,,x

CS308 Compiler Theory26 Translation Scheme for Arrays – Example3 A three-dimensional int array A : 0..9x0..19x0..29  n 1 =10 n 2 =20 n 3 =30 width=4 (integers) low 1 =0 low 2 =0 low 3 =0 Intermediate codes corresponding to x := A[w,y,z] mult w,20,t1 add t1,y,t1 mult t1,30,t2 add t2,z,t2 mov c,,t3 // where c=base A -((0*20+0)*30+0)*4 mult t2,4,t4 mov t3[t4],,t5 mov t5,,x

Test Yourself for i:=1 to M do for j:=1 to N do A[i,j]:=B[i,j] 假设数组每维下届为 1 ，数组每个单元长度为 1 ，按行存放。写出四元式中间代码。 CS308 Compiler Theory27

CS308 Compiler Theory28 B1:I := 1 (:=, 1, _, I) B2:if I > M goto B7 (J>, I, M, B7) B3:J := 1 (:=, 1, _, J) B4: if J > N goto B6 (J>, J, N, B6) B5: T1 := I * N (*, I, N, T1) T2 := T1 + J (+, T1, J, T2) T3 := addr(A) – C /* C = N + 1 */ (-, addr(A), C, T3) T4 := I * N (*, I, N, T4) T5 := T4 + J (+, T4, J, T5) T6 := addr[B] – C (-, addr(B), C, T6) T7 := T6[T5] (=[], T6[ T5],_, T7) T3[T2] := T7 /* A[I,J] := B[I,J] */ ([]=, T7, _, T3[T2]) J := J + 1 (+, J, 1, J) goto B4 (J, _, _, B4) B6: I := I + 1 (+, I, 1, I) goto B2 (J, _, _, B2) B7:

boolean expressions boolean expressions serve too purposes: compute a logical value used as conditional expressions in flow-of-control statements E  E || E | E && E | ！ E | ( E ) | E relop E | true | false

methods of translating boolean expressions numerical computation, e.g. 1 denotes true, 0 false position in a program (flow of control), e.g. in if-then-else statements

numerical representation – complete evaluation E  E 1 || E 2 {E.place := newtemp; emit (E.place, ‘:=’, E 1.place, ‘or’ E 2.place) } E  id 1 relop id 2 {E.place := newtemp; emit (‘if’, id 1.place, relop.op, id 2.place, ‘goto’, nextstat+3 ); emit (E.place, ‘:=’, ‘0’ ); emit (‘goto’, nextstat + 2 ); emit (E.place, ‘:=’, ‘1’ ) }

translation of a<b || c<d && e<f 100:if a < b goto 103 101:t1 := 0 102: goto 104 103:t1 := 1 104: if c < d goto 107 105:t2 := 0 106: goto 108 107: t2 := 1 108:if e < f goto 111 109: t3 := 0; 110: goto 112 111: t3 := 1 112: t4 := t2 and t3 113: t5 := t1 or t2

flow of control – short-circuit evaluation E  E 1 || E 2 {E 1.true := E.true; /* E.true: the label to which control flows if E is true */ E 1.false := newlabel(); E 2.true := E.true; E 2.false := E.false; E.code := E 1.code || gen(E 1.false, ‘:’) || E 2.code }

E  E 1 && E 2 {E 1.true := newlabel(); E 1.false := E.false; E 2.true := E.true; E 2.false := E.false; E.code := E 1.code || gen(E 1. true, ‘:’) || E 2.code }

E  ! E 1 {E 1.true := E.false; E 1.false := E.true; E.code := E 1.code }

E  E 1 relop E 2 {E.code := E 1.code || E 2.code || gen(‘if’,E 1.place,relop.op, E 2.place, ‘goto’, E.true) || gen(‘goto’, E.false) } E  true {E.code := gen(‘goto’, B.true)} E  false {E.code := gen(‘goto’, B.false)}

translation of a<b or c<d and e<f if a < b goto L true goto L 1 L 1 :if c < d goto L 2 goto L false L 2 :if e < f goto L true goto L false 有多余的 goto 语句！

translation of flow-of-control statements S  if (E) S 1 | if (E) S 1 else S 2 | while (E) S 1 | S 1 S 2 S.next : the label that is attached to the first three-address code to be executed after the code for S

code for flow-of-control statements E.code S 1.code E.true:... to E.true to E.false (a) if-then E.code S 1.code E.true:... to E.true to E.false E.false: goto S.next S 2.code (b) if-then-else E.code S 1.code E.true:... to E.true to E.false goto S.begin S.begin: (c) while-do E.false: S.next: E.false:

S  if (E) S 1 {E.true := newlabel(); E.false := S.next; S 1.next := S.next; S.code := E.code || gen(E.true, ‘:’) || S 1.code }

S  if (E) S 1 else S 2 {E.true := newlabel(); E.false := newlabel(); S 1.next := S.next; S 2.next := S.next; S.code := E.code || gen(E.true, ‘:’) || S 1.code ||gen(‘goto’, S.next) || gen(E.false, ‘:’) ||S 2.code}

S  while (E) S 1 {S.begin:= newlabel(); E.true := newlabel(); E.false := S.next; S 1.next := S.begin; S.code := gen(S.begin, ‘:’) || E.code || gen(E.true, ‘:’) || S 1.code || gen(‘goto’, S.begin) }

S  S 1 S 2 {S 1.next := newlabel(); S 2.next := S.next; S.code := S 1.code || gen(S 1.next, ‘:’) || S 2 code}

例子 if (X 200 && x != y) x = 0; => if x < 100 goto L2 goto L3 L3:if x > 200 goto L4 goto L1 L4:if x!=y goto L2 goto L1 L2:x = 0 L1:

如何产生更高效的代码 if x > 200 goto L4 goto L1 L4:…  if x <= 200 goto L1 L4:… (fall through)

S  if (E) S 1 {E.true := fall; // not newlabel; E.false := S.next; S 1.next := S.next; S.code := E.code || S 1.code } 对于 if (E) S else S, while (E) S, 同样设置 E.true 为 fall

利用 fall through E  E 1 relop E 2 {test = E 1 relop E 2 s = if E.true != fall and E.false != fall then gen(‘if’ test ‘goto’, E.true) || gen(‘goto’, E.false) else if (E.true != fall) then gen(‘if’ test ‘goto’, E.true) else if (E.false != fall) then gen(‘if’ ! test ‘goto’, E.false) else ‘’ E.code := E 1.code || E 2.code || s }

E  E 1 || E 2 {E 1.true := if (E.true = fall) newlabel() else E.true; E 1.false := fall; E 2.true := E.true; E 2.false := E.false; E.code := if (E.true = fall) then E 1.code || E 2.code || gen(E 1.true, ‘:’) else E 1.code || E 2.code }

E  E 1 && E 2 {E 1.false := if (E.false = fall) newlabel() else E.false; E 1.true := fall; E 2.true := E.true; E 2.false := E.false; E.code := if (E.false = fall) then E 1.code || E 2.code || gen(E 1.false, ‘:’) else E 1.code || E 2.code }

case statements switch E begin case V 1 : S 1 case V 2 : S 2... case V n - 1 : S n – 1 default: S n end

to facilitate case optimization we need to provide special IL instructions so that compilers can recognize the case construct and do appropriate optimizations: case V 1 L 1 case V 2 L 2... case V n-1 L n-1 case t L n next:

backpatching allows generation of intermediate code in one pass (the problem with translation scheme before is that we have inherited attributes such as S.next, which is not suitable to implement in bottom-up parsers) idea: the labels (in the three-address code) will be filled when we know the places attributes: E.truelist (true exits ：真标号表 ), E.falselist (false exits)

我们把所有生成的中间代码（假设为 4 元组）用一个指令数组表示，那么标号就可以看作是数组的索引用变量 nextquad 表示下一个四元组的编号， emit （）函数将生成一个四元组，同时 nextquad++ ；

three auxiliary functions: makelist(i) : create a list containing i (index to quadruples) merge(p1,p2): returns a concatenated list of p1 and p2 backpatch(p, i) insert i as the target label for each of the statement on list p

boolean expressions for E 1 or E 2, we know we must evaluate E2 if E1 is false. We use a marker nonterminal. E  E 1 or M E 2 E  E 1 and M E 2 E  not E 1 E  ( E 1 ) E  id 1 relop id 2 | true | false M  e

attribute M.quad records the number of the first statement (quadruple) of E 2.code E  E 1 or M E 2 {backpatch(E 1.falselist,M.quad); E.truelist := merge(E 1.truelist, E 2.truelist); E.falselist := E 2.falselist} M  e {M.quad := nextquad}

E  E 1 and M E 2 {backpatch(E 1.truelist,M.quad); E.falselist := merge(E 1.falselist, E 2.falselist); E.truelist := E 2.truelist} E  not E 1 {E.truelist := E 1.falselist; E.falselist := E 1.truelist;} E  ( E 1 ) {E.truelist := E 1.truelist; E.falselist := E 1.falselist;}

E  id1 relop id2 {E.truelist := makelist(nextquad); E.falselist := makelist(nextquad+1); emit(‘if’ id1.place relop.op id2.place ‘goto _’); emit(‘goto _’);} E  true {E.truelist := makelist(nextquad); emit(‘goto _’);} E  false {E.falselist := makelist(nextquad); emit(‘goto _’);}

We use an attribute S.nextlist for a list of jumps to the quadruple following S in execution order. We also define L.nextlist similarly the need for.nextlist if E then S1 else S2 we must use a ‘goto _’ after code for S1 to skip over code for S2

S  if E then M S 1 {backpatch(E.truelist, M.quad); S.nextlist := merge(E.falselist, S 1.nextlist} S  if E then M 1 S 1 N else M 2 S 2 {backpatch(E.truelist, M 1.quad); backpatch(E.falselist, M 2.quad); S.nextlist := merge(S 1.nextlist, N.nextlist, S 2.nextlist)}

N  ε{N.nextlist := makelist(nextquad); emit(‘goto _’);} M  ε{M.quad := nextquad;} S  while M 1 E do M 2 S 1 {backpatch(E.truelist, M 2.quad); backpatch(S 1.nextlist, M 1.quad); S.nextlist := E.falselist; emit(‘goto’ M 1.quad)}

S  begin L end {S.nextlist := L.nextlist;} S  A {S.nextlist := nil;} L  L 1 ; M S { backpatch(L 1.nextlist, M.quad); L.nextlist := S.nextlist;} L  S { L.nextlist := S.nextlist;}

Labels and Gotos for goto L, we have to change L into the address of the three-address code for the statement where L is attached when L’s address has been found, we can do this easily with the information in the symbol table otherwise, we have to use backpatching. We keep the list of address to be filled in the symbol table

Break and continue 必须记住包含 break 或 continue 的上层语句（ while, for ， switch ） S 生成 goto _ 之类的代码，同时把这个四元组的标号加到 S.nextlist 回填

procedure calls assume procedure call is generated using the grammar: S  call id (Elist) Elist  Elist, E Elist  E One translation scheme is: first evaluate parameters, then pass them together we have to use a queue to store those evaluated results

S  call id (Elist) {for each item p on queue do emit(‘param’, p); emit(‘call’, id.place) } Elist  Elist, E {append(E.place, queue)} Elist  E {queue := initqueue(E.place)} queue 是一个全局变量

另一种方法是用栈  S  Elistp) Elistp  Elistp, E| call id ( E we can add synthesized attributes Elistp.procname, Elistp.paramstack 这比较容易在 LR 方法实现。最后看到 ) 时，可以做类型检查

在实际的编译器中, 问题要复杂. 首先，我们不但要处理过程，还要处理函数。而函数是可以嵌套的。因此不可能采用全局变量。另外，没有必要所有的参数同时传递，可以计算好一个传递一个。（ param.c)

CS308 Compiler Theory71 Declarations P  M D M  €{ offset=0 } D  D ; D D  id : T{ enter(id.name,T.type,offset); offset=offset+T.width } T  int{ T.type=int; T.width=4 } T  real{ T.type=real; T.width=8 } T  array[num] of T 1 { T.type=array(num.val,T 1.type); T.width=num.val*T 1.width } T  ↑ T 1 { T.type=pointer(T 1.type); T.width=4 } where enter crates a symbol table entry with given values.

CS308 Compiler Theory72 Nested Procedure Declarations For each procedure we should create a symbol table. mktable(previous) – create a new symbol table where previous is the parent symbol table of this new symbol table enter(symtable,name,type,offset) – create a new entry for a variable in the given symbol table. enterproc(symtable,name,newsymbtable) – create a new entry for the procedure in the symbol table of its parent. addwidth(symtable,width) – puts the total width of all entries in the symbol table into the header of that table. We will have two stacks: –tblptr – to hold the pointers to the symbol tables –offset – to hold the current offsets in the symbol tables in tblptr stack.

CS308 Compiler Theory73 Nested Procedure Declarations P  M D { addwidth(top(tblptr),top(offset)); pop(tblptr); pop(offset) } M  € { t=mktable(nil); push(t,tblptr); push(0,offset) } D  D ; D D  proc id N D ; S { t=top(tblptr); addwidth(t,top(offset)); pop(tblptr); pop(offset); enterproc(top(tblptr),id.name,t) } D  id : T { enter(top(tblptr),id.name,T.type,top(offset)); top(offset)=top(offset)+T.width } N  € { t=mktable(top(tblptr)); push(t,tblptr); push(0,offset) }

Test Yourself CS308 Compiler Theory74 S  do S (1) While E 其语义解释为：真假 S (1) 的代码 E 的代码针对自下而上的语法分析器，按如下要求构造该语句的翻译模式： (1) 写出适合语法制导翻译的产生式； (2) 写出每个产生式对应的语义动作。

G(S): R  do U  R S (1) While S  U E R  do { R.QUAD:=NXQ } U  R S (1) While { U.QUAD:=R.QUAD; BACKPATCH(S.CHAIN, NXQ) } S  U E { BACKPATCH(E.TC, U.QUAD); S.CHAIN:=E.FC } 答案二： (1) S  do M 1 S (1) While M 2 E M  ε (3 分 ) (2) M  ε { M.QUAD := NXQ } (6 分 ) S  do M 1 S (1) While M 2 E { BACKPATCH(S (1).CHAIN, M 2.QUAD); BACKPATCH(E.TC, M 1.QUAD); S.CHAIN:=E. FC } CS308 Compiler Theory75

(1) S  do M 1 S (1) While M 2 E M  ε (2)M  ε { M.QUAD := NXQ } S  do M 1 S (1) While M 2 E { BACKPATCH(S (1).CHAIN, M 2.QUAD); BACKPATCH(E.TC, M 1.QUAD); S.CHAIN:=E. FC } CS308 Compiler Theory76

Intermediate Code Generation CS308 Compiler Theory1.

Similar presentations

Presentation on theme: "Intermediate Code Generation CS308 Compiler Theory1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intermediate Code Generation CS308 Compiler Theory1.

Similar presentations

Presentation on theme: "Intermediate Code Generation CS308 Compiler Theory1."— Presentation transcript:

Similar presentations

About project

Feedback