Intermediate Code Generating machine-independent intermediate form. Decouple backend from frontend, facilitate retargeting Machine independent code optimizer can be applied here. Position of intermediate code generator: Intermediate Code generation Static semantic analysis parser Machine-specific Code optimization Target code generation Machine-independent Code optimization
Intermediate languages, many kinds for different purposes High-level representation for source to source translation to keep the program structure: Abstract syntax tree Low-level representation for compiling for target machine. An intermediate form that is close to low level machine language. Three address code (more later) gcc uses RTL, a variation of the three address code. Other commonly used intermediate language Control flow graph, Program dependence graph (PDG), DAG (direct acyclic graph)
Three address code: A sequence of statement of the form x:=y op z Example: a:=b*-c + b * -c Three address statements are close to the assembly statements (OP src1 src2 dst) t1 := -c t2 := b * t1 t3 := -c t4 := b * t3 t5 = t2 + t4 a = t5 t1 := -c t2 := b * t1 t3 = t2 + t2 a = t3
Some three-address statements that will be used later: Assignment statements: With a binary operation: x := y op z With a unary operation: x:= op y With no operation(copy) : x := y Branch statements Unconditional jump: goto L Conditional jumps: if x relop y goto L Statement for procedure calls Param x, set a parameter for a procedure call Call p, n call procedure p with n parameters Return y return from a procedure with return value y
Example: instructions for procedure call: p(x1, x2, x3, …, xn): param x1 param x2 … param xn call p, n Indexed assignments: x := y[i] and x[i] := y Address and pointer assignments x := &y, x := *y