Compiler Chapter 9. Intermediate Languages Sung-Dong Kim Dept. of Computer Engineering, Hansung University
1. Introduction (1) Compiler model Source Program Back-End IL Lexical Analyzer Source Program tokens Syntax Analyzer AST Back-End Semantic Analyzer Intermediate Code Generator IL Code Optimizer Object Program Target Code Generator Front-End Front-End - language dependant part Back-End - machine dependant part (2011-1) Compiler
1. Introduction (2) Intermediate language Advantages Link front end and back end of the compiler Advantages Modularity Portability Bridge between semantic gaps Machine-independent code optimization Interpreter can execute on interpretive compiling system (2011-1) Compiler
1. Introduction (3) Disadvantages Compile time Inefficient code code optimization (2011-1) Compiler
1. Introduction (4) Types of IL Polish notation – postfix representation, IR 3-address code – Triple, Quadruple, Indirect Triple Tree structure code – AST, TCOL, Diana Virtual machine code – P-code, EM-code, U-code, Bytecode (2011-1) Compiler
2. Polish notation (1) Developed by Lucasiewiez Usage Intermediate code for arithmetic expression in Fortran compiler Intermediate code for interpreter like BASIC (2011-1) Compiler
2. Polish notation (2) Postfix notation Operand precedes operator No parenthesis Operator stack Easy to change and fast Appropriate for interpreter Not appropriate for optimization (2011-1) Compiler
3. 3-address code (1) 구성 예 4: a = b * (c + d) 가장 널리 이용 최적화 컴파일러 1개의 연산자 2개의 피연산자 예 4: a = b * (c + d) 가장 널리 이용 최적화 컴파일러 A := B op C (t1 := c + d) (t2 := b * t1) (a := t2 ) (2011-1) Compiler
3. 3-address code (2) Implementation Triple Indirect triple Quadruple (operator, operand1, operand2) Not appropriate for optimization Indirect triple Table for triple’s execution order Quadruple (operator, operand1, operand2, result) Appropriate for optimization (2011-1) Compiler
3. 3-address code (3) RTL (Register Transfer Language) IL for GNU’s C, C++ compiler All operations use register Based on list concept of LISP Ex 7: (set (reg:SI 69) (mem:SI (plus:SI (reg:SI 65) (const_int –4)))) (set (reg:SI 70) (mem:SI (plus:SI (reg:SI 65) (const_int –8)))) (set (reg:SI 68) (plus:SI (reg:SI 69) (reg:SI 70))) (set (mem:SF (plus:SI (reg:SI 65) (const_int –12))) (float:SF (reg:SI 68))) (2011-1) Compiler
4. Tree structure code (1) Advantages Parse tree Represent program’s meaning (semantics) Easy to construct Most appropriate for optimized compiler Parse tree Syntax-directed Many useless information (2011-1) Compiler
4. Tree structure code (2) Abstract syntax tree (AST) Very effective Specify terminal/non-terminal nodes Compile-compiler project (2011-1) Compiler
5. Virtual machine code (1) Portable compiler Virtual machine connects front- and back-end instruction set Abstract machine No general register Stack machine Virtual machine code (2011-1) Compiler
5. Virtual machine code (1) P-code Intermediate output of Pascal-P compiler P machine Stack machine 4 registers: PC, SP, MP, NP Memory: CODE, STORE stack heap constant area CODE STORE PC MP SP NP (2011-1) Compiler
5. Virtual machine code (2) U-code Intermediate code for portable Pascal compiler developed by Stanford Based on virtual stack machine All operations are performed on stack (2011-1) Compiler
5. Virtual machine code (3) Types of operations Unary instruction: notop, neg Binary instruction : add, sub, … Stack instruction : lod, str, ldc, ldr Constrol instruction : ujp, tjp, fjp Range check instruction : chkh, chkl Indirect-address instruction : ixa, sta Procedure instruction : cal, ret, … Other instruction: bgn, sym (2011-1) Compiler
5. Virtual machine code (4) Bytecode Intermediate language for Java Instructions for Java Virtual Machine (JVM) Portability Interpreter, JIT (Just-In-Time) compiler (2011-1) Compiler
5. Virtual machine code (5) Features Small, simple: transferred on the network Instructions for array, class, exception, thread, … Instructions for data types (iadd, fadd, …) Instructions with composite function (pop2) (2011-1) Compiler
6. Selection of IL (1) Recent approach: several ILs ILS: Automatically acquired from source Source language dependent and high level Source Front-End ILS ILS-ILT ILT Back-End Target (2011-1) Compiler
6. Selection of IL (2) ILT: ILS to ILT : Easy to translate into target machine Target machine dependent and low level ILS to ILT : Translate from ILS into ILT (2011-1) Compiler