7/15/2015\course\cpeg621-10F\Topic-1a.ppt1 Intermediate Code Generation Reading List: Aho-Sethi-Ullman: Chapter 2.3 Chapter 6.1 ~ 6.2 Chapter 6.3 ~ 6.10.

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
CS 31003: Compilers Introduction to Phases of Compiler.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Program Representations. Representing programs Goals.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Yu-Chen Kuo1 Chapter 1 Introduction to Compiling.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Intermediate Code Generation Professor Yihjia Tsai Tamkang University.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
CSC 8505 Compiler Construction Intermediate Representations.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
COP4020 Programming Languages
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Compiler Chapter# 5 Intermediate code generation.
1 Intermediate Code Generation Part I Chapter 8 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Chapter 8: Intermediate Code Generation
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
1 June 3, June 3, 2016June 3, 2016June 3, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
Introduction to Code Generation and Intermediate Representations
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
INTRODUCTION TO COMPILERS(cond….) Prepared By: Mayank Varshney(04CS3019)
Intermediate Code Representations
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
1 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator.
CSC 4181 Compiler Construction
CSC 8505 Compiler Construction
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Dr. Hussien Sharaf Dr Emad Nabil. Dr. Hussien M. Sharaf 2 position := initial + rate * Lexical analyzer 2. Syntax analyzer id 1 := id 2 + id 3 *
Lecture 12 Intermediate Code Generation Translating Expressions
INTERMEDIATE LANGUAGES SUNG-DONG KIM DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Compiler Chapter 9. Intermediate Languages Sung-Dong Kim Dept. of Computer Engineering, Hansung University.
CS 404 Introduction to Compiler Design
Compiler Design (40-414) Main Text Book:
Intermediate code Jakub Yaghob
A Simple Syntax-Directed Translator
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Chapter 9. Intermediate Languages
Chapter 1: Introduction to Compiling (Cont.)
Compiler Construction
Compiler Optimization and Code Generation
Intermediate Code Generation Part I
Intermediate Representations
Intermediate Code Generation
Chapter 6 Intermediate-Code Generation
Intermediate Code Generation Part I
Intermediate Representations
Intermediate code generation
Intermediate Code Generation Part I
Intermediate Code Generation Part I
Intermediate Code Generating machine-independent intermediate form.
Presentation transcript:

7/15/2015\course\cpeg621-10F\Topic-1a.ppt1 Intermediate Code Generation Reading List: Aho-Sethi-Ullman: Chapter 2.3 Chapter 6.1 ~ 6.2 Chapter 6.3 ~ 6.10 (Note: Glance through it only for intuitive understanding.)

7/15/2015\course\cpeg621-10F\Topic-1a.ppt2 Lexical Analyzer (Scanner) + Syntax Analyzer (Parser) + Semantic Analyzer Abstract Syntax Tree w/Attributes Intermediate-code Generator Non-optimized Intermediate Code Front End Error Message Summary of Front End

7/15/2015\course\cpeg621-10F\Topic-1a.ppt3 Component-Based Approach to Building Compilers Target-1 Code GeneratorTarget-2 Code Generator Intermediate-code Optimizer Language-1 Front End Source program in Language-1 Language-2 Front End Source program in Language-2 Non-optimized Intermediate Code Optimized Intermediate Code Target-1 machine code Target-2 machine code

7/15/2015\course\cpeg621-10F\Topic-1a.ppt4 Intermediate Representation (IR) A kind of abstract machine language that can express the target machine operations without committing to too much machine details. zWhy IR ?

7/15/2015\course\cpeg621-10F\Topic-1a.ppt5 Without IR C Pascal FORTRAN C++ SPARC HP PA x86 IBM PPC

7/15/2015\course\cpeg621-10F\Topic-1a.ppt6 With IR C Pascal FORTRAN C++ SPARC HP PA x86 IBM PPC IR

7/15/2015\course\cpeg621-10F\Topic-1a.ppt7 With IR C Pascal FORTRAN C++ IR Common Backend ?

7/15/2015\course\cpeg621-10F\Topic-1a.ppt8 Advantages of Using an Intermediate Language 1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Optimization - reuse intermediate code optimizers in compilers for different languages and different machines. Note: the terms “intermediate code”, “intermediate language”, and “intermediate representation” are all used interchangeably.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt9 Issues in Designing an IR  Whether to use an existing IR  if target machine architecture is similar  if the new language is similar  Whether the IR is appropriate for the kind of optimizations to be performed  e.g. speculation and predication  some transformations may take much longer than they would on a different IR

7/15/2015\course\cpeg621-10F\Topic-1a.ppt10 Issues in Designing an IR  Designing a new IR needs to consider  Level (how machine dependent it is)  Structure  Expressiveness  Appropriateness for general and special optimizations  Appropriateness for code generation  Whether multiple IRs should be used

7/15/2015\course\cpeg621-10F\Topic-1a.ppt11 Multiple-Level IR Source Program High-level IR Low-level IR Target code Semantic Check High-level Optimization Low-level Optimization …

7/15/2015\course\cpeg621-10F\Topic-1a.ppt12 Using Multiple-level IR Translating from one level to another in the compilation process  Preserving an existing technology investment  Some representations may be more appropriate for a particular task.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt13 Commonly Used IR zPossible IR forms Graphical representations: such as syntax trees, AST (Abstract Syntax Trees), DAG Postfix notation Three address code SSA (Static Single Assignment) form zIR should have individual components that describe simple things

7/15/2015\course\cpeg621-10F\Topic-1a.ppt14 DAG Representation A variant of syntax tree. / + + A * BC * _ Example: D = ((A+B*C) + (A*B*C))/ -C = D DAG: Direct Acyclic Graph

7/15/2015\course\cpeg621-10F\Topic-1a.ppt15 Postfix Notation (PN) The PN of expression 9* (5+2) is 952+* How about (a+b)/(c-d) ?ab+cd-/ A mathematical notation wherein every operator follows all of its operands. Examples:

7/15/2015\course\cpeg621-10F\Topic-1a.ppt16 Postfix Notation (PN) – Cont’d Form Rules: 1.If E is a variable/constant, the PN of E is E itself 2.If E is an expression of the form E 1 op E 2, the PN of E is E 1 ’ E 2 ’ op (E 1 ’ and E 2 ’ are the PN of E 1 and E 2, respectively.) 3.If E is a parenthesized expression of form (E 1 ), the PN of E is the same as the PN of E 1.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt17 Three-Address Statements A popular form of intermediate code used in optimizing compilers is three-address statements. Source statement: x = a + b  c + d Three address statements with temporaries t 1 and t 2 : t 1 = b  c t 2 = a + t 1 x = t 2 + d

7/15/2015\course\cpeg621-10F\Topic-1a.ppt18 Three Address Code The general form x := y op z x,y,and z are names, constants, compiler- generated temporaries op stands for any operator such as +,-,… x*5-y might be translated as t1 := x * 5 t2 := t1 - y

7/15/2015\course\cpeg621-10F\Topic-1a.ppt19 Syntax-Directed Translation Into Three-Address Temporary In general, when generating three-address statements, the compiler has to create new temporary variables (temporaries) as needed. We use a function newtemp( ) that returns a new temporary each time it is called. Recall Topic-2: when talking about this topic

7/15/2015\course\cpeg621-10F\Topic-1a.ppt20 Syntax-Directed Translation Into Three-Address The syntax-directed definition for E in a production id := E has two attributes: 1. E.place - the location (variable name or offset) that holds the value corresponding to the nonterminal 2. E.code - the sequence of three-address statements representing the code for the nonterminal

7/15/2015\course\cpeg621-10F\Topic-1a.ppt21 Example Syntax-Directed Definition term ::= ID { term.place := ID.place ; term.code = “” } term 1 ::= term 2 * ID {term 1.place := newtemp( ); term 1.code := term 2.code || ID.code ||* gen(term 1.place ‘:=‘ term 2.place ‘*’ ID.place} expr ::= term { expr.place := term.place ; expr.code := term.code } expr 1 ::= expr 2 + term { expr 1.place := newtemp( ) expr1.code := expr 2.code || term.code ||+ gen(expr 1.place ‘:=‘ expr 2.place ‘+’ term.place }

7/15/2015\course\cpeg621-10F\Topic-1a.ppt22 Syntax tree vs. Three address code _ + + A* BC * _ A B B T1 := B * C T2 = A + T1 T3 = - B T4 = T3 * A T5 = T2 + T4 T6 = T5 – B Three address code is a linearized representation of a syntax tree (or a DAG) in which explicit names (temporaries) correspond to the interior nodes of the graph. Expression: (A+B*C) + (-B*A) - B

7/15/2015\course\cpeg621-10F\Topic-1a.ppt23 DAG vs. Three address code / + + A * BC * T1 := A T2 := C T3 := B * T2 T4 := T1+T3 T5 := T1*T3 T6 := T4 + T5 T7 := – T2 T8 := T6 / T7 D := T8 _ T1 := B * C T2 := A+T1 T3 := A*T1 T4 := T2+T3 T5 := – C T6 := T4 / T5 D := T6 Question: Which IR code sequence is better? Expression: D = ((A+B*C) + (A*B*C))/ -C = D

7/15/2015\course\cpeg621-10F\Topic-1a.ppt24 Implementation of Three Address Code zQuadruples Four fields: op, arg1, arg2, result ¤ Array of struct {op, *arg1, *arg2, *result} x:=y op z is represented as op y, z, x arg1, arg2 and result are usually pointers to symbol table entries. May need to use many temporary names. Many assembly instructions are like quadruple, but arg1, arg2, and result are real registers.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt25 zTriples Three fields: op, arg1, and arg2. Result become implicit. arg1 and arg2 are either pointers to the symbol table or index/pointers to the triple structure. Example: d = a + (b*c) 1 * b, c 2 + a, (1) 3 assign d, (2) No explicit temporary names used. Need more than one entries for ternary operations such as x:=y[i], a=b+c, x[i]=y, … etc. Implementation of Three Address Code (Con’t) Problem in reorder the codes?

7/15/2015\course\cpeg621-10F\Topic-1a.ppt26 IR Example in Open64 ─ WHIRL The Open64 uses a tree-based intermediate representation called WHIRL, which stands for Winning Hierarchical Intermediate Representation Language.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt27 WHIRL zAbstract syntax tree based zSymbol table links, map annotations zBase representation is simple and efficient zUsed through several phases with lowering zDesigned for multiple target architectures

7/15/2015\course\cpeg621-10F\Topic-1a.ppt28 From WHIRL to CGIR An Example T 1 = sp + &a; T 2 = ld T 1 T 3 = sp + &i; T 4 = ld T 3 T 6 = T 4 << 2 T 7 = T 6 T 8 = T 2 + T 7 T 9 = ld T 8 T 10 = sp + &aa := st T 10 T 9 STaa LD + a 4 * i (c) WHIRL (d) CGIR

7/15/2015\course\cpeg621-10F\Topic-1a.ppt29 From WHIRL to CGIR An Example int *a; int i; int aa; aa = a[i]; (a) Source U4U4LDID 0 T U4INTCONST 4 (0x4) U4MPY U4ADD I4I4ILOAD 0 T T I4STID 0 T (b) Whirl

7/15/2015\course\cpeg621-10F\Topic-1a.ppt30 U4U4LDID 0 T U4INTCONST 4 (0x4) U4MPY U4ADD I4I4ILOAD 0 T T I4STID 0 T (insn (set (reg:SI 61 [ i.0 ]) (mem/c/i:SI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -8 [0xfffffffffffffff8])) [0 i+0 S4 A32])) -1 (nil) (nil)) (insn (parallel [ (set (reg:SI 60 [ D.1282 ]) (ashift:SI (reg:SI 61 [ i.0 ]) (const_int 2 [0x2]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) (insn (set (reg:SI 59 [ D.1283 ]) (reg:SI 60 [ D.1282 ])) -1 (nil) (nil)) (insn (parallel [ (set (reg:SI 58 [ D.1284 ]) (plus:SI (reg:SI 59 [ D.1283 ]) (mem/f/c/i:SI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -12 [0xfffffffffffffff4])) [0 a+0 S4 A32]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) (insn (set (reg:SI 62) (mem:SI (reg:SI 58 [ D.1284 ]) [0 S4 A32])) -1 (nil) (nil)) (insn (set (mem/c/i:SI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -4 [0xfffffffffffffffc])) [0 aa+0 S4 A32]) (reg:SI 62)) -1 (nil) (nil)) WHIRLGCC RTL

7/15/2015\course\cpeg621-10F\Topic-1a.ppt31 Differences zgcc rtl describes more details than whirl zgcc rtl already assigns variables to stack z actually, WHIRL needs other symbol tables to describe the properties of each variable. Separating IR and symbol tables makes WHIRL simpler. zWHIRL contains multiple levels of program constructs representation, so it has more opportunities for optimization.

7/15/2015\course\cpeg621-10F\Topic-1a.ppt32 Lexical Analyzer (Scanner) + Syntax Analyzer (Parser) + Semantic Analyzer Abstract Syntax Tree w/Attributes Intermediate-code Generator Non-optimized Intermediate Code Front End Error Message Summary of Front End

7/15/2015\course\cpeg621-10F\Topic-1a.ppt33 lexical analyzer syntax analyzer semantic analyzer Position := initial + rate * 60 id 1 := id 2 + id 3 * 60 := id 1 + id 2 * id 3 60 := id 1 + id 2 * id 3 inttoreal 60 intermediate code generator temp1 := inttoreal (60) temp2 := id 3 * temp1 temp3 := id 2 + temp2 id1 := temp3 code optimizer temp1 := id 3 * 60.0 id1 := id 2 + temp1 code generator MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1 The Phases of a Compiler

7/15/2015\course\cpeg621-10F\Topic-1a.ppt34 Summary 1.Why IR 2.Commonly used IR 3.IRs of Open64 and GCC