241-437 Compilers: IC/10 1 Compiler Structures Objective – –describe intermediate code generation – –explain a stack-based intermediate code for the expression.

Slides:



Advertisements
Similar presentations
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Advertisements

Intermediate Code Generation
The University of Adelaide, School of Computer Science
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
1 Languages and Compilers (SProg og Oversættere) Code Generation.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
Compilers: Parse Tree/9 1 Compiler Structures Objective – –extend the expressions language compiler to generate a parse tree for the input program,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
Lecture 26 Epilogue: Or Everything else you Wanted to Know about Compilers (more accurately Everything else I wanted you to Know) Topics Getreg – Error.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
CS 536 Spring Code generation I Lecture 20.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
4/6/08Prof. Hilfinger CS164 Lecture 291 Code Generation Lecture 29 (based on slides by R. Bodik)
Lecture 8: Intermediate Code CS 540 Spring CS 540 GMU Spring Compiler Architecture Scanner (lexical analysis) Parser (syntax analysis) Code.
Compilers: topDown/5 1 Compiler Structures Objective – –look at top-down (LL) parsing using recursive descent and tables – –consider a recursive.
COP4020 Programming Languages
Compilers: Attr. Grammars/8 1 Compiler Structures Objective – –describe semantic analysis with attribute grammars, as applied in yacc and recursive.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 October 1, October 1, 2015October 1, 2015October 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Concordia University Department of Computer Science and Software Engineering Click to edit Master title style COMPILER DESIGN Introduction to code generation.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Compiler Chapter# 5 Intermediate code generation.
1 Intermediate Code Generation Part I Chapter 8 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Chapter 8: Intermediate Code Generation
1 June 3, June 3, 2016June 3, 2016June 3, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
Introduction to Code Generation and Intermediate Representations
Compilers: Overview/1 1 Compiler Structures Objective – –what are the main features (structures) in a compiler? , Semester 1,
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
More on MIPS programs n SPIM does not support everything supported by a general MIPS assembler. For example, –.end doesn’t work Use j $ra –.macro doesn’t.
Intermediate Code Representations
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
An Attribute Grammar for Tiny Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 18.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Code Generation How to produce intermediate or target code.
1 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Lecture 12 Intermediate Code Generation Translating Expressions
CS 404 Introduction to Compiler Design
Intermediate code Jakub Yaghob
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler design Introduction to code generation
Procedures (Functions)
An Overview to Compiler Design
Intermediate Code Generation Part I
Intermediate Representations
Lecture 30 (based on slides by R. Bodik)
Intermediate Code Generation Part I
The University of Adelaide, School of Computer Science
Intermediate Representations
Compiler design.
Intermediate Code Generation Part I
Compiler Structures 8. Attribute Grammars Objectives
Compiler Structures 2. Lexical Analysis Objectives
10. Intermediate Code Generation
9. Creating and Evaluating a
Intermediate Code Generation Part I
Compiler Structures 11. IC Generation: Control Structures Objectives
Presentation transcript:

Compilers: IC/10 1 Compiler Structures Objective – –describe intermediate code generation – –explain a stack-based intermediate code for the expression language , Semester 1, Intermediate Code Generation

Compilers: IC/10 2 Overview 1. Intermediate Code (IC) Generation 2. IC Examples 3. Expression Translation in SPIM 4. The Expressions Language

Compilers: IC/10 3 In this lecture Source Program Target Lang. Prog. Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator Back End Int. Code Generator Intermediate Code

Compilers: IC/ Intermediate Code (IC) Generation Helps with retargeting – –e.g. can easily attach a back end for a new machine to an existing front end Enables machine-independent code optimization. Front endBack end Intermediate code Target machine code

Compilers: IC/10 5 Graphical IC Representations Abstract Syntax Trees (AST) – –retains basic parse tree structure, but with unneeded nodes removed Directed Acyclic Graphs (DAG) – –compacted AST to avoid duplication – –smaller memory needs Control Flow Graphs (CFG) – –used to model control flow

Compilers: IC/10 6 Linear (text-based) ICs Stack-based (postfix) – –e.g. the JVM Three-address code x := y op z Two-address code: x := op y (the same as x := x op y)

Compilers: IC/ IC Examples ASTs and DAGs Stack-based (postfix) Three-address Code SPIM

Compilers: IC/ ASTs and DAGs assign a+ ** b- c a+ b b * cc a := b *-c + b * -c -- Pros:easy restructuring of code and/or expressions for intermediate code optimization Cons:memory intensive AST DAG

Compilers: IC/ Stack-based (postfix) a := b * -c + b * -c b c uminus * b c uminus * + a assign iload 2// push b iload 3// push c ineg// uminus imul// * iload 2// push b iload 3// push c ineg// uminus imul// * iadd// + istore 1// store a (e.g. JVM stack instrs) Postfix notation represents operations on a stack Pro:easy to generate Cons:stack operations are more difficult to optimize

Compilers: IC/ Three-Address Code a := b * -c + b * -c t1 := - c t2 := b * t1 t3 := - c t4 := b * t3 t5 := t2 + t4 a := t5 Translated from the AST t1 := - c t2 := b * t1 t5 := t2 + t2 a := t5 Translated from the DAG

Compilers: IC/ SPIM Three address code for a simulator that runs MIPS32 assembly language programs – – Loading/Storing – –lw register,var - loads value into register – –sw register,var - stores value from register – –many, many others continued

Compilers: IC/ registers: $t0 - $t7 Binary math ops (reg1 = reg2 op reg3): – –add reg1,reg2,reg3 – –sub reg1,reg2,reg3 – –mul reg1,reg2,reg3 – –div reg1,reg2,reg3 Unary minus (reg1 = - reg2) – –neg reg1, reg2

Compilers: IC/10 13 "a := b * -c + b * -c" in SPIM assign a+ ** b- c b c lw $t0,c neg $t1,$t0 lw $t0,b mul $t2, $t1,$t0 lw $t0,c neg $t1,$t0 lw $t0,b mul $t1, $t1,$t0 add $t1,$t2,$t1 sw $t1,a t1 t0 t1 t2 t1 - AST

Compilers: IC/10 14 a := b * -c + b * -c lw $t0,c neg $t1,$t0 lw $t0,b mul $t1, $t1,$t0 add $t2,$t1,$t1 sw $t2,a assign a+ b * - c t1 t0 t1 t2 DAG

Compilers: IC/ Expression Translation in SPIM Grammar: S => id := E E => E + E E => id S a := b + c + d + e E EE E E E E Generate: lw $t1,b 1 As we parse, use attributes to pass information about the temporary variables up the tree. parse tree --> code using bottom-up evaluation

Compilers: IC/10 16 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c 12 Each number corresponds to a temporary variable.

Compilers: IC/10 17 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c add $t3,$t1,$t Each number corresponds to a temporary variable.

Compilers: IC/10 18 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c add $t3,$t1,$t2 lw $t4,d

Compilers: IC/10 19 S a := b + c + d + e E EE E E E E Generate: lw t1,b lw t2,c add $t3,$t1,$t2 lw t4,d add $t5,$t3,$t

Compilers: IC/10 20 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c add $t3,$t1,$t2 lw $t4,d add $t5,$t3,$t4 lw $t6,e

Compilers: IC/10 21 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c add $t3,$t1,$t2 lw $t4,d add $t5,$t3,$t4 lw $t6,e add $t7,$t5,$t6

Compilers: IC/10 22 S a := b + c + d + e E EE E E E E Generate: lw $t1,b lw $t2,c add $t3,$t1,$t2 lw $t4,d add $t5,$t3,$t4 lw $t6,e add $t7,$t5,$t6 sw $t7,a Pro:easy to rearrange code for global optimization Cons:lots of temporaries

Compilers: IC/10 23 Issues when Processing Expressions Type checking/conversion. Address calculation for more complex types (arrays, records, etc.). Expressions in control structures, such as loops and if tests.

Compilers: IC/ The Expressions Language exprParse3.c builds a parse tree for the input file (reuses code from exprParse2.c). An intermediate code is generated from the parse tree, and saved to an output file. The input file is not executed by exprParse3.c – –that is done by a separate emulator.

Compilers: IC/10 25 Usage > gcc -Wall -o exprParse3 exprParse3.c >./exprParse3 < test1.txt > cat codeGen.txt PUSH 2 STORE x WRITE PUSH 3 LOAD x ADD STORE y WRITE STOP let x = 2 let y = 3 + x test1.txt stores intermediate code in codeGen.txt exprParse3 test1.txt codeGen.txt

Compilers: IC/10 26 Emulator Usage >./emulator codeGen.txt Reading code from codeGen.txt == 2 == 5 Stop emulator codeGen.txt it runs the intermediate code

Compilers: IC/ The Instruction Set The instructions in codeGen.txt are executed by a emulator. – –it emulates (simulates) real hardware The instructions refer to two data structures used in the emulator.

Compilers: IC/10 28 The Emulator's Data Structures The emulator's data structures: – –a symbol table of IDs and their integer values – –a stack of integers for evaluating the expressions 2 stack x 4 symbol table

Compilers: IC/10 29 The Instructions WRITE// pop top element off stack and print STOP// exit code emulation LOAD ID// get ID value from symbol table, and push onto stack STORE ID// copy stack top into symbol table for ID continued

Compilers: IC/10 30 PUSH integer// push integer onto stack STORE0 ID// push 0 onto stack, and save to table as value for ID ( same as push 0; store ID) MULT// pop two stack values, multiply them, push result back ADD, MINUS, DIV // same for those ops

Compilers: IC/10 31 Intermediate Code Type Since the intermediate code uses a stack to store values rather than registers, then it is a stack-based (postfix) representation.

Compilers: IC/ exprParse3.c Coding All the parsing code in exprParse3.c is the same as exprParse2.c. The difference is that the parse tree is passed to a generateCode() function to convert it to intermediate code – –see main()

Compilers: IC/10 33 main() #define CODE_FNM "codeGen.txt" // where to store generated code int main(void) /* parse, print the tree, then generate code which is stored in CODE_FNM */ { Tree *t; nextToken(); t = statements(); match(SCANEOF); printTree(t, 0); generateCode(CODE_FNM, t); return 0; }

Compilers: IC/10 34 Generating the Code void generateCode(char *fnm, Tree *t) /* Open the intermediate code file, fnm, and write to it. */ { FILE *fp; if ((fp = fopen(fnm, "w")) == NULL) { printf("Could not write to %s\n", fnm); exit(1); } else { printf("Writing code to %s\n", fnm); cgTree(fp, t); fprintf(fp, "STOP\n"); // last instruction in file fclose(fp); } } // end of generateCode()

Compilers: IC/10 35 void cgTree(FILE *fp, Tree *t) /* Recurse over the parse tree looking for non-NEWLINE subtrees to convert into code Each block of code generated for a non-NEWLINE subtree ends with a WRITE instruction, to print out the value of the line. */ { if (t == NULL) return; Token tok = TreeOper(t); if (tok == NEWLINE) { cgTree(fp, TreeLeft(t)); cgTree(fp, TreeRight(t)); } else { codeGen(fp, t); fprintf(fp, "WRITE\n"); // print value at EOL } } // end of cgTree()

Compilers: IC/10 36 void codeGen(FILE *fp, Tree *t) /* Convert the tree nodes for ID, INT, ASSIGNOP, PLUSOP, MINUSOP, MULTOP, DIVOP into instructions. The load/store instructions: LOAD ID, STORE ID, STORE0 ID, PUSH integer The math instructions: MULT, ADD, MINUS, DIV */ { if (t == NULL) return; : continued

Compilers: IC/10 37 Token tok = TreeOper(t); if (tok == ID) codeGenID(fp, TreeID(t)); else if (tok == INT) fprintf(fp, "PUSH %d\n", TreeValue(t)); else if (tok == ASSIGNOP) { // id = expr char *id = TreeID(TreeLeft(t)); getIDEntry(id); // don't use Symbol info codeGen(fp, TreeRight(t)); fprintf(fp, "STORE %s\n", id); } : continued

Compilers: IC/10 38 else if (tok == PLUSOP) { codeGen(fp, TreeLeft(t)); codeGen(fp, TreeRight(t)); fprintf(fp, "ADD\n"); } else if (tok == MINUSOP) { codeGen(fp, TreeLeft(t)); codeGen(fp, TreeRight(t)); fprintf(fp, "MINUS\n"); } : continued

Compilers: IC/10 39 else if (tok == MULTOP) { codeGen(fp, TreeLeft(t)); codeGen(fp, TreeRight(t)); fprintf(fp, "MULT\n"); } else if (tok == DIVOP) { codeGen(fp, TreeLeft(t)); codeGen(fp, TreeRight(t)); fprintf(fp, "DIV\n"); } } // end of codeGen()

Compilers: IC/10 40 void codeGenID(FILE *fp, char *id) /* An ID may already be in the symbol table, or be new, which is converted into a LOAD or a STORE0 code operation. */ { SymbolInfo *si = NULL; if ((si = lookupID(id)) != NULL) // already declared fprintf(fp, "LOAD %s\n", id); else { // new, so add to table addID(id, 0); // 0 is default value fprintf(fp, "STORE0 %s\n", id); } } // end of codeGenID()

Compilers: IC/10 41 From Tree to Code \n NULL = x2 = y + 3x let x = 2 let y = 3 + x x 0 symbol table in exprParse3.c PUSH 2 STORE x WRITE PUSH 3 LOAD x ADD STORE y WRITE STOP y 0

Compilers: IC/ The Emulator > gcc –Wall –o emulator emulator.c >./emulator codeGen.txt Reading code from codeGen.txt == 2 == 5 Stop

Compilers: IC/10 43 Emulator Data Structures #define MAX_SYMS 15 // max no of vars #define STACK_SIZE 10 // stack data structure int stack[STACK_SIZE]; int stackTop = -1; // symbol table data structures typedef struct SymInfo { char *id; int value; } SymbolInfo; int symNum = 0; // number of symbols stored SymbolInfo syms[MAX_SYMS]; 2 x 4

Compilers: IC/10 44 Evaluating Input Lines void eval(FILE *fp) /* Read in the code file a line at a time and process the lines. An instruction on a line may be a single command (e.g. WRITE) or a instruction name and an argument (e.g. LOAD x). */ { char buf[BUFSIZ]; char cmd[MAX_LEN], arg[MAX_LEN]; int no; : continued

Compilers: IC/10 45 while (fgets(buf, sizeof(buf), fp) != NULL) { no = sscanf(buf, "%s %s\n", cmd, arg); if ((no 2)) printf("Unknown format: %s\n", buf); else processCmd(cmd, arg); // process commands as they are read in } } // end of eval()

Compilers: IC/10 46 Processing an Instruction void processCmd(char *cmd, char *arg) { SymbolInfo *si; if (strcmp(cmd, "LOAD") == 0) { if ((si = lookupID(arg)) == NULL) { printf("Error: load cannot find %s\n", arg); exit(1); } push(si->value); } else if (strcmp(cmd, "STORE") == 0) addID(arg, topOf()); else if (strcmp(cmd, "STORE0") == 0) { push(0); addID(arg, 0); } continued

Compilers: IC/10 47 else if (strcmp(cmd, "PUSH") == 0) push( atoi(arg) ); else if (strcmp(cmd, "MULT") == 0) { int v2 = pop(); int v1 = pop(); push( v1*v2 ); } else if (strcmp(cmd, "ADD") == 0) { int v2 = pop(); int v1 = pop(); push( v1+v2 ); } else if (strcmp(cmd, "MINUS") == 0) { int v2 = pop(); int v1 = pop(); push( v1-v2 ); } continued

Compilers: IC/10 48 else if (strcmp(cmd, "DIV") == 0) { int v2 = pop(); if (v2 == 0) { printf("Error: div by 0; using 1\n"); v2 = 1; } int v1 = pop(); push( v1/v2 ); } else if (strcmp(cmd, "WRITE") == 0) printf("== %d\n", pop()); else if (strcmp(cmd, "STOP") == 0) { printf("Stop\n"); exit(1); } continued

Compilers: IC/10 49 else printf("Unknown instruction: %s\n", cmd); } // end of processCmd()

Compilers: IC/10 50 Evaluating the Code for test1.txt let x = 2 let y = 3 + x PUSH 2 STORE x WRITE PUSH 3 LOAD x ADD STORE y WRITE STOP test1.txtcodeGen.txt continued

Compilers: IC/10 51 PUSH 2 STORE X WRITE PUSH x 2 x 2 3 x 2 stack symbol table x 2 continued

Compilers: IC/10 52 LOAD X ADD STORE Y WRITE STOP 3 2 x 2 x 2 x 2 stack symbol table y 5 x y 5 y