8 VM code generation Aspects of code generation Address allocation

Slides:



Advertisements
Similar presentations
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Advertisements

Simplifications of Context-Free Grammars
JavaCUP JavaCUP (Construct Useful Parser) is a parser generator
CS Data Structures I Chapter 6 Stacks I 2 Topics ADT Stack Stack Operations Using ADT Stack Line editor Bracket checking Special-Palindromes Implementation.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
UNITED NATIONS Shipment Details Report – January 2006.
1 Hyades Command Routing Message flow and data translation.
7 Copyright © 2005, Oracle. All rights reserved. Creating Classes and Objects.
10 Copyright © 2005, Oracle. All rights reserved. Reusing Code with Inheritance and Polymorphism.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Programming Language Concepts
Excel Functions. Part 1. Introduction 2 An Excel function is a formula or a procedure that is performed in the Visual Basic environment, outside the.
Lesson 6 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Intel VTune Yukai Hong Department of Mathematics National Taiwan University July 24, 2008.
Chapter 17 Linked Lists.
Chapter 24 Lists, Stacks, and Queues
ITEC200 Week04 Lists and the Collection Interface.
ABC Technology Project
EU market situation for eggs and poultry Management Committee 20 October 2011.
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Semantic Analysis and Symbol Tables
© Copyright by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Tutorial 12 – Security Panel Application Introducing.
Green Eggs and Ham.
VOORBLAD.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
1 public class Newton { public static double sqrt(double c) { double epsilon = 1E-15; if (c < 0) return Double.NaN; double t = c; while (Math.abs(t - c/t)
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
25 seconds left…...
Januar MDMDFSSMDMDFSSS
Analyzing Genes and Genomes
Chapter 9 Interactive Multimedia Authoring with Flash Introduction to Programming 1.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Copyright 2013 – Noah Mendelsohn Compiling C Programs Noah Mendelsohn Tufts University Web:
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Energy Generation in Mitochondria and Chlorplasts
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Chapter 9: Using Classes and Objects. Understanding Class Concepts Types of classes – Classes that are only application programs with a Main() method.
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Code Generation.
1 Languages and Compilers (SProg og Oversættere) Code Generation.
6-1 6 Syntactic analysis  Aspects of syntactic analysis  Tokens  Lexer  Parser  Applications of syntactic analysis  Compiler generation tool ANTLR.
7-1 7 Contextual analysis  Aspects of contextual analysis  Scope checking  Type checking  Case study: Fun contextual analyser  Representing types.
4-1 4 Interpretation  Overview  Virtual machine interpretation  Case study: SVM  Case study: SVM interpreter in Java Programming Languages 3 © 2012.
5-1 5 Compilation  Overview  Compilation phases –syntactic analysis –contextual analysis –code generation  Abstract syntax trees  Case study: Fun language.
Presentation transcript:

8 VM code generation Aspects of code generation Address allocation Code selection Example: Fun code generator Representing addresses Handling jumps Programming Languages 3 © 2012 David A Watt, University of Glasgow

Aspects of code generation (1) Code generation translates the source program (represented by an AST) into equivalent object code. In general, code generation can be broken down into: address allocation (deciding the representation and address of each variable in the source program) code selection (selecting and generating object code) register allocation (where applicable) (assigning registers to local and temporary variables).

Aspects of code generation (2) Here we cover code generation for stack-based VMs: address allocation is straightforward code selection is straightforward register allocation is not an issue! Later we will cover code generation for real machines, where register allocation is an issue (see §15).

Example: Fun compilation (1) Source program: int n = 15 # pointless program proc main (): while n > 1: n = n/2 . .

Example: Fun compilation (2) AST after syntactic analysis (slightly simplified): PROG NOFORMAL DIV ASSN PROC WHILE GT VAR ID ‘n’ INT ‘int’ NUM ‘1’ NUM ‘2’ NUM ‘15’ ID ‘main’

Example: Fun compilation (3) SVM object code after code generation: 0: 3: 6: 7: 10: 13: 14: 17: 20: 23: 24: 27: 30: LOADC 15 CALL 7 HALT LOADG 0 LOADC 1 COMPGT JUMPF 30 LOADG 0 LOADC 2 DIV STOREG 0 JUMP 7 RETURN 0 Address table (simplified) ‘n’ 0 (global) ‘main’ 7 (code) code for procedure main() code to evaluate“n>1” code to execute “while n>1: n=n/2.” code to execute “n=n/2”

Address allocation (1) Address allocation requires collection and dissemination of information about declared variables, procedures, etc. The code generator employs an address table. This contains the address of each declared variable, procedure, etc. E.g.: ‘x’ 0 (global) ‘y’ 2 (global) ‘fac’ 0 (code) ‘main’ 7 (code) variables procedures

Address allocation (2) At each variable declaration, allocate a suitable address, and put the identifier and address into the address table. Wherever a variable is used (e.g., in a command or expression), retrieve its address. At each procedure declaration, note the address of its entry point, and put the identifier and address into the address table. Wherever a procedure is called, retrieve its address.

Note: Fun is simpler: all variables are of size 1. Address allocation (3) Allocate consecutive addresses to variables, taking account of their sizes. E.g.: 1 2 3 4 5 6 … variable of size 4 variable of size 1 variable of size 2 Note: Fun is simpler: all variables are of size 1.

Code selection The code generator will walk the AST. For each construct (expression, command, etc.) in the AST, the code generator must emit suitable object code. The developer must plan what object code will be selected by the code generator.

Code templates For each construct in the source language, the developer should devise a code template. This specifies what object code will be selected. The code template to evaluate an expression should include code to evaluate any sub-expressions, together with any other necessary instructions. The code template to execute a command should include code to evaluate any sub-expressions and code to execute any sub-commands, together with any other necessary instructions.

Example: Fun → SVM code templates (1) Code template for binary operator: code to evaluate expr1 code to evaluate expr2 ADD expr1 PLUS expr2 E.g., code to evaluate “m+(7*n)”: code to evaluate “m” LOADG 3 LOADC 7 LOADG 4 MULT ADD PLUS TIMES NUM ‘7’ ID ‘n’ ID ‘m’ code to evaluate “7*n” We are assuming that m and n are global variables at addresses 3 and 4, respectively.

Example: Fun → SVM code templates (2) Code generator action for binary operator: PLUS expr1 expr2 walk expr1 generating code; walk expr2 generating code; emit instruction “ADD” Compare: The code template specifies what code should be selected. The action specifies what the code generator will actually do to generate the selected code.

Example: Fun → SVM code templates (3) Code template for assignment-command: code to evaluate expr STOREG d or STOREL d ASSN expr ID ‘x’ where d is the address offset of ‘x’ E.g., code to execute “m = n-9”: LOADG 4 LOADC 9 SUB STOREG 3 ASSN MINUS NUM ‘9’ ID ‘n’ ID ‘m’ code to evaluate “n-9”

Example: Fun → SVM code templates (4) Code generator action for assignment-command: walk expr generating code; lookup ‘x’ and retrieve its address d; emit instruction “STOREG d” (if x is global) or “STOREL d” (if x is local) ASSN expr ID ‘x’

Example: Fun → SVM code templates (5) Code template for if-command: IF com expr code to evaluate expr JUMPF code to execute com E.g., code to execute “if m>n: m = n.”: LOADG 3 LOADG 4 CMPGT JUMPF LOADG 4 STOREG 3 IF ASSN ID ‘n’ ID ‘m’ GT code to eval-uate “m>n” code to exec-ute “m = n”

Example: Fun → SVM code templates (6) Code generator action for if-command: IF com expr walk expr, generating code; emit instruction “JUMPF 0”; walk com, generating code; patch the correct address into the above JUMPF instruction

Code generation with ANTLR Recall: In ANTLR we can write a “tree grammar” which describes the ASTs. Each rule in the tree grammar is a pattern match for part of the AST. From the tree grammar, ANTLR generates a depth-first left-to-right tree walker. To build a code generator, we enhance the tree grammar with actions to perform address allocation and code selection. ANTLR inserts those actions into the tree walker.

Case study: Fun tree grammar with code generation actions (1) Fun tree grammar with actions (outline): tree grammar FunEncoder; options { tokenVocab = Fun; …; } @members { private SVM obj = new SVM(); private int varaddr = 0; private SymbolTable<Address> addrTable; … } Creates an instance of the SVM. The code generator will emit instructions directly into its code store.

Case study: Fun tree grammar with code generation actions (2) Fun tree grammar with actions (continued): expr : NUM { let n = value of the numeral; emit “LOADC n”; } | ID { lookup the identifier in addrTable and retrieve its address d; emit “LOADG d” or “LOADL d”; } | …

Case study: Fun tree grammar with code generation actions (3) Fun tree grammar with actions (continued): | ^(EQ expr //generate code for left expr expr //generate code for right expr ) { emit “CMPEQ”; } | ^(PLUS expr //generate code for left expr expr //generate code for right expr ) { emit “ADD”; } | ^(NOT expr //generate code for expr ) { emit “INV”; } | … ;

Case study: Fun tree grammar with code generation actions (4) Fun tree grammar with actions (continued): com : ^(ASSN ID expr //generate code for expr ) { lookup the identifier in addrTable and retrieve its address d; emit “STOREG d” or “STOREL d”; } | …

Case study: Fun tree grammar with code generation actions (5) Fun tree grammar with actions (continued): | ^(IF expr //generate code for expr { emit “JUMPF 0” (incomplete); } com //generate code for com ) { let c = next instruction address; patch c into the incomplete “JUMPF” instruction; } | ^(SEQ com* //generate code for com* ) | … ;

Case study: Fun tree grammar with code generation actions (6) Fun tree grammar with actions (continued): var_decl : ^(VAR type ID expr //generate code for expr ) { put the identifier into addr- Table along with varaddr; increment varaddr; } ; type : BOOL | INT ;

Case study: Fun tree grammar with code generation actions (7) Fun tree grammar with actions (continued): prog returns [SVM objprog] : ^(PROG { put ‘read’ and ‘write’ into addrTable; } var_decl* //generate code for var_decl* { emit “CALL 0” (incomplete); emit “HALT”; } proc_decl+//generate code for proc_decl* ) { lookup ‘main’ in addrTable and retrieve its address c; patch c into the incomplete CALL instruction; set $objprog to obj; } ;

Case study: Fun compiler (1) Put the above tree grammar in a file named FunEncoder.g. Feed this as input to ANTLR: …$ java org.antlr.Tool FunEncoder.g ANTLR generates a class FunEncoder containing methods that walk the AST and perform the code generation actions.

Case study: Fun compiler (2) Program to run the Fun syntactic analyser and code generator: public class FunRun { public static void main (String[] args) { // Syntactic analysis: … CommonTree ast = (CommonTree) parser.prog().getTree(); // Code generation: FunEncoder encoder = new FunEncoder( new CommonTreeNodeStream(ast)); SVM objcode = encoder.prog(); } }

Representing addresses The code generator must distinguish between three kinds of addresses: A code address refers to an instruction within the space allocated to the object code. A global address refers to a location within the space allocated to global variables. A local address refers to a location within a space allocated to a group of local variables.

Case study: implementation of Fun addresses Implementation in Java: public class Address { public static final int CODE = 0, GLOBAL = 1, LOCAL = 2; public int offset; public int locale; // CODE, GLOBAL, or LOCAL public Address (int off, int loc) { offset = off; locale = loc; } }

Handling jumps (1) The code generator emits instructions one by one. When an instruction is emitted, it is added to the end of the object code. At the destination of a jump instruction, the code generator must note the destination address and incorporate it into the jump instruction.

Handling jumps (2) For a backward jump, the destination address is already known when the jump instruction is emitted. For a forward jump, the destination address is unknown when the jump instruction is emitted. Solution: Emit an incomplete jump instruction (with 0 in its address field), and note its address. When the destination address becomes known later, patch that address into the jump instruction.

Example: Fun while-command (1) Code template for while-command: code to evaluate expr JUMPF code to execute com JUMP … WHILE expr com

Example: Fun while-command (2) AST of while-command “while n>1: n=n/2.”: DIV ASSN WHILE GT ID ‘n’ NUM ‘1’ NUM ‘2’ Assume that the while-command’s object code will start at address 7.

Example: Fun while-command (3) Code generator action (animated): … … LOADG 0 LOADC 1 COMPGT JUMPF 0 LOADG 0 LOADC 2 DIV STOREG 0 JUMP 7 0: 7: 10: 13: 14: 17: 20: 23: 24: 27: 30: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 14 c2 7 c1 … … LOADG 0 LOADC 1 COMPGT JUMPF 30 LOADG 0 LOADC 2 DIV STOREG 0 JUMP 7 0: 7: 10: 13: 14: 17: 20: 23: 24: 27: 30: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 30 c3 14 c2 7 c1 … … LOADG 0 LOADC 1 COMPGT JUMPF 0 LOADG 0 LOADC 2 DIV STOREG 0 0: 7: 10: 13: 14: 17: 20: 23: 24: 27: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 14 c2 7 c1 … … LOADG 0 LOADC 1 COMPGT JUMPF 0 LOADG 0 LOADC 2 DIV STOREG 0 JUMP 7 0: 7: 10: 13: 14: 17: 20: 23: 24: 27: 30: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 30 c3 14 c2 7 c1 … … LOADG 0 LOADC 1 COMPGT 0: 7: 10: 13: 14: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 14 c2 7 c1 … … 0: 7: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 … … 0: 7: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 7 c1 … … LOADG 0 LOADC 1 COMPGT 0: 7: 10: 13: 14: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 7 c1 … … LOADG 0 LOADC 1 COMPGT JUMPF 0 0: 7: 10: 13: 14: 17: note the current instruction address c1 walk expr, generating code note the current instruction address c2 emit “JUMPF 0” walk com, generating code emit “JUMP c1” note the current instruction address c3 patch c3 into the jump at c2 14 c2 7 c1