Running MiniJava Basic Code Generation and Bootstrapping

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Program Representations. Representing programs Goals.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
CS 536 Spring Code generation I Lecture 20.
10/14/2002© 2002 Hal Perkins & UW CSEM-1 CSE 582 – Compilers Running JFlat Basic Code Generation and Bootstrapping Hal Perkins Autumn 2002.
Run time vs. Compile time
4/6/08Prof. Hilfinger CS164 Lecture 291 Code Generation Lecture 29 (based on slides by R. Bodik)
8/9/2015© Hal Perkins & UW CSEK-1 CSE P 501 – Compilers Code Shape I – Basic Constructs Hal Perkins Winter 2008.
1 Stacks Chapter 4 2 Introduction Consider a program to model a switching yard –Has main line and siding –Cars may be shunted, removed at any time.
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
Compiler Construction
CPSC 388 – Compiler Design and Construction Runtime Environments.
Activation Records (in Tiger) CS 471 October 24, 2007.
11/12/2015© Hal Perkins & UW CSEL-1 CSE P 501 – Compilers Code Shape II – Objects & Classes Hal Perkins Winter 2008.
Prof. Fateman CS 164 Lecture 281 Virtual Machine Structure Lecture 28.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
1 Assembly Language: Function Calls Jennifer Rexford.
3/5/2016© Hal Perkins & UW CSEM-1 CSE P 501 – Compilers Running MiniJava Basic Code Generation and Bootstrapping Hal Perkins Winter 2008.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
CS 536 © CS 536 Spring Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 15.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
Code Generation Instruction Selection Higher level instruction -> Low level instruction Register Allocation Which register to assign to hold which items?
Practical Session 3.
Lecture 3 Translation.
CS 404 Introduction to Compiler Design
Writing Functions in Assembly
Run-Time Environments Chapter 7
User-Written Functions
Assembly language.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Dr. Bernard Chen Ph.D. University of Central Arkansas
Microprocessor and Assembly Language
Introduction to Compilers Tim Teitelbaum
Writing Functions in Assembly
Stacks Chapter 4.
Stack Data Structure, Reverse Polish Notation, Homework 7
Basic Code Generation and Bootstrapping
Chapter 9 :: Subroutines and Control Abstraction
CS 3304 Comparative Languages
Lecture 4: MIPS Instruction Set
Variables ICS2O.
Chapter 6 Intermediate-Code Generation
Code Shape II – Objects & Classes Hal Perkins Autumn 2011
Lecture 30 (based on slides by R. Bodik)
File I/O in C Lecture 7 Narrator: Lecture 7: File I/O in C.
Code Shape I – Basic Constructs Hal Perkins Autumn 2011
Multi-modules programming
Running MiniJava Basic Code Generation and Bootstrapping
Code Shape II – Objects & Classes Hal Perkins Autumn 2005
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
UNIT V Run Time Environments.
Code Shape II – Objects & Classes Hal Perkins Summer 2004
Course Overview PART I: overview material PART II: inside a compiler
Run Time Environments 薛智文
Runtime Environments What is in the memory?.
Running MiniJava Basic Code Generation and Bootstrapping
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Computer Organization and Assembly Language
Procedure Linkages Standard procedure linkage Procedure has
Code Shape II – Objects & Classes Hal Perkins Autumn 2009
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Topic 2b ISA Support for High-Level Languages
Presentation transcript:

Running MiniJava Basic Code Generation and Bootstrapping CSE P 501 – Compilers Running MiniJava Basic Code Generation and Bootstrapping Hal Perkins Summer 2004 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Agenda What we need to finish the project Assembler source file format A basic code generation strategy Interfacing with the bootstrap program Implementing the system interface 1/1/2019 © 2002-05 Hal Perkins & UW CSE

What We Need To run a MiniJava program Space needs to be allocated for a stack and a heap ESP and other registers need to have sensible initial values We need some way to allocate storage and communicate with the outside world 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Bootstraping from C Idea: take advantage of the existing C runtime library Write a small C main program that calls the MiniJava main method as if it were a C function C’s standard library provides the execution environment and we can call C functions from compiled code for I/O, malloc, etc. 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Assembler File Format Here is a skeleton for the .asm file to be produced by MiniJava compilers (MASM syntax) .386 ; use 386 extensions .model flat,c ; use 32-bit flat address space with ; C linkage conventions for ; external labels public asm_main ; start of compiled static main extern put:near,get:near,mjmalloc:near ; external C routines .code ;; generated code repeat .code/.data as needed .data ;; generated method tables … end 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Generating .asm Code Suggestion: isolate the actual output operations in a handful of routines Modularity & saves some typing Possibilities // write code string s to .asm output void gen(String s) { … } // write “op src,dst” to .asm output void genbin(String op, String src, String dst) { … } // write label L to .asm output as “L:” void genLabel(String L) { … } A handful of these methods should do it 1/1/2019 © 2002-05 Hal Perkins & UW CSE

A Simple Code Generation Strategy Priority: quick ‘n dirty correct code first, optimize later if time Traverse AST primarily in execution order and emit code during the traversal May need to control the traversal from inside the visitor methods, or have both bottom-up and top-down visitors Treat the x86 as a 1-register stack machine Alternative strategy: produce lower-level linear IR and generate from that (after possible optimizations) We’ll cover this in lecture, but may be more ambitious than we can do in 10 weeks 1/1/2019 © 2002-05 Hal Perkins & UW CSE

x86 as a Stack Machine Idea: Use x86 stack for expression evaluation with eax as the “top” of the stack Whenever an expression (or part of one) is evaluated at runtime, the result is in eax If a value needs to be preserved while another expression is evaluated, push eax, evaluate, then pop when needed Remember: always pop what you push Will produce lots of redundant, but correct, code Examples below follow code shape examples, but with some details about where code generation fits 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: Generate Code for Constants and Identifiers Integer constants, say 17 gen(mov eax,17) leaves value in eax Variables (whether int, boolean, or reference type) gen(mov eax,[appropriate base register+ appropriate offset]) also leaves value in eax 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: Generate Code for exp1 + exp1 Visit exp1 generates code to evaluate exp1 and put result in eax gen(push eax) generate a push instruction Visit exp2 generates code for exp2; result in eax gen(pop edx) pop left argument into edx; cleans up stack gen(add eax,edx) perform the addition; result in eax 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: var = exp; (1) Assuming that var is a local variable visit node for exp Generates code that leaves the result of evaluating exp in eax gen(mov [ebp+offset of variable],eax) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: var = exp; (2) If var is a more complex expression visit var gen(push eax) push reference to variable or object containing variable onto stack visit exp gen(pop edx) gen(mov [edx+appropriate_offset],eax) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: Generate Code for obj.f(e1,e2,…en) Visit en leaves argument in eax gen(push eax) … Repeat until all arguments pushed Visit obj leaves reference to object in eax Note: this isn’t quite right if evaluating obj has side effects – ignore for simplicity for now gen(mov ecx,eax) copy “this” pointer to ecx generate code to load method table pointer generate call instruction with indirect jump gen(add esp,numberOfBytesOfArguments) Pop arguments 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Method Definitions Generate label for method Generate method prologue Visit statements in order Method epilogue will be generated as part of each return statement (next) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: return exp; Visit exp; leaves result in eax where it should be Generate method epilogue to unwind the stack frame; end with ret instruction 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Control Flow: Unique Labels Needed: a String-valued method that returns a different label each time it is called (e.g., L1, L2, L3, …) Variation: a set of methods that generate different kinds of labels for different constructs (can really help readability of the generated code) (while1, while2, while3, …; if1, if2, …; else1, else2, …; fi1, fi2, … .) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Control Flow: Tests Recall that the context for compiling a boolean expression is Jump target Whether to jump if true or false So visitor for a boolean expression needs this information from parent node 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example: while(exp) body Assuming we want the test at the bottom of the generated loop… gen(jmp testLabel) gen(bodyLabel:) visit body gen(testLabel:) visit exp (condition) with target=bodyLabel and sense=“jump if true” 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example exp1 < exp2 Similar to other binary operators Difference: context is a target label and whether to jump if true or false Code visit exp1 gen(push eax) visit exp2 gen(pop edx) gen(cmp eax,edx) gen(condjump targetLabel) appropriate conditional jump depending on sense of test 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Boolean Operators && and || !exp Create label needed to skip around second operand when appropriate Generate subexpressions with appropriate target labels and conditions !exp Generate exp with same target label, but reverse the sense of the condition 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Join Points Loops and conditional statements have join points where execution paths merge Generated code must ensure that machine state will be consistent regardless of which path is taken to reach a join point i.e., the paths through an if-else statement must not leave a different number of bytes pushed onto the stack If we want a particular value in a particular register at a join point, both paths must put it there, or we need to generate additional code to get value in the right register With the simple 1-accumulator model of code generation, this should generally be true without needing extra work 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Bootstrap Program The bootstrap will be a tiny C program that calls your compiled code as if it were an ordinary C function It also contains some functions that compiled code can call as needed Mini “runtime library” You can add to this if you like Sometimes simpler to generate a call to a newly written library routine instead of generating in-line code – implementor tradeoff 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Example Bootstrap Program #include <stdio.h> extern void asm_main(); /* compiled code */ /* execute compiled program */ void main() { asm_main(); } /* return next integer from standard input */ int get() { … } /* write x to standard output */ void put(int x) { … } /* return a pointer to a block of memory at least nBytes large (or null if insufficient memory available) */ void * runtimealloc(int nBytes) { return malloc(nBytes); } 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Interfacing to External Code Recall that the .asm file includes these declarations at the top public asm_main ; start of compiled static main extern put:near,get:near,mjmalloc:near ; external C routines “public” means that the label is defined in the .asm file and can be linked from external files Jargon: also known as an entry point “extern” declares labels used in the .asm file that must be found in another file at link time “near” means in same segment (as opposed to multi-segment MS-DOS programs of ancient times) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Main Program Label Compiler needs special handling for the static main method Label must be the same as the one declared extern in the C bootstrap program and declared public in the .asm file asm_main used above Can be changed if you wish Why not “main”? (Hint: what is the real main function here?) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

Interfacing to “Library” code To call “behind the scenes” library routines: Must be declared extern in generated code Call using normal C language conventions 1/1/2019 © 2002-05 Hal Perkins & UW CSE

System.out.println(exp) Can handle in an ad-hoc way (particularly since this is a “reserved word” in MiniJava) <compile exp; result in eax> push eax ; push parameter call put ; call external put routine add esp,4 ; pop parameter A more general solution Hand-code (in asm) classes to act as a bridge between compiled code and the C runtime Put information about these classes in the symbol table at compiler initialization Calls to these routines compile normally – no other special case code needed in the compiler(!) 1/1/2019 © 2002-05 Hal Perkins & UW CSE

And That’s It… We’ve now got enough on the table to complete the compiler project Coming Attractions Lower-level IR Back end (instruction selection and scheduling, register allocation) Middle (optimizations) 1/1/2019 © 2002-05 Hal Perkins & UW CSE