Prof. Fateman CS 164 Lecture 191 Introduction to Intermediate Code, virtual machine implementation Lecture 19.

Slides:



Advertisements
Similar presentations
Chapter 11 Introduction to Programming in C
Advertisements

Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
The University of Adelaide, School of Computer Science
10/9: Lecture Topics Starting a Program Exercise 3.2 from H+P Review of Assembly Language RISC vs. CISC.
1 Compiler Construction Intermediate Code Generation.
Wannabe Lecturer Alexandre Joly inst.eecs.berkeley.edu/~cs61c-te
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
Ch. 8 Functions.
The University of Adelaide, School of Computer Science
1 COMS 361 Computer Organization Title: Instructions Date: 9/28/2004 Lecture Number: 10.
The Assembly Language Level
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Program Representations. Representing programs Goals.
CPSC Compiler Tutorial 8 Code Generator (unoptimized)
Prof. Fateman CS 164 Lecture 151 Language definition by interpreter, translator, continued Lecture 15.
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
CS 536 Spring Run-time organization Lecture 19.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
CS 536 Spring Code generation I Lecture 20.
Assemblers Dr. Monther Aldwairi 10/21/20071Dr. Monther Aldwairi.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Run time vs. Compile time
Prof. Fateman CS 164 Lecture 201 Virtual Machine Structure Lecture 20.
Chapter 4 H1 Assembly Language: Part 2. Direct instruction Contains the absolute address of the memory location it accesses. ld instruction:
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
4/6/08Prof. Hilfinger CS164 Lecture 291 Code Generation Lecture 29 (based on slides by R. Bodik)
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Computer Architecture Instruction Set Architecture Lynn Choi Korea University.
CSC 3210 Computer Organization and Programming Chapter 1 THE COMPUTER D.M. Rasanjalee Himali.
5-1 Chapter 5 - Languages and the Machine Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of Computer.
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
Languages and the Machine Chapter 5 CS221. Topics The Compilation Process The Assembly Process Linking and Loading Macros We will skip –Case Study: Extensions.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
April 23, 2001Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Assemblers, Linkers, and Loaders * Jeremy R. Johnson Mon. April 23,
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
Prof. Fateman CS 164 Lecture 111 Syntax  Simple Semantics Lecture 11.
Copyright 2006 by Timothy J. McGuire, Ph.D. 1 MIPS Assembly Language CS 333 Sam Houston State University Dr. Tim McGuire.
Prof. Fateman CS 164 Lecture 281 Virtual Machine Structure Lecture 28.
Microprocessors The ia32 User Instruction Set Jan 31st, 2002.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Code Generation CPSC 388 Ellen Walker Hiram College.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 2 Part 3.
Copyright 2006 by Timothy J. McGuire, Ph.D. 1 MIPS Assembly Language CS 333 Sam Houston State University Dr. Tim McGuire.
Programming Language Concepts (CIS 635) Elsa L Gunter 4303 GITC NJIT,
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Lecture 3 Translation.
Component 1.6.
Computer Architecture Instruction Set Architecture
CS2100 Computer Organisation
Computer Architecture Instruction Set Architecture
Run-time organization
Introduction to Compilers Tim Teitelbaum
Chapter 9 :: Subroutines and Control Abstraction
Instructions - Type and Format
Chapter 11 Introduction to Programming in C
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
CSCE Fall 2013 Prof. Jennifer L. Welch.
The Metacircular Evaluator
The University of Adelaide, School of Computer Science
Chapter 11 Introduction to Programming in C
CSCE Fall 2012 Prof. Jennifer L. Welch.
Chapter 11 Introduction to Programming in C
Introduction to Intermediate Code, virtual machine implementation
CSc 453 Interpreters & Interpretation
Presentation transcript:

Prof. Fateman CS 164 Lecture 191 Introduction to Intermediate Code, virtual machine implementation Lecture 19

Prof. Fateman CS 164 Lecture 192 Where do we go from here? AST generation Type checking Cleanup?Interpreter intermediate code/ assmblr testing Virtual machine in out

Prof. Fateman CS 164 Lecture 193 Details of MJ  what the VM must support Details of the VM  what IC to generate intermediate code/ assmblr Virtual machine in out

Prof. Fateman CS 164 Lecture 194 What is Intermediate Code? No single answer… Any encoding further from the source and closer to the machine. Possibilities: –Same except remove line/column numbers! –Change all parameter or local NAMES to stack offset counts –Change all global references (vars, methods) to vtable counts –Possible spew out Lisp as an IC. [My favorite: map every language you encounter into Common Lisp. Macro defs make it possible to define an intermediate level language “especially suited to IC” in Lisp. ] Not (usually) machine code itself Usually some extra “abstraction” –Imagine you have arbitrary numbers of registers for calculations. –Imagine you have “macro” instructions like x=a[i,j]

Prof. Fateman CS 164 Lecture 195 What is Intermediate Code? Advantages Generally relatively portable between machines One IC may support several languages (a typical set might be C, C++, Pascal, Fortran) where the resources needed are similar. (If you can support C, the rest of them are pretty easy.) Languages with widely-differing semantics will not fit in this restricted set and may require an extended IC. E.g. Java IC supporting Scheme is hard because Scheme has 1 st -class functions. Scheme IC supporting Java is plausible structurally but might not be as efficient.

Prof. Fateman CS 164 Lecture 196 Forms of Intermediate Code Virtual stack machine simplified “macro” machine 3-address code A :=B op C Register machine models with unlimited number of registers, mapped to real registers later Tree form (= lisp program, more or less)

Prof. Fateman CS 164 Lecture 197 Using Intermediate Code We need to make some progress towards the machine we have in mind –Subject to manipulation, “optimization” –Perhaps several passes –Or, we can generate assembler –Or, we could plop down in absolute memory locations the binary instructions needed to execute the program “compile-and-go” system. This was a common “student compiler” technique for Pascal and Fortran.

Prof. Fateman CS 164 Lecture 198 Reminder of what we are doing.. From a tree representation like our AST.. Instead of typechecking or interpreting the code, we are traversing it and putting together a list of instructions… generating IC … that if run on a machine -- would execute the program. In other words instead of interpreting or typechecking a loop,or going through a loop executing it, we write it out in assembler.

Prof. Fateman CS 164 Lecture 199 The simplest example Compiling the MJ program segment … 3+4… The string "3+4" [too simple to have line/column numbers] parses to (Plus (IntegerLiteral 3) (IntegerLiteral 4)) typechecker approves and says it is of type int. A program translating to lisp produces (+ 3 4) Which could be executed… But we don’t really want Lisp, we want machine code. We could start from the Lisp or from the AST, i.e. …. (Plus …)…

Prof. Fateman CS 164 Lecture 1910 Just a simple stack machine (oversimplified) Compiling (+ 3 4) (some-kind-of-compiler '(+ 3 4))  ;; I made-up name ((pushi 3) (pushi 4) (+)) Result is a list of assembly language instructions push immediate constant 3 on stack push immediate constant 4 on stack + = take 2 args off stack and push sum on stack Conventions: result is left on top of stack. Location of these instructions unspecified. Lengthy notes in simple-machine file describe virtual machine.

Prof. Fateman CS 164 Lecture 1911 It doesn’t have to look like lisp if we write a printing program (pprint-code compiled-vector) ;; prints out… pushi 3 pushi 4 +

Prof. Fateman CS 164 Lecture 1912 Consider the file Simple-compiler.fasl Load this file and you have functions for compiling methods, exps, statements. You can trace them. mj-c-method compiles one method mj-c-exp compiles one expression mj-c-statement compiles one statement Each program calls “emit” to add to the generated program some sequence of instructions. Typically these are consed on to the front of a list intended to become the “body” of a method. This list which is then reversed, embroidered with other instructions, and is ready to assemble.

Prof. Fateman CS 164 Lecture 1913 Some FAQ. 1. Redundant work? Q: Going back to the AST for compiling, it seems to me that I am re-doing things I already did (or did 95%) for type-checking. A. You are right. Next time you write a compiler (hehe) you will remember and maybe you will save the information some way. E.g. save the environment / inheritance hierarchy, offsets for variables, type data used to determine assembler instructions, etc.

Prof. Fateman CS 164 Lecture 1914 FAQ 2. Where do the programs live? You might ask this question; how are the methods placed in memory? For now we do not have to say, but we could let a “loader” determine where to put each code segment in memory. Each method can refer to instructions in its body by a relative address; a call sets the program counter (PC) to the top of the method’s code. In a “real” program, a loader would resolve the references to methods /classes/ etc defined elsewhere. MJ has no such problems. (discuss why?) Or we could let all this live in some world where there is a symbol table (like Lisp) and lets us just grab the definition when we want to get it. (Dynamic loading, too)

Prof. Fateman CS 164 Lecture 1915 FAQ 3: Too many layers? Q. There are too many pieces up in the air. Where do I start? A. Yes, we are faced with a multi-level target for understanding. That’s why we took it in steps up to here. Two more levels, closely tied together. We produce code for the Assembler –Understand the assembler input / output –Requires understanding VM The VM determines what instructions make sense to generate to accomplish tasks (esp. call/return, get/set data The programming language definition determines what we need. E.g. Where does “this” object come from? –Look at output of translator –You see what to generate (or equivalent…)

Prof. Fateman CS 164 Lecture 1916 The 2-pass assembler… (50 lines of code?) What does the assembler program do with a list of symbolic instructions – the reversed list mentioned previously? 1.Turn a list of instructions in a function into a vector. ;; count up the instructions and keep track of the labels ;; make a note of where the labels are (program counter ;; relative to start of function). ;; Create a vector of instructions. ;; The transformed "assembled" program can support fast ;; jumps forward and back. (multiple-value-bind (length labels) ;extract 2 items from 1 st pass (asm-first-pass (fn-code fn)) (setf (fn-code fn) ;; put vector back instead of list (asm-second-pass (fn-code fn) length labels)) fn))

Prof. Fateman CS 164 Lecture 1917 What does the assembler do? Transforms a list of symbolic instructions and labels into a vector. –Turn a list of instructions in a function into a vector. –When there is a jump, replace with a jump where the is the location of that First pass merely counts up the instructions and keeps track of the labels. Produces a lookup-table (e.g. assoc. list) for label  integer mapping. Second pass creates the vector with substitutions

Prof. Fateman CS 164 Lecture 1918 The 1 st pass counts up locations (defun asm-first-pass (code) "Return the label assoc list" (let ((length 0) (labels nil)) (dolist (inst code) (if (label-p inst) (push (cons inst length) labels) (incf length))) labels)) ;; could return (values length labels)

Prof. Fateman CS 164 Lecture 1919 The 2 nd pass resolves labels, makes vector (defun asm-second-pass (code labels) "Put code into code-vector, adjusting for labels." (coerce (map 'list #'(lambda (inst) (if (member (car inst) '(jump jumpn jumpz pusha call)) `(,(car inst),(cdr (assoc (second inst) inst)) inst)) (remove-if #'label-p code)) 'vector))

Prof. Fateman CS 164 Lecture 1920 The MJ Virtual Machine The easy parts

Prof. Fateman CS 164 Lecture 1921 It is a stack-based architecture simulated by a Lisp program. How does this differ from the MIPS architecture in CS61c? 1.No registers for values of variables 2.(there are some implicit registers fp, sp, pc)

Prof. Fateman CS 164 Lecture 1922 It is a stack-based architecture simulated by a Lisp program Some differences from CS61c MIPS/SPIM architecture –Registers not used for passing arguments –No “caller saved” or “callee saved” registers –Only implicit registers (e.g. program counter, frame pointer, stack pointer) –Debugging in the VM –I/O much different –No interrupts –No jump delay slot –No floating point co-processor –Undoubtedly more differences

Prof. Fateman CS 164 Lecture 1923 We set up an update-program-counter/ execute loop (defun run-vm (vm) ;; set up small utilities (loop (vm-fetch vm) ; Fetch instruction / update PC (case (car (vm-state-inst vm)) ;; Variable/stack manipulation instructions (lvar (vpush (frame (a1)))) (lset (set-frame (a1) (vpop))) ;;; etc etc (pushi (vpush (a1))) (pusha (vpush (a1))) ;; Branching instructions: (jump (set-pc (a1))) ;;; etc ;; Function call/return instructions: ;; ….. ;; Arithmetic and logical operations: ;; ….. ;; Other: ;;;…. (t (error "Unknown opcode: ~a" (vm-state-inst vm)))))

Prof. Fateman CS 164 Lecture 1924 The outer exception handling controller (defun run-vm (vm) ;; set up small utilities (handler-case (loop ;;process stuff ) (error (pe) (format t "~%Caught error: ~a" pe) (if (vm-state-crash-hook vm) (funcall (vm-state-crash-hook vm) vm) (vm-state-print vm)))))

Prof. Fateman CS 164 Lecture 1925 List of Opcodes (I) DETAILS in simple-machine.lisp ;; the EASY ones ;; +, *, -, <, and - Operations on two variables (e.g. pop a, b, push a-b) ;; not - Operates on one variable ;; ;; print - Pops an integer and prints it ;; read - Reads an integer and pushes it ;; exit s - Exits with status code s ;; ;; debug - Ignored by the machine; may be used for debugging hooks ;; break - Aborts execution; may be useful for debugging

Prof. Fateman CS 164 Lecture 1926 List of Opcodes (II) ;; lvar i - Gets the ith variable from the stack frame ;; lset i - Sets the ith variable from the stack frame ;; pop - Pops the stack ;; swap - Swaps the top two elements ;; dup i - Duplicates the ith entry from top of stack ;; addi i - Adds an immediate value to the top of stack ;; alloc - Pops a size from the stack; allocates a new array of that size ;; alen - Pops an array, pushes the length ;; mem - Pops index and array; pushes array[index]. ;; smem - Pops value, index, and array; sets array[index] = value. ;; pushi i - Push an immediate value ;; pusha i - Push an address ;; ;; jump a - Jumps to a ;; jumpz a - Pops val, jumps to a if it's zero ;; jumpn a - Pops val, jumps to a if it isn't zero ;; jumpi - Pops an address and jumps to it ;; ;; call f n - Calls function f with n arguments. ;; calli n - Calls a function with n arguments. Address popped from stack. ;; frame m - Push zeros onto the stack until there are m slots in the frame. ;; return - Returns from a call. Stack should contain just the return val.

Prof. Fateman CS 164 Lecture 1927 Architecture for arithmetic is based on underlying lisp arithmetic, e.g. + ´ #’+ Machine arithmetic can be defined as anything we wish (arbitrary precision? 16 bit, 32 bit, 64 bit? )

Prof. Fateman CS 164 Lecture 1928 Machine + assembler in simple-machine.lisp 250 lines of code, including comments (defun run (exp) ;;exp is (cleaned-up-perhaps ast) "compile and run MJ code" (machine (assemble (mj-compile exp)))) (machine(assemble(mj-compile(cleanup(mj-parse filename))))) is equivalent to run-mj, though we should check for semantic errors in there, too. Details, lots, in the on-line stuff; Discussion in sections.