Download presentation
Presentation is loading. Please wait.
Published byStine Mikalsen Modified over 5 years ago
1
Introduction to Intermediate Code, virtual machine implementation
Lecture 27 Prof. Fateman CS 164 Lecture 27
2
Prof. Fateman CS 164 Lecture 27
Where do we go from here? testing AST generation Type checking Cleanup Interpreter in intermediate code/ assmblr Virtual machine out Prof. Fateman CS 164 Lecture 27
3
Prof. Fateman CS 164 Lecture 27
Details of Tiger what the VM must support Details of the VM what IC to generate in intermediate code/ assmblr Virtual machine out Prof. Fateman CS 164 Lecture 27
4
What is Intermediate Code? No single answer…
Any encoding further from the source and closer to the machine (in our case, no line/column numbers is one step!) Generally relatively portable May support several languages (C, C++, Pascal, Fortran) where the resources needed are similar. Languages with widely differing semantics may require different IC. E.g. Java supporting Scheme is hard. Prof. Fateman CS 164 Lecture 27
5
Forms of Intermediate Code
Virtual stack machine simplified “macro” machine 3-address code A :=B op C Register machine models with unlimited number of registers, mapped to real registers later Tree form (= lisp program, more or less) Prof. Fateman CS 164 Lecture 27
6
Using Intermediate Code
We need to make some progress towards the machine we have in mind Subject to manipulation, “optimization” Perhaps several passes Or, we can generate assembler Or, we could plop down in absolute memory locations the binary instructions needed to execute the program “compile-and-go” system Prof. Fateman CS 164 Lecture 27
7
In practice, given a tree representation like an AST
Instead of typechecking or interpreting the code, we are traversing it and putting together a list of instructions… generating code… that if run on a machine -- would execute the program. In other words instead of typechecking a loop,or going through a loop executing it, we write it out in assembler. Prof. Fateman CS 164 Lecture 27
8
Prof. Fateman CS 164 Lecture 27
The simplest example Compiling the Tiger program 3+4 The string “3+4” parses to (OpExp (1 . 2) + (IntExp (1 . 1) 3) (IntExp (1 . 3) 4) typechecker approves and says it is of type int. cleanup program produces (+ 3 4) We now want to produce code. Prof. Fateman CS 164 Lecture 27
9
Just a simple stack machine
Compiling (+ 3 4) (comp '(+ 3 4)) ((pushi 3) (pushi 4) (+)) Result is a list of assembly language instructions push immediate constant 3 on stack push immediate constant 4 on stack + = take 2 args off stack and push sum on stack Conventions: result is left on top of stack. Location of these instructions unspecified. Lengthy notes in ic file describe virtual machine. Prof. Fateman CS 164 Lecture 27
10
It doesn’t have to look like lisp if we write a printing program
(show-comp '(+ 3 4)) ;; prints out… pushi 3 pushi 4 + Prof. Fateman CS 164 Lecture 27
11
We need to have interacting programs, so lets make functions
;; We are shuffling around fns all the time in this compiler (defstruct fn name params code env) (top-comp1 '(+ 3 4)) #S(fn :name main :params nil :code ((args 0) (pushi 3) (pushi 4) (+) (exit 0)) :env (#(…))) (show-fn *) ;;prints this out args 0 pushi 3 pushi 4 + exit 0 nil ;;return value Conventions (args 0): no params. Exit is opcode to return to supervisor level of simulator. Not return from subroutine. Prof. Fateman CS 164 Lecture 27
12
How do we convert to machine code?
(setf r (top-comp1 '(+ 3 4))) #S(fn :name main :params nil :code ((args 0) (pushi 3) (pushi 4) (+) (exit 0)) :env (#(…))) Assembler does this… makes list into vector/array.. (assemble r) ;; MODIFIES :code part of r :code #((args 0) (pushi 3) (pushi 4) (+) (exit 0)) We have ignored :env. It is a pointer to an array of Global functions… print, chr, ord, flush, … Prof. Fateman CS 164 Lecture 27
13
A slightly more interesting example
(setf r (top-comp1 '(IfExp 1 "yes" "no")) ) #S(fn :name main :params nil …..) (show-fn r) ;; slightly more sophisticated show-fn; byte counts 0: args 0 4: pushi 1 8: jumpz L0003 12: pusha ="yes" 16: jump L0004 L0003: 24: pusha ="no" L0004: 32: exit 0 nil Prof. Fateman CS 164 Lecture 27
14
That example,assembled, has no labels
(assemble r) #S(fn :name main :params nil …..) (show-fn r) 0: args 0 4: pushi 1 8: jumpz 20 12: pusha ="yes" 16: jump 24 20: pusha ="no" 24: exit 0 nil Prof. Fateman CS 164 Lecture 27
15
The 2-pass assembler… (50 lines of lisp)
(in-package :tiger) (defun label-p(x)(symbolp x)) ; is x a label? (defun opcode (instr) (if (label-p instr) :label (first instr))) (defun args (instr) (if (listp instr) (rest instr))) (defun arg1 (instr) (if (listp instr) (second instr))) (defun arg2 (instr) (if (listp instr) (third instr))) (defsetf arg1 (instr) (val) `(setf (second ,instr) ,val)) ;; defsetf method makes (setf (arg1 x) foo) ;; mean the same as (setf (second x) foo) ;; we want to change arg1 in (opcode arg1 …) Prof. Fateman CS 164 Lecture 27
16
Prof. Fateman CS 164 Lecture 27
The 2-pass assembler… (defun assemble (fn) "Turn a list of instructions in a function into a vector." ;; count up the instructions and keep track of the labels ;; make a note of where the labels are (program counter ;; relative to start of function). Create a vector of instructions. ;; The transformed "assembled" program can support fast ;; jumps forward and back. (multiple-value-bind (length labels);extract 2 items from 1st pass (asm-first-pass (fn-code fn)) (setf (fn-code fn) ;; put vector back instead of list (asm-second-pass (fn-code fn) length labels)) fn)) Prof. Fateman CS 164 Lecture 27
17
The 1st pass counts up locations
(defun asm-first-pass (code) ;;return 2 values: the total code length ;;and the labels assoc list (let ((length 0) (labels nil)) (dolist (instr code) (if (label-p instr) (push (cons instr length) labels) ;;assume instructions are 4 bytes (incf length 4))) (values (/ length 4) labels))) Prof. Fateman CS 164 Lecture 27
18
The 2nd pass resolves labels, makes vector
(defun asm-second-pass (code length labels) ;;Put code into code-vector, adjusting for labels. ;;For each instruction that refers to a label, ;;put the position in for the label. (let ((addr 0) (code-vector (make-array length))) (dolist (instr code) (unless (label-p instr) (if (is instr '(jump jumpn jumpz save)) ;; a setf method that changes label to number (setf (arg1 instr) (cdr (assoc (arg1 instr) labels)))) (setf (aref code-vector addr) instr) (incf addr ))) code-vector)); returns a vector of code! Prof. Fateman CS 164 Lecture 27
19
The Tiger Virtual Machine
The easy parts Prof. Fateman CS 164 Lecture 27
20
It is a stack-based architecture simulated by a lisp program
The machine is started up running a particular function, f The program counter in that function is set to 0. In general the environment will look like a stack of arrays. This is a peculiarly lispish way of doing things, and could be done other, more efficient ways. On calling a new function a “macro” operation ARGS is used to copy arguments from the stack to the environment. This makes calling rather pricey. However, it makes return very neat. n-args is the number of args supplied to this function instr is a temp variable. Prof. Fateman CS 164 Lecture 27
21
We set up an update-program-counter/ execute loop
(defun machine (f) (let* ((code (fn-code f)) ;an array of instructions (pc 0) (env (fn-env f)) (stack nil) (n-args 0) ;main program has no arguments (instr nil)) (loop ;; pick out the instruction at the given pc (setf instr (elt code (ash pc -2))) ; address ¥ 4 (if *debug-pc* (format t "~%Executing ~s at pc ~s; code is ~s" (fn-name f) pc instr)) (incf pc 4) ;; we assume 4 bytes per instruction (case (opcode instr) ;; ALL THE OPCODES GO HERE Prof. Fateman CS 164 Lecture 27
22
Prof. Fateman CS 164 Lecture 27
Sample Opcodes (defun machine (f) ;;; … stuff left out (case (opcode instr) ;; Variable/stack manipulation instructions: (lvar ; (lvar arg1 arg2) ;; take the value in the environment at offset arg2 ;; in offset arg1, and push on the computation stack (push (elt (elt env (arg1 instr)) (arg2 instr)) stack)) (lset (setf (elt (elt env (arg1 instr)) (arg2 instr)) (top stack))) (pop (pop stack))…. Prof. Fateman CS 164 Lecture 27
23
Prof. Fateman CS 164 Lecture 27
Architecture for arithmetic is based on underlying lisp arithmetic, e.g. + ´ #’+ (defun machine (f) ;;; … stuff left out (case (opcode instr) ;; … more stuff left out ;; arithmetic: use defn from the interpreter ((+ - * / < > <= >= = =) (setf stack (cons (funcall (cadr (assoc (opcode instr) tig-init-env)) (second stack) (first stack)) (cddr stack)))) ;; Other: exit provides exit code.. ((exit nil) (return (top stack))) Prof. Fateman CS 164 Lecture 27
24
Machine + assembler in assemble.cl
250 lines of code, including comments (defun run (exp) ;;exp is (cleanup ast) "compile and run tiger code" (machine (assemble (top-comp1 exp)))) Prof. Fateman CS 164 Lecture 27
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.