Download presentation
Presentation is loading. Please wait.
1
Computer Science 210 Computer Organization
Building an Assembler Part III: The First Pass and Scope Analysis
2
The First Pass Text file Line stream Token stream
Tools Opcode table Text file Line stream CharacterIO Scanner Token stream Symbol table First Pass Source program listing, error messages (file and/or terminal) Sym file
3
What the First Pass Does
Enters each distinct label into a symbol table If a duplicate label is encountered at the beginning of an instruction, ERROR! The instruction’s address # is also entered with the label
4
Complications The default start address is x3000, but .ORIG can override this .BLKW and .STRINGz might specify an array of many cells, so we can’t just increment the address counter after each line of code Must treat both as special cases during the first pass
5
The Token Set The token set of a language is its vocabulary, or set of words and their types Includes keywords, opcode symbols, labels, register names, integer literals, commas, directive symbols, etc.
6
LC-3 Tokens in token.h int TC_ADD = 0; int TC_AND = 1; int TC_BR = 2; int TC_NOT = 3; etc. Define a bunch of symbols to be returned by the scanner Will examine these to make choices during scope analysis in pass one and syntax analysis in pass two
7
LC-3 Tokens in token.h typedef enum { TC_ADD, TC_AND, TC_BLKW, TC_BR, TC_BRZ, TC_BRN, TC_BRP, TC_BRNZ, TC_BRZP, TC_BRNZP, TC_BRNP, TC_COMMA, TC_END, TC_ERROR, TC_FILL, TC_GETC, TC_HALT, TC_IN, TC_INT, TC_JMP, TC_JSR, TC_JSRR, TC_LABEL, TC_LD, TC_LDI, TC_LDR, TC_LEA, TC_NEWLINE, TC_NOT, TC_ORIG, TC_OUT, TC_PUTS, TC_REG, TC_RET, TC_RTI, TC_ST, TC_STI, TC_STR, TC_STRING_LIT, TC_STRINGZ, TC_TRAP } tokenType; // A token has a type, a source string (in uppercase), // a binary string, and an int value. typedef struct token{ tokenType type; char* source; char* binary; int intValue; } token; Use a C enum type to introduce a set of symbolic values of a new type
8
The scanner Interface void initScanner(FILE* infile, FILE* outfile);
// Must be run to get the first instruction, and after a token // is a newline. // Can be run after the first token is scanned to advance to // the next instruction on the assembler's first pass. // Returns 0 if the end of file has been reached. int nextInstruction(); // Advances to the next token, if there is one, in the // instruction, and returns the token just scanned. // Returns the end of line token when the instruction has // been completely scanned. // Ignores program comments. token nextToken();
9
Setting Up the Scanner static token opcodeTable[MAX_OPCODES]; static token registerTable[MAX_REGISTERS]; static token directiveTable[MAX_DIRECTIVES]; static char* codeLine; static int codeLineNumber; static int codeLineColumn; void initScanner(FILE* infile, FILE* outfile){ initOpcodeTable(); initRegisterTable(); initDirectiveTable(); codeLineNumber = 0; codeLineColumn = 0; initChario(infile, outfile); } Maintains lookup tables of opcodes, registers, and directives
10
The Interface for Token Tables
// Initializes token, inserts it at index, and increments index. // Assumes that index < size of table array. void initToken(tokenType t, char* src, char* bin, int intValue, int *index, token table[]); // Returns the index of the token associated with src or -1 // if it is not there. int findToken(char* src, token table[], int length); // Prints the table in three columns void printTokenTable(token table[], int length); Unlike the symbol tables, these tables are of fixed size
11
Initializing a Token Table
// The test driver for a symbol table. void initRegisterTable(){ int index = 0; initToken(TC_REG, "R0", "000", 0, &index, registerTable); initToken(TC_REG, "R1", "001", 0, &index, registerTable); initToken(TC_REG, "R2", "010", 0, &index, registerTable); initToken(TC_REG, "R3", "011", 0, &index, registerTable); initToken(TC_REG, "R4", "100", 0, &index, registerTable); initToken(TC_REG, "R5", "101", 0, &index, registerTable); initToken(TC_REG, "R6", "110", 0, &index, registerTable); initToken(TC_REG, "R7", "111", 0, &index, registerTable); } Other modules can access a register’s 3-bit address in its token object
12
First Pass Strategy Handle a possible .ORIG first
Then get instructions, look at the first token, and if it’s a label, take some action Look ahead for a .BLKW and handle that case Repeat the last two steps until .END is reached or there are no more instructions
13
Trying Out the First Pass
When you have completed and thoroughly tested your symbol table module, copy it to the firstpass directory and run make Then, run ./testfirstpass with your favorite assembly language program Be sure to try it with a program that has scope errors as well!
14
Grammar and Syntax Analysis
For Monday Grammar and Syntax Analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.