Computer Science 210 Computer Organization

Computer Science 210 Computer Organization
Building an Assembler Part IV: The Second Pass, Syntax Analysis, and Code Generation

The Second Pass Text file Line stream Token stream
Tools Opcode table Text file Line stream CharacterIO Scanner Token stream First Pass Symbol table Second Pass Source program listing, error messages (file and/or terminal) Sym file Bin file

What the Second Pass Does
Scans through the lines of code and performs syntax analysis Translates each line of code to a 16-bit binary instruction (or data values when .FILL, .STRINGZ, and .BLKW appear)

Implementation: The Data
#define FILL_ZERO " " #define SIX_ZEROS "000000" static int spAddress; // The address counter static int spNotDone; // More instructions? static token spToken; // The current token static FILE* binfile; // The output file static char outputBuffer[17]; // The current binary instruction

The Top-Level Function
// Initializes the data, gets the first instruction, gets the // first token, and calls program() void secondPass(FILE* infile, FILE* outfile, FILE* bfile){ binfile = bfile; outputBuffer[16] = 0; initScanner(infile, outfile); spAddress = DEFAULT_START_ADDRESS; spNotDone = nextInstruction(); spToken = nextToken(); program(); }

Implementation: Second Pass Tools
Define some utility functions to Output a line of binary code Process a label reference Finish an instruction (scans to end of line, increments the address counter, gets the next token) Check a token’s type and output an error message if it’s unexpected

Finishing an Instruction
// The purported end of an instruction has been reached, so // check for the newline, get the next instruction, get // its first token, and increment the address counter. void finishInstruction(){ spToken = nextToken(); accept(TC_NEWLINE, "Too many tokens in instruction."); spNotDone = nextInstruction(); if (spNotDone){ spAddress++; }

The Parsing Functions Each syntax rule in the EBNF grammar translates to a parsing function Each function assumes that the current token is its start symbol Each function calls finishInstruction as its last step

The Protoypes // Parsing function prototypes void program(); void instruction(); void orig_ins(); void add_or_and_ins(); void blkw_ins(); void br_ins(); void fill_ins(); void jmp_ins(); void jsr_ins(); void jsrr_ins(); void ld_ldi_st_sti_ins(); void ldr_or_str_ins(); void lea_ins(); void not_ins(); void ret_or_rti_ins(); void stringz_ins(); void trap_ins(); Instructions with the same format differ only in the leading token

Parsing with the Top-Level Rule
// program = [ orig-directive ] { [ label ] instruction } ".END" void program(){ orig_ins(); while (spNotDone && spToken.type != TC_END) instruction(); accept(TC_END, ".END expected."); } We stop when .END is reached or there are no more instructions accept checks the current token’s type for possible error

Parsing the not Instruction
// not-ins = "NOT" register "," register void not_ins(){ strcpy(outputBuffer, spToken.binary); spToken = nextToken(); accept(TC_REG, "Register expected."); strcat(outputBuffer, spToken.binary); accept(TC_COMMA, "Comma expected."); outputBinary(); finishInstruction(); }

Parsing the .FILL Directive
// fill-ins = ".FILL" integer-literal void fill_ins(){ spToken = nextToken(); accept(TC_INT, "Integer literal expected."); strcpy(outputBuffer, signedBinary(spToken.intValue, 16)); outputBinary(); finishInstruction(); } Should add a check on the bounds of an integer fill value!

Parsing the .BLKW Directive
// blkw-ins = ".BLKW" integer-literal void blkw_ins(){ strcpy(outputBuffer, FILL_ZERO); spToken = nextToken(); accept(TC_INT, "Integer literal expected."); int i; for (i = 1; i <= spToken.intValue; i++) outputBinary(); spAddress += spToken.intValue - 1; finishInstruction(); } Should add a check on the memory available for the given number of words!

Parsing the .STRINGZ Directive
// stringz-ins = ".STRINGZ" string-literal void stringz_ins(){ spToken = nextToken(); accept(TC_STRING_LIT, "String literal expected."); char* lit = spToken.source; int i; for (i = 0; i < spToken.intValue; i++){ char ch = lit[i]; strcpy(outputBuffer, unsignedBinary(ch, 16)); outputBinary(); } strcpy(outputBuffer, FILL_ZERO); spAddress += spToken.intValue - 2; finishInstruction(); Should add a check on the memory available for the characters!

Parsing the LD Instruction
// ld-ins = "LD" register "," label void ld_ldi_lea_st_sti_ins(){ strcpy(outputBuffer, spToken.binary); spToken = nextToken(); accept(TC_REG, "Register expected."); strcat(outputBuffer, spToken.binary); accept(TC_COMMA, "Comma expected."); processLabel(9); outputBinary(); finishInstruction(); } LD, LDI, LEA, ST, and STI all have the same format

Processing a Reference to a Label
// Converts an integer to a signed bit string and // appends that to the output buffer void processLabel(int numBits){ accept(TC_LABEL, "Label expected."); if (spToken.type == TC_LABEL){ int labelAddress = findSymbol(spToken.source); if (labelAddress == -1) putError("Undeclared label."); else{ int offset = labelAddress - (spAddress + 1); strcat(outputBuffer, signedBinary(offset, numBits)); } Make sure there is a label, make sure it’s declared, and use its address and PC + 1 to compute the offset of length numBits Should add a check on the limits of the offset!

For Friday Review and Wrapup

Computer Science 210 Computer Organization

Similar presentations

Presentation on theme: "Computer Science 210 Computer Organization"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Science 210 Computer Organization

Similar presentations

Presentation on theme: "Computer Science 210 Computer Organization"— Presentation transcript:

Similar presentations

About project

Feedback