Assembler Design Options CSCI/CMPE 3334 David Egle
Design Options Have studied the two pass assembler One pass assemblers not the only design possible One pass assemblers useful when it is necessary or desirable to avoid a second pass Multi-pass assemblers an extension to the two pass assembler that allows the assembler to handle forward references in symbolic expressions
One Pass Assembler Does everything in one pass Problem: How do we handle forward references? Could eliminate forward references easy for data – just define the data areas before they are referenced not easy in code – how do we handle selection or loop statements which have forward jumps? Two types of one pass assemblers: produce object code directly in memory produce object file for later linking/loading/execution
Load and go assembler Generate object code in memory for immediate execution No loader is needed More efficient Avoids the overhead of writing the object program out and reading it and loading it back in Avoids overhead of an additional pass over the source Also, storage devices for the intermediate file might not be available, slow, or inconvenient
Operation of load and go Assembler generates object code instructions as it scans source program If an operand symbol has not yet been defined operand address is set to 0 in instruction symbol is entered into the symbol table (unless it is already present) entry is flagged to indicate the symbol is undefined address of instruction is added to list of forward references associated with this symbol When symbol definition is encountered forward reference list is scanned, and proper address is inserted in any instructions previously generated (in memory)
One pass assembler - Algorithm read first input line if OPCODE = ‘START’ then { save #{OPERAND} as starting address initialize LOCCTR to starting address read next input line else set LOCCTR=0 while OPCODE != ‘END’ do { if this is not a comment line then handle line } save (LOCCTR – starting address) as program length Put value of OPERAND on END directive in PC
Algorithm - 2 If there is a symbol in the LABEL field then search SYMTAB for LABEL if found then if undefined mark defined; go through reference list and fix up memory locations else set error flag (duplicate label) else insert (LABEL, LOCCTR) into SYMTAB verify mnemonic and get value if operand is present search SYMTAB for OPERAND if defined get the address set address to 0 and add current value of LOCCTR to reference list insert OPERAND in symbol table, marked undefined, and add LOCCTR to reference list Build machine code Handle directives as appropriate Insert code in memory
One pass assembler that produces object programs Use the same procedure When the definition of a symbol is encountered if instruction which made the forward reference is still in memory, then fix it if not, the instruction has already been written out in a Text record, so generate a new Text record with the correct operand address the loader will fix up the address field
Multi-pass assemblers Required if we remove the restriction that all the symbols on the right hand side of an EQU or ORG directive must be defined Note that removing this restriction is not really a convenience for the programmer as it leads to unreadable code! But, if removed, the multiple passes are use to gradually evaluate the expressions The number of passes will depend on the depth of forward referencing
Implementation examples MASM assembler SPARC assembler AIX assembler
Review problems – Chapter 2 Section 2.1 2, 7 Section 2.2 1, 2, 3 Section 2.3 4, 7, 8, 11, 16, 17, 20, 22 Section 2.4 6, 8, 9