Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assembly Process. Machine Code Generation Assembling a program entails translating the assembly language into binary machine code This requires more than.

Similar presentations


Presentation on theme: "Assembly Process. Machine Code Generation Assembling a program entails translating the assembly language into binary machine code This requires more than."— Presentation transcript:

1 Assembly Process

2 Machine Code Generation Assembling a program entails translating the assembly language into binary machine code This requires more than simply mapping assembly instructions to machine instructions Each instruction is bound to an address Each instruction is bound to an address Labels are bound to addresses Labels are bound to addresses Assembly instructions which refer to labels generate machine instructions which contain the label's address Assembly instructions which refer to labels generate machine instructions which contain the label's address Pseudo-instructions are translated into one or more machine instructions Pseudo-instructions are translated into one or more machine instructions

3 Instruction Format addi $13,$7,50 0010 0000111 01101 0000 0000 0011 0010 6 bits5 bits 16 bits opcode add $13,$7,$8 immediate operand 0000 0000 1110100001101000 0010 0000 opcode extended opcode

4 The symbol table The assembler scans the source code and generates the appropriate bit string for each line encountered The assembler must remember what memory locations have been allocated what memory locations have been allocated to which address each label is bound to which address each label is bound A symbol table is a list of (label, address) pairs When the data and text segments have been generated, they are stored as an executable file The file is used by a program called the loader to initialize memory to the appropriate state before execution

5 Instructions The.text directive tells the assembler that the lines which follow are instructions. By default, the text segment starts at 0x00400000 By default, the text segment starts at 0x00400000 In some cases, a symbol may not have an assigned address yet when the assembler scans the line where it belongs A second pass through the code can update instructions containing unresolved labels A second pass through the code can update instructions containing unresolved labels Maintain a list of addresses in which each unresolved label appears Maintain a list of addresses in which each unresolved label appears When the labeled is added to the symbol table, all locations in the corresponding list are updated to hold the address associated with the label

6 Branch offset in the MIPS R2000 In machine code, the target address in a branch must be specified as an offset from the address of the branch. During execution, this offset is simply added to the program counter to fetch the next instruction PC contains the address PC contains the address Offset is measured in words, not bytes Offset is measured in words, not bytes PC_NEW = offset*4 + PC_OLD To calculate the offset, the assembler uses the formula: offset = (target instruction address – (branch instruction address))/4

7 Branch offset calculation The offset is stored in the instruction as a word offset rather than a byte offset. Instructions are only stored at word boundaries Instructions are only stored at word boundaries For both target and branch instruction, the least two bits of the address are zero For both target and branch instruction, the least two bits of the address are zero An offset maybe negative If the target instruction preceded the branch instruction If the target instruction preceded the branch instruction The offset is stored in the 16-bit immediate field This means the branch can only jump about 2 15 instructions before or after the current address This means the branch can only jump about 2 15 instructions before or after the current address 2 15 instructions (words) = 2 17 bytes

8 Branch offset calculation [0x00400068] 0x1440ffe6 bne $2, $0, -104 [__start-0x00400068]; 44: bnez $v0, __start An entry in the SPIM instruction list orignal assembly code line number in source file offset calculation, in bytes ignores PC increment offset in bytes (__start = 0x00400000) 0x00400000 – (0x00400068) = - 104 machine code stored offset ffe6 = -26 = -104/4 instruction address

9 Jump target calculation The jump instruction has two forms Pseudo-direct, for j and jal Pseudo-direct, for j and jal Register direct for jr and jalr Register direct for jr and jalr jr and jalr specify a register containing the address to be loaded into the PC j and jal specify most of the address of the target within the instruction. However, they have a range of at most one-sixteenth of the memory space However, they have a range of at most one-sixteenth of the memory space fedcba9876543210fedcba9876543210

10 Jump target calculation The target address is a 32 bit quantity Since all word addresses are multiples of 4 there is no need to store the last two bits Since all word addresses are multiples of 4 there is no need to store the last two bits The jump instruction format has 26 bits for the target address The jump instruction format has 26 bits for the target address The remaining 6 bits of the instruction are used for the opcode The highest-order 4 bits of the target are taken from the address currently stored in the program counter The highest-order 4 bits of the target are taken from the address currently stored in the program counter PC opcodeJump target bits (26) 00

11 Jump Target Calculation jump instructions have a range of 2 26 words or 2 26 x 2 2 =2 28 bytes This range is NOT symmetric about the jump instruction This range is NOT symmetric about the jump instruction fedcba9876543210fedcba9876543210 0x80000080 -0x00000080 +0x0fffff7c

12 Program relocation It is possible that program modules are developed separately by individual programmers. When these programs are to be loaded into memory they should not be assigned overlapping memory space. Thus,the modules have to be relocated relative addresses are relocatable relative addresses are relocatable Any absolute references must be "fixed" by the loader Any absolute references must be "fixed" by the loader Use a logical base address known at load time Absolute addresses are stored as offsets from this TBD base

13 From source to executable compiler assembler linker loader memory exe obj lib asm high-level source code

14 Some examples of assembling code.data.data a1:.word 3 a1:.word 3 a2:.word 16, 16, 16, 16 a2:.word 16, 16, 16, 16 a3:.word 5 a3:.word 5.text.text __start: __start: la $6, a2 la $6, a2 loop: loop: lw $7, 4($6) lw $7, 4($6) mul $9, $10, $7 mul $9, $10, $7 b loop b loop li $v0, 10 li $v0, 10 syscall syscall

15 Some examples of assembling code Symbol Table symboladdress symboladdress a1 1000 0000 a1 1000 0000 a2 1000 0004 a2 1000 0004 a3 1000 0014 a3 1000 0014 __start0040 0000 __start0040 0000 loop0040 0008 loop0040 0008 Memory map of data section Memory map of data section addresscontents addresscontents 1000 00000000 0003 1000 00000000 0003 1000 00040000 0010 1000 00040000 0010 1000 00080000 0010 1000 00080000 0010 1000 000c0000 0010 1000 000c0000 0010 1000 00100000 0010 1000 00100000 0010 1000 00140000 0005 1000 00140000 0005.data a1:.word 3 a2:.word 16, 16, 16, 16 a3:.word 5.text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10, $7 b loop li $v0, 10 syscall

16 Translate pseudo-instructions lui $6, $6, 0x1000 lui $6, $6, 0x1000 ori $6, $6, 0x0004 ori $6, $6, 0x0004 lw $7, 4($6) lw $7, 4($6) mult $10, $7 mult $10, $7 mflo $9 mflo $9 b loop b loop ori $v0, $0, 10 ori $v0, $0, 10 syscall syscall la $6, a2 loop: lw $7, 4($6) mul $9, $10, $7 b loop li $v0, 10 syscall

17 Translate to machine code lui $6, 0x1000 lui $6, 0x1000 ori $6, 0x0004 ori $6, 0x0004 lw $7, 4($6) lw $7, 4($6) mult $10, $7 mult $10, $7 mflo $9 mflo $9 b loop b loop ori $v0, $0, 10 ori $v0, $0, 10 syscall syscall address contents 00400000 3c06 1000 (lui) 00400004 34c6 0004 (ori) 00400008 8cc7 0004 (lw) 0040000c 012a 0018 (mult) 00400010 0000 4812 (mflo) 00400014 1000 xxxx (beq) 00400018 3402 000a (ori) 0040001c 0000 000c (syscall)

18 Resolve relative references lui $6, 0x1000 ori $6, 0x0004 ori $6, 0x0004 lw $7, 4($6) lw $7, 4($6) mult $10, $7 mult $10, $7 mflo $9 mflo $9 b loop b loop ori $v0, $0, 10 ori $v0, $0, 10 syscall syscall address contents 00400000 3c06 1000 00400004 34c6 0004 00400008 8cc7 0004 0040000c 012a 0018 00400010 0000 4812 00400014 1000 fffd (-3) 00400018 3402 000a 0040001c 0000 000c [0x400008 - (0x400014)]/4 = -12/4 = -3 = 0xfffd


Download ppt "Assembly Process. Machine Code Generation Assembling a program entails translating the assembly language into binary machine code This requires more than."

Similar presentations


Ads by Google