Chapter 5 Assembly Language
The Level-ISA3 language is machine language, sequences of 1’s and 0’s sometimes abbreviated to hexadecimal (last chapter)
Two types of bit patterns Instructions Mnemonics for opcodes Letters for addressing modes Data Pseudo-ops, also called dot commands
Two types of bit patterns Instructions Mnemonics for opcodes Letters for addressing modes Data Pseudo-ops, also called dot commands
Example machine language instruction C0009A 1100-0000 0000-0000 1001-1010 1100-raaa load register accumulator(r=0) Immediate addressing(aaa=000) This instruction is written in the Pep/9 assembly language as LDWA 0x009A,i LDWA, (mnemonic load word accumulator) 0x hexadecimal constant i, addressing mode
Figure 5.1
QUESTION Convert following machine language instructions to assembly language 1100 0011 0000 0000 1001 1010 1100 0110 0000 0000 1001 1010 1100 1011 0000 0000 1001 1010 1100 1110 0000 0000 1001 1010 LDWA 0x009A,s LDWA 0x009A,sx LDWX 0x009A,s
ANSWER 1100 0011 0000 0000 1001 1010 LDWA 0x009A,s 1100 0110 0000 0000 1001 1010 LDWA 0x009A,sx 1100 1011 0000 0000 1001 1010 LDWX 0x009A,s 1100 1110 0000 0000 1001 1010 LDWX 0x009A,sx
40 instructions of the Pep/9 instruction set at Level Asmb5 Figure 5.2 40 instructions of the Pep/9 instruction set at Level Asmb5 instruction is unary (U)
Figure 5.2 (continued)
Figure 5.2 (continued)
Figure 5.2 (continued)
The unimplemented opcode instructions NOPn Unary no-operation trap NOP Non-unary no-operation trap DECI Decimal input trap DECO Decimal output trap HEXO Hexadecimal output trap STRO String output trap These new instructions are available to the assembly language programmer at Level Asmb5, but they are not part of the instruction set at Level ISA3. The operating system at Level OS4 provides them with its trap handler. Chapter 8 shows in detail how the operating system provides these instructions. You do not need to know the details of how the instructions are implemented to program with them.
Two types of bit patterns Instructions Mnemonics for opcodes Letters for addressing modes Data Pseudo-ops, also called dot commands
Pseudo-operations(pseudo-ops/dot commands ) Assembly language statements. Do not have opcodes and do not correspond to any of the 40 instructions in the Pep/9 instruction set. .ADDRSS The address of a symbol .ALIGN Padding to align at a memory boundary .ASCII A string of ASCII bytes .BLOCK A block of zero bytes .BURN Initiate ROM burn .BYTE A byte value .END The sentinel for the assembler .EQUATE Equate a symbol to a constant value .WORD A word value Pseudo means false. Pseudo-ops are so called because the bits that they generate do not correspond to opcodes, as do the bits generated by the 40 instruction mnemonics. Pseudo-ops are also called assembler directives or dot commands because each must be preceded by a . in assembly language. All the pseudo-ops except .BURN, .END, and .EQUATE insert data bits into the machine language program.
Question Convert the following machine language instructions into assembly language, assuming that they were not generated by pseudo-ops: (a) 9AEF2A (b) 03 (c) D7003D
Answer (a) 9AEF2A ANSWER: ORX 0xEF2A,n 1001-1010-1110-1111-0010-1010 r=1 = x ORX aaa = 010 addressing mode = indirect n Operand –specifier = EF2A ANSWER: ORX 0xEF2A,n
Answer (b) 03 0000 - 0011 ANSWER: MOVSPA
Answer (c) D7003D ANSWER: LDBA 0x003D,sfx 1101-0111-0000-00000-0011-1101 r=0 = A LDBA aaa = 111 addressing mode = sfx Operand –specifier = 003D ANSWER: LDBA 0x003D,sfx
Question Convert the following assembly language instructions into hexadecimal machine language: (a) ASLA (b) DECI 0x000F,s (c) BRNE 0x01E6,i
Answer (a) ASLA ASLA r = A 0 0000 -1010 ANSWER: 0A
Answer (b) DECI 0x000F,s ANSWER: 33 00 0F s addressing mode = 011 0011 – 0011 – 0000 – 0000 – 0000 - 1111 ANSWER: 33 00 0F
Answer (c) BRNE 0x01E6,i ANSWER: 1A 01 E6 i addressiong mode = 0 0001-1010-0000-0001-1110-0110 ANSWER: 1A 01 E6
QUESTION Convert following program into assembly in Pep/9 Memory address
The .ASCII and .END Pseudo-ops ANSWER The .ASCII and .END Pseudo-ops Comment
run Build menu: assemble load execute
Buildassemble Object code Assembler Listing
BuildLoad BuildExecute
Run source hi.pep Assemble, Load, Execute — are combined in the single option called Run Source
Step through
The .ASCII pseudo-op The backslash prefix To include a double quote in your string, you must prefix it with a backslash \. E.g. "She said, \"Hello\"." To include a backslash, prefix it with a backslash. E.g. "My bash is \\." You can put a newline character in your string by prefixing the letter n with a backslash and put a tab character by prefixing the letter t with a backslash. E.g "\nThis sentence will output on a new line."
QUESTION Write an assembly language program that prints your first name on the screen. Use the .ASCII pseudo-op to store the characters at the bottom of your program. Use the LDBA instruction with direct addressing to output the characters from the string. The name you print must contain more than two letters.
ANSWER
Figure 5.4 ASSEMBLER
Figure 5.5
.BLOCK pseudo-op generates the next byte of 0’s for storage The assembler interprets any number not prefixed with 0x as a decimal integer (e.g. 1 here). E.g. .BLOCK 1
QUESTION Convert following program from machine code to assembly
Assembler generates the next byte of 0’s for storage Figure 5.6 ANSWER BLOCK Pseudo-op Assembler generates the next byte of 0’s for storage
The .WORD and .BYTE Pseudo-ops Like the .BLOCK command Two differences: it always generates one word (two bytes) of code, not an arbitrary number of bytes. the programmer can specify the content of the word.
The dot command .WORD 5 .WORD 0x0030 means “Generate one word with a value of 5 (dec).” .WORD 0x0030 means “Generate one word with a value of 0030 (hex).”
.BYTE command : In this program, you could replace .WORD 0x0030 with works like the .WORD command except that it generates a byte value instead of a word value. In this program, you could replace .WORD 0x0030 with .BYTE 0x00 .BYTE 0x30 and generate the same machine language.
Question Convert the following assembly language pseudo- ops into hexadecimal machine language: (a) .ASCII "Bear\x00" (b) .BYTE 0xF8 (c) .WORD 790
Answer (a) .ASCII "Bear\x00“ (b) .BYTE 0xF8 ANSWER: F8 (c) .WORD 790 790(dec) = 316(hex)
Question Convert following program to assembly:
review
Figure 5.7 Answer
Using the Pep/9 Assembler Figure 5.8 Using the Pep/9 Assembler First the assembler is loaded into main memory and the application program is taken as the input file. The output from this run is the machine language version of the application program. It is then loaded into main memory for the second run. All the programs in the center boxes must be in machine language.
Figure 5.9 When writing an assembly language program, you must place at least one space after the mnemonic or dot command. Other than that, there are no restrictions on spacing. Your source program may be in any combination of uppercase or lowercase letters.
QUESTION Predict the output of the following assembly language program:
ANSWER Output: gum Address(hex) Takes one byte 0000 0003 0006 0009 000C 000F 0012 0013 Load “m” to Accumulator Output “m” Load “u” to Accumulator Output “u” Load “g” to Accumulator Output “g” Takes one byte 0013 0014 0015 Output: gum
Question Predict the output of the following assembly language program if the input is g. Predict the output if the input is A. Explain the difference between the two results: LDBA 0xFC15,d ANDA 0x000A,d STBA 0xFC16,d STOP .WORD 0x00DF .END
Answer 0000 LDBA 0xFC15,d ; get char from user 0003 ANDA 0x000A,d ; and with 00DF =0000-0000-1101-1111 0006 STBA 0xFC16,d 0009 STOP 000A .WORD 0x00DF 000C .END Output is G when the input is g. The output is A when the input is A. The program converts a lowercase letter to its uppercase equivalent, but keeps uppercase letters the same. Uppercase and lowercase letters differ by a single bit, which is 0 for uppercase and 1 for the corresponding lowercase letter. The AND mask forces the bit to zero and leaves all other bits unchanged.
Direct addressing Oprnd = Mem[OprndSpec] Asmb5 letter: d The operand specifier is the address in memory of the operand.
Immediate addressing The operand specifier is the operand. Oprnd = OprndSpec Asmb5 letter: i
Question Convert following program to use immediate addressing How will the machine code change?
Program 5.3 (print ‘Hi’) using immediate addressing Figure 5.10 Program 5.3 (print ‘Hi’) using immediate addressing Previous program Character constants are enclosed in single quotes and always generate one byte of code. Immediate addressing has two advantages over direct addressing: - The program is shorter because the ASCII string does not need to be stored separately from the instruction. - The instruction also executes faster because the operand is immediately available to the CPU in the instruction register.
The decimal input instruction Instruction specifier: 0011 0aaa Mnemonic: DECI Convert a string of ASCII characters from the input device into a 16-bit signed integer and store it into memory Input device at Mem[FC15] can input only one byte as as a single ASCII character, it is difficult to perform I/O on decimal values that require more than one digit for their ASCII representation.
The decimal output instruction Instruction specifier: 0011 1aaa Mnemonic: DECO Convert a 16-bit signed integer from memory into a string of ASCII characters and send the string to the output device output device at Mem[FC16] can output only one byte as a single ASCII character, it is difficult to perform I/O on decimal values that require more than one digit for their ASCII representation.
The unconditional branch instruction Instruction specifier: 0001 001a Mnemonic: BR Skips to a different memory location for the next instruction to be executed. branch instructions almost always use immediate addressing
Question Write a program to accept a decimal value from keyboard and print the input value + 1 Sample outputs: User input = 7 output
algorithm Branch around data Storage for one integer Input number Output number Load A <- ‘ ‘ Store byte A->output Load A <- ‘+’ Load A <- ‘1’ Load A <- ‘=‘ Load A <- input sotred in memory Add A + <- 1 Store sum in A in memory Output sum in memory stop
Figure 5.11 Requires seven pairs of LDBA and STBA instructions to output the string " + 1 = ", one pair for each ASCII character that is output.
Figure 5.11 (continued) If you do not specify the addressing mode for a branch instruction, the assembler will assume immediate addressing
The string output instruction Instruction specifier: 0100 1aaa Mnemonic: STRO Send a string of null-terminated ASCII characters to the output device It lets you output the entire string of multiple characters with only one instruction.
Figure 5.12
The hexadecimal output instruction Instruction specifier: 0100 0aaa Mnemonic: HEXO Convert a 2-byte word from memory into four hexadecimal digits and send the string to the output device
Figure 5.13 in hex -2 1136(dec)= 0470(hex) 70(hex) = ‘p’ In dec in hex In dec -2 00(hex) 55(hex) =85(dec) 0055 (hex) ‘U’ 70 (hex) = ‘p’ 1136(dec)= 0470(hex) 70(hex) = ‘p’
Figure 5.13 (continued)
QUESTION Predict the output of the program in Figure 5.13 if the dot commands are changed to .WORD 0xFFC7 ;First .BYTE 0x00 ;Second .BYTE 'H' ;Third .WORD 873 ;Fourth
QUESTION Predict the output of the program in Figure 5.13 if the dot commands are changed to .WORD 0xFFC7 ;First .BYTE 0x00 ;Second .BYTE 'H' ;Third .WORD 873 ;Fourth
48(hex) in ASCII Table = ‘H’ 0039(hex) =57(dec) -57(dec) 0048(hex) = 72(dec) 0048(hex) 48(hex) in ASCII Table = ‘H’ 69(hex) in ASCII Table = ‘i’
ASCII Chart
Symbols Associate a symbol, similar to a C identifier, with a memory address Defined by an identifier followed by a colon at the start of a statement The value of a symbol is the address of the object code generated by the statement When the assembler detects a symbol definition, it stores the symbol and its value in a symbol table.
Let’s use symbols …
Figure 5.15 identifier Use symbol
Figure 5.15 (continued)
QUESTION In the following code, determine the values of the symbols here and there. Write the object code in hexadecimal. (Do not predict the output.) BR there here: .WORD 9 there: DECO here,d STOP .END
ANSWER Object code is: 12 00 05 00 09 39 00 03 00 zz Address Object code 0000 12 00 05 BR there 0003 00 09 here: .WORD 9 0005 39 00 03 there: DECO here,d 0008 00 STOP .END With immediate addressing: 0001-0010 (bin) = 12(hex) With direct addressing: 0011-1001 (bin) = 39(hex) Object code is: 12 00 05 00 09 39 00 03 00 zz Symbol here has value 0003 (hex). Symbol there has value 0005 (hex).
Figure 5.16 Symbols: -relieve you of the burden of calculating addresses manually -make your programs easier to read num is easier on the eyes than 0x0003. Good programmers are careful to select meaningful symbols for their programs to enhance readability.
Translating from Level HOL6 Figure 5.18 Translating from Level HOL6 Other compilers translate into assembly language (Level Asmb5). An assembler then must translate the assembly language program into machine language before it can be loaded and executed Some compilers translate directly into machine language (Level ISA3)
C Pep/9 This section describes the translation process from C to Pep/9 assembly language. It shows how a compiler translates scanf(), printf(), and assignment statements, and how it enforces the concept of type at the C level.
Translating printf() Translate string output with STRO Translate integer output with DECO
Question Convert following program into Pep/9 Assembly
Figure 5.19 does not appear in the assembly language program at all
Figure 5.20
Variables and Types Compiler uses a symbol table to make the connection between variable names and addresses.
Global variables Allocated at a fixed location in memory with .BLOCK Accessed with direct addressing (d) Global variables
Assignment statements Load the accumulator from the right hand side of the assignment with LDA Compute the value of the right hand side of the assignment if necessary Store the value to the variable on the left hand side of the assignment with STA assignment
Input and output device names If you modify the operating system, the input device may no longer be at Mem[FC15]. However, input device location will still be in the machine vector at FFF8. Similarly, the location of the output device will always be in the machine vector at FFFA. Mem[FFF8] has the value of charIn Mem[FFFA] has the value of charOut During execution, the virtual machine uses these vectors to know where the input and output devices are in the memory map. From now on, you should use the symbols charIn and charOut when accessing the memory-mapped I/O devices, because they will always map to the correct locations in memory regardless of any modifications to the operating system.
Mem[FFF8] has charIn Mem[FFFA] has charOut LDBA 0xFC15,d LDBA charIn,d Mem[FFF8] has charIn Mem[FFFA] has charOut STBA 0xFC16,d STBA charOut,d
Programming Question Convert following C program to Pep/9 Assembly
Figure 5.22 (continued) Input device Output device
Figure 5.22 (continued)
Entries in the symbol table for this program Figure 5.23 #include <stdio.h> char ch; int j; int main() { scanf("%c %d", &ch, &j); j += 5; ch++; printf("%c\n%d\n", ch, j); return 0; } Entries in the symbol table for this program Trace tags
QUESTION Write an assembly language program that corresponds to the following C program: int num1; int num2; int main () { scanf("%d %d", &num1, &num2); printf("%d\n%d\n", num2, num1); return 0; }
ANSWER 5.4 Q.24
Type Compatibility Is modulus type compatible? Suppose you have two variables, integer j and floating-point y, in a C program. Is modulus type compatible? What is the assembly equivalent?
Type Compatibility Modulus: all the bits except the rightmost three bits to 0 compiler would consult the symbol table and determine that kind for the variable j is sInt. It would also recognize 8 as an integer constant and determine that the % operation is legal. It would then generate the object code LDWA j,d ANDA 0x0007,i STWA j,d
Figure 5.24 #include <stdio.h> int j; float y; int main () { ... j = j % 8; y = y % 8; // Compile error } compiler would consult the symbol table and determine that kind for the variable y is sFloat. It would determine that the % operation is not legal because it can be applied only to integer types. It would then generate the error message (TYPE CHECKING) Having the compiler check for type compatibility is a tremendous help. It keeps you from writing meaningless statements, such as performing a % operation on a float variable. When you program directly in assembly language at Level Asmb5, there are no type compatibility checks. All data consists of bits. When bugs occur due to incorrect data movements, they can be detected only at run time, not at translation time. That is, they are logical errors instead of syntax errors. Logical errors are notoriously more difficult to locate than syntax errors.
Question Convert following C Program to Pep/9 Assembly #include <stdio.h> int j; int main() { scanf(“%d”,&j); printf(“%d\n”,j); j = j%8; printf(“%d”,j) }
Answer
Question 31. Write an assembly language program that corresponds to the following C program: int num; int main () { scanf("%d", &num); num = num % 16; printf("num = %d\n", num); return 0; }
Answer
Trace tags Pep/9 has three symbolic trace features: global tracer for global variables stack tracer for parameters and local variables heap tracer for dynamically allocated variables
Trace tags To trace a variable, the programmer embeds trace tags in the comments associated with the variables and single steps through the program. The Pep/9 integrated development environment shows the runtime values of the variables. http://computersystemsbook.com/video-tutorials/pep9/ Pep/9 Assembly Language Programming
Trace Tags contained in assembly language comments have no effect on generated object code. begins with the # character supplies information to the symbol tracer on how to format and label the memory cell in the trace window. Trace tag errors show up as warnings when the code is assembled, allowing program execution without tracing turned on. However, they do prevent tracing until they are corrected.
Trace tags There are two kinds of trace tags: Format trace tags Required for global and local variables Symbol trace tags NOT required for global variables
format trace tags #1c and #2d Global Tracer allows the user to specify which global symbol to trace by placing a format trace tag in the comment of the .BLOCK line where the global variable is declared E.g. format trace tags #1c and #2d
Format trace tags #1c One-byte character #1d One-byte decimal #2d Two-byte decimal #1h One-byte hexadecimal #2h Two-byte hexadecimal
Question 28. Write an assembly language program that corresponds to the following C program: char ch; int main () { scanf("%c", &ch); ch--; printf("%c\n", ch); return 0; }
Answer
Question 29. Write an assembly language program that corresponds to the following C program: int num1; int num2; int main () { scanf("%d", &num1); num2 = -num1; printf("num1 = %d\n", num1); printf("num2 = %d\n", num2); return 0; }
Answer
The Shift and Rotate Instructions Pep/9 has two arithmetic shift instructions and two rotate instructions. All four are unary, with the following instruction specifiers, mnemonics, and status bits that they affect: They have no operand specifier. Each one operates on either the accumulator or the index register, depending on the value of r
The arithmetic shift right instruction Divides a signed integer by 2 Instruction specifier: 0000 110r Mnemonic: ASRr (ASRA, ASRX) Performs a one-bit arithmetic shift right on a 16-bit register LSB before shift
Figure 5.25, 5.26 0000-0000-0100-1100(bin) 0000-0000-1001-1000(bin) The C bit is 0 because the least significant bit was 0 before the shift occurred. 0000-0000-0100-1100(bin) 0000-0000-1001-1000(bin)
The arithmetic shift left instruction Multiplies a signed integer by 2 Instruction specifier: 0000 101r Mnemonic: ASLr (ASLA, ASLX) Performs a one-bit arithmetic shift left on a 16-bit register MSB before shift
The rotate left instruction Rotates each bit to the left by one bit, sending the most significant bit into C and C into the least significant bit Instruction specifier: 0000 111r Mnemonic: ROLr (ROLA, ROLX) Performs a one-bit rotate left on a 16-bit register
The rotate right instruction Rotates each bit to the right by one bit, sending the least significant bit into C and C into the most significant bit Instruction specifier: 0001 000r Mnemonic: RORr (RORA, RORX) Performs a one-bit rotate right on a 16-bit register
QUESTION Write an assembly language program that corresponds to the following C program: int width; int length; int perim; int main () { scanf("%d %d", &width, &length); perim = (width + length) * 2; printf("width = %d\n", width); printf("length = %d\n\n", length); printf("perim = %d\n", perim); return 0; }
ANSWER 5.4 Q27
Question 30. Write an assembly language program that corresponds to the following C program: int num; int main () { scanf("%d", &num); num = num / 16; printf("num = %d\n", num); return 0; }
Answer
Constants Equate the constant to its value with .EQUATE .EQUATE does not generate object code The value of the constant symbol is not an address
Question Convert following program to Pep/9 Assembly:
Example C constant /2
.EQUATE does not generate code (no machine code or address) access constant with immediate addressing
Figure 5.27 (continued) 10 (dec). address
Question Q. Write an assembly language program that corresponds to the following C program: const char chConst = 'a'; char ch1; char ch2; int main () { scanf("%c%c", &ch1, &ch2); printf("%c%c%c\n", ch1, chConst, ch2); return 0; }
Answer ANSWER:
Question Write an assembly language program that corresponds to the following C program: const int amount = 20000; int num; int sum; int main () { scanf("%d", &num); sum = num + amount; printf("sum = %d\n", sum); return 0; } Test your program twice. The first time, enter a value for num to make the sum within the allowed range for the Pep/9 computer. The second time, enter a value that is in range but that makes sum outside the range. Note that the out-of-range condition does not cause an error message but just gives an incorrect value. Explain the value.
Answer
References https://en.wikipedia.org/wiki/Arithmetic_shift