Chapter 2 Instructions: Language of the Computer CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Zhao Zhang Iowa State University Revised from original slides provided by MKP
Chapter 2 — Instructions: Language of the Computer — 2 Procedure/Function Calling Steps required 1.Place parameters in registers 2.Transfer control to procedure 3.Acquire storage for procedure 4.Perform procedure’s operations 5.Place result in register for caller 6.Return to place of call §2.8 Supporting Procedures in Computer Hardware
Chapter 2 — Instructions: Language of the Computer — 3 Register Usage Review $a0 – $a3: arguments (reg’s 4 – 7) $v0, $v1: result values (reg’s 2 and 3) $t0 – $t9: temporaries Can be overwritten by callee $s0 – $s7: saved Must be saved/restored by callee $gp: global pointer for static data (reg 28) $sp: stack pointer (reg 29) $fp: frame pointer (reg 30) $ra: return address (reg 31) Note: There are additional rules for floating point registers
Chapter 2 — Instructions: Language of the Computer — 4 Procedure Call Instructions Procedure call: jump and link jal ProcedureLabel Address of following instruction put in $ra Jumps to target address Procedure return: jump register jr $ra Copies $ra to program counter Can also be used for computed jumps e.g., for case/switch statements
Chapter 2 — Instructions: Language of the Computer — 5 Leaf Procedure Example C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; } Arguments g, …, j in $a0, …, $a3 f in $s0 (hence, need to save $s0 on stack) Result in $v0
Chapter 2 — Instructions: Language of the Computer — 6 Leaf Procedure Example MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra Save $s0 on stack Procedure body Restore $s0 Result Return
Exercise Write MIPS code for int add2(int x, int y) { return x + y; } Chapter 1 — Computer Abstractions and Technology — 7
Exercise First version, with stack frame # x in $a0, y in $a1, return in $v0 add2: addi $sp, $sp, -4# alloc frame sw$s0, 0($sp)# save $s0 add$s0, $a0, $a1# tmp = x + y add$v0, $s0, $zero # $v0 = tmp lw$s0, 0($sp)# restore $s0 addi $sp, $sp, 4# release frame jr$ra Chapter 1 — Computer Abstractions and Technology — 8
Exercise Optimized version, w/o stack frame # x in $a0, y in $a1, return in $v0 add2: add$v0, $a0, $a1# $v0 = x + y jr$ra In this case, we have nothing to store in stack frame Chapter 1 — Computer Abstractions and Technology — 9
Exercise Write MIPS code for int max(int x, int y) { if (x > y) return x; else return y; } Chapter 1 — Computer Abstractions and Technology — 10
Exercise # x in $a0, y in $a1, return in $v0 max: slt$t0, $a1, $a0# y < x? beqelse# no, do else add$v0, $a0, $zero # to return x jal$ra# return else: add$v0, $a1, $zero # to return y jal$ra Chapter 1 — Computer Abstractions and Technology — 11
Chapter 2 — Instructions: Language of the Computer — 12 Non-Leaf Procedures Procedures that call other procedures For nested call, caller needs to save on the stack: Its return address Any arguments and temporaries needed after the call Restore from the stack after the call
Stack Frame Contents A complete stack frame may hold Extra arguments exceeding $a0-$a3 Save registers ($s0-$s7) that will be overwritten Return address ($ra) Local, automatic variables A non-leaf function must have a stack frame, because $ra has to be saved Chapter 1 — Computer Abstractions and Technology — 13
Chapter 2 — Instructions: Language of the Computer — 14 Local Data on the Stack Local data allocated by callee e.g., C automatic variables Procedure frame (activation record) Used by some compilers to manage stack storage Our examples do not use $fp
Non-Leaf Procedure Example Write MIPS code for int max3(int x, int y, int z) { return max(max(x, y), z); } We have to use a procedure frame in stack (stack frame) Chapter 1 — Computer Abstractions and Technology — 15
Non-Leaf Procedure Example # x in $a0, y in $a1, z in $a2, ret in $v0 max3: addi $sp, $sp, -8# alloc stack frame sw$ra, 4($sp)# preserve $ra sw$a2, 0($sp)# preserve z jal max# call max(x, y) add$a0, $v0, $zero # $a0 = max(x, y) lw$a1, 0($sp)# $a1 = z jal max# 2 nd call max(…) lw$ra, 4($sp)# restore $ra addi$sp, $sp, 8# free stack frame jr$ra# return Chapter 1 — Computer Abstractions and Technology — 16
Non-Leaf Procedure Example Write MIPS code for int add3(int x, int y, int z) { return add2(add2(x, y), z); } Chapter 1 — Computer Abstractions and Technology — 17
Chapter 2 — Instructions: Language of the Computer — 18 Non-Leaf Procedure Example C code: int fact (int n) { if (n < 1) return f; else return n * fact(n - 1); } Argument n in $a0 Result in $v0
Chapter 2 — Instructions: Language of the Computer — 19 Non-Leaf Procedure Example MIPS code: fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and return L1: addi $a0, $a0, -1 # else decrement n jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack mul $v0, $a0, $v0 # multiply to get result jr $ra # and return
Chapter 2 — Instructions: Language of the Computer — 20 Memory Layout Text: program code Static data: global variables e.g., static variables in C, constant arrays and strings $gp initialized to address allowing ±offsets into this segment Dynamic data: heap E.g., malloc in C, new in Java Stack: automatic storage
Chapter 2 — Instructions: Language of the Computer — 21 Character Data Byte-encoded character sets ASCII: 128 characters 95 graphic, 33 control Latin-1: 256 characters ASCII, +96 more graphic characters Unicode: 32-bit character set Used in Java, C++ wide characters, … Most of the world’s alphabets, plus symbols UTF-8, UTF-16: variable-length encodings §2.9 Communicating with People
Chapter 2 — Instructions: Language of the Computer — 22 Byte/Halfword Operations Could use bitwise operations MIPS byte/halfword load/store String processing is a common case lb rt, offset(rs) lh rt, offset(rs) Sign extend to 32 bits in rt lbu rt, offset(rs) lhu rt, offset(rs) Zero extend to 32 bits in rt sb rt, offset(rs) sh rt, offset(rs) Store just rightmost byte/halfword
Chapter 2 — Instructions: Language of the Computer — 23 String Copy Example C code (array-based version) Null-terminated string void strcpy (char x[], char y[]) { int i = 0; while ((x[i] = y[i]) != '\0') i++; } Addresses of x, y in $a0, $a1 i in $s0
Chapter 2 — Instructions: Language of the Computer — 24 String Copy Example MIPS code: strcpy: addi $sp, $sp, -4 # adjust stack for 1 item sw $s0, 0($sp) # save $s0 add $s0, $zero, $zero # i = 0 L1: add $t1, $s0, $a1 # addr of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $s0, $a0 # addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i + 1 j L1 # next iteration of loop L2: lw $s0, 0($sp) # restore saved $s0 addi $sp, $sp, 4 # pop 1 item from stack jr $ra # and return
Chapter 2 — Instructions: Language of the Computer — 25 String Copy Example C code, pointer-based version void strcpy (char *x, char *y) { while ((*x++ = *y++) != '\0') { } } A good optimizing compiler may generate the same, efficient code for both versions (see next)
Strcpy: Optimized Version strcpy: # reg: x in $a0, y in $a1, *y in $t0 Loop: lbu $t0, 0($a1) # load *y sb $t0, 0($a0)# store to *x addi $a0, $a0, 1# x++ addi $a1, $a1, 1# y++ bne $t0, $zero, Loop# *y != 0? jr $ra # return 5 vs. 7 instructions in the loop 6 vs. 13 instructions in the function Chapter 1 — Computer Abstractions and Technology — 26
Chapter 2 — Instructions: Language of the Computer — 27 Arrays vs. Pointers Array indexing involves Multiplying index by element size Adding to array base address Pointers correspond directly to memory addresses Can avoid indexing complexity §2.14 Arrays versus Pointers
Another Example Clear an array, Array access clear1(int array[], int size) { int i; for (i = 0; i < size; i++) { array[i] = 0; } Chapter 1 — Computer Abstractions and Technology — 28
Array Access MIPS Code # array in $a0, size in $a1 clear1: move $t0,$zero # i = 0 loop1: sll $t1, $t0, 2 # $t1 = i * 4 add $t2, $a0, $t1 # $t2 = &array[i] sw $zero, 0($t2) # array[i] = 0 addi $t0, $t0, 1 # i = i + 1 slt $t3, $t0, $a1 # $t3 = (i < size) bne $t3, $zero, loop1 # if true, repeat Chapter 1 — Computer Abstractions and Technology — 29
Pointer Access Clear an array, array access clear2(int *array, int size) { int *p; for (p = array; p < array + size; p++) { *p = 0; } Chapter 1 — Computer Abstractions and Technology — 30
Pointer Access MIPS Code clear2: move $t0, $a0 # p = array sll $t1, $a1, 2 # $t1 = size * 4 add $t2, $a0, $t1 # $t2 = &array[size] j loop2_cond loop2: sw $zero, 0($t0) # *p = 0 addi $t0, $t0, 4 # p++ loop2_cond: slt $t3, $t0, $t2 # p < &array[size]? bne $t3, $zero, loop2 $jr $ra Chapter 1 — Computer Abstractions and Technology — 31
Chapter 2 — Instructions: Language of the Computer — 32 Comparison of Array vs. Ptr Multiply “strength reduced” to shift Array version requires shift to be inside loop Part of index calculation for incremented i c.f. incrementing pointer Compiler can achieve same effect as manual use of pointers Induction variable elimination Better to make program clearer and safer
For-Loop Example Calculate the sum of array int array_sum(int X[], int size) { int sum = 0; for (int i = 0; i < size; i++) sum += X[i]; return sum; } Chapter 1 — Computer Abstractions and Technology — 33
FOR Loop Control and Data Flow Graph Linear Code Layout (Optional: prologue and epilogue) 34 For-body Incr-expr Branch if true For-body Cond T Init-expr Incr-expr F Init-expr Jump Test cond
For-Loop MIPS Code # X in $a0, size in $a1, return in $v0 array_sum: add $v0, $zero, $zero # sum = 0 add $t0, $zero, $zero # i = 0 j for_cond for_loop: sll $t1, $t0, 2 # $t1 = i*4 add $t1, $a0, $t1 # $t1 = &X[i] lw $t1, 0($t1) # $t1 = X[i] add $v0, $v0, $t1 # sum += X[i] addi $t0, $t0, 1 # i++ for_cond: slt $t1, $t0, $a1 # i < size? bne $t1, $zero, for_loop # if true, repeat jr $ra Chapter 1 — Computer Abstractions and Technology — 35
For-Loop: Pointer Version Calculate the sum of array int array_sum(int X[], int size) { int *p, sum = 0; for (p = X; p < &X[size]; p++) sum += *p; return sum; } Again, do not write pointer version for performance – A good compiler will take care of it. Chapter 1 — Computer Abstractions and Technology — 36
Optimized MIPS Code # X in $a0, size in $a1, return in $v0 array_sum: add $v0, $zero, $zero # sum = 0 add $t0, $a0, $zero # p = X sll $a1, $a1, 2 # $a1 = 4*size add $a1, $a0, $a1 # $a1 = &X[size] j for_cond for_loop: lw $t1, 0($t0) # $t1 = *p add $v0, $v0, $t1 # sum += *p addi $t0, $t0, 4 # p++ for_cond: slt $t1, $t0, $a1 # p < &X[size]? bne $t1, $zero, for_loop # if true, repeat jr $ra Chapter 1 — Computer Abstractions and Technology — 37
Chapter 2 — Instructions: Language of the Computer — bit Constants Most constants are small 16-bit immediate is sufficient For the occasional 32-bit constant lui rt, constant Copies 16-bit constant to left 16 bits of rt Clears right 16 bits of rt to 0 lhi $s0, ori $s0, $s0, 2304 §2.10 MIPS Addressing for 32-Bit Immediates and Addresses
32-bit Constants Translate C to MIPS f = 0x ; # Assume f in $s0 lui $s0, 0x1020 ori $s0, $s0, 0x3040 Chapter 1 — Computer Abstractions and Technology — 39
32-bit Constants Load a big value in MIPS int *p = array; # assume p in $s0 la $s0, array MIPS assembly supports pseudo instruction “la”, equivalent to lui $s0, upper_of_array ori $s0, $s0, lower_of_array The assembler decides the value for upper_of_array and lower_of_array Chapter 1 — Computer Abstractions and Technology — 40
Shift Instructions sllrd, rt, shamt; shift left logic Ex:sll$s0, $s0, 4; sll by 4 bits sllvrd, rt, rs; SLL variable Ex:sllv$s0, $s0, $t0; ssl by $t0 bits Source: textbook B-55, B56 Chapter 1 — Computer Abstractions and Technology — rtrdshamt0 6 bits 5 bits 0rsrtrd04 6 bits 5 bits
Shift Instructions Other shift instructions srlrd, rt, shamt# shift right logic srlvrd, rt, rs# SRL varaible srard, rt, shamt# shift right arithmetic sravrd, rt, rs# SRA variable Chapter 1 — Computer Abstractions and Technology — 42
Chapter 2 — Instructions: Language of the Computer — 43 Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward oprsrtconstant or address 6 bits5 bits 16 bits PC-relative addressing Target address = PC + offset × 4 PC already incremented by 4 by this time
Chapter 2 — Instructions: Language of the Computer — 44 Jump Addressing Jump ( j and jal ) targets could be anywhere in text segment Encode full address in instruction opaddress 6 bits 26 bits (Pseudo)Direct jump addressing Target address = PC 31…28 : (address × 4)
Chapter 2 — Instructions: Language of the Computer — 45 Target Addressing Example Loop code from earlier example Assume Loop at location Loop: sll $t1, $s3, add $t1, $t1, $s lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, j Loop Exit: … 80024
Chapter 2 — Instructions: Language of the Computer — 46 Branching Far Away If branch target is too far to encode with 16-bit offset, assembler rewrites the code Example beq $s0,$s1, L1 ↓ bne $s0,$s1, L2 j L1 L2:…
Chapter 2 — Instructions: Language of the Computer — 47 Addressing Mode Summary