Presentation is loading. Please wait.

Presentation is loading. Please wait.

Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University

Similar presentations


Presentation on theme: "Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University"— Presentation transcript:

1 Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University
April 27, 2017 CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Exam 1 Review Dr. Zhao Zhang Iowa State University Chapter 1 — Computer Abstractions and Technology

2 What We Have Learned Ch. 1: Computer Abstraction and Technology
Technology Trends CPU Performance Instruction count, CPI, and cycle time Processor power efficiency Processor manufacturing and cost Chapter 1 — Computer Abstractions and Technology — 2

3 Question Styles and Coverage
Morgan Kaufmann Publishers April 27, 2017 Question Styles and Coverage Short conceptual questions Calculation questions Performance improvement (speedup) Power rate and energy saving CPU time, CPI, Instruction Count, Cycle Time CPU time = # Cycles × CT = IC × CPI × CT Speedup = Old Time / New Time The coverage excludes Manufacturing and cost Chapter 1 — Computer Abstractions and Technology — 3 Chapter 1 — Computer Abstractions and Technology

4 Question 1 A MIPS processor runs at 1.0GHz, and for a given benchmark program its CPI is 1.5. A design optimization will improve the clock rate to 1.5GHz and increase the CPI to 1.8. What is the speedup from the optimization? Instruction count remains the same Clock rate change: 1.5/1.0 = 1.5x Cycle time improvement factor is 1.50x CPI change: 1.8/1.5 = 1.2x Improvement factor is 0.83x (degradation) Overall performance improvement is 1.50*0.83 = 1.25x Chapter 1 — Computer Abstractions and Technology — 4

5 Question 2 A processor spends 60% time on load/store instructions. A new design improve load/store performance by 2.0 times. What is the overall performance improvement? Amdahl’s Law: Speedup = 1/((1-f)+f/s) f: Fraction of time that the optimization applies to s: The improvement factor of the optimization Speedup = 1/( /2.0) = 1/0.7 = 1.43 Chapter 1 — Computer Abstractions and Technology — 5

6 What We Have Learned Ch. 2, Instructions: Language of the Computer
Instruction set architecture MIPS binary instruction format Plus floating-point instructions Chapter 1 — Computer Abstractions and Technology — 6

7 Question 3 Translate the following C statement into MIPS. Variables f, g, h are global and located at 100($gp), 104($gp) and 108($gp), respectively. extern int f, g, h; f = g + 4 * h; Try to predict how many instructions that you have to use Chapter 1 — Computer Abstractions and Technology — 7

8 Question 3 lw $t0, 104($gp) # load g lw $t1, 108($gp) # load h
# Load g, load h, multiply, add, store lw $t0, 104($gp) # load g lw $t1, 108($gp) # load h sll $t1, $t1, 2 # 4*h add $t0, $t0, $t1 # g+4*h sw $t0, 100($gp) # store f Chapter 1 — Computer Abstractions and Technology — 8

9 Exam Strategy In your exam, write comments with the MIPS code
It helps you write the code It helps the grader understand your code You may get more partial credit In case your code is not 100% correct Chapter 1 — Computer Abstractions and Technology — 9

10 Load and Store Three factors: address, size and extension
Load/store word: lw, sw Half word: lh, lhu, sh Byte: lb, lbu, sb Choose sign extension or zero extension, when loading a half word or a byte Floating points load and store Single precision: lwc1, swc1 Double precision: ldc1, sdc1 Chapter 1 — Computer Abstractions and Technology — 10

11 Array access Load from an array element extern unsigned short X[];
h = X[i]; Assume h in $s2, X in $s0, i in $s1. sll $t0, $s1, # $t0=i*2 add $t0, $s0, $t0 # $t0=&X[i] lhu $s2, 0($t0) # h=X[i] Chapter 1 — Computer Abstractions and Technology — 11

12 Array Access Store to an array element extern int Y[]; Y[j] = g;
Assume g in $s2, Y in $s0, j in $s1. sll $t0, $s1, # $t0=j*4 add $t0, $s0, $t0 # $t0=&Y[j] sw $s2, 0($t0) # Y[j]=g Chapter 1 — Computer Abstractions and Technology — 12

13 Array Access Load and store floating point numbers
extern double X[], Y[]; Y[i] = X[i]; Assume i in $s0, X in $a0, j in $a1 sll $t0, $s0, # $t0=8*i add $t0, $a0, $t0 # $t0=&X[i] ldc1 $f0, 0($t0) # $f0:f1=X[i] add $t1, $a1, $t0 # $t1=&Y[i] sdc1 $f0, 0($t1) # $f0:f1=Y[i] Chapter 1 — Computer Abstractions and Technology — 13

14 16-bit and 32-bit Constants
Load a 16-bit immediate f = 0x1000; // f in $s0 addi $s0, 0x1000 Load an 32-bit immediate f = 0xFFFF1000; lui $s0, 0xFFFF ori $s0, $s0, 0x1000 Chapter 1 — Computer Abstractions and Technology — 14

15 Pointer Access Pointer access int h, *p; Assume h in $t0, p in $s0.
lw $t0, 0($s0) # h = *p *p = h; sw $t0, 0($s0) # h = *p Chapter 1 — Computer Abstractions and Technology — 15

16 Branches Only two branches in the original MIPS beq rs, rt, label
bne rs, rt, label Branch if true/non-zero bne rs, $zero, label Branch if false/zero beq rs, $zero, label Chapter 1 — Computer Abstractions and Technology — 16

17 If-else Statement Evaluate condition, branch if false if (a < 0)
a = -a; Assume a in $s0 slt $t0, $s0, $zero # a < 0? beq endif # false? skip sub $s0, $zero, $s0 # a = -a endif: Chapter 1 — Computer Abstractions and Technology — 17

18 If-else Structure Evaluate condition, branch if false
if (a > b) max = a; else max = b; Assume max in $s2, a in $s0, b in $s1 slt $t0, $s1, $s0 # b < a beq $t0, $zero, else # false? add $s2, $s0, $zero # max = a j endif else: add $s2, $s1, $zero # max = b endif: Chapter 1 — Computer Abstractions and Technology — 18

19 FOR Loop Control and Data Flow Graph Linear Code Layout
(Optional: prologue and epilogue) Init-expr Init-expr Jump For-body For-body Incr-expr Incr-expr Test cond Cond Branch if true T F

20 Function with For-loop
Translate the following C function into MIPS short checksum(short X[], int N) { int i; short checksum = 0; for (i = 0; i < N; i++) checksum = checksum ^ X[i]; return checksum; } Chapter 1 — Computer Abstractions and Technology — 20

21 Function with For-loop
checksum: # X=>$a0, N=>$a1, i=>$t0, # checksum=>$v0 addi $v0, $zero, 0 # checksum = 0 addi $t0, $zero, 0 # i = 0 j loop_cond loop: sll $t1, $t0, 1 # i*2 add $t1, $a0, $t1 # &X[i] lh $t1, 0($t1) # load X[i] xor $v0, $v0, $t1 # checksum ^= X[i] addi $t0, $t0, 1 # i++ loop_cond: slt $t1, $t0, $a1 # i < N bne $t1, $zero, loop # loop jr $ra Chapter 1 — Computer Abstractions and Technology — 21

22 Leaf and Non-Leaf Functions
Leaf function doesn’t call another function Stack frame is not necessary Prefer to use temp registers (t-registers) Non-leaf function calls some other functions(s) Must use a stack frame, has to save $ra Usually has to use save registers (s-registers) Chapter 1 — Computer Abstractions and Technology — 22

23 Non-Leaf Function What is the size of the frame?
extern short xor(short, short); short checksum(short X[], int N) { int i; short checksum = 0; for (i = 0; i < N; i++) checksum = xor(checksum, X[i]); return checksum; } Chapter 1 — Computer Abstractions and Technology — 23

24 Non-Leaf Function Need a stack frame of 16 bytes
X, N, i, and $ra must be preserved Need a stack frame of 16 bytes addi $sp, $sp, -16 sw $ra, 12($sp) # for return address sw $s2, 8($sp) sw $s1, 4($sp) sw $s0, 0($sp) add $s0, $a0, $zero # $s0 = X add $s1, $a1, $zero # $s1 = N addi $s2, $zero, 0 # i = 0 Chapter 1 — Computer Abstractions and Technology — 24

25 Non-Leaf Function … # function body lw $s0, 0($sp) lw $s1, 4($sp)
lw $ra, 12($sp) addi $sp, $sp, 16 jr $ra Chapter 1 — Computer Abstractions and Technology — 25

26 Register Name and Call Convention
Number Use Preserved? $zero Constant value 0 N/A $at 1 Assembler temporary No $v0-$v1 2-3 Values for function results and expression evaluation $a0-$a3 4-7 Arguments $t0-$t7 8-15 Temporaries $s0-$s7 16-23 Saved temporaries Yes $t8-$t9 24-25 $k0-$k1 26-27 Saved for OS kernel $gp 28 Global pointer $sp 29 Stack pointer $fp 30 Frame pointer $ra 31 Return address 6 24 6 Chapter 1 — Computer Abstractions and Technology — 26

27 MIPS Call Convention: FP
The first two FP parameters in registers 1st parameter in $f12 or $f12:$f13 A double-precision parameter takes two registers 2nd FP parameter in $f14 or $f14:$f15 Extra parameters in stack $f0 stores single-precision FP return value $f0:$f1 stores double-precision FP return value $f0-$f19 are FP temporary registers $f20-$f31 are FP saved temporary registers Chapter 1 — Computer Abstractions and Technology — 27

28 FP Example: Call a Function
extern double a, b, c; extern double max(double, double); c = max(a, b); ldc1 $f12, 100($gp) # $f12:$f13 = a ldc1 $f14, 108($gp) # $f14:$f15 = b jal max sdc1 $f0, 116($gp) # c = $f0:$f1 Assume a, b, c assigned to 100($gp), 108($gp), and 116($gp) Chapter 1 — Computer Abstractions and Technology — 28

29 FP Instructions in MIPS
Morgan Kaufmann Publishers 27 April, 2017 FP Instructions in MIPS Single-precision arithmetic add.s, sub.s, mul.s, div.s e.g., add.s $f0, $f1, $f6 Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g., mul.d $f4, $f4, $f6 Chapter 3 — Arithmetic for Computers — 29 Chapter 3 — Arithmetic for Computers

30 FP Instructions in MIPS
Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, …) Sets or clears FP condition-code bit e.g. c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t, bc1f e.g., bc1t TargetLabel Chapter 1 — Computer Abstractions and Technology — 30


Download ppt "Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University"

Similar presentations


Ads by Google