Presentation is loading. Please wait.

Presentation is loading. Please wait.

EI 209 Chapter 3.1CSE, 2015 EI 209 Computer Organization Fall 2015 Chapter 3: Arithmetic for Computers Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )http://tdt.sjtu.edu.cn/~hjzhu/

Similar presentations


Presentation on theme: "EI 209 Chapter 3.1CSE, 2015 EI 209 Computer Organization Fall 2015 Chapter 3: Arithmetic for Computers Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )http://tdt.sjtu.edu.cn/~hjzhu/"— Presentation transcript:

1 EI 209 Chapter 3.1CSE, 2015 EI 209 Computer Organization Fall 2015 Chapter 3: Arithmetic for Computers Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )http://tdt.sjtu.edu.cn/~hjzhu/ [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, © 2012, MK]

2 EI 209 Chapter 3.2CSE, 2015 Review: MIPS (RISC) Design Principles  Simplicity favors regularity l fixed size instructions l small number of instruction formats l opcode always the first 6 bits  Smaller is faster l limited instruction set l limited number of registers in register file l limited number of addressing modes  Make the common case fast l arithmetic operands from the register file (load-store machine) l allow instructions to contain immediate operands  Good design demands good compromises l three instruction formats

3 EI 209 Chapter 3.3CSE, 2015 Specifying Branch Destinations  Use a register (like in lw and sw) added to the 16-bit offset l which register? Instruction Address Register (the PC) -its use is automatically implied by instruction -PC gets updated (PC+4) during the fetch cycle so that it holds the address of the next instruction l limits the branch distance to -2 15 to +2 15 -1 (word) instructions from the (instruction after the) branch instruction, but most branches are local anyway PC Add 32 offset 16 32 00 sign-extend from the low order 16 bits of the branch instruction branch dst address ? Add 4 32

4 EI 209 Chapter 3.4CSE, 2015  MIPS also has an unconditional branch instruction or jump instruction: j label#go to label Other Control Flow Instructions  Instruction Format (J Format): 0x02 26-bit address PC 4 32 26 32 00 from the low order 26 bits of the jump instruction Why shift left by two bits?

5 EI 209 Chapter 3.5CSE, 2015 Review: MIPS Addressing Modes Illustrated 1. Register addressing op rs rt rd funct Register word operand op rs rt offset 2. Base (displacement) addressing base register Memory word or byte operand 3. Immediate addressing op rs rt operand 4. PC-relative addressing op rs rt offset Program Counter (PC) Memory branch destination instruction 5. Pseudo-direct addressing op jump address Program Counter (PC) Memory jump destination instruction||

6 EI 209 Chapter 3.6CSE, 2015  32-bit signed numbers (2’s complement): 0000 0000 0000 0000 0000 0000 0000 0000 two = 0 ten 0000 0000 0000 0000 0000 0000 0000 0001 two = + 1 ten... 0111 1111 1111 1111 1111 1111 1111 1110 two = + 2,147,483,646 ten 0111 1111 1111 1111 1111 1111 1111 1111 two = + 2,147,483,647 ten 1000 0000 0000 0000 0000 0000 0000 0000 two = – 2,147,483,648 ten 1000 0000 0000 0000 0000 0000 0000 0001 two = – 2,147,483,647 ten... 1111 1111 1111 1111 1111 1111 1111 1110 two = – 2 ten 1111 1111 1111 1111 1111 1111 1111 1111 two = – 1 ten Number Representations maxint minint  Converting <32-bit values into 32-bit values copy the most significant bit (the sign bit) into the “empty” bits 0010 -> 0000 0010 1010 -> 1111 1010 sign extend versus zero extend ( lb vs. lbu ) MSB LSB

7 EI 209 Chapter 3.7CSE, 2015 MIPS Arithmetic Logic Unit (ALU)  Must support the Arithmetic/Logic operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu 32 m (operation) result A B ALU 4 zeroovf 1 1  With special handling for sign extend – addi, addiu, slti, sltiu zero extend – andi, ori, xori overflow detection – add, addi, sub

8 EI 209 Chapter 3.8CSE, 2015 Dealing with Overflow OperationOperand AOperand BResult indicating overflow A + B≥ 0 < 0 A + B< 0 ≥ 0 A - B≥ 0< 0 A - B< 0≥ 0  Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit  When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur  MIPS signals overflow with an exception (aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception

9 EI 209 Chapter 3.9CSE, 2015  Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101  Two's complement operations are easy do subtraction by negating and then adding 0111  0111 - 0110  + 1010  Overflow (result too large for finite computer word) e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 Addition & Subtraction

10 EI 209 Chapter 3.10CSE, 2015  Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101  Two's complement operations are easy l do subtraction by negating and then adding 0111  0111 - 0110  + 1010  Overflow (result too large for finite computer word) e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 Addition & Subtraction 1101 0001 0001 0001 1 0001 1000

11 EI 209 Chapter 3.11CSE, 2015 Building a 1-bit Binary Adder 1 bit Full Adder A B S carry_in carry_out S = A xor B xor carry_in carry_out = A&B | A&carry_in | B&carry_in (majority function)  How can we use it to build a 32-bit adder?  How can we modify it easily to build an adder/subtractor? AB carry_incarry_out S 00000 00101 01001 01110 10001 10110 11010 11111

12 EI 209 Chapter 3.12CSE, 2015 Building 32-bit Adder 1-bit FA A0A0 B0B0 S0S0 c 0 =carry_in c1c1 1-bit FA A1A1 B1B1 S1S1 c2c2 A2A2 B2B2 S2S2 c3c3 c 32 =carry_out 1-bit FA A 31 B 31 S 31 c 31...  Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect...  Ripple Carry Adder (RCA) advantage: simple logic, so small (low cost) disadvantage: slow and lots of glitching (so lots of energy consumption)

13 EI 209 Chapter 3.13CSE, 2015 A 32-bit Ripple Carry Adder/Subtractor  Remember 2’s complement is just complement all the bits add a 1 in the least significant bit A 0111  0111 B - 0110  + 1-bit FA S0S0 c 0 =carry_in c1c1 1-bit FA S1S1 c2c2 S2S2 c3c3 c 32 =carry_out 1-bit FA S 31 c 31... A0A0 A1A1 A2A2 A 31 B0B0 B1B1 B2B2 B 31 add/sub B0B0 control (0=add,1=sub) B 0 if control = 0, !B 0 if control = 1

14 EI 209 Chapter 3.14CSE, 2015 A 32-bit Ripple Carry Adder/Subtractor  Remember 2’s complement is just complement all the bits add a 1 in the least significant bit A 0111  0111 B - 0110  + 1-bit FA S0S0 c 0 =carry_in c1c1 1-bit FA S1S1 c2c2 S2S2 c3c3 c 32 =carry_out 1-bit FA S 31 c 31... A0A0 A1A1 A2A2 A 31 B0B0 B1B1 B2B2 B 31 add/sub B0B0 control (0=add,1=sub) B 0 if control = 0 !B 0 if control = 1 0001 1001 1 1 0001

15 EI 209 Chapter 3.15CSE, 2015 Overflow Detection Logic  Carry into MSB ! = Carry out of MSB l For a N-bit ALU: Overflow = CarryIn [N-1] XOR CarryOut [N-1] Overflow XYX XOR Y 000 11 101 110 A0 B0 1-bit ALU Result0 CarryIn0 CarryOut0 A1 B1 1-bit ALU Result1 CarryIn1 CarryOut1 A2 B2 1-bit ALU Result2 CarryIn2 CarryOut2 A3 B3 1-bit ALU Result3 CarryIn3 CarryOut3 0 why?

16 EI 209 Chapter 3.16CSE, 2015 Multiply  Binary multiplication is just a bunch of right shifts and adds multiplicand multiplier partial product array double precision product n 2n n can be formed in parallel and added in parallel for faster multiplication

17 EI 209 Chapter 3.17CSE, 2015  More complicated than addition l Can be accomplished via shifting and adding 0010 (multiplicand) x_1011 (multiplier) 0010 0010 (partial product 0000 array) 0010 00010110 (product)  In every step multiplicand is shifted next bit of multiplier is examined (also a shifting step) if this bit is 1, shifted multiplicand is added to the product Multiplication

18 EI 209 Chapter 3.18CSE, 2015 In every step multiplicand is shifted next bit of multiplier is examined (also a shifting step) if this bit is 1, shifted multiplicand is added to the product Multiplication Algorithm 1

19 EI 209 Chapter 3.19CSE, 2015

20 EI 209 Chapter 3.20CSE, 2015 Comments on Multiplicand Algorithm 1  Performance  Three basic steps for each bit  It requires 100 clock cycles to multiply two 32-bit numbers If each step took a clock cycle,  How to improve it?  Motivation (Performing the operations in parallel):  Putting multiplier and the product together  Shift them together

21 EI 209 Chapter 3.21CSE, 2015 Refined Multiplicand Algorithm 2 multiplicand 32-bit ALU multiplier Control add shift right product 32-bit ALU and multiplicand is untouched the sum keeps shifting right at every step, number of bits in product + multiplier = 64, hence, they share a single 64-bit register

22 EI 209 Chapter 3.22CSE, 2015 Add and Right Shift Multiplier Hardware multiplicand 32-bit ALU multiplier Control add shift right product 0 1 1 0 = 6 0 0 0 0 0 1 0 1 = 5 add 0 1 1 0 0 1 0 1 0 0 1 1 0 0 1 0 add 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 add 0 1 1 1 1 0 0 1 0 0 0 1 1 1 1 0 add 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 = 30

23 EI 209 Chapter 3.23CSE, 2015 Exercise  Using 4-bit numbers to save space, multiply 2 ten *3 ten, or 0010 two * 0011 two

24 EI 209 Chapter 3.24CSE, 2015 Division  Division is just a bunch of quotient digit guesses and left shifts and subtracts dividend = quotient x divisor + remainder dividend divisor partial remainder array quotient n n remainder n 000 0 0 0

25 EI 209 Chapter 3.25CSE, 2015 Division 1001 ten Quotient Divisor 1000 ten | 1001010 ten Dividend -1000 10 101 1010 -1000 10 ten Remainder At every step, shift divisor right and compare it with current dividend if divisor is larger, shift 0 as the next bit of the quotient if divisor is smaller, subtract to get new dividend and shift 1 as the next bit of the quotient

26 EI 209 Chapter 3.26CSE, 2015 26 First Version of Hardware for Division A comparison requires a subtract; the sign of the result is examined; if the result is negative, the divisor must be added back

27 EI 209 Chapter 3.27CSE, 2015 1. Subtract the Divisor register from the Remainder register, and place the result in the Remainder register. Test Remainder Remainder < 0 Remainder >=0 2a. Shift the Quotient register to the left setting the new rightmost bit to 1. 2b. Restore the original value by adding the Divisor reg to the Remainder reg and place the sum in the Remainder reg. Also shift the Quotient register to the left, setting the new LSB to 0 3. Shift the Divisor register right1 bit. 33rd repetition? No: < 33 repetitions Done Yes: 33 repetitions Start Divide Algorithm

28 EI 209 Chapter 3.28CSE, 2015 28 Divide Example Divide 7 ten (0000 0111 two ) by 2 ten (0010 two ) IterStepQuotDivisorRemainder 0Initial values 1 2 3 4 5

29 EI 209 Chapter 3.29CSE, 2015 Divide Example Divide 7 ten (0000 0111 two ) by 2 ten (0010 two ) IterStepQuotDivisorRemainder 0Initial values00000010 00000000 0111 1Rem = Rem – Div Rem < 0  +Div, shift 0 into Q Shift Div right 0000 0010 0000 0001 0000 1110 0111 0000 0111 2Same steps as 10000 0001 0000 0000 1000 1111 0111 0000 0111 3Same steps as 100000000 01000000 0111 4Rem = Rem – Div Rem >= 0  shift 1 into Q Shift Div right 0000 0001 0000 0100 0000 0010 0000 0011 5Same steps as 400110000 0001

30 EI 209 Chapter 3.30CSE, 2015 30 Efficient Division Remainder Quotient Divisor 64-bit ALU Shift Right Shift Left Write Control 32 bits 64 bits divisor 32-bit ALU quotient Control subtract shift left dividend remainder

31 EI 209 Chapter 3.31CSE, 2015 Left Shift and Subtract Division Hardware divisor 32-bit ALU quotient Control subtract shift left dividend remainder 0 0 1 0 = 2 0 0 0 0 0 1 1 0 = 6 0 0 0 0 1 1 0 0 sub 1 1 1 0 1 1 0 0rem neg, so ‘ient bit = 0 0 0 0 0 1 1 0 0restore remainder 0 0 0 1 1 0 0 0 sub 1 1 1 1 1 1 0 0rem neg, so ‘ient bit = 0 0 0 0 1 1 0 0 0restore remainder 0 0 1 1 0 0 0 0 sub 0 0 0 1 0 0 0 1 rem pos, so ‘ient bit = 1 0 0 1 0 sub 0 0 0 0 0 0 1 1 rem pos, so ‘ient bit = 1 = 3 with 0 remainder

32 EI 209 Chapter 3.32CSE, 2015 s (0) = z for j = 1 to k if 2 s (j-1) - 2 k d > 0 q k-j = 1 s (j) = 2 s (j-1) - 2 k d else q k-j = 0 s (j) = 2 s (j-1) 32 Restoring Unsigned Integer Division No need to restore the remainder in the case of R-D>0, Restore the remainder In the case of R-D<0, the remainder shift left by 1 bit K=32, put divisor in the left 32 bit register

33 EI 209 Chapter 3.33CSE, 2015 Non-Restoring Unsigned Integer Division s (1) = 2 z - 2 k d for j = 2 to k if s (j-1)  0 q k-(j-1) = 1 s (j) = 2 s (j-1) - 2 k d else q k-(j-1) = 0 s (j) = 2 s (j-1) + 2 k d end for if s (k)  0 q 0 = 1 else q 0 = 0 Correction step If in the last step, remainder –divisor >0, Perform subtraction If in the last step, remainder –divisor <0, Perform addition why?

34 EI 209 Chapter 3.34CSE, 2015 s (0) = z for j = 1 to k if 2 s (j-1) - 2 k d > 0 q k-j = 1 s (j) = 2 s (j-1) - 2 k d else q k-j = 0 s (j) = 2 s (j-1) s (1) = 2 z - 2 k d for j = 2 to k if s (j-1)  0 q k-(j-1) = 1 s (j) = 2 s (j-1) - 2 k d else q k-(j-1) = 0 s (j) = 2 s (j-1) + 2 k d end for if s (k)  0 q 0 = 1 else q 0 = 0 Correction step Restoring Unsigned Integer Division equal Why? Non-Restoring Unsigned Integer Division considering two consequent steps j-1 and j, in particular 2s (j-2) - 2 k d <0 In the j-1 step, Restoring Algorithm computes q k-j = 0 s (j-1) = 2 s (j-2) Non-Restoring Algorithm s (j-1) = 2 s (j-2) - 2 k d In the subsequent j step, Restoring Algorithm computes 2 s (j-1) - 2 k d == 2*2 s (j-2) - 2 k d In the subsequent j step, non- Restoring Algorithm computes 2 s (j-1) + 2 k d = 2*2 s (j-2) - 2*2 k d +2 k d = 2*2 s (j-2) - 2 k d 2x-y= 2(x-y)+y

35 EI 209 Chapter 3.35CSE, 2015 Non-restoring algorithm set subtract_bit true 1: If subtract bit true: Subtract the Divisor register from the Remainder and place the result in the remainder register else Add the Divisor register to the Remainder and place the result in the remainder register 2:If Remainder >= 0 Shift the Quotient register to the left, setting rightmost bit to 1 else Set subtract bit to false 3: Shift the Divisor register right 1 bit if < 33rd rep goto 1 else Add Divisor register to remainder and place in Remainder register exit

36 EI 209 Chapter 3.36CSE, 2015 Example: Perform n + 1 iterations for n bits Remainder 0000 1011 Divisor 00110000 ----------------------------------- Iteration 1: (subtract) Rem 1101 1011 Quotient 0 Divisor 0001 1000 ----------------------------------- Iteration 2: (add) Rem 11110011 Q00 Divisor 0000 1100 ----------------------------------- Iteration 3: (add) Rem 11111111 Q000 Divisor 0000 0110 ----------------------------------- Iteration 4: (add) Rem 0000 0101 Q0001 Divisor 0000 0011 ----------------------------------- Iteration 5: (subtract) Rem 0000 0010 Q 00011 Divisor 0000 0001 Since reminder is positive, done. Q = 0011 and Rem = 0010

37 EI 209 Chapter 3.37CSE, 2015 Exercise Calculate A divided by B using restoring and non-restoring division. A=26, B=5

38 EI 209 Chapter 3.38CSE, 2015  Divide ( div and divu ) generates the reminder in hi and the quotient in lo div $s0, $s1 # lo = $s0 / $s1 # hi = $s0 mod $s1 Instructions mfhi rd and mflo rd are provided to move the quotient and reminder to (user accessible) registers in the register file MIPS Divide Instruction  As with multiply, divide ignores overflow so software must determine if the quotient is too large. Software must also check the divisor to avoid division by 0. 0 16 17 0 0 0x1A

39 EI 209 Chapter 3.39CSE, 2015 Lecture 1


Download ppt "EI 209 Chapter 3.1CSE, 2015 EI 209 Computer Organization Fall 2015 Chapter 3: Arithmetic for Computers Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )http://tdt.sjtu.edu.cn/~hjzhu/"

Similar presentations


Ads by Google