Chapter 6-2 Multiplier Multiplier Next Lecture Divider Floating Point Numbers
Multiplication of Positive Numbers using usual algorithm for multiplying integers Algorithm applies to unsigned numbers and to positive numbers Result of the product of two n-digit numbers can be accommodated in 2n digits Binary multiplication of positive operands can be implemented in a purely combinational, two dimensional logic array 1 1 0 1 (13) Multiplicand M 1 0 1 1 (11) Multiplier Q Partial Products (143) Product P
Formal Representation
Multiplier Implementation Multiplicand m 3 2 1 q p 4 5 6 7 Partial product (PP0) PP1 PP2 PP3 PP4 = p , p , ... p = Product 7 6 Bit of incoming partial product PPi m j q i Typical cell Carry-out F A Carry-in Bit of outgoing partial product PP(i+1)
Array Multiplier q 1 m 3 2 HA FA P 6 7 5 4 P P m m 2 1 FA FA HA
Ripple-Carry Array Multiplier For the multiplication operation M Q = P for 4-bit operands M: m3m2m1m0 Q: q3q2q1q0 P: p7p6p5p4p3p2p1p0 miqj = mi·qj FA p 7 6 5 4 3 1 2 m q
The MxN Array Multiplier Critical Path HA FA FA HA FA FA FA HA Critical Path 1 Critical Path 2 Critical Path 1 & 2 FA FA FA HA Dmult=[(M-1)+(N-2)]Dcarry +(N-1)Dsum+1Dand
Multiplier Implementation The main component in each cell is an adder circuitry Each AND gate determines whether a multiplicand bit mj is added to the incoming partial product bit, based on the value of the multiplier bit qj For each row i ( 0 ≤ i ≤ 3) where qi = 1, adds the multiplicand appropriately shifted, to the incoming partial product, PPi, to generate PPi+1 If qi = 0, PPi is passed vertically downward unchanged PP0 is all 0s PP4 is the desired product The multiplicand is shifted left one position per row by the diagonal signal path
Another Method of Multiplier Design The previous algorithm may be impractical for large numbers because it uses many gates Multiplication can be performed using a mixture of combinational array techniques and sequential techniques that require less combinational logic In early computers, because of the cost of logic gates, the adder circuitry in the ALU was used to perform multiplication sequentially Called sequential circuit binary multiplier
Register A (initially 0) q n 1 - m -bit Multiplicand M Control sequencer Multiplier Q C Shift right Register A (initially 0) adder Add/Noadd control a MUX 1 1 1 1 1 0 1 1 1 1 1 0 1 1 0 1 Initial configuration Add M C First cycle Second cycle Third cycle Fourth cycle No add Shift 1 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 Q A Product
Sequential Circuit Binary Multiplier This circuit performs multiplication by using a single adder n times to implement the spatial addition performed by the n rows of ripple carry adders Registers A and Q combined hold PPi while multiplier bit qi generates the signal Add/Noadd Add/Noadd controls the addition of the multiplicand M to PPi to generate PPi+1 The product is computed in n cycles The partial product grows in length 1 bit per cycle from the initial vector PP0 of n 0s in register A The carry-out from the adder is stored in Flip-Flop C At the start, the multiplier is loaded into register Q, the multiplicand into register M, and C as well as A are cleared to 0
Sequential Circuit Binary Multiplier At the end of each cycle, C, A and Q are shifted right by one bit position to allow for the growth of the partial product as the multiplier is shifted out of register Q Because of this shifting, multiplier bit qi appears in the LSB position of Q to generate the Add/Noadd signal at the correct time, starting with q0 during the first cycle, q1 during the second cycle, etc... If the adder has a delay of 10 ns The control setting and the shift operations take another 10ns each A hardwired multiply in a 32-bit word-length computer would take about 640ns Multiply instructions took much longer to execute than Add instructions in early computers
Signed Operand Multiplication Multiplication of signed operands generates a double length product in the 2's complement number system Consider the case of a positive multiplier and a negative multiplicand When we add a negative multiplicand to a partial product, we must extend the sign bit value of the multiplicand to the left as far as the product will extend The previous hardware can be used for negative multiplicands if it provides for sign extension of the partial products
Sign Extension of Negative Multiplicand 1 ´ 13 - ( ) 143 11 + Sign extension is shown in blue Negative number must be the multiplicand and the positive number is the multiplier
Booth Algorithm A powerful algorithm for signed-number multiplication treats positive and negative numbers uniformly So far, the number of additions equals the number of 1s in the multiplier Consider a multiplication in which the multiplier is positive and has a single block of 1s (e.g., 00111102 = 3010) To derive the product, we could add four appropriately shifted versions of the multiplicand (i.e., for four 1s) We can reduce the number of operations by regarding the multiplier as the difference between two numbers, i.e., 3210-210 or 01000002-00000102 This suggests that the product can be generated by adding 25 times the multiplicand to the 2's complement of 21 times the multiplicand The sequence of required operations can be recoded as 0+1000-10
Booth Algorithm -1 times the shifted multiplicand is selected when changing multiplier from 0 to 1 +1 times the shifted multiplicand is selected when changing multiplier from 1 to 0 The multiplier is scanned form right to left
Normal and Booth Multiplication Schemes 1 1 1 1 + 1 + 1 + 1 + 1 Normal 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + 1 - 1 2's complement of 1 1 1 1 1 1 1 1 1 1 the multiplicand Booth 1 1 1 1 1 1 1 1 1
Booth Recoding of a Multiplier 1 1 1 1 1 1 1 1 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 When the least significant bit is 1 , assume an implied 0 lies to its right
Booth Multiplication with a Negative Multiplier 1 6 - ( ) 13 + ´ 78 +1 Handles both positive and negative multipliers uniformly
Correctness of Booth Technique for Negative Multipliers Let the leftmost zero of a negative number, X, be at bit position k X = 11…10xk-1….x0 The value of X is given by V(X)= -2k+1 + xk-12k-1 +….+x020 Example V(X) 11000 (-8) 11001 (-7) = -23 = -23 + 1 For example, 1101102(-1010) is recoded as 0-1+10-10 -24+23-2 = -1010 X= -2k+1 =
Booth Multiplier Recoding Scheme Version of multiplicand selected by bit i Bit i Bit i - 1 ´ M 1 + 1 ´ M 1 1 ´ M 1 1 ´ M
Booth Recoded Multipliers 1 1 1 1 1 1 1 1 Worst-case multiplier + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 1 1 1 1 1 1 1 1 1 Ordinary multiplier - 1 + 1 - 1 + 1 - 1 + 1 - 1 1 1 1 1 1 1 1 1 Good multiplier + 1 - 1 + 1 - 1 Achieves some efficiency in the number of additions required when the multiplier has a few large blocks of 1s
Fast Multiplication Bit-pair recoding of multipliers Halves the maximum number of summands Derived from the booth algorithm (+1 -1) is equivalent to (0 +1) Because (+1 -1) is (+102 + -12) = +2M + -M = +M = (0 +1) Instead of adding +1×M at position i+1 to -1 times the multiplicand M at a shift position i The same result can be obtained by adding +1×M at position i (+1 0) is equivalent to (0 +2) (-1 +1) is equivalent to (0 -1) The booth-recoded multiplier is examined two bits at a time, starting from the right
Multiplier Bit-Pair Recoding Sign extension Implied 0 to right of LSB 1 1 1 1 (a) Example of bit-pair recoding derived from Booth recoding 1 + 1 1 1 2 i 1 + (b) Table of multiplicand selection decisions selected at position Multiplicand Multiplier bit-pair Multiplier bit on the right M ´ 2
Multiplication Requiring only n/2 Summands Example 1 1 1 ( + 13 ) ´ 1 1 1 ( - 6 ) 1 1 1 - 1 + 1 - 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ( - 78 ) Multiplication Requiring only n/2 Summands 1 1 1 - 1 - 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Ripple-Carry Array Disadvantage Multiplication requires many additions Using Ripple-Carry Array is slow Consider the addition of three n-bit numbers W, X, Y to produce the sum Z We can first add W to X to generate a number A Then we can add A to Y to produce Z This can be done by using two ripple carry adders
A Different Approach Instead of adding W to X to produce A in the upper ripple carry adder, let’s introduce the bits of Y into the inputs This generates the vectors S and the saved carries C as the outputs In the second row, S and C are added in a a ripple carry adder to produce Z Carry save addition can speedup this process
Q: Do you see any saving here? Carry Save Array For the multiplication operation M Q = P for 4-bit operands M: m3m2m1m0 Q: q3q2q1q0 P: p7p6p5p4p3p2p1p0 FA p 7 6 5 4 3 1 2 m q Q: Do you see any saving here?
Example Ripple-Carry vs. Carry-Save
Carry-Save Addition Approach 1 1 1 1 (45) M X 1 1 1 1 1 1 (63) Q 1 1 1 1 A 1 1 1 1 B 1 1 1 1 C 1 1 1 1 D 1 1 1 1 E 1 1 1 1 F 1 1 1 1 1 1 (2,835) Product
Complete Example 1 + M Q A B C S D E F 2 3 4 Product x
Schematic Representation of C.S.A. D C B A Level 1 CSA C S C S 2 2 1 1 Level 2 CSA C C S 2 3 3 Level 3 CSA C S 4 4 Final addition + Product Tmult = (N-1)Tcarry + (N-1)Tand + Tmerge 1.7log2k – 1.7 steps, where k is the number of summands
Example Ripple-Carry vs. Carry-Save Carry-save addition transforms W, X and Y into S and C Advantages: all bits of S and C are produced in a short fixed amount of time after W, X, and Y are applied Each row approximately takes one full-adder delay Carry propagation takes place only in the last row Carry lookahead adder could be used effectively to add the S and C vectors because all bits of S and C are available in parallel Consider the addition of many summands We can group the summands in threes and perform the carry save addition on each of these groups in parallel to generate S and C Next, group all the S and C vectors into threes and perform carry save addition on them Continue this process until there are only two vectors remaining These remaining vectors can be added in a ripple carry or a carry lookahead adder to produce the sum