Sequential Multipliers Lecture 9
Required Reading Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter 12.3, Bit-Serial Multipliers Chapter 12.4, Modular Multipliers Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design
Notation a Multiplicand a k-1 a k-2... a 1 a 0 x Multiplier x k-1 x k-2... x 1 x 0 p Product (a x) p 2k-1 p 2k-2... p 2 p 1 p 0 If multiplicand and multiplier are of different sizes, usually multiplier has the smaller size
Multiplication of two 4-bit unsigned binary numbers in dot notation Partial Product 0 Partial Product 1 Partial Product 2 Partial Product 3 Number of partial products = number of bits in multiplier x Bit-width of each partial product = bit-width of multiplicand a
Basic Multiplication Equations x = x i 2 i i=0 k-1 p = a x p = a x = a x i 2 i = = x 0 a2 0 + x 1 a2 1 + x 2 a2 2 + … + x k-1 a2 k-1 i=0 k-1
Shift/Add Algorithm Right-shift version
Shift/Add Algorithms Right-shift algorithm p = a x = x 0 a2 0 + x 1 a2 1 + x 2 a2 2 + … + x k-1 a2 k-1 = (...((0 + x 0 a2 k )/2 + x 1 a2 k )/ x k-1 a2 k )/2 = k times = p (0) = 0 p = p (k) p (j+1) = (p (j) + x j a 2 k ) / 2 j=0..k-1
Sequential shift-and-add multiplier for right-shift algorithm
Right-shift multiplication algorithm: Example
Area optimization for the sequential shift-and-add multiplier with the right-shift algorithm
Shift/Add Algorithms Right-shift algorithm: multiply-add = (...((y2 k + x 0 a2 k )/2 + x 1 a2 k )/ x k-1 a2 k )/2 = k times p (0) = y2 k p = p (k) p (j+1) = (p (j) + x j a 2 k ) / 2 j=0..k-1 = y + x 0 a2 0 + x 1 a2 1 + x 2 a2 2 + … + x k-1 a2 k-1 = y + a x
Signed Multiplication Previous sequential multipliers are for unsigned multiplication For signed multiplication: –assume sign-extended operation for p (j) + x j a – if 2's complement multiplier is POSITIVE right-shift sequential algorithms (shift-add) will work directly –if 2's complement multiplier is NEGATIVE than we must use "negative weight” for x k-1 and subtract x k-1 a in the last cycle Slight increase in area due to control and one-bit sign extension on inputs of adder –Unsigned: k bit number + k bit number k+1 bit number –Signed: k+1 bit sign extended number + k+1 bit sign extended number k+1 bit number
Sequential multiplication of 2’s-complement numbers with right shifts (positive multiplier)
Sequential multiplication of 2’s-complement numbers with right shifts (negative multiplier)
Shift/Add Algorithm Left-shift version
Shift/Add Algorithms Left-shift algorithm p = a x = x 0 a2 0 + x 1 a2 1 + x 2 a2 2 + … + x k-1 a2 k-1 = (...((0 2 + x k-1 a) 2 + x k-2 a) x 1 a) 2 + x 0 a= k times = p (0) = 0 p = p (k) p (j+1) = (p (j) 2 + x k-1-j a) j=0..k-1
Sequential shift-and-add multiplier for left-shift algorithm Left shifts are not as efficient for two's complement because must sign extend multiplicand by k bits
Left-shift multiplication algorithm: Example
p (0) = y2 -k p = p (k) p (j+1) = (p (j) 2 + x k-(j+1) a) j=0..k-1 Shift/Add Algorithms Left-shift algorithm: multiply-add = (...((y2 -k 2 + x k-1 a) 2 + x k-2 a) x 1 a) 2 + x 0 a = k times = y + x k-1 a2 k-1 + x k-2 a2 k-2 + … + x 1 a2 1 + x 0 a = y + a x
Shift/Add Algorithm Right-shift version with Carry-Save Adder
Sequential shift-and-add multiplier with a carry save adder
High-Radix Sequential Multipliers
High-Radix Notation a Multiplicand (a n-1 a n-2... a 1 a 0 ) r x Multiplier (x n-1 x n-2... x 1 x 0 ) r p Product (a x) (p 2n-1 p 2n-2... p 2 p 1 p 0 ) r
Radix-4, or two-bit-at-a-time, multiplication in dot notation
Basic Multiplication Equations x = x i r i i=0 n-1 p = a x p = a x = a x i r i = = x 0 ar 0 + x 1 a r 1 + x 2 a r 2 + … + x n-1 a r n-1 i=0 n-1
High-Radix Shift/Add Algorithms Right-shift high-radix algorithm p = a x = x 0 ar 0 + x 1 ar 1 + x 2 ar 2 + … + x n-1 ar n-1 = (...((0 + x 0 ar n )/r + x 1 ar n )/r x n-1 ar n )/r = n times = p (0) = 0 p = p (n) p (j+1) = (p (j) + x j a r n ) / r j=0..n-1
High-Radix Shift/Add Algorithms Left-shift high-radix algorithm p = a x = x 0 ar 0 + x 1 ar 1 + x 2 ar 2 + … + x n-1 ar n-1 = (...((0 r + x n-1 a) r + x n-2 a) r x 1 a) r + x 0 a= n times = p (0) = 0 p = p (n) p (j+1) = (p (j) r + x n-1-j a) j=0..n-1
The multiple generation part of a radix-4 multiplier with precomputation of 3a
Example of radix-4 multiplication using the 3a multiple
The multiple generation part of a radix-4 multiplier based on replacing 3a with 4a (carry into next higher radix-4 multiplier digit) and -a
Higher Radix Multiplication In radix-8, one must precompute 3a, 5a, 7a –Overhead becomes prohibitive and does not help However, when we discuss CSA this may be useful
Radix-2 Booth Recoding i j j+1
Radix-2 Booth Recoding y i = -x i + x i-1
Sequential multiplication of 2’s-complement numbers with right shifts using Booth’s recoding
Notation Y Multiplicand y m-1 y m-2... y 1 y 0 X Multiplier x m-1 x m-2... x 1 x 0 P Product (Y X ) p 2m-1 p 2m-2... p 2 p 1 p 0 If multiplicand and multiplier are of different sizes, usually multiplier has the smaller size
Radix-2 Booth Multiplier Basic Step
Radix-2 Booth Multiplier Basic Step in Xilinx FPGAs
Radix-2 Booth Multiplier in Xilinx FPGAs
Radix-4 Booth Recoding (1)
z i/2 = -2x i+1 + x i + x i-1
Example radix-4 multiplication with modified Booth’s recoding of the 2’s-complement multiplier
The multiple generation part of a radix-4 multiplier based on Booth’s recoding
Notation Y Multiplicand y m-1 y m-2... y 1 y 0 X Multiplier x m-1 x m-2... x 1 x 0 P Product (Y X ) p 2m-1 p 2m-2... p 2 p 1 p 0 If multiplicand and multiplier are of different sizes, usually multiplier has the smaller size
Radix-4 Booth Multiplier Basic Step
Radix-4 Booth Multiplier: Left Shifter & Control
High-Radix Multipliers with Carry-Save Adder
Radix-4 multiplication with a carry-save adder used to combine the cumulative partial product, x i a, and 2x i+1 a into two numbers
Radix-4 multiplier with a carry-save adder and Booth’s recoding
Booth recoding and multiple selection logic for high-radix multiplication
Radix-4 multiplier with two carry-save adders
Radix-16 multiplier with carry-save adders
Bit-Serial Multipliers
Bit Serial Multipliers Advantages small area reduced pin count reduced wire length high clock rate
Systolic Array Systolic array: synchronous arrays of processing elements that are interconnected by only short, local wires thus allowing very high clock rates
Semisystolic Bit-Serial Multiplier (1)
Semisystolic Bit-Serial Multiplier (2) a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 a 3 0 a 2 0 a 1 0 a 0 0 p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7
Retiming d kk k+n k+n+d d k k+d k+d+n
Retimed Semisystolic Bit-Serial Multiplier (1)
Retimed Semisystolic Bit-Serial Multiplier (2) a 3 0 a 2 0 a 1 0 a 0 x 0 a 3 0 a 2 0 a 1 x 0 a 0 x 1 a 3 0 a 2 x 0 a 1 x 1 a 0 x 2 a 3 x 0 a 2 x 1 a 1 x 2 a 0 x 3 a 3 x 1 a 2 x 2 a 1 x 3 a 0 0 a 3 x 2 a 2 x 3 a 1 0 a 0 0 a 3 x 3 a 2 0 a 1 0 a 0 0 a 3 0 a 2 0 a 1 0 a 0 0 p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7
Systolic Bit-Serial Multiplier
Modular Multipliers
Modular Multiplication Special Cases a x pHpH pLpL a x = p = p H 2 k + p L k bits a x mod 2 k = p L a x mod 2 k -1 = p L + p H + carry p a x a x mod 2 k +1 = p L - p H - borrow
Modular Multiplication Special Case (1) a x mod 2 k -1 = (p H 2 k + p L ) mod (2 k -1) = = (p H (2 k mod (2 k -1)) + p L ) mod (2 k -1) = = p H + p L mod (2 k -1) = = p H + p L if p H + p L < 2 k - 1 p H + p L - (2 k -1) if p H + p L 2 k - 1 = p L + p H + carry carry = carry from addition p L + p H
Modular Multiplication Special Case (2) a x mod 2 k +1 = (p H 2 k + p L ) mod (2 k +1) = = (p H (2 k +1-1) + p L ) mod (2 k +1) = = p L - p H mod (2 k +1) = = p L - p H if p L - p H 0 p L - p H + (2 k +1) if p L - p H < 0 = p L - p H + borrow borrow = borrow from subtraction p L + p H
Modulo (2 b -1) Carry Save Adder
4 x 4 Modulo 15 Multiplier
4 x 4 Modulo 13 Multiplier