Tree and Array Multipliers Lecture 8
Required Reading Chapter 11, Tree and Array Multipliers Chapter 12.5, The special case of squaring Note errata at: Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design
Notation a Multiplicand a k-1 a k-2... a 1 a 0 x Multiplier x k-1 x k-2... x 1 x 0 p Product (a x) p 2k-1 p 2k-2... p 2 p 1 p 0
Multiplication of two 4-bit unsigned binary numbers in dot notation
Basic Multiplication Equations x = x i 2 i i=0 k-1 p = a x p = a x = a x i 2 i = = x 0 a2 0 + x 1 a2 1 + x 2 a2 2 + … + x k-1 a2 k-1 i=0 k-1
Unsigned Multiplication a 4 a 3 a 2 a 1 a 0 x 4 x 3 x 2 x 1 x 0 x a 4 x 0 a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 a 4 x 1 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 a 4 x 2 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 a 4 x 3 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 a 4 x 4 a 3 x 4 a 2 x 4 a 1 x 4 a 0 x 4 p0p0 p1p1 p9p9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 + ax ax ax ax ax 4 2 4
Full tree multiplier - general structure
7 x 7 tree multiplier
A slice of a balanced-delay tree for 11 inputs
Tree multiplier with a more regular structure
12 Unsigned vs. Signed Multiplication 1111 x x x x 1 UnsignedSigned
2’s Complement Multiplication (1) a 4 a 3 a 2 a 1 a 0 x 4 x 3 x 2 x 1 x 0 x a 4 a 3 a 2 a 1 a 0 -x 4 x 3 x 2 x 1 x 0 x
-a 4 a 3 a 2 a 1 a 0 -x 4 x 3 x 2 x 1 x 0 x -a 4 x 0 a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 -a 4 x 1 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 -a 4 x 2 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 -a 4 x 3 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 -a 0 x 4 p0p0 p1p1 -p 9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 + 2’s Complement Multiplication (2) -a 1 x 4 -a 2 x 4 -a 3 x 4 a4x4a4x
2’s Complement Multiplication (3) p0p0 p1p1 -p 9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p p0p0 p1p1 p9p9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p
- a j x i = - a j (1 - x i ) = a j x i - a j = a j x i + a j - 2 a j z = 1 - z 2’s Complement Multiplication (4) z = 1 - z - a j x i = - (1- a j ) x i = a j x i - x i = a j x i + x i - 2 x i - a j x i = - (1- a j x i ) = a j x i - 1 = a j x i a j = - (1 - a j ) = a j - 1 = a j x i = - (1 - x i ) = x i - 1 = x i
-a 4 x 0 -a 4 x 1 -a 4 x 2 -a 4 x 3 + a4x0a4x0 a4a4 -a 4 a4x1a4x1 a4a4 a4x2a4x2 a4a4 a4x3a4x3 a4a4 a4x0a4x0 a4a4 a4x2a4x2 a4x1a4x1 a4x3a4x3 a4a4
+ a0x4a0x4 x4x4 -x 4 a1x4a1x4 x4x4 a2x4a2x4 x4x4 a3x4a3x4 x4x4 a0x4a0x4 x4x4 a2x4a2x4 a1x4a1x4 a3x4a3x4 x4x4 -a 0 x 4 -a 1 x 4 -a 2 x 4 -a 3 x 4
a4x0a4x0 a4a4 a4x2a4x2 a4x1a4x1 a4x3a4x3 a4a4 a0x4a0x4 x4x4 a2x4a2x4 a1x4a1x4 a3x4a3x4 x4x a4x0a4x0 a4a4 a4x2a4x2 a4x1a4x1 a4x3a4x3 a4a4 a0x4a0x4 x4x4 a2x4a2x4 a1x4a1x4 a3x4a3x4 x4x4 a4x0a4x0 a4a4 a4x2a4x2 a4x1a4x1 a4x3a4x3 a4a4 1 a0x4a0x4 x4x4 a2x4a2x4 a1x4a1x4 a3x4a3x4 x4x4 -2 9
-a 4 a 3 a 2 a 1 a 0 -x 4 x 3 x 2 x 1 x 0 x a 4 x 0 a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 a 4 x 1 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 a 4 x 2 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 a 4 x 3 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 a 0 x 4 p0p0 p1p1 p9p9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 + Baugh-Wooley 2’s Complement Multiplier a 1 x 4 a 2 x 4 a 3 x 4 a4x4a4x x4x4 a4a4 x4x4 a4a4 1
-a 4 x 0 -a 4 x 1 -a 4 x 2 -a 4 x 3 + a4x0a4x0 1 a4x1a4x1 1 a4x2a4x2 1 a4x3a4x3 1 1 a4x0a4x0 a4x1a4x1 a4x2a4x2 a4x3a4x3
+ a0x4a0x4 1 a1x4a1x4 1 a2x4a2x4 1 a3x4a3x4 1 a0x4a0x4 1 a2x4a2x4 a1x4a1x4 a3x4a3x4 -a 0 x 4 -a 1 x 4 -a 2 x 4 -a 3 x 4
a0x4a0x4 1 a2x4a2x4 a1x4a1x4 a3x4a3x4 1 a4x0a4x0 a4x1a4x1 a4x2a4x2 a4x3a4x a0x4a0x4 1 a2x4a2x4 a1x4a1x4 a3x4a3x4 a4x0a4x0 a4x1a4x1 a4x2a4x2 a4x3a4x3 a0x4a0x4 1 a2x4a2x4 a1x4a1x4 a3x4a3x4 1 a4x0a4x0 a4x1a4x1 a4x2a4x2 a4x3a4x3 -2 9
-a 4 a 3 a 2 a 1 a 0 -x 4 x 3 x 2 x 1 x 0 x a 4 x 0 a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 a 4 x 1 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 a 4 x 2 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 a 4 x 3 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 a 0 x 4 p0p0 p1p1 p 9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 + Modified Baugh-Wooley Multiplier a 1 x 4 a 2 x 4 a 3 x 4 a4x4a4x
Basic array multiplier
5 x 5 Array Multiplier
Array Multiplier - Basic Cell x y c in c out s FA
-a 4 a 3 a 2 a 1 a 0 -x 4 x 3 x 2 x 1 x 0 x a 4 x 0 a 3 x 0 a 2 x 0 a 1 x 0 a 0 x 0 a 4 x 1 a 3 x 1 a 2 x 1 a 1 x 1 a 0 x 1 a 4 x 2 a 3 x 2 a 2 x 2 a 1 x 2 a 0 x 2 a 4 x 3 a 3 x 3 a 2 x 3 a 1 x 3 a 0 x 3 a 0 x 4 p0p0 p1p1 p9p9 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 + Baugh-Wooley 2’s Complement Multiplier a 1 x 4 a 2 x 4 a 3 x 4 a4x4a4x x4x4 a4a4 x4x4 a4a4 1
Modifications in a 5 x 5 multiplier
Array Multiplier – Modified Basic Cell s i-1 cici c i+1 sisi FA xnxn amam
5 x 5 Array Multiplier with modified cells
Pipelined 5 x 5 Multiplier
Xilinx FPGA Implementation Equations Z = (2x m-1 +x m-2 ) Y 2 m-2 + … + (2x i+1 +x i ) Y 2 i + … + +(2x 3 +x 2 ) Y (2x 1 +x 0 ) Y 2 0 (2x i+1 +x i ) Y = p i(k+1) p ik p i(k-1) …p i2 p i1 p i0 p ij = x i y j xor x i+1 y j-1 xor c j c j+1 = (x i y j )(x i+1 y j-1 ) + (x i y j ) c j + (x i+1 y j-1 ) c j c 0 = c 1 = 0
Modified Basic Cell Xilinx FPGA Implementation c j+1 cjcj p ij FA yjyj xixi x i+1 y j-1
LUT 01 xixi yiyi c j+1 cjcj p ij x i+1 y i-1 Modified Basic Cell Xilinx FPGA Implementation LUT: x i y j xor x i+1 y j-1 p ij = x i y j xor x i+1 y j-1 xor c j c j+1 = (x i y j )(x i+1 y j-1 ) + (x i y j ) c j + (x i+1 y j-1 ) c j
Xilinx FPGA Multiplier
Optimizations for Squaring (1)
Optimizations for Squaring (2) x i x j x j x i x i x j xixi x i x j + x i x j = 2 x i x j x i x j + x i = 2 x i x j - x i x j + x i = = 2 x i x j + x i (1-x j ) = = 2 x i x j + x i x j x i x i = x i
Squaring Using Look-Up Tables for relatively small values k input=a output=a k (2 k -1) i i2i k words 2k-bit each
Multiplication Using Squaring a x = (a+x) 2 - (a-x) 2 4