1 Integer Multipliers
2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance Serial, Serial/Parallel,Shift and Add, Array, Booth, Wallace Tree,….
3 16x16 multiplier converter Converter RB r e s e t e n converter RC e n r e s e t RA r e s e t e n
4 Multiplication Algorithm Yn-1X0 Yn-2X0 Yn-3X0 …… Y1X0 Y0X0 Yn-1X1 Yn-2X1 Yn-3X1 …… Y1X1 Y0X1 Yn-1X2 Yn-2X2 Yn-3X2 …… Y1X2 Y0X2 … … … … …. …. …. …. …. Yn-1Xn-2 Yn-2X0 n-2 Yn-3X n-2 …… Y1Xn-2 Y0Xn-2 Yn-1Xn-1 Yn-2X0n-1 Yn-3Xn-1 …… Y1Xn-1 Y0Xn P2n-1 P2n-2 P2n-3 P2 P1 P0 X= Xn-1 Xn-2 …………………X0 Multiplicand Y=Yn-1 Yn-2…………………….Y0 Multiplier
5 A7 A6 A5 A4 A3 A2 A1 A0 B7 B6 B5 B4 B3 B2 B1 B0 A7.B2 A6.B2 A5.B2 A4.B2 A3.B2 A2.B2 A1.B2 A0.B2 A7.B3 A6.B3 A5.B3 A4.B3 A3.B3 A2.B3 A1.B3 A0.B3 A7.B4 A6.B4 A5.B4 A4.B4 A3.B4 A2.B4 A1.B4 A0.B4 A7.B5 A6.B5 A5.B5 A4.B5 A3.B5 A2.B5 A1.B5 A0.B5 1. Multiplication Algorithms Implementation of multiplication of binary numbers boils down to how to do the the additions. Consider the two 8 bit numbers A and B to generate the 16 bit product P. First generate the 64 partial Products and then add them up. A7.B0 A6.B0 A5.B0 A4.B0 A3.B0 A2.B0 A1.B0 A0.B0 A7.B1 A6.B1 A5.B1 A4.B1 A3.B1 A2.B1 A1.B1 A0.B1. A7.B6 A6.B6 A5.B6 A4.B6 A3.B6 A2.B6 A1.B6 A0.B6 A3.B7 A2.B7 A1.B7 A0.B7 A3.B7 A2.B7 A1.B7 A0.B7 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 The equation is :.
6 MU (16X16 Multiplier Unit) REGIN1 REGIN1 REG OUT REG OUT Control Unit Storage Multiplier Design
7 Slide 1 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
8 S i : the ith bit of the final result Slide 2 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
9 S i : the ith bit of the final result Slide 3 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
10 S i : the ith bit of the final result Slide 4 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
11 S i : the ith bit of the final result Slide 5 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
12 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 6 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
13 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 7 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
14 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 8 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
15 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 9 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
16 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 10 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
17 Slide 11 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
18 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 12 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
19 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 13 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
20 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 14 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
21 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 15 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
22 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: Slide 16
23 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 17 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
24 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 18 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
25 Slide 19 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
26 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 20 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
27 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 21 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
28 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 21 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:
29 S i : the ith bit of the final result Slide 1
30 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 2
31 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 3
32 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 4
33 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 5
34 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 6
35 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 7
36 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 8
37 8 bit Adder MUX 0 INPUT Ain (7 downto 0) REGA Result (7 downto 0) Result (15 downto 8) INPUT Bin (7 downto 0) CLOCK REGB REGC Shift Add Multiplier Design Implementation
38 Synchronous Shift and Add Multiplier controller Multiplication process: 5 states: Idle, Init, Test, Add, and Shift&Count. Idle: Starts by receiving the Start signal; Init: Multiplicand and multiplier are loaded into a load register and a shift register, respectively; Test: The LSB in the shift register which contains the multiplier is tested to decide the next state;
39 Synchronous Shift and Add Multiplier ControllerDesign Add: If LSB is ‘1’, then next state is to add the new partial product to the accumulation result, and the state machine transits to shift&count state ; Shift&Count: If LSB is ‘0’, then the two shift register shift their contains one bit right, and the counter counts up by one step. After that, the state machine transits back to test state; When the counter reaches to N, a Stop signal is asserted and the state machine goes to the idle state; Idle: In the idle state, a Done signal is asserted to indicate the end of multiplication.
40 Slide 1 n-bit Multiplier: Q 0 =1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q 0 =0: Registers C, A, Q are shifted to the right one bit n-bit Multiplier: Q 0 =1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q 0 =0: Registers C, A, Q are shifted to the right one bit
41 Slide 2 Example: 4-bit Multiplier Initial Values Example: 4-bit Multiplier Initial Values
42 Slide 3 Example: 4-bit Multiplier First Cycle--Add Example: 4-bit Multiplier First Cycle--Add
43 Slide 4 Example: 4-bit Multiplier First Cycle--Shift Example: 4-bit Multiplier First Cycle--Shift
44 Slide 5 Example: 4-bit Multiplier Second Cycle--Shift Example: 4-bit Multiplier Second Cycle--Shift
45 Slide 6 Example: 4-bit Multiplier Third Cycle--Add Example: 4-bit Multiplier Third Cycle--Add
46 Slide 7 Example: 4-bit Multiplier Third Cycle--Shift Example: 4-bit Multiplier Third Cycle--Shift
47 Slide 8 Example: 4-bit Multiplier Fourth Cycle--Add Example: 4-bit Multiplier Fourth Cycle--Add
48 Slide 9 Example: 4-bit Multiplier Fourth Cycle--Shift Example: 4-bit Multiplier Fourth Cycle--Shift
49 4*4 Synchronous Shift and Add Multiplier Design Layout Design Floor plan of the 4*4 Synchronous Shift and Add Multiplier
50 Comparison between Synchronous and Asynchronous Approaches.
51 Example : (simulated by Ovais Ahmed, Fall_03,project) Multiplicand = =89 16 Multiplier = =AB 16 Expected Result = =5B83 16
52 Regular structure based on add and shift algorithm. Addition is mainly done by carry save algorithm. Sign bit extension results in a higher capacitive load and slows down the speed of the circuit. Array Multiplier
53 Addition with CLA
54 Array Multiplier with CSA
55 Critical Path with Array Multipliers HAFA HAFA HAFA Two of the possible paths for the Ripple-Carry based 4*4 Multiplier Area = (N*N) AND Gate + (N-1)N Full-Adder Delay = τ HA + (2N-1) τ FA
56
57 Wallace Tree
58 Array Multiplier + Wallace Tree
59 4/12/2015Concordia VLSI Lab59 Background Baugh-Wooley Algorithm Convert negative partial products to positive representation No sign-extension required
60 4/12/2015Concordia VLSI Lab60 examples of 5-by-5 Baugh-Wooley
61 a7a6a5a4a3a2a1a0 *a7a6a5a4a3a2a1a a7*a0a6*a0a5*a0a4*a0a3*a0a2*a0a1*a0a0*a0 a7*a1a6*a1a5*a1a4*a1a3*a1a2*a1a1*a1a0*a1 a7*a2a6*a2a5*a2a4*a2a3*a2a2*a2a1*a2a0*a2 a7*a3a6*a3a5*a3a4*a3a3*a3a2*a3a1*a3a0*a3 a7*a4a6*a4a5*a4a4*a4a3*a4a2*a4a1*a4a0*a4 a7*a5a6*a5a5*a5a4*a5a3*a5a2*a5a1*a5a0*a5 a7*a6a6*a6a5*a6a4*a6a3*a6a2*a6a1*a6a0*a6 a7*a7a6*a7a5*a7a4*a7a3*a7a2*a7a1*a7a0*a a7*a6a7*a5a7*a4a7*a3a7*a2a7*a1a7*a0a6*a0a5*a0a4*a0a3*a0a2*a0a1*a0‘0'a0 a7*a7a6*a5a6*a4a6*a3a6*a2a6*a1a5*a1a4*a1a3*a1a2*a1a1*a1 a6*a6a5*a4a5*a3a5*a2a4*a2a3*a2a2*a2 a5*a5a4*a3a3*a3 a4*a4 S15, S14S13S12S11S10S9S8S7S6S5S4S3S2S1S0
62 Example of an 8bit squarer N*N N=8bits
63 Array Multiplier 32bits by 32bits multiplier
64 1 Booth (Radix-4) Multiplier Radix-4 (3 bit recoding) reduces number of partial products to be added by half. Great saving in area and increased speed. A = -a n-1 2 n-1 + a n-2 2 n-2 + a n-3 2 n-3 + …. + a a 0 B = -b n-1 2 n-1 + b n-2 2 n-2 + b n-3 2 n-3 + …. + b b 0 · Base 4 redundant sign digit representation of B is (n/2) - 1 B = 2 2i K i i = 0
65 K i is calculated by following equation K i = -2b 2i+1 + b 2i + b 2i-1 i = 0,1,2,….(n-2)/2 3 bits of Multiplier B, b 2i+1, b 2i, b 2i-1, are examined and corresponding K i is calculated. B is always appended on the right with zero (b -1 = 0), and n is always even (B is sign extended if needed). The product A B is then obtained by adding n/2 partial products. (n/2) - 1 A B= P = 2 2i K i A i = 0
66 Booth Algorithm Decoding of multiplier to generate signals for hardware use Xi+1XiXi-1OPNEGZEROTWO
67 Booth Algorithm A Booth recoded multiplier examines Three bits of the multiplicand at a time It determine whether to add zero, 1, -1, 2, or -2 of that rank of the multiplicand. The operation to be performed is based on the current two bits of the multiplicand and the previous bit X i+1 XX i-1 Z i/
68 BIT M is OPERATION multiplied XiXi X i+1 X i+2 by 000add zero (no string)+0 001add multipleic (end of string)+X 010add multiplic. (a string)+X 011add twice the mul. (end of string)+2X 100sub. twice the m. (beg. of string)-2X 101sub. the m. (-2X and +X)-X 110sub. the m. (beg. of string)-X 111sub. zero (center of string)-0
69 Booth Algorithm-a higher radix Multiplication Multiplicand A = ● ● ● ● Multiplier B = (●●)(●●) Partial product bits ● ● ● ● (B 1 B 0 ) 2 A4 0 Partial product bits ● ● ● ● (B 3 B 2 )A4 1 Product P = ● ● ● ● ● ● ● ●
70 The following example is used to show how the calculation is done properly. Multiplicand X = Multiplier Y = After booth decoding, Y is decoded as to multiply X by +2, -1, +1 separately, then shift the partial product two bits and add them together. X* X* X* Example Added to the multiplier
71 Sign Extension
72 4/12/2015Concordia VLSI Lab72 Sign extension Traditional sign-extension scheme Segment the input operands based on the size of embedded blocks Multiply the segmented inputs and extend the sign bit of each partial products Sum all partial products Segmented input operands Sign extension × + Final result partial products Sign
73 Booth Algorithm-Example 1 Example 1:
74 Booth Algorithm Example 2 Notice sign extensions
75 Booth Algorithm-Example 3 Notice the sign extensions
76 Comparison of Booth and parallel multiplier shift and Add
77 Please note that each operand is 17 bit ie. the 17 th bit is the sign bit. Also negative numbers are entered as 1’s complement, this is why you need to add the S in the right hand side of the diagram. If you use 2’complement then the S’s on right side of the diagram can be removed Template to reduce sign extensions for Booth Algorithm For hardware implementation
78 Comparison of Template and the sign extension
79 Partial Product matrix generated for a 16 * 16 bit multiplication, Using booth and the template given in previous slide
80 Using the Template 25 * -35 Sign bit Add SS Add inverted S Add Inverted sign and add * 1 Add Inverted sign bit * * 2 No sign bit * This is a –ve number. Convert it = 875 Example of using the template 25 * - 35 with -35 as the multiplier. Using 8 bit representation
81 Booth Multiplier Components Multiplier M ult ipl ic an d Booth Encoder PPU (Partial products unit) PPA (Partial products adding unit) Product
82 Wallace Tree and Ripple Carry Adder Structure. Of 8*8 multiplier With Pipeline
83 Hardware implementation of Booth with shift and add
84 Simulation Plan
85 Testing the Design
86 Simulation For Parallel Multipliers Signed Number: Unsigned Number:
87 Simulation For Signed S/P Multipliers There are 340 ns delay between the result and the operators because of the D flip-flops delay.
88 FPGA after implementation, areas of programming shown clearly
89 Another implementation of the above after pipelining, the place and rout has paced the design in different places.
90 Spartacus FPGA board
91 Testing the multiplication system
92 Comparison of Multipliers Table 7. Performance comparison for two ’ s complement multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) Maximum Delay D(ns) (3.36x32)49.33 Total Dynamic Power P (W) Delay ·Power Product (DP) (ns W) AreaPower Product (AP) (# W) AreaDelay Product (AD) (# ns) 1.10E E E E E E+05 AreaDelay 2 Product (AD 2 ) (# ns 2 ) 3.94E E E E E E+06
93 Comparison of Multipliers Table 7. Performance comparison for Unsigned multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) Maximum Delay D(ns) Total Dynamic Power P (W) Delay ·Power Product (DP) (ns W) AreaPower Product (AP) (# W) AreaDelay Product (AD) (# ns) 1.22E E E E E E+05 AreaDelay 2 Product (AD 2 ) (# ns 2 ) 4.55E E E E E E+06
94 Comparison of Multipliers The relation of Area and Delay for behavioral multiplier -- "banana curve" Change the value of “set_max_delay” in Script file (ns) >60 Area(#) Power(w) Delay(n s)
95 Comparison of Multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace- Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area MediumSmallLargeSmallSmallestMedium Critical Delay MediumFastVery FastFastestVery LargeLarge Power Consumption LargeMediumLargeMediumSmallestMedium Complexity SimpleComplex More Complex SimpleSimplest Implement EasyMediumDifficut EasyEasiest
96 Pipelining Simulation
97 Synthesis for Signed Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral
98 Synthesis for Unsigned Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral
99 Conclusion Modified Booth and Wallace Tree are the best techniques for high speed multiplication. Wallace Tree has the best performance, but it is hard to implement. Booth algorithm based multipliers have lower area among parallel multipliers. For behavioral multipliers, the area will increase while the delay decreases.
100 Comparison Array Multiplier Modified Booth Multiplier Wallace Tree Multiplier Modified Booth & Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Area – Total CLB’s (#) Maximum Delay (ns) ns ns ns ns 22.58ns (722.56ns) Power Consumption at highest speed (mW) m W (at 188ns) mW (at 140ns) 30.95mW (at ns) mW (at ns) 2.089mW (at ns) Delay Power Product (DP) (ns mW) Area Power Product (AP) (# mW) x x x x Area Delay Product (AD) (# ns) x x x x x 10 3 Area Delay 2 Product(AD 2 ) (# ns 2 ) x x x x x 10 6
101 NOTICE The rest of these slides are for extra information only and are not part of the lecture
102 Array Addition
103 Addition of 8 binary numbers using the Wallace tree principal
104
105
106
107 Baugh-Wooley two's complement multiplier:
108
109 Cluster Multipliers Divide the multiplier into smaller multipliers
110 Cluster Multipliers 8-bit cluster low power multiplier The circuit used to generate the enable signal
111 Cluster Multipliers Dividing the multiplication circuit into clusters (blocks) of smaller multipliers Applying clock gating techniques to disable the blocks that are producing a zero result. Features –Low Power (claims 13.4 % savings)
112 Multiplexer-Based Array Multipliers Z jZ j xjyjxjyj
113 Multiplexer-Based Array Multipliers Two types of cells: Cell 1: produce the terms Z i j 2 j and includes a full adder of carry save adder array Cell 2: produce the terms x j y j 2 j and includes a full adder of carry save adder array
114 Multiplexer-Based Array Multipliers Characteristics –Faster than Modified Booth –Unlike Booth, does not require encoding logic –Requires approximately N 2 /2 cells –Has a zigzag shape, thus not layout-friendly
115 Multiplexer-Based Array Multipliers Improvement –More rectangular layout –Save up to 40 percent area without penalties –Outperforms the modified Booth multiplier in both speed and power by 13% to 26%
116 Gray-Encoded Array Multiplier DecHybDecHybDecHybDecHyb ’s complement Hybrid Coding –Having a single bit different for consecutive values –Reducing the number of transitions, and thus power ( for highly correlated streams ).
117 Gray-Encoded Array Multiplier An 8-bit wide 2’s complement radix-4 array multiplier
118 Gray-Encoded Array Multiplier Characteristics –Uses gray code to reduce the switching activity of multiplier –Saves 45.6% power than Modified Booth –Uses greater area(26.4% ) than Modified Booth
119 Ultra-high Speed Parallel Multiplier How to ultra-high speed? –Based on Modified Booth Algorithm and Tree Structure (Column compress) –Chooses efficient counters (3:2 and 5:3) –Uses the new compressor (faster 20% ) –Uses First Partial product Addition (FPA) Algorithm (reducing the bits of CLA by 50%)
120 Ultra-high Speed Parallel Multiplier Calculate the partial products as soon as possible. The final CLA is only 16-bit instead of 32-bit. Divide into 3 rows or 5 rows only (most efficient). Calculation process using parallel counter in case of 16x16 ---Totally reduce delay by about 30%
121 ULLRLF Multiplier ULLRLF stands for Upper/Lower Left-to- Right Leapfrog. Combine the following techniques: –Signal flow optimization in [3:2] adder array for partial product reduction, –Left-to-right leapfrog (LRLF) signal flow, –Splitting of the reduction array into upper/lower parts.
122 ULLRLF Multiplier 1)Signal flow optimization in [3:2] adder array -- For n = 32, the delay is reduced by 30 percent. -- The power is saved also. PP ij is always connected to pin A S in /C in are connected to B/C, most S in signals are connected to C
123 ULLRLF Multiplier 2) Left-to-Right Leapfrog (LRLF) Structure -- The delay of signals is more balanceable. -- Low power. The sum signals skip over alternate rows.
124 ULLRLF Multiplier 3) Upper/Lower Split Structure -- The long path of data path be broken into parallel short paths, there would be a saving in power. -- The delay of Partial Products Reduction is reduced. Only n+2 bits
125 ULLRLF Multiplier Floorplan of ULLRLF (n = 32) ULLRLF multipliers have less power than optimized tree multipliers for n ≤ 32 while keeping similar delay and area. With more regularity and inherently shorter interconnects, the ULLRLF structure presents a competitive alternative to tree structures.
126 Signed Array Multiplier
127 Unsigned Array Multiplier
128 Signed Modified Booth Multiplier
129 Signed Modified Booth Multiplier
130 Unsigned Modified Booth Multiplier
131 Unsigned Modified Booth Multiplier
132 Wallace Tree multipliers
133 Wallace Tree multipliers Use the 3:2 counters and 2:2 counters Number of levels of = log (32/2) / log (3/2) ≈8 Irregular structure Fast
134 Wallace Tree multipliers 2-level hierarchical
135 Modified Booth-Wallace Tree Multipliers
136 Modified Booth-Wallace Tree Multipliers Use the 3:2 counters and 2:2 counters Number of levels of = log (16/2) / log (3/2) ≈6 Irregular structure Fast Less area
137 Twin pipe serial-parallel multipliers
138 Signed twin pipe serial-parallel multipliers “Sign” control line and the sign-change hardware
139 Unsigned twin pipe serial-parallel multipliers Don’t need the “Sign” control line and the sign-change hardware