Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integer Multipliers.

Similar presentations


Presentation on theme: "Integer Multipliers."— Presentation transcript:

1 Integer Multipliers

2 Multipliers A must have circuit in most DSP applications
A variety of multipliers exists that can be chosen based on their performance Serial, Serial/Parallel,Shift and Add, Array, Booth, Wallace Tree,….

3 16x16 multiplier converter Converter RB r e s t n RC RA

4 Multiplication Algorithm
X= Xn-1 Xn-2 ………..……X0 Multiplicand Y=Yn-1 Yn-2……………….Y Multiplier Yn-1X0 Yn-2X0 Yn-3X0 …… Y1X0 Y0X0 Yn-1X1 Yn-2X1 Yn-3X1 …… Y1X1 Y0X1 Yn-1X2 Yn-2X2 Yn-3X2 …… Y1X2 Y0X2 … … … … … … … … …. Yn-1Xn-2 Yn-2X0 n-2 Yn-3X n …… Y1Xn-2 Y0Xn-2 Yn-1Xn-1 Yn-2X0n-1 Yn-3Xn …… Y1Xn-1 Y0Xn-1 P2n P2n P2n P P P0

5 1. Multiplication Algorithms
Implementation of multiplication of binary numbers boils down to how to do the additions. Consider the two 8 bit numbers A and B to generate the 16 bit product P. First generate the 64 partial Products and then add them up.

6 Multiplier Design MU Storage R REG E G OUT I N ( Multiplier Unit)
MU ( Multiplier Unit) R E G I N REG OUT Control Unit Storage

7 Serial Multiplier X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1:
00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 1

8 Si: the ith bit of the final result
X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 2

9 Si: the ith bit of the final result
X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 3

10 Si: the ith bit of the final result
X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 4

11 Si: the ith bit of the final result
X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 5

12 Si: the ith bit of the final result Ci: the only carry from column i
X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 6

13 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 7

14 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 8

15 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 9

16 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 10

17 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 11

18 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 12

19 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 13

20 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 14

21 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 15

22 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 16

23 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 17

24 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 18

25 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 19

26 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 20

27 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 21

28 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i X: x3x2x1x0 Y:y 3y2y1y0 Input Sequence for G1: 00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0 00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0 Reset: Slide 21

29 Serial / Parallel Multiplier
Si: the ith bit of the final result Serial / Parallel Multiplier Slide 1 slide

30 Si: the ith bit of the final result Ci: the only carry from column i
Slide 2

31 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 3

32 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 4

33 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 5

34 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 6

35 Si: the ith bit of the final result Ci: the only carry from column i
Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 7

36 Si: the ith bit of the final result Ci: the only carry from column i
Slide 8

37 Shift AND Add Multiplier
8 bit Adder MUX INPUT Ain (7 downto 0) REGA Result (7 downto 0) Result (15 downto 8) INPUT Bin (7 downto 0) CLOCK REGB REGC

38 Synchronous Shift and Add Multiplier controller
Multiplication process: 5 states: Idle, Init, Test, Add, and Shift&Count. Idle: Starts by receiving the Start signal; Init: Multiplicand and multiplier are loaded into a load register and a shift register, respectively; Test: The LSB in the shift register which contains the multiplier is tested to decide the next state;

39 Synchronous Shift and Add Multiplier ControllerDesign
Add: If LSB is ‘1’, then next state is to add the new partial product to the accumulation result, and the state machine transits to shift&count state ; Shift&Count: If LSB is ‘0’, then the two shift register shift their contains one bit right, and the counter counts up by one step. After that, the state machine transits back to test state; When the counter reaches to N , a Stop signal is asserted and the state machine goes to the idle state; Idle: In the idle state, a Done signal is asserted to indicate the end of multiplication.

40 n-bit Multiplier: Q0=1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q0=0: Registers C, A, Q are shifted to the right one bit Slide 1

41 Example: 4-bit Multiplier
Initial Values Slide 2

42 Example: 4-bit Multiplier
First Cycle--Add Slide 3

43 Example: 4-bit Multiplier
First Cycle--Shift Slide 4

44 Example: 4-bit Multiplier
Second Cycle--Shift Slide 5

45 Example: 4-bit Multiplier
Third Cycle--Add Slide 6

46 Example: 4-bit Multiplier
Third Cycle--Shift Slide 7

47 Example: 4-bit Multiplier
Fourth Cycle--Add Slide 8

48 Example: 4-bit Multiplier
Fourth Cycle--Shift Slide 9

49 4*4 Synchronous Shift and Add Multiplier Design Layout Design
Floor plan of the 4*4 Synchronous Shift and Add Multiplier

50 Comparison between Synchronous and Asynchronous Approaches
.

51 Example : (simulated by Ovais Ahmed)
Multiplicand = = 8916 Multiplier = = AB16 Expected Result = =5B8316

52 Array Multiplier · Regular structure based on add and shift algorithm.
   ·     Regular structure based on add and shift algorithm. ·     Addition is mainly done by carry save algorithm. ·     Sign bit extension results in a higher capacitive load and slows down the speed of the circuit.

53 Addition with CLA

54 Array Multiplier with CSA

55 Critical Path with Array Multipliers
FA FA FA HA FA FA FA HA FA FA FA HA Two of the possible paths for the Ripple-Carry based 4*4 Multiplier Area = (N*N) AND Gate + (N-1)N Full-Adder τ Delay = + (2N-1) τ HA FA

56

57 Wallace Tree

58 Array Multiplier + Wallace Tree

59 Baugh-Wooley Algorithm
Convert negative partial products to positive representation No sign-extension required 12/7/2018 Concordia VLSI Lab 59 slide 59

60 examples of 5-by-5 Baugh-Wooley
12/7/2018 Concordia VLSI Lab 60

61 Squarer using Baugh-Wooley Algorithm
* a7*a0 a6*a0 a5*a0 a4*a0 a3*a0 a2*a0 a1*a0 a0*a0 a7*a1 a6*a1 a5*a1 a4*a1 a3*a1 a2*a1 a1*a1 a0*a1 a7*a2 a6*a2 a5*a2 a4*a2 a3*a2 a2*a2 a1*a2 a0*a2 a7*a3 a6*a3 a5*a3 a4*a3 a3*a3 a2*a3 a1*a3 a0*a3 a7*a4 a6*a4 a5*a4 a4*a4 a3*a4 a2*a4 a1*a4 a0*a4 a7*a5 a6*a5 a5*a5 a4*a5 a3*a5 a2*a5 a1*a5 a0*a5 a7*a6 a6*a6 a5*a6 a4*a6 a3*a6 a2*a6 a1*a6 a0*a6 a7*a7 a6*a7 a5*a7 a4*a7 a3*a7 a2*a7 a1*a7 a0*a7 ‘0' S15, S14 S13 S12 S11 S10 S9 S8 S7 S6 S5 S4 S3 S2 S1 S0

62 Example of an 8bit squarer

63 Array Multiplier 32bits by 32bits multiplier

64 Booth (Radix-4) Multiplier
·     Radix-4 (3 bit recoding) reduces number of partial products to be added by half. ·     Great saving in area and increased speed. A = -an-12n-1 + an-22n-2 + an-32n-3 + …. + a12 + a0 B = -bn-12n-1 + bn-22n-2 + bn-32n-3 + …. + b12 + b0 ·     Base 4 redundant sign digit representation of B is (n/2) - 1 B =  i Ki i = 0

65 ·     ·  Ki is calculated by following equation
Ki = -2b2i+1 + b2i + b2i i = 0,1,2,….(n-2)/2 ·     3 bits of Multiplier B, b2i+1, b2i, b2i-1, are examined and corresponding Ki is calculated. ·     B is always appended on the right with zero (b-1 = 0), and n is always even (B is sign extended if needed). ·     The product AB is then obtained by adding n/2 partial products. (n/2) - 1 AB = P =  22i Ki A i = 0

66 Booth Algorithm Decoding of multiplier to generate signals for hardware use
Xi+1 Xi Xi-1 OP NEG ZERO TWO 1 2

67 Booth Algorithm Three bits of the multiplicand at a time
A Booth recoded multiplier examines Three bits of the multiplicand at a time It determine whether to add zero, 1, -1, 2, or -2 of that rank of the multiplicand. The operation to be performed is based on the current two bits of the multiplicand and the previous bit Xi+1 X Xi-1 Zi/2 1 2 -2 -1

68 BIT M is 21 20 2-1 OPERATION multiplied Xi Xi+1 Xi+2 by
M is 21 20 2-1 OPERATION multiplied Xi Xi+1 Xi+2 by add zero (no string) +0 1 add multipleic (end of string) +X add multiplic. (a string) add twice the mul. (end of string) +2X sub. twice the m. (beg. of string) -2X sub. the m. (-2X and +X) -X sub . the m. (beg. of string) sub. zero (center of string) -0

69 Booth Algorithm- dot notation
Multiplicand A = ● ● ● ● Multiplier B = (●●)(●●) Partial product bits ● ● ● ● (B1B0)2A40 Partial product bits ● ● ● ● (B3B2)A41 Product P = ● ● ● ● ● ● ● ●

70 Added to the multiplier
Example The following example is used to show how the calculation is done properly. Multiplicand X = Added to the multiplier Multiplier Y = After booth decoding, Y is decoded as to multiply X by +2, -1, +1 separately, then shift the partial product two bits and add them together. X* X* X*

71 Sign Extension

72 Segmented input operands
Sign extension Traditional sign-extension scheme Segment the input operands based on the size of embedded blocks Multiply the segmented inputs and extend the sign bit of each partial products Sum all partial products Segmented input operands Sign extension × + Final result partial products Sign 12/7/2018 Concordia VLSI Lab 72 slide 72

73 Booth Algorithm-Example 1
Example 1:

74 Booth Algorithm Example 2
Notice sign extensions

75 Booth Algorithm-Example 3
Notice the sign extensions

76 Comparison of Booth and parallel multiplier shift and Add

77 Template to reduce sign extensions for Booth Algorithm
Please note that each operand is 17 bit ie. the 17th bit is the sign bit. Also negative numbers are entered as 1’s complement, this is why you need to add the S in the right hand side of the diagram. If you use 2’complement then the S’s on right side of the diagram can be removed

78 Comparison of Template and the sign extension

79 Example of using the template
25 * with -35 as the multiplier. Using 8 bit representation Using the Template 25 * -35 Sign bit Add SS Add inverted S Add Inverted sign and add 1 * 1 Add Inverted sign bit * -1 * 2 No sign bit * -1 This is a –ve number. Convert it = 875

80 Booth Multiplier Components
Multiplicand Booth Encoder PPU (Partial products unit) PPA (Partial products adding unit) Product

81 Wallace Tree and Ripple Carry Adder Structure.
Of 8*8 multiplier With Pipeline

82 Hardware implementation of Booth with shift and add

83 Simulation Plan

84 Testing the Design

85 Simulation For Parallel Multipliers
Signed Number: Unsigned Number:

86 Simulation For Signed S/P Multipliers
There are 340 ns delay between the result and the operators because of the D flip-flops delay.

87 FPGA after implementation, areas of programming shown clearly

88 Another implementation of the above after pipelining, the place and rout has paced the design in different places.

89 Spartacus FPGA board

90 Testing the multiplication system

91 Comparison of Multipliers
Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth-Wallace Tree Multiplier Twin Pipe Serial-Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) 490.00 Maximum Delay D(ns) 35.78 24.43 18.93 18.53 (3.36x32) 49.33 Total Dynamic Power P (W) 7.52 6.33 7.46 6.41 0.28 6.24 Delay ·Power Product (DP) (ns W) 268.98 154.64 141.14 118.76 30.62 307.58 Area•Power Product (AP) (# W) 139.54 Area•Delay Product (AD) (# ns) 1.10E+05 6.47E+04 6.30E+04 4.95E+04 5.27E+04 1.48E+05 Area•Delay2 Product (AD2) (# ns2) 3.94E+06 1.58E+06 1.19E+06 9.18E+05 5.66E+06 7.28E+06 Table 7. Performance comparison for two’s complement multipliers By Chen Yaoquan, M.Eng. 2005

92 Comparison of Multipliers
Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth-Wallace Tree Multiplier Twin Pipe Serial-Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) 487.00 Maximum Delay D(ns) 37.23 25.33 18.93 18.33 107.52 44.50 Total Dynamic Power P (W) 7.57 6.66 7.32 0.29 6.26 Delay ·Power Product (DP) (ns W) 281.88 168.77 138.60 122.13 30.66 278.53 Area•Power Product (AP) (# W) 138.89 Area•Delay Product (AD) (# ns) 1.22E+05 7.09E+04 6.29E+04 5.22E+04 5.24E+04 1.34E+05 Area•Delay2 Product (AD2) (# ns2) 4.55E+06 1.80E+06 1.19E+06 9.56E+05 5.63E+06 5.95E+06 Table 7. Performance comparison for Unsigned multipliers By Chen Yaoquan, M.Eng. 2005

93 Comparison of Multipliers
Change the value of “set_max_delay” in Script file (ns) 10 20 30 40 50 60 >60 Area(#) 3014.5 3013.0 3110.0 3193.5 3019.5 2999.5 2978.5 Power(w) 6.6499 6.6470 7.5683 8.1878 8.0645 8.0419 8.0156 Delay(ns) 31.98 30.93 30.08 39.93 49.88 59.63 The relation of Area and Delay for behavioral multiplier -- "banana curve"

94 Comparison of Multipliers
Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth-Wallace Tree Multiplier Twin Pipe Serial-Parallel Multiplier Behavioral Multiplier Area Medium Small Large Smallest Critical Delay Fast Very Fast Fastest Very Large Power Consumption Complexity Simple Complex More Complex Simplest Implement Easy Difficut Easiest By Chen Yaoquan, M.Eng. 2005

95 Pipelining Simulation

96 Synthesis for Signed Multipliers
Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral

97 Synthesis for Unsigned Multipliers
Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral

98 Conclusion Modified Booth and Wallace Tree are the best techniques for high speed multiplication. Wallace Tree has the best performance, but it is hard to implement. Booth algorithm based multipliers have lower area among parallel multipliers. For behavioral multipliers, the area will increase while the delay decreases.

99 Comparison Array Multiplier Modified Booth Multiplier
Array Multiplier Modified Booth Multiplier Wallace Tree Multiplier Modified Booth & Wallace Tree Multiplier Twin Pipe Serial-Parallel Multiplier Area – Total CLB’s (#) 1165 1292 1659 1239 133 Maximum Delay (ns) 187.87ns 139.41ns 101.14ns 101.43ns 22.58ns (722.56ns) Power Consumption at highest speed (mW) mW (at 188ns) 23.136mW (at 140ns) 30.95mW (at ns) 30.862mW (at ns) 2.089mW (at ns) Delay Power Product (DP) (ns mW) Area  Power Product (AP) (# mW) x 103 x 103 x 103 x 103 Area  Delay Product (AD) (# ns) x 103 x 103 x 103 x 103 x 103 Area  Delay2 Product(AD2) (# ns2) x 106 x 106 x 106 x 106 x 106

100 NOTICE · The rest of these slides are for extra information only
and are not part of the lecture 

101 Array Addition

102 Addition of 8 binary numbers using the Wallace tree principal

103

104

105

106 Baugh-Wooley two's complement multiplier:

107

108 Cluster Multipliers Divide the multiplier into smaller multipliers

109 Cluster Multipliers The circuit used to generate the enable signal
8-bit cluster low power multiplier

110 Cluster Multipliers Dividing the multiplication circuit into clusters (blocks) of smaller multipliers Applying clock gating techniques to disable the blocks that are producing a zero result. Features Low Power (claims 13.4 % savings)

111 Multiplexer-Based Array Multipliers
Z j xjyj

112 Multiplexer-Based Array Multipliers
Two types of cells: Cell 1: produce the terms Zij2j and includes a full adder of carry save adder array Cell 2: produce the terms xjyj 2j and includes a full adder of carry save adder array

113 Multiplexer-Based Array Multipliers
Characteristics Faster than Modified Booth Unlike Booth, does not require encoding logic Requires approximately N2/2 cells Has a zigzag shape, thus not layout-friendly

114 Multiplexer-Based Array Multipliers
Improvement More rectangular layout Save up to 40 percent area without penalties Outperforms the modified Booth multiplier in both speed and power by 13% to 26%

115 Gray-Encoded Array Multiplier
Dec Hyb 0000 4 0100 -8 1100 -4 1000 1 0001 5 0101 -7 1101 -3 1001 2 0011 6 0111 -6 1111 -2 1011 3 0010 7 0110 -5 1110 -1 1010 2’s complement Hybrid Coding Having a single bit different for consecutive values Reducing the number of transitions, and thus power ( for highly correlated streams ).

116 Gray-Encoded Array Multiplier
An 8-bit wide 2’s complement radix-4 array multiplier

117 Gray-Encoded Array Multiplier
Characteristics Uses gray code to reduce the switching activity of multiplier Saves 45.6% power than Modified Booth Uses greater area(26.4% ) than Modified Booth

118 Ultra-high Speed Parallel Multiplier
How to ultra-high speed? Based on Modified Booth Algorithm and Tree Structure (Column compress) Chooses efficient counters (3:2 and 5:3) Uses the new compressor (faster 20% ) Uses First Partial product Addition (FPA) Algorithm (reducing the bits of CLA by 50%)

119 Ultra-high Speed Parallel Multiplier
Divide into 3 rows or 5 rows only (most efficient). Calculate the partial products as soon as possible. The final CLA is only 16-bit instead of 32-bit. Calculation process using parallel counter in case of 16x16 ---Totally reduce delay by about 30%

120 ULLRLF Multiplier ULLRLF stands for Upper/Lower Left-to-Right Leapfrog. Combine the following techniques: Signal flow optimization in [3:2] adder array for partial product reduction, Left-to-right leapfrog (LRLF) signal flow, Splitting of the reduction array into upper/lower parts.

121 ULLRLF Multiplier Signal flow optimization in [3:2] adder array
PPij is always connected to pin A Sin/Cin are connected to B/C , most Sin signals are connected to C Signal flow optimization in [3:2] adder array -- For n = 32, the delay is reduced by 30 percent. -- The power is saved also.

122 ULLRLF Multiplier 2) Left-to-Right Leapfrog (LRLF) Structure
The sum signals skip over alternate rows. 2) Left-to-Right Leapfrog (LRLF) Structure -- The delay of signals is more balanceable. -- Low power.

123 ULLRLF Multiplier 3) Upper/Lower Split Structure
Only n+2 bits 3) Upper/Lower Split Structure -- The long path of data path be broken into parallel short paths, there would be a saving in power. -- The delay of Partial Products Reduction is reduced.

124 ULLRLF Multiplier ULLRLF multipliers have less power than optimized tree multipliers for n ≤ 32 while keeping similar delay and area. With more regularity and inherently shorter interconnects, the ULLRLF structure presents a competitive alternative to tree structures. Floorplan of ULLRLF (n = 32)

125 Signed Array Multiplier

126 Unsigned Array Multiplier

127 Signed Modified Booth Multiplier

128 Signed Modified Booth Multiplier

129 Unsigned Modified Booth Multiplier

130 Unsigned Modified Booth Multiplier

131 Wallace Tree multipliers

132 Wallace Tree multipliers
Use the 3:2 counters and 2:2 counters Number of levels of = log (32/2) / log (3/2) ≈8 Irregular structure Fast

133 Wallace Tree multipliers
2-level hierarchical

134 Modified Booth-Wallace Tree Multipliers

135 Modified Booth-Wallace Tree Multipliers
Use the 3:2 counters and 2:2 counters Number of levels of = log (16/2) / log (3/2) ≈6 Irregular structure Fast Less area

136 Twin pipe serial-parallel multipliers

137 Signed twin pipe serial-parallel multipliers
“Sign” control line and the sign-change hardware

138 Unsigned twin pipe serial-parallel multipliers
Don’t need the “Sign” control line and the sign-change hardware


Download ppt "Integer Multipliers."

Similar presentations


Ads by Google