Low-power, High-speed Multiplier Architectures

Slides:



Advertisements
Similar presentations
1 Integer Multipliers. 2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance.
Advertisements

Datorteknik ArithmeticCircuits bild 1 Computer arithmetic Somet things you should know about digital arithmetic: Principles Architecture Design.
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 8 - Multiplication.
Arithmetic Operations and Circuits
Henry Hexmoor1 Chapter 5 Arithmetic Functions Arithmetic functions –Operate on binary vectors –Use the same subfunction in each bit position Can design.
UNIVERSITY OF MASSACHUSETTS Dept
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 24 - Subsystem.
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Multipliers.
Copyright 2008 Koren ECE666/Koren Part.6b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
Wakerly Section 2.4 and further Addition and Subtraction of Nondecimal Numbers.
Fixed-Point Arithmetics: Part I
Chapter 6 Arithmetic. Addition Carry in Carry out
Computer Organization Multiplication and Division Feb 2005 Reading: Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann.
UNIVERSITY OF MASSACHUSETTS Dept
ECE 301 – Digital Electronics
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
COE 308: Computer Architecture (T041) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 4 – Arithmetic Functions Logic and Computer.
Arithmetic Operations and Circuits
3-1 Chapter 3 - Arithmetic Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture.
Lecture 18: Datapath Functional Units
1 Bits are just bits (no inherent meaning) — conventions define relationship between bits and numbers Binary numbers (base 2)
ECE 4110– Sequential Logic Design
Aug Shift Operations Source: David Harris. Aug Shifter Implementation Regular layout, can be compact, use transmission gates to avoid threshold.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
Binary Arithmetic Stephen Boyd March 14, Two's Complement Most significant bit represents sign. 0 = positive 1 = negative Positive numbers behave.
Programmable Logic Circuits: Multipliers Dr. Eng. Amr T. Abdel-Hamid ELECT 90X Fall 2009 Slides based on slides prepared by: B. Parhami, Computer Arithmetic:
3-1 Chapter 3 - Arithmetic Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer Architecture.
Logical Circuit Design Week 8: Arithmetic Circuits Mentor Hamiti, MSc Office ,
Digital Arithmetic and Arithmetic Circuits
Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
Chapter 4 – Arithmetic Functions and HDLs Logic and Computer Design Fundamentals.
Fall 2004EE 3563 Digital Systems Design EE 3563 Comparators  Comparators determine if two binary inputs are equal  Some will signal greater than/less.
Arithmetic Building Blocks
07/19/2005 Arithmetic / Logic Unit – ALU Design Presentation F CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Number Systems. Why binary numbers? Digital systems process information in binary form. That is using 0s and 1s (LOW and HIGH, 0v and 5v). Digital designer.
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
Sequential Multipliers Lecture 9. Required Reading Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter 12.3, Bit-Serial.
Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.
1 Chapter 7 Computer Arithmetic Smruti Ranjan Sarangi Computer Organisation and Architecture PowerPoint Slides PROPRIETARY MATERIAL. © 2014 The McGraw-Hill.
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n Multipliers.
Lecture 4 Multiplier using FPGA 2007/09/28 Prof. C.M. Kyung.
Topics covered: Arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
ECE 331 – Digital System Design Multi-bit Adder Circuits, Adder/Subtractor Circuit, and Multiplier Circuit (Lecture #12)
CS/EE 3700 : Fundamentals of Digital System Design Chris J. Myers Lecture 5: Arithmetic Circuits Chapter 5 (minus 5.3.4)
Topics Multipliers..
CPEN Digital System Design
Addition, Subtraction, Logic Operations and ALU Design
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
Comparison of Various Multipliers for Performance Issues 24 March Depart. Of Electronics By: Manto Kwan High Speed & Low Power ASIC
UNIT 2. ADDITION & SUBTRACTION OF SIGNED NUMBERS.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
Multipliers Multipliers play an important role in today’s digital signal processing and various other applications. The common multiplication method is.
Unsigned Multiplication
Topics Multipliers..
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
Comparison of Various Multipliers for Performance Issues
UNIVERSITY OF MASSACHUSETTS Dept
Appendix J Authors: John Hennessy & David Patterson.
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

Low-power, High-speed Multiplier Architectures Shawn Nicholl ELEC-5705y March 7, 2005

Low-Power, High-Speed Multiplier Architectures Agenda/Overview Design Abstraction Numbering Systems Addition and Subtraction Adder Architectures Multiplication Traditional Multiplier Architectures Advanced Multiplier Architectures 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Levels of Abstraction in Digital ICs Low-power, high-speed techniques can be used at many levels of abstraction Systems Increasing Abstraction Modules Multiplier Architectures Logic Gates Circuits Devices Higher levels of abstraction have greater effect on overall system performance 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Numbering Systems – A Quick Review Some common numbering systems: Decimal Range: 0 to 10n-1 Unsigned Binary Range: 0 to 2n-1 Two’s-Complement Range: -2n-1 to +(2n-1 –1) Sign Decimal Unsigned Binary Two’s Complement + 10 0000 1010 N/A - 45 0010 1101 1101 0011 45d = 0+0+25+0+23+22+0+20 0 0 1 0 1 1 0 1 Eg. 1 1 0 1 0 0 1 0 1 2’s Comp 1 1 0 1 0 0 1 1 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Adding and Subtracting Two’s-complement algorithm is consistent Addition and subtraction and behave the same Negative numbers treated same as positive numbers Example: Add –45d to 10d 10d -45d 45d -10d 35d -35d Step1) Initialize Step2) Compare so that augend holds larger number Step3) Treat as a subtraction Step4) Do subtraction (borrows may be required) Step5) Negate result (knowing that augend was negative) Two’s Complement Method Step1) Initialize Step2) Add (no special rules) 10d = 0000 1010b -45d = 1101 0011b 0000 1010b 1101 0011b 1101 1101b Converting 2’s Comp back to decimal: 1101 1101b = -35d -45d  Augend 10d  Addend ----- 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Adding and Subtracting (Example 2) Example2: Subtract –45d from 10d Two’s Complement Method 10d = 0000 1010b -45d = 1101 0011b 1b 0000 1010b 0010 1100b 0011 0111b Converting 2’s Comp back to decimal: 0011 0111b = 55d Step1) Initialize Step2) Invert subtrahend and set CIN = 1 Subtraction logic can be shared with addition logic! Signed Decimal Method 10d - -45d + 45d 55d Step1) Initialize Step2) Subtrahend is negative, so negate it and do an addition 10 minuend - -45 subtrahend 55 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Low-Power, High-Speed Multiplier Architectures Adder Building Blocks Half Adder Sn = An  Bn COn = An • Bn Full Adder Sn = An  Bn CINn COUTn = An • Bn• CINn 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Adder Architectures (CRA) Carry Ripple Adder (CRA) Gate Count  N  Area  N Delay  N Power  N Layout friendly (low fan-in/fan-out; regular structure) 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Adder Architectures (CLA) Carry Lookahead Adder (CLA) Generate: Gn = An • Bn Propagate: Pn = An + Bn Recursive Relationship: CINn = Gn-1 + Pn-1• CINn-1 Generates Propagates 1 CINn = Gn-1 + Pn-1Gn-2 + Pn-1Pn-2…P1G0 + Pn-1Pn-2…P0CIN0 Source: Patterson and Hennessy, Figure A.14 CLA: Delay  log2N (if built right) Gate count, power are greater than CRA Not layout friendly (high fan-in; difficult to route) Shows the technique of parallelism to make the circuit faster. CINn = Gn-1 + Pn-1 * CINn-1 If previous stage generates a carry, then there is a carry-in to the current stage OR If previous stage has a carry-in and it propagates that carry-in, then there is a carry-in to current stage. 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Adder Architectures (CSA) Carry Save Adder Adders work independently, so very fast Pipelined architecture results in flops and control logic, which increase area and latency 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Unsigned Multiplication Example: Multiply 118d by 99d Two’s Complement Method Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products Multiplicand Multiplier Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products 118d 99d 1062d 1062 d 11682d 118d = 0111 0110b 99d = 0110 0011b 01110110b 01110110 b 00000000 b 00000000 b 00000000 b 01110110 b 01110110 b 00000000 b 010110110100010 b Shift-and-Add Algorithm Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Shift-and-Add Multiplier B Multiplicand X A Multiplier P Product Shift-and-Add Multiplier Take N cycles to complete: TLat= (TN-bitADD+Tshift)xN Requires minimal logic (most logic is in the adder) 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Basic Signed Multiplication Basic Idea Convert to Unsigned Use Shift-and-Add Multiplier Convert to Signed Extra Hardware! 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Signed Multiplication Booth Recoding Reduce the number of partial products by re-coding the multiplier operand Works for signed numbers Low-order Bit Last Bit Shifted Out Example: Multiply -118d by -99d An An-1 Partial Product 1 +B -B Recall, 99d = 0110 0011b 1001 1100b 1b -99d = 1001 1101b Radix-2 Booth Recoding -99d = 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Radix-2 Booth Multiplication Example: Multiply -118d by -99d Radix-2 Booth Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products B = -118d = 1000 1010b -B = 118d = 0111 0110b A = -99d = 1001 1101b -118d = 0111 0110b -99d = -99d = 01110110b -B B 110001010 b Sign Extension 01110110 b 00000000 b 00000000 b 1110001010 b 000000000 b 01110110 b 0010110110100010 b Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Low-Power, High-Speed Multiplier Architectures Array Multiplier -118d = 0111 0110b 01110110b 00000000 b 00000000 b 1110001010 b 000000000 b 01110110 b 0010110110100010 b 110001010 b 01110110 b -99d = -B B 01110110b 110001010 b 01110110 b -B B 00000000 b 00000000 b 1110001010 b 000000000 b 01110110 b Array Multiplier Combinatorial, so it is very fast – delay  N Can be pipelined Very regular structure 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Array Multiplier Structure Source: J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, 1999 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Radix-4 Booth Multiplication Low-order Bits Similar to Radix-2, but uses looks at two low-order bits at a time (instead of 1) Last Bit Shifted Out A2n+1 A2n A2n-1 Partial Product 1 +B +2B -2B -B Recall, 99d = 0110 0011b 1001 1100b 1b -99d = 1001 1101b Radix-4 Booth Recoding -99d = 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Radix-4 Booth Multiplication Example: Multiply -118d by -99d Radix-4 Booth Step1) Initialize Step2) Find partial products Step3) Sum up the shifted partial products B = -118d = 1000 1010b -B = 118d = 0111 0110b 2B = -236d = 1 0001 0100b -2B = 236d = 0 1110 1100b A = -99d = 1001 1101b -118d = 0111 0110b -99d = 111111110001010b 011101100 b 0010110110100010 b 01110110 b 11100010100 b B -B 2B -2B Sign Extension -99d = Convert 2’s-Comp back to decimal: 0010 1101 1010 0010 = 11682d Reduces number of partial products by half! 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Tree Multiplier Wallace Tree Reduces the total number of full-adders Original Structure Tree Structure Wallace Tree Reduces the total number of full-adders Uses 3:2 Compressor (aka Full Adder) Delay  log3/2N Irregular structure is difficult to layout Source: J. Kuo, et. al., Low-Voltage CMOS VLSI Circuits, 1999 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Twin Pipe Serial-Parallel Multiplier Even data bits on rising clock Parallel Feed One Operand Serial Feed One Operand Odd data bits on falling clock Source: S. Shah, et.al., “Comparison of 32-bit Multipliers for Various Performance Measures”, 2000. Features Low Area High latency Low Power 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Cluster Multiplication Divide circuit into clusters of nibble-wide multiplications If all bits in a nibble are zeroes, then use clock-gating to gate multiplication for that nibble Features Low Power (claims 13% savings) Source: A. Fayed, M. Bayoumi, “A Novel Architecture for Low-Power Design of Parallel Multipliers”, 2001. 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Multiplexer-Based Array Multiplier Characteristics Fast (because it is array-based) Unlike Booth, does not require encoding logic Source: K. Pekmestzi, “Multiplexer-Based Array Multipliers”, 1999. Processes 1 bit of multiplier and 1 bit of multiplicand at a time, thus it is symmetric Has a zigzag shape, thus not layout-friendly 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Area-Efficient Multiplexer-Based Multiplier Source:Y. Wang, Y. Jiang, E. Sha, “On Area-Efficient Low Power Array Multipliers”, 2001. Characteristics Increases each row to have N+1 cells (instead of N) Depth is cut in half (increases “squareness”) 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Low Latency Booth-Encoding-based Pipeline Multiplier Features Delay  N/4 Needs (N+N/2)-bit addition at end Uses CLA’s instead of CSA’s because longest stage (i.e. adder at end) determines fastest operating frequency Source: X. Wu, H. Chen, S. Wei, “Design of a Low Latency High Speed Pipelining Multiplier”, 2001. 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Two’s Complement Gray-Encoded Array Multiplier Characteristics Uses gray code to reduce the switching activity of multiplier Claims that traditional Booth uses 45% more power Greater area than traditional Booth Source: E. Costa, et.al., “A New Architecture for 2’s Complement Gray Encoded Array Multiplier”, 2002. 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Low-Power, High-Speed Multiplier Architectures Project Plan Start End Task - 03/05 Research Multiplier Circuits 03/06 03/12 Code multipliers in Verilog HDL 03/13 03/19 Synthesize all multiplier circuits 03/20 03/26 Analyze results (delay/power/area) 03/27 04/02 Prepare report 04/03 04/09 Prepare for final exam 04/10 04/16 Complete Report and Submit 2005/03/07 Low-Power, High-Speed Multiplier Architectures

Low-Power, High-Speed Multiplier Architectures References S. Shah, A.J. Al-Khalili, D. Al-Khalili, “Comparison of 32-bit Multipliers for Various Performance Measures”, Proc. 2000 Int’l Conf. Microelectronics, pp. 75-80, 2000. D. Patterson, J. Hennessy, 2nd, ed., Computer Architecture – A Quantitative Approach, San Francisco, CA: Morgan Kaufmann Publishers, Inc., 1996. X. Wu, H. Chen, S. Wei, “Design of a Low Latency High Speed Pipelining Multiplier”, Proc. 2001 Int’l Conf. on ASIC, pp. 551-554, 2001. J. Wakerly, 2nd, ed., Digital Design – Principles and Practices, Eaglewood Cliffs, NJ: Prentice Hall, 1994. J. Kuo and J. Lou, Low-Voltage CMOS VLSI Circuits, New York, NY: John Wiley & Sons, Inc., 1999. K. Pekmestzi, “Multiplexer-Based Array Multipliers”, IEEE Trans. on Computers, vol. 48, pp. 15-23, 1999. A. Fayed, M. Bayoumi, “A Novel Architecture for Low-Power Design of Parallel Multipliers”, Proc. 2001 IEEE Computer Society Workshop on VLSI, pp. 149-154, 2001. Y. Wang, Y. Jiang, E. Sha, “On Area-Efficient Low Power Array Multipliers”, Proc. 2001 IEEE Int’l Conf. On Electronics, Circuits and Systems, vol. 3, pp. 1429‑1432, 2001. 2005/03/07 Low-Power, High-Speed Multiplier Architectures