Digital Integrated Circuits Chpt. 5Lec. 01- 08/29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( www.cse.psu.edu/~mji.

Slides:



Advertisements
Similar presentations
1 Integer Multipliers. 2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance.
Advertisements

EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
EE141 Adder Circuits S. Sundar Kumar Iyer.
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
Datapath Functional Units. Outline  Comparators  Shifters  Multi-input Adders  Multipliers.
EE 382 Processor DesignWinter 98/99Michael Flynn 1 AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 24 - Subsystem.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Multipliers.
Copyright 2008 Koren ECE666/Koren Part.6b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
UNIVERSITY OF MASSACHUSETTS Dept
Contemporary Logic Design Arithmetic Circuits © R.H. Katz Lecture #24: Arithmetic Circuits -1 Arithmetic Circuits (Part II) Randy H. Katz University of.
EE466: VLSI Design Lecture 14: Datapath Functional Units.
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Introduction to CMOS VLSI Design Datapath Functional Units
Lecture 18: Datapath Functional Units
Aug Shift Operations Source: David Harris. Aug Shifter Implementation Regular layout, can be compact, use transmission gates to avoid threshold.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
ECE 645 – Computer Arithmetic Lecture 7: Tree and Array Multipliers ECE 645—Computer Arithmetic 3/18/08.
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Reference: Digital Integrated.
Arithmetic Building Blocks
Sequential Multipliers Lecture 9. Required Reading Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter 12.3, Bit-Serial.
Spring 2002EECS150 - Lec12-cl3 Page 1 EECS150 - Digital Design Lecture 12 - Combinational Logic Circuits Part 3 March 4, 2002 John Wawrzynek.
55:035 Computer Architecture and Organization Lecture 5.
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n Multipliers.
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
CSE477 L23 Memories.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 23: Semiconductor Memories Mary Jane Irwin (
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
Cost/Performance Tradeoffs: a case study
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 19: Adder Design
Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.
Topics Multipliers..
CPEN Digital System Design
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
Comparison of Various Multipliers for Performance Issues 24 March Depart. Of Electronics By: Manto Kwan High Speed & Low Power ASIC
CSE477 L21 Multiplier Design.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 19: Timing Issues; Introduction to Datapath.
Lecture 18: Datapath Functional Units
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
CSE477 L20 Adder Design.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 20: Adder Design Mary Jane Irwin (
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
CSE477 VLSI Digital Circuits Fall 2003 Lecture 21: Multiplier Design
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
Multipliers Multipliers play an important role in today’s digital signal processing and various other applications. The common multiplication method is.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2003 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Unsigned Multiplication
Digital Integrated Circuits A Design Perspective
Topics Multipliers..
UNIVERSITY OF MASSACHUSETTS Dept
Lecture 9 Digital VLSI System Design Laboratory
Arithmetic Building Blocks
Arithmetic Circuits.
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin ( ) [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]

Digital Integrated Circuits Chpt. 5Lec /29/2006 Review: Basic Building Blocks n Datapath –Execution units »Adder, multiplier, divider, shifter, etc. –Register file and pipeline registers –Multiplexers, decoders n Control –Finite state machines (PLA, ROM, random logic) n Interconnect –Switches, arbiters, buses n Memory –Caches (SRAMs), TLBs, DRAMs, buffers

Digital Integrated Circuits Chpt. 5Lec /29/2006 Review: Binary Adder Landscape synchronous word parallel adders ripple carry adders (RCA) carry prop min adders signed-digit fast carry prop residue adders adders adders Manchester carry parallel conditional carry carry chain select prefix sum skip T = O(N), A = O(N) T = O(1), A = O(N) T = O(log N) A = O(N log N) T = O(  N), A = O(N) T = O(N) A = O(N)

Digital Integrated Circuits Chpt. 5Lec /29/2006 Multiply Operation n Multiplication as repeated additions multiplicand multiplier partial product array double precision product N 2N N can be formed in parallel

Digital Integrated Circuits Chpt. 5Lec /29/2006 Shift & Add Multiplication n Right shift and add –Partial product array rows are accumulated from top to bottom on an N-bit adder –After each addition, right shift (by one bit) the accumulated partial product to align it with the next row to add –Time for N bits T serial_mult = O(N T adder ) = O(N 2 ) for a RCA n Making it faster –Use a faster adder –Use higher radix (e.g., base 4) multiplication »Use multiplier recoding to simplify multiple formation –Form partial product array in parallel and add it in parallel n Making it smaller (i.e., slower) –Use an array multiplier »Very regular structure with only short wires to nearest neighbor cells. Thus, very simple and efficient layout in VLSI »Can be easily and efficiently pipelined

Digital Integrated Circuits Chpt. 5Lec /29/2006 Tree Multiplier Structure partial product array reduction tree fast carry propagate adder (CPA) P (product) mux + reduction tree (log N) + CPA (log N) Q (‘ier) D (‘icand) D D D multiple forming circuits

Digital Integrated Circuits Chpt. 5Lec /29/2006 (4,2) Counter n Built out of two (3,2) counters (just FA’s!) –all of the inputs (4 external plus one internal) have the same weight (i.e., are in the same bit position) –the internal output is carried to the next higher weight position (indicated by the ) (3,2) Note: Two carry outs - one “internal” and one “external”

Digital Integrated Circuits Chpt. 5Lec /29/2006 Tiling (4,2) Counters n Reduces columns four high to columns only two high –Tiles with neighboring (4,2) counters –Internal carry in at same “level” (i.e., bit position weight) as the internal carry out (3,2)

Digital Integrated Circuits Chpt. 5Lec /29/2006 Tiling (4,2) Counters n Reduces columns four high to columns only two high –Tiles with neighboring (4,2) counters –Internal carry in at same “level” (i.e., bit position weight) as the internal carry out (3,2)

Digital Integrated Circuits Chpt. 5Lec /29/2006 4x4 Partial Product Array Reduction multiplicand multiplier partial product array reduced pp array (to CPA) double precision product n Fast 4x4 multiplication using (4,2) counters

Digital Integrated Circuits Chpt. 5Lec /29/2006 4x4 Partial Product Array Reduction multiplicand multiplier partial product array reduced pp array (to CPA) double precision product n Fast 4x4 multiplication using (4,2) counters

Digital Integrated Circuits Chpt. 5Lec /29/2006 8x8 Partial Product Array Reduction ‘icand ‘ier partial product array How many (4,2) counters minimum are needed to reduce it to 2 rows?

Digital Integrated Circuits Chpt. 5Lec /29/2006 8x8 Partial Product Array Reduction ‘icand ‘ier partial product array reduced partial product array How many (4,2) counters minimum are needed to reduce it to 2 rows? Answer: 24

Digital Integrated Circuits Chpt. 5Lec /29/2006 Alternate 8x8 Partial Product Array Reduction ‘icand ‘ier partial product array reduced partial product array More (4,2) counters, so what is the advantage?

Digital Integrated Circuits Chpt. 5Lec /29/2006 Array Reduction Layout Approach multiple generators multiplicand multiple selection signals (‘ier)... 2 (4,2) counter slice CPA

Digital Integrated Circuits Chpt. 5Lec /29/2006 Next Lecture and Reminders n Next lecture –Shifters, decoders, and multiplexers »Reading assignment – Rabaey, et al, n Reminders –Project final reports due December 5 th –HW5 (last one!) due November 19 th –Final grading negotiations/correction (except for the final exam) must be concluded by December 10 th –Final exam scheduled »Monday, December 16 th from 10:10 to noon in 118 and 121 Thomas

Digital Integrated Circuits Chpt. 5Lec /29/2006 Topics n Adders and ALUs (§6.4, §6.5) n Multipliers (§6.6) –Array multiplier –Baugh-Wooley multiplier –Booth encoding –Wallace tree multiplier n Subsystem design principles (§6.2)

Digital Integrated Circuits Chpt. 5Lec /29/2006 Elementary School Algorithm multiplicand × multiplier partial products

Digital Integrated Circuits Chpt. 5Lec /29/2006 Combinational Multiplier bit of multiplier controls whether addition occurs

Digital Integrated Circuits Chpt. 5Lec /29/2006 Array Multiplier n Regular layout –An n × m cell layout –Easy to be pipelined –Used frequently in FPGA and ASICs n Critical path –Less than (n+m-1) bit adder delay n Handles unsigned multiplication ONLY

Digital Integrated Circuits Chpt. 5Lec /29/2006 A 4 × 4 Unsigned Array Multiplier skew array for rectangular layout

Digital Integrated Circuits Chpt. 5Lec /29/2006 Unsigned Array Multiplier + a b CinCin C ou t Sum + x0y1x0y1 x0y2x0y2 P1 + x0y0x0y0 x0y3x0y x1y1x1y1 x1y2x1y2 + x1y0x1y0 x1y3x1y3 + + P2 P3 P4 0 P0 + 0 x2y1x2y1 x2y2x2y2 + x2y0x2y0 x2y3x2y3 + + x3y1x3y1 x3y2x3y2 x3y0x3y0 x3y3x3y3 P5P6P7

Digital Integrated Circuits Chpt. 5Lec /29/2006 Signed Multiplication n Signed number representation – n Signed n×n multiplication –(1110) 2 × (0011) 2 = (1010) 2 (-2) × 3 = (-6) –No difference from unsigned multiplication if the result has the same bit-width as the input n But what if we want the result to be 2n bit? –Use sign-bit extension –Needs 2n × 2n array multiplier

Digital Integrated Circuits Chpt. 5Lec /29/2006 Baugh-Wooley Multiplier: Principle

Digital Integrated Circuits Chpt. 5Lec /29/2006 Baugh-Wooley Multiplier: Structure + a b CinCin C ou t Sum

Digital Integrated Circuits Chpt. 5Lec /29/2006 Booth Multiplier n Utilize Booth encoding scheme n Booth encoding scheme  Handles signed multiplication  Reduce the number of partial products by half  Small area and fast  Encoding scheme cannot be applied hierarchically »Often used as the first stage partial products reduction

Digital Integrated Circuits Chpt. 5Lec /29/2006 Booth Encoding: Principle n Two’s-complement form of multiplier y – n Consider first two terms – –By looking at three bits of y, we can determine whether to add x, 2x to partial product.

Digital Integrated Circuits Chpt. 5Lec /29/2006 Booth Actions y i y i-1 y i-2 increment X 0 1 0X X X X X

Digital Integrated Circuits Chpt. 5Lec /29/2006 Booth Example n Don’t forget the sign extension of the encoded value when add them together –Only have to extend 2 bits though n x = (25 10 ), y = ( ). y 1 y 0 y -1 = 100, P 1 = P 0 - (10  ) = y 3 y 2 y 1 = 111, P 2 = P 1  0 = n y 5 y 4 y 3 = 101, P 3 = P =

Digital Integrated Circuits Chpt. 5Lec /29/2006 Wallace Tree n Reduces the number of partial products n Built from carry-save adders: – Three inputs: a, b, c – Two outputs: y, z such that y + z = a + b + c n Carry-save equations: – y i = a i  b i  c i – z i+1 = a i b i + b i c i + c i a i – What’s the difference from carry-ripple adder?

Digital Integrated Circuits Chpt. 5Lec /29/2006 Wallace Tree Structure FA a2a2 b2b2 c2c2 a1a1 b1b1 c1c1 a0a0 b0b0 c0c0 s0s0 s1s1 s2s2 carry- ripple adder FA a2a2 b2b2 c2c2 a1a1 b1b1 c1c1 a0a0 b0b0 c0c0 y0y0 carry- save adder z1z1 y1y1 z2z2 y2y2 z3z3

Digital Integrated Circuits Chpt. 5Lec /29/2006 Wallace Tree Operation n n additions are reduced to (2n/3) additions after each level –Sum of inputs = Sum of outputs –Can apply the reduction hierarchically –More efficient design uses 4-2 adders to reduce n additions to (n/2) additions after each level n Need final adder to add the last two numbers

Digital Integrated Circuits Chpt. 5Lec /29/2006 A Booth-Wallace Tree Multiplier Most commonly used high-performance multiplier

Digital Integrated Circuits Chpt. 5Lec /29/2006 Topics n Adders and ALUs (§6.4, §6.5) n Multipliers (§6.6) n Subsystem design principles (§6.2)

Digital Integrated Circuits Chpt. 5Lec /29/2006 Pipelining n Pipelining can be used to reduce clock period at the expense of latency: combinational logic 1 combinational logic 2

Digital Integrated Circuits Chpt. 5Lec /29/2006 Cycle Time and Latency # stages cycle time # stages latency

Digital Integrated Circuits Chpt. 5Lec /29/2006 Data Paths n A data path is a logical and physical structure: –bit-wise logical organization –bit-wise physical structure n Data paths generally use busses to pass data between function units.

Digital Integrated Circuits Chpt. 5Lec /29/2006 Bit Slice Organization registersshifterALU bit n-1 bit 0 bus control

Digital Integrated Circuits Chpt. 5Lec /29/2006 Data Path Cell Design n Connections may be made by: –abutment, requiring stretching cells; –river routing, requiring a routing channel between function units.

Digital Integrated Circuits Chpt. 5Lec /29/2006 Project n Due 10/26 –Schematic –Verilog/Spectre simulation results –10/27 presentation (10-15 PowerPoint slides) n Important (efficiency-related) –How to add array of instances