ECE 645 – Computer Arithmetic Lecture 6: Multi-Operand Addition ECE 645—Computer Arithmetic 3/5/08.

Slides:



Advertisements
Similar presentations
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Advertisements

Multioperand Addition Lecture 6. Required Reading Chapter 8, Multioperand Addition Note errata at:
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
UNIVERSITY OF MASSACHUSETTS Dept
Multiplication Schemes Continued
Part II Addition / Subtraction
ECE 331 – Digital System Design
UNIVERSITY OF MASSACHUSETTS Dept
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
1 A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products Sabyasachi Das University of Colorado, Boulder Sunil P. Khatri.
Copyright 2008 Koren ECE666/Koren Part.6b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Lecture 9 Sept 28 Chapter 3 Arithmetic for Computers.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
VLSI Design Spring03 UCSC By Prof Scott Wakefield Final Project By Shaoming Ding Jun Hu
ECE C03 Lecture 61 Lecture 6 Arithmetic Logic Circuits Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
UNIVERSITY OF MASSACHUSETTS Dept
Fall 2008EE VLSI Design I - © Kia Bazargan 1 EE 5323 – VLSI Design I Kia Bazargan University of Minnesota Adders.
ECE 301 – Digital Electronics
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
VLSI Digital System Design 1.Carry-Save, 2.Pass-Gate, 3.Carry-Lookahead, and 4.Manchester Adders.
Aug Shift Operations Source: David Harris. Aug Shifter Implementation Regular layout, can be compact, use transmission gates to avoid threshold.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Efficient.
Part II Addition / Subtraction
Digital Arithmetic and Arithmetic Circuits
Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
ECE 645 – Computer Arithmetic Lecture 7: Tree and Array Multipliers ECE 645—Computer Arithmetic 3/18/08.
Chapter 4 – Arithmetic Functions and HDLs Logic and Computer Design Fundamentals.
Arithmetic Building Blocks
Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.
Multi-operand Addition
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
1 Using 2-opr adder Carry-save adder Wallace Tree Dadda Tree Parallel Counters Multi-Operand Addition.
1 CPSC3850 Adders and Simple ALUs Simple Adders Figures 10.1/10.2 Binary half-adder (HA) and full-adder (FA). Digit-set interpretation: {0, 1}
Nov 10, 2008ECE 561 Lecture 151 Adders. Nov 10, 2008ECE 561 Lecture 152 Adders Basic Ripple Adders Faster Adders Sequential Adders.
July 2005Computer Architecture, The Arithmetic/Logic UnitSlide 1 Part III The Arithmetic/Logic Unit.
Computer Architecture, The Arithmetic/Logic UnitSlide 1 Part III The Arithmetic/Logic Unit.
Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5.
FPGA-Based System Design: Chapter 4 Copyright  2003 Prentice Hall PTR Topics n Number representation. n Shifters. n Adders and ALUs.
Spring C:160/55:132 Page 1 Lecture 19 - Computer Arithmetic March 30, 2004 Sukumar Ghosh.
1 CS 151: Digital Design Chapter 4: Arithmetic Functions and Circuits 4-1,2: Iterative Combinational Circuits and Binary Adders.
1 Arithmetic I Instructor: Mozafar Bag-Mohammadi Ilam University.
Lecture 4. Adder & Subtractor Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research.
N, Z, C, V in CPSR with Adder & Subtractor Prof. Taeweon Suh Computer Science Education Korea University.
Kavita Bala CS 3410, Spring 2014 Computer Science Cornell University.
Wallace Tree Previous Example is 7 Input Wallace Tree
Conditional-Sum Adders Parallel Prefix Network Adders
Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.
Addition, Subtraction, Logic Operations and ALU Design
CEC 220 Digital Circuit Design
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
ECE/CS 552: Arithmetic I Instructor:Mikko H Lipasti Fall 2010 University of Wisconsin-Madison Lecture notes partially based on set created by Mark Hill.
Addition and multiplication Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
Multioperand Addition
CSE 8351 Computer Arithmetic Fall 2005 Instructors: Peter-Michael Seidel.
Application of Addition Algorithms Joe Cavallaro.
Carry-Lookahead, Carry-Select, & Hybrid Adders ECE 645: Lecture 2.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Topic: N-Bit parallel and Serial adder
ETE 204 – Digital Electronics Combinational Logic Design Single-bit and Multiple-bit Adder Circuits [Lecture: 9] Instructor: Sajib Roy Lecturer, ETE,ULAB.
Addition and multiplication1 Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
Conditional-Sum Adders Parallel Prefix Network Adders
Multioperand Addition
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

ECE 645 – Computer Arithmetic Lecture 6: Multi-Operand Addition ECE 645—Computer Arithmetic 3/5/08

2 Lecture Roadmap Multi-Operand Addition Adder Trees Carry-Save Adders and Carry-Save Adder Trees Wallace and Dadda Trees 5:3 Counter and Generalized Counters

3 Required Reading B. Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 8, Multioperand Addition Note errata at:

Multi-Operand Addition ECE 645 – Computer Arithmetic

5 Applications of Multi-Operand Addition Multiplication Inner product s =  x (i) y (i) =  i=0 n-1 i=0 n-1 p (i) p=a·x

6 Applications of Multi-Operand Addition x y h(0) h(1)h(2)h(3) Finite Impulse Response (FIR) filter 4-tap filter requires 4 numbers to be summed T-tap filter requires T numbers to be summed In the following slides, begin by considering multi-operand addition using two-input k- bit adders O is the "big O" notation, representing an order of magnitude approximation without units Helps as rough approximation for comparisons, ignoring minor components (small constants, etc.) Ripple carry: propagation delay is O(k) Fast adders (carry-select, carry-lookahead, prefix, etc): propagation delay is O(log 2 (k)) k k+2

7 S =  x (i) i=0 n-1 x (i)  [0..2 k -1] S max = n (2 k -1)S min = 0 # of bits of S = log 2 (S max + 1)= = log 2 (n (2 k -1) + 1)  log 2 n 2 k = = k + log 2 n Number of Bits of the Result (Unsigned Binary)

8 T serial-multi-add = O(n log(k + log n)) = O(n log k + n log log n) Therefore, addition time grows superlinearly with n when k is fixed and logarithmically with k for a given n Fig. 8.1 Serial implementation of multi-operand addition with a single 2-operand adder. Serial Implementation of Multi-Operand Addition assume fast adder propagation time

9 Fig. 8.1 Serial multi-operand addition when each adder is a 4-stage pipeline. Problem to think about: Ignoring start-up and other overheads, this scheme achieves a speedup of 4 with 3 adders. How is this possible? Pipelined Implementation for Higher Throughput

Adder Trees ECE 645 – Computer Arithmetic

11 Parallel Addition via Adder Trees We will look at three methods of addition trees 1) fast adder tree (carry-lookahead, carry-select, etc.) 2) ripple carry adder tree 3) carry adder tree  log 2 n  adder levels using two- input adders n – 1 adders

12 Binary Adder Trees T tree-fast-multi-add = O(log k + log(k + 1) log(k +  log 2 n  – 1)) = O(log n log k + log n log log n) explanation: log 2 (n) stages, each stage increasing by one bit T tree-ripple-multi-add = O(k + (k + 1) (k +  log 2 n  – 1)) explanation: log 2 (n) stages, each stage increasing by one bit This can be optimized to: = O(k + log n) [Justified on the next slide]  log 2 n  adder levels using two- input adders n – 1 adders

13 T tree-ripple-multi-add = O(k + log n) Fig. 8.5 Ripple-carry adders at levels i and i + 1 in the tree of adders used for multi-operand addition. Elaboration on Tree of Ripple-Carry Adders Sometimes ripple-carry tree can have advantage over fast adder in that the computations can "overlap" in time between levels May or may not be true with fast adders, depending on architecture In general, fast adders need previous stage to complete before going to next stage

14 Example: Adding 8 three bit numbers x y sum c o HA x y sum c o FA c i x y sum c o FA c i x y sum c o HA x y sum c o FA c i x y sum c o FA c i recall: carry-out becomes the msb of the new sum in unsigned arithmetic A 2 B 2 A 1 B 1 A 0 B 0 C 2 D 2 C 1 D 1 C 0 D 0 x sum c o HA yx sum c o FA c i y sum c o FA c i xy sum c o FA c i xy x sum c o HA yx sum c o FA c i y sum c o FA c i xy sum c o FA c i xy sum c o FA c i xy t+1 t+2 t+1t+2 t+1 t+2 t+3 t+4t+5 tttttttttttt t+6 t+7 k= 3 bits n = 8 operands Delay: 3 + log2(8) + 1 = 7 (must add +1 to be exact) t+3 t+4

15 Latency of Multi-Operand Addition in Parallel T tree-fast-multi-add = O(log k + log(k + 1) log(k +  log 2 n  – 1)) = O(log n log k + log n log log n) T tree-ripple-multi-add = O(k + log n) In the next slides we will present a carry-save adder tree: T carry-save-multi-add = O(tree height + T CPA ) = O(log n + log k)

16 Example Calculations of Latency (Propagation Delay) n=4 k=8 n=4 k=32 n=4 k=128 n=16 k=8 n=16 k=32 n=16 k=128 n=128 k=8 n=128 k=32 n=128 k=128 Fast Adder Tree Ripple Carry Tree Carry Save Tree These are very rough approximations of order of magnitude accuracy only! They are used to show trends. Conclusions of latency/propagation delay: Comparing fast adder tree versus ripple-carry tree: Fast adder usually superior for larger values of k Ripple carry can be superior for large values of n with small values of k Carry-save tree is superior to both Latency for different adder trees (in units of 1 full-adder delay)

Carry-Save Adders ECE 645 – Computer Arithmetic

18 a1a1 b1b1 FA c2c2 s1s1 a0a0 b0b0 c0c0 c1c1 s0s0 a2a2 b2b2 c3c3 s2s2 a n-1 b n-1 FA cncn s n-1 c n-1... Ripple-Carry Carry Propagate Adder (CPA) Critical path is n full adders

19 FA c2c2 s1s1 a0a0 b0b0 c1c1 s0s0 c3c3 s2s2 cncn s n-1 c n-1... c0c0 s3s3 a1a1 b1b1 c1c1 a2a2 b2b2 c2c2 a n-1 b n-1 c n-1 Carry Save Adder Critical path is 1 full adder, regardless of the number of bits n!

20 A Ripple-Carry vs. Carry-Save Adder

xyzxyz scsc Example x+y+z = s + c Operation of a Carry Save Adder (CSA)

22 Carry Propagate and Carry-Save Adders in Dot Notation Fig. 8.7 Carry-propagate adder (CPA) and carry-save adder (CSA) functions in dot notation.

23 Specifying Full- and Half-Adder Blocks in Dot Notation Full adder

Carry-Save Adder Trees ECE 645 – Computer Arithmetic

25 Carry-Save Adder Tree CSA adder tree uses very roughly log 2 (n) stages of carry-save adders This is why use "big O" notation Each stage is of 1 latency unit (i.e. one full adder cell), so these comprise log 2 (n) latency units We will find exact number of stages in upcoming slides One fast k-bit carry-propagate adder Assuming you use a fast adder, this comprises approximately log 2 (k) latency units Carry-propagate adder T carry-save-multi-add = O(tree height + T CPA ) = O(log n + log k) C carry-save-multi-add = (n – 2)C CSA + C CPA

26 x 3 x 2 x 1 x 0 y 3 y 2 y 1 y 0 z 3 z 2 z 1 z 0 w 3 w 2 w 1 w 0 s 3 s 2 s 1 s 0 c 4 c 3 c 2 c 1 w 3 w 2 w 1 w 0 c 4 s 3 s 2 s 1 s 0 c 4 c 3 c 2 c 1 ’’’’ ’ ’ ’’ S 5 S 4 S 3 S 2 S 1 S 0 Carry-Save Adder for Four Operands

27 Carry-Save Adder for Four Operands s0s0 s1s1 s2s2 s3s3 c1c1 c2c2 c3c3 c4c4 s0s0 s1s1 s2s2 s3s3 c1c1 c2c2 c3c3 c4c4 ’ ’ ’ ’ ’ ’ ’ ’

28 x y z CSA 4 w CPA sc s’ c’ S Carry-Save Adder for Four Operands

29 Carry-Save Adder for Six Operands CSA tree Implementation of one-bit slice

30 Tree of Carry-Save Adders Reducing Seven Numbers to Two

31 Addition of Seven Six-Bit Numbers in Dot Notation

32 Width of Adders in a CSA Tree Due to the gradual retirement (dropping out) of some of the result bits, CSA widths do not vary much as we go down the tree levels

33 Serial Carry Save Adder

34 Latency Latency CSA = h(n)  T FA + Latency CPA (k, n) Tree height for n operands Widths CSA CPA k.. k + log 2 n typically close to k bits  k + log 2 n Component Adders Parameters of Tree Carry-Save Adders

35 Relationship Between Number of Inputs and Tree Height

36 Maximum number of inputs that can be reduced to two by an h-level tree, n(h) n(0) = 2 n(h) = n(h-1) 3 2 n(1) = 3 n(2) = 4 n(3) = 6 n(4) = 9 n(5) = 13 n(6) = 19 2 ( ) h-1 < n(h)  2 ( ) h Number of Inputs as Function of Tree Height

37 Maximum number of input n(h)

38 Smallest height of the tree carry save adder for n operands, h(n) h(n) = 1 + h ( ) 2 3 n h(n)  log ( ) h(2) = 0 n Tree Height as Function of Number of Inputs

Wallace and Dadda Trees ECE 645 – Computer Arithmetic

40 Wallace Trees So far all examples of carry-save adder trees have been Wallace Trees Wallace Trees Reduce the size of the final Carry Propagate Adder (CPA) Generally optimum from the point of view of speed Dadda Trees Reduce the cost of the carry save tree Generally optimum (among the CSA trees) from the point of view of area

41 Wallace and Dadda Trees Wallace reduces number of operands at earliest opportunity Goal of this is to have smallest number of bits for CPA adder However, sometimes having a few bits longer CPA adder does not affect the propagation delay significantly (i.e. carry-lookahead) Dadda seeks to reduce the number of FA and HA units May be at the cost of a slightly larger final CPA

42 Wallace Tree

43 Dadda Tree Optimization Step 11 FAs 3 FAs + 2 HAs

5:3 Counter and Generalized Counters ECE 645 – Computer Arithmetic

45 General Compressors So far we have seen a 3 input to 2 output reduction circuit. Often called: Carry-save adder 3:2 compressor 3:2 counter This was implemented as a full adder Can also have other compression ratios 5:3 is an example

abcdeabcde s0 s1 s2 a+b+c+d+e = s0+s1+s2 5-to-3 Parallel Counter (or Compressor)

47 Implementation of 1-bit of 5-to-3 Parallel Counter using Single CLB Slice of a Virtex FPGA

48 ww ww w w CSA CPA y=a+b+c+d+e mod 2 w abcde wwwww w PC CSA CPA y=a+b+c+d+e mod 2 w ab cde s2 s1 s0 Carry Save Adder vs. 5-to-3 Parallel Counter

49 (5, 5; 4)-counter Fig Dot notation for a (5, 5; 4)-counter and the use of such counters for reducing five numbers to two numbers. Multicolumn reduction (2, 3; 3)-counter Unequal columns Generalized parallel counter = Parallel compressor Generalized Parallel Counters 1 st column 2 nd column output