Instructor: Prof. Chung-Kuan Cheng

Slides:



Advertisements
Similar presentations
1 ECE 4436ECE 5367 Computer Arithmetic I-II. 2 ECE 4436ECE 5367 Addition concepts 1 bit adder –2 inputs for the operands. –Third input – carry in from.
Advertisements

ECE2030 Introduction to Computer Engineering Lecture 13: Building Blocks for Combinational Logic (4) Shifters, Multipliers Prof. Hsien-Hsin Sean Lee School.
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
Fast Adders See: P&H Chapter 3.1-3, C Goals: serial to parallel conversion time vs. space tradeoffs design choices.
Fast Adders See: P&H Chapter 3.1-3, C Goals: serial to parallel conversion time vs. space tradeoffs design choices.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 5.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Winter 2004 Lecture 4.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 4.
Aug Shift Operations Source: David Harris. Aug Shifter Implementation Regular layout, can be compact, use transmission gates to avoid threshold.
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
Chapter 6-1 ALU, Adder and Subtractor
Multi-operand Addition
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
Morgan Kaufmann Publishers
EE2174: Digital Logic and Lab Professor Shiyan Hu Department of Electrical and Computer Engineering Michigan Technological University CHAPTER 8 Arithmetic.
Topics covered: Arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Jianhua Liu1, Yi Zhu1, Haikun Zhu1, John Lillis2, Chung-Kuan Cheng1
EKT 221 : Digital 2 Serial Transfers & Microoperations Date : Lecture : 2 hr.
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
VLSI Physical Design Automation
Combinational Circuits
EKT 221 : Digital 2 Serial Transfers & Microoperations
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
IP Routers – internal view
CSE241A VLSI Digital Circuits Winter 2003 Recitation 2
UNIVERSITY OF MASSACHUSETTS Dept
EKT 221 : Digital 2 Serial Transfers & Microoperations
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
Lecture 11: Tree Search © J. Christopher Beck 2008.
Multipliers Multipliers play an important role in today’s digital signal processing and various other applications. The common multiplication method is.
Summary Half-Adder Basic rules of binary addition are performed by a half adder, which has two binary inputs (A and B) and two binary outputs (Carry out.
Registers and Counters Register : A Group of Flip-Flops. N-Bit Register has N flip-flops. Each flip-flop stores 1-Bit Information. So N-Bit Register Stores.
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
Topics Number representation. Shifters. Adders and ALUs.
CSE Winter 2001 – Arithmetic Unit - 1
Unsigned Multiplication
Arithmetic Functions & Circuits
Instruction Scheduling Hal Perkins Winter 2008
Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University
Multiplier-less Multiplication by Constants
Programmable Configurations
Instructor: Prof. Chung-Kuan Cheng
Topics Multipliers..
Chapter 1_5 register Cell Design
CSE 140L Discussion 3 CK Cheng and Thomas Weng
CS 140 Lecture 14 Standard Combinational Modules
Sungho Kang Yonsei University
COMS 361 Computer Organization
CSE 140 Lecture 14 Standard Combinational Modules
A General Backtracking Algorithm
Instructor: Mozafar Bag-Mohammadi University of Ilam
Artificial Intelligence Chapter 8 Uninformed Search
UNIVERSITY OF MASSACHUSETTS Dept
Instruction Scheduling Hal Perkins Autumn 2005
UNIVERSITY OF MASSACHUSETTS Dept
How does CLA (carry look-ahead adder) work?
ECE 352 Digital System Fundamentals
Combinational Circuits
ECE 352 Digital System Fundamentals
ECE 352 Digital System Fundamentals
UNIVERSITY OF MASSACHUSETTS Dept
Lecture 9 Digital VLSI System Design Laboratory
Comparison of Various Multipliers for Performance Issues
UNIVERSITY OF MASSACHUSETTS Dept
Booth Recoding: Advantages and Disadvantages
Instruction execution and ALU
Instruction Scheduling Hal Perkins Autumn 2011
Lecture 3 Combinational units. Adders
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

Instructor: Prof. Chung-Kuan Cheng CSE246 Adder – Part II Instructor: Prof. Chung-Kuan Cheng

Zimmerman’s Heuristic Approach Problem formulation Given depth constraint, generate a parallel prefix adder of minimum size Two step Heuristic Start with a serial prefix adder Compress to a fastest prefix structure at the cost of increasing size LSB to MSB, low level to high level Expand to reduce size, subject to depth constraint MSB to LSB, high level to low level 2018/12/2

Zimmerman’s Heuristic Approach Local compression/expansion operation Up/down shift The compression oper 2018/12/2

Zimmerman’s Heuristic Approach Advantages Simple and fast Product depth-size optimal result in many cases Handles non-uniform input arrival times Disadvantage No guarantee on optimality 2018/12/2

Prefix Adder with arbitrary input arrival time profile Non-uniform input arrival times represented in real number How to construct the fastest prefix adder under arbitrary input arrival time profile? 2018/12/2

Cont’ Timing model t[i:j] = max{t[i:k] , t[k-1:j] }+C All (G,P) generators have the same delay C Denote the output timing of generator (G,P)[i:j] as t[i:j] Suppose in the prefix graph, (G,P)[i:j] is generated from (G,P)[j:k] and (G,P)[k-1:j], then t[i:j] = max{t[i:k] , t[k-1:j] }+C 2018/12/2

Dynamic Programming – The idea Image a full array of partial prefix results All (G,P) signals of length i are on level i Rightmost signals are wanted prefix results Generate all the (G,P) signals row by row, from lower level to higher level For each (G,P) signal, find the scheme that leads to best timing, i.e., find the partition point k such that (G,P)[i:j] = (G,P)[i:k] (G,P)[k-1:j] t[i:j] = min{max{t[i:k] , t[k-1:j] }+C} t[n:n] t[n-1:n-1] t[2:2] t[1:1] Level 1: Level 2: Level n: … . k t[n:n-1] t[2:1] t[n:n-2] t[3:1] t[n:2] t[n-1:1] t[n:1] 2018/12/2

Dynamic Programming A 5-bit example 2(g4p4) 4(g3p3) 3(g2p2) 1(g1p1) Level 1 8 7(GP[3,0]) Level 4 8 7 5(GP[2,0]) Level 3 6 5 3(GP[1,0]) Level 2 7 8 8(GP[4,0]) Level 5 2018/12/2

Dynamic Programming Complexity Hints for reducing complexity For (G,P)[i:j], search (i-j) combinations Overall O(n3) Hints for reducing complexity For (G,P)[i:j], there might more than one optimal partition points, but we want just one At least one optimal partition point of (G,P)[i:j] is bounded by the optimal partition points of (G,P)[i-1:j] and (G,P)[i:j+1] 2018/12/2

Backward Reduction I Some of the partial prefix results are not used, hence can be removed Level 1 Level 2 Level 3 Level 4 Level 5 (a) (b) 2018/12/2

Backward Reduction II Some nodes may be over tightened, and can be relaxed to reduce area 3(g4p4) 6(g3p3) 7(g2p2) 11(g1p1) 13 (G,P)[2,1] 9 8 10 (G,P)[3,1] (G,P)[4,1] (G,P)[4,2] (13) 3(g4p4) 6(g3p3) 7(g2p2) 11(g1p1) 13 (G,P)[2,1] (G,P)[3,1] (G,P)[4,1] (G,P)[4,2] 11 (11) 8 () 9 (9) 10 (11) 8 (9) 9 11 (11) 9 (9) (13) (11) (9) 2018/12/2

A missing detail (G,P) signals allows overlap  search space increases However, allowing overlapping does not produce better timing (G,P)[i:j] = (G,P)[i:k] (G,P)[l:j] l ≥k 2018/12/2

Function level optimization Carry Skip Adder A0 a3,0 b3,0 cin c4 1 p3,0 A1 a7,4 b7,4 c8 p7,4 A2 a11,8 b11,8 c12 p11,8 x If p3,0=p3p2p1p0 = 1, then x = cin 2018/12/2

False Path A1 <- MUX <- A0 <- cin is a false path If carry is from cin, then block must have p3p2p1p0 = 1 Since p3,0 = 1, g3,0 must be 0 The carry is not generated from A0 The carry needs not to propagate via A0, it will go from the MUX 2018/12/2

False Path: Cycles A3,0 B3,0 Cout Cin Adder S3,0 Cycles of False Paths: Eg. 1’s complement number addition Positive: x Negative: (2n-1)-x Addition (2n-1)-x + (2n-1)-y = 2n+(2n-1)-(x+y)-1 A3,0 B3,0 Cout Cin Adder S3,0 2018/12/2

Example -3-5 = -8 11100 -3 + 11010 -5 110110 110111 -8 0+0=0 11111 0 + 11111 0 111110 111111 0 2018/12/2

Multi-Operand Addition Carry save adder: a (3,2) counter 2018/12/2

Example A (3,2) counter compresses X rows to 2/3X rows each time Tree structure in implementation 2018/12/2

Other Counters (7,3) counter (5,3) counter Ca Cb S0 S2 S1 S0 Design of (5,3) counter using full adders Ca Cb S0 2018/12/2