Lecture 11 Advanced Dividers.

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

Arithmetic in Computers Chapter 4 Arithmetic in Computers2 Outline Data representation integers Unsigned integers Signed integers Floating-points.
ECE 645 – Computer Arithmetic Lecture 11: Advanced Topics and Final Review ECE 645—Computer Arithmetic 4/22/08.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 8 - Multiplication.
Lecture Objectives: 1)Perform binary division of two numbers. 2)Define dividend, divisor, quotient, and remainder. 3)Explain how division is accomplished.
UNIVERSITY OF MASSACHUSETTS Dept
EE 382 Processor DesignWinter 98/99Michael Flynn 1 AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under.
Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Computer Organization Multiplication and Division Feb 2005 Reading: Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann.
Chapter # 5: Arithmetic Circuits Contemporary Logic Design Randy H
May 2007Computer Arithmetic, DivisionSlide 1 Part IV Division.
Lecture 8 Arithmetic Logic Circuits
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
May 2012Computer Arithmetic, DivisionSlide 1 Part IV Division 28. Reconfigurable Arithmetic Appendix: Past, Present, and Future.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 8: Division.
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
COE 308: Computer Architecture (T041) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Winter 2004 Lecture 7.
Computer Arithmetic Integers: signed / unsigned (can overflow) Fixed point (can overflow) Floating point (can overflow, underflow) (Boolean / Character)
Ch. 21. Square-rootingSlide 1 VI Function Evaluation Topics in This Part Chapter 21 Square-Rooting Methods Chapter 22 The CORDIC Algorithms Chapter 23.
ECE 645 – Computer Arithmetic Lecture 10: Fast Dividers ECE 645—Computer Arithmetic 4/15/08.
Lecture 10 Fast Dividers.
ECE 645 – Computer Arithmetic Lecture 9: Basic Dividers ECE 645—Computer Arithmetic 4/1/08.
ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
Lecture Notes Dr. Rakhmad Arief Siregar Universiti Malaysia Perlis
Chapter # 5: Arithmetic Circuits
Follow-up Courses. ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL.
Multiplication of signed-operands
Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.
July 2005Computer Architecture, The Arithmetic/Logic UnitSlide 1 Part III The Arithmetic/Logic Unit.
Computer Arithmetic and the Arithmetic Unit Lesson 2 - Ioan Despi.
ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.
June 2007 Computer Arithmetic, Function EvaluationSlide 1 VI Function Evaluation Topics in This Part Chapter 21 Square-Rooting Methods Chapter 22 The CORDIC.
1/30 Division by Convergence 授課老師:王立洋老師 製作學生: M 蔡鐘葳.
Lecture 12: Integer Arithmetic and Floating Point CS 2011 Fall 2014, Dr. Rozier.
Division Harder Than Multiplication Because Quotient Digit Selection/Estimation Can Have Overflow Condition – Divide by Small Number OR even Worse – Divide.
1 Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.1, Shift/Subtract Division Algorithms 13.3, Restoring Hardware Dividers.
1 Basic Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.1, Shift/Subtract Division Algorithms 13.3, Restoring Hardware Dividers.
05/03/2009CA&O Lecture 8,9,10 By Engr. Umbreen sabir1 Computer Arithmetic Computer Engineering Department.
Advanced Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.4, Non-Restoring and Signed Division Chapter 15 Variation in Dividers.
Basic Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.1, Shift/Subtract Division Algorithms 13.3, Restoring Hardware Dividers.
FAMU-FSU College of Engineering 1 Part III The Arithmetic/Logic Unit.
COE 308: Computer Architecture (T032) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank Competency Area 4: Computer Arithmetic.
Copyright 2008 Koren ECE666/Koren Part.7b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
1 Basic Dividers Lecture 9. Required Reading Chapter 13, Basic Division Schemes 13.1, Shift/Subtract Division Algorithms 13.3, Restoring Hardware Dividers.
FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Spring 2006 Dr. Michael Frank Module #9 – Number Representations.
ECE/CS 552: Integer Dividers
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 7 Division.
CSE 8351 Computer Arithmetic Fall 2005 Instructors: Peter-Michael Seidel.
Chapter 8 Computer Arithmetic. 8.1 Unsigned Notation Non-negative notation  It treats every number as either zero or a positive value  Range: 0 to 2.
Arithmetic for Computers Chapter 3 1. Arithmetic for Computers  Operations on integers  Addition and subtraction  Multiplication and division  Dealing.
More Binary Arithmetic - Multiplication
CORDIC (Coordinate rotation digital computer)
UNIVERSITY OF MASSACHUSETTS Dept
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
UNIVERSITY OF MASSACHUSETTS Dept
Lecture 9 Basic Dividers.
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
Radix 2 Sequential Multipliers
Arithmetic Circuits (Part I) Randy H
Computer Arithmetic Multiplication, Floating Point
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
Lecture 9 Basic Dividers.
Dr. Clincy Professor of CS
Presentation transcript:

Lecture 11 Advanced Dividers

Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division by Convergence

Division versus Multiplication Division is more complex than multiplication: Need for quotient digit selection or estimation Overflow possibility: the high-order k bits of z must be strictly less than d; this overflow check also detects the divide-by-zero condition. Pentium III latencies Instruction Latency Cycles/Issue Load / Store 3 1 Integer Multiply 4 1 Integer Divide 36 36 Double/Single FP Multiply 5 2 Double/Single FP Add 3 1 Double/Single FP Divide 38 38 The ratios haven’t changed much in later Pentiums, Atom, or AMD products* *Source: T. Granlund, “Instruction Latencies and Throughput for AMD and Intel x86 Processors,” Feb. 2012 May 2012 Computer Arithmetic, Division

Classification of Dividers Array Dividers Dividers by Convergence Sequential Radix-2 High-radix Restoring Non-restoring regular SRT using carry save adders SRT using carry save adders

Fractional Division

Unsigned Fractional Division zfrac Dividend .z-1z-2 . . . z-(2k-1)z-2k dfrac Divisor .d-1d-2 . . . d-(k-1) d-k qfrac Quotient .q-1q-2 . . . q-(k-1) q-k sfrac Remainder .000…0s-(k+1) . . . s-(2k-1) s-2k k bits

Integer vs. Fractional Division For Integers: z = q d + s  2-2k z 2-2k = (q 2-k) (d 2-k) + s (2-2k) For Fractions: zfrac = qfrac dfrac + sfrac where zfrac = z 2-2k dfrac = d 2-k qfrac = q 2-k sfrac = s 2-2k

Unsigned Fractional Division Overflow Condition for no overflow: zfrac < dfrac

Sequential Fractional Division Basic Equations s(0) = zfrac s(j) = 2 s(j-1) - q-j dfrac for j=1..k 2k · sfrac = s(k) sfrac = 2-k · s(k)

Fig. 13.2 Examples of sequential division with integer and fractional operands.

Array Dividers

Sequential Fractional Division Basic Equations sfrac(0) = zfrac s(j) = 2 s(j-1) - q-j dfrac s(k)frac = 2k sfrac

Restoring Unsigned Fractional Division s(0) = z for j = 1 to k if 2 s(j-1) - d > 0 q-j = 1 s(j) = 2 s(j-1) - d else q-j = 0 s(j) = 2 s(j-1)

Restoring Array Divider May 2012 Computer Arithmetic, Division

Non-Restoring Unsigned Fractional Division s(-1) = z-d for j = 0 to k-1 if s(j-1) > 0 q-j = 1 s(j) = 2 s(j-1) - d else q-j = 0 s(j) = 2 s(j-1) + d end for if s(k-1) > 0 q-k = 1 q-k = 0

Nonrestoring Array Divider Similarity to array multiplier is deceiving Critical path May 2012 Computer Arithmetic, Division

Division by Convergence

Division by Convergence Chapter Goals Show how by using multiplication as the basic operation in each division step, the number of iterations can be reduced Chapter Highlights Digit-recurrence as convergence method Convergence by Newton-Raphson iteration Computing the reciprocal of a number Hardware implementation and fine tuning May 2012 Computer Arithmetic, Division

16.1 General Convergence Methods Sequential digit-at-a-time (binary or high-radix) division can be viewed as a convergence scheme As each new digit of q = z / d is determined, the quotient value is refined, until it reaches the final correct value Convergence is from below in restoring division and oscillating in nonrestoring division Digit 0.101101 q 1 Meanwhile, the remainder s = z – q  d approaches 0; the scaled remainder is kept in a certain range, such as [– d, d) May 2012 Computer Arithmetic, Division

Elaboration on Scaled Remainder in Division The partial remainder s(j) in division recurrence isn’t the true remainder but a version scaled by 2j Division with left shifts s(j) = 2s(j–1) – qk–j (2k d) with s(0) = z and |–shift–| s(k) = 2ks |––– subtract –––| Digit 0.101101 q 1 Quotient digit selection keeps the scaled remainder bounded (say, in the range –d to d) to ensure the convergence of the true remainder to 0 May 2012 Computer Arithmetic, Division

Recurrence Formulas for Convergence Methods u (i+1) = f(u (i), v (i)) v (i+1) = g(u (i), v (i)) u (i+1) = f(u (i), v (i), w (i)) v (i+1) = g(u (i), v (i), w (i)) w (i+1) = h(u (i), v (i), w (i)) Constant Desired function Guide the iteration such that one of the values converges to a constant (usually 0 or 1) The other value then converges to the desired function The complexity of this method depends on two factors: a. Ease of evaluating f and g (and h) b. Rate of convergence (number of iterations needed) May 2012 Computer Arithmetic, Division

16.2 Division by Repeated Multiplications Motivation: Suppose add takes 1 clock and multiply 3 clocks 64-bit divide takes 64 clocks in radix 2, 32 in radix 4  Divide faster via multiplications faster if 10 or fewer needed Idea: Converges to q Force to 1 Remainder often not needed, but can be obtained by another multiplication if desired: s = z – qd To turn the identity into a division algorithm, we face three questions: 1. How to select the multipliers x(i) ? 2. How many iterations (pairs of multiplications)? 3. How to implement in hardware? May 2012 Computer Arithmetic, Division

Formulation as a Convergence Computation Idea: Force to 1 Converges to q d (i+1) = d (i) x (i) Set d (0) = d; make d (m) converge to 1 z (i+1) = z (i) x (i) Set z (0) = z; obtain z/d = q  z (m) Question 1: How to select the multipliers x (i) ? x (i) = 2 – d (i) This choice transforms the recurrence equations into: d (i+1) = d (i) (2 - d (i)) Set d (0) = d; iterate until d (m)  1 z (i+1) = z (i) (2 - d (i)) Set z (0) = z; obtain z/d = q  z (m) u (i+1) = f(u (i), v (i)) v (i+1) = g(u (i), v (i)) Fits the general form May 2012 Computer Arithmetic, Division

Determining the Rate of Convergence d (i+1) = d (i) x (i) Set d (0) = d; make d (m) converge to 1 z (i+1) = z (i) x (i) Set z (0) = z; obtain z/d = q  z (m) Question 2: How quickly does d (i) converge to 1? We can relate the error in step i + 1 to the error in step i: d (i+1) = d (i) (2 - d (i)) = 1 – (1 – d (i))2 1 – d (i+1) = (1 – d (i))2 For 1 – d (i)  e, we get 1 – d (i+1)  e2: Quadratic convergence In general, for k-bit operands, we need 2m – 1 multiplications and m 2’s complementations where m = log2 k May 2012 Computer Arithmetic, Division

Quadratic Convergence Table 16.1 Quadratic convergence in computing z/d by repeated multiplications, where 1/2  d = 1 – y < 1 ––––––––––––––––––––––––––––––––––––––––––––––––––––––– i d (i) = d (i–1) x (i–1), with d (0) = d x (i) = 2 – d (i) 0 1 – y = (.1xxx xxxx xxxx xxxx)two  1/2 1 + y 1 1 – y 2 = (.11xx xxxx xxxx xxxx)two  3/4 1 + y 2 2 1 – y 4 = (.1111 xxxx xxxx xxxx)two  15/16 1 + y 4 3 1 – y 8 = (.1111 1111 xxxx xxxx)two  255/256 1 + y 8 4 1 – y 16 = (.1111 1111 1111 1111)two = 1 – ulp Each iteration doubles the number of guaranteed leading 1s (convergence to 1 is from below) Beginning with a single 1 (d  ½), after log2 k iterations we get as close to 1 as is possible in a fractional representation May 2012 Computer Arithmetic, Division

Graphical Depiction of Convergence to q Fig. 16.1 Graphical representation of convergence in division by repeated multiplications. May 2012 Computer Arithmetic, Division

16.5 Hardware Implementation Repeated multiplications: Each pair of ops involves the same multiplier d (i+1) = d (i) (2 - d (i)) Set d (0) = d; iterate until d (m)  1 z (i+1) = z (i) (2 - d (i)) Set z (0) = z; obtain z/d = q  z (m) Fig. 16.6 Two multiplications fully overlapped in a 2-stage pipelined multiplier. May 2012 Computer Arithmetic, Division

16.3 Division by Reciprocation The Newton-Raphson method can be used for finding a root of f (x) = 0 Start with an initial estimate x(0) for the root Iteratively refine the estimate via the recurrence x(i+1) = x(i) – f (x(i)) / f (x(i)) Justification: tan a(i) = f (x(i)) = f (x(i)) / (x(i) – x(i+1)) Fig. 16.2 Convergence to a root of f(x) = 0 in the Newton-Raphson method. May 2012 Computer Arithmetic, Division

Computing 1/d by Convergence 1/d is the root of f (x) = 1/x – d f (x) = –1/x2 Substitute in the Newton-Raphson recurrence x(i+1) = x(i) – f (x(i)) / f (x(i)) to get: x (i+1) = x (i) (2 - x (i)d) One iteration = Two multiplications + One 2’s complementation Error analysis: Let d (i) = 1/d – x(i) be the error at the ith iteration d (i+1) = 1/d – x (i+1) = 1/d – x (i) (2 – x (i) d) = d (1/d – x (i))2 = d (d (i))2 Because d < 1, we have d (i+1) < (d (i))2 -d 1/d x f(x)   May 2012 Computer Arithmetic, Division

Choosing the Initial Approximation to 1/d With x(0) in the range 0 < x(0) < 2/d, convergence is guaranteed Justification: | d(0) | = | x(0) – 1/d | < 1/d d(1) = | x(1) – 1/d | = d (d(0))2 = (d d(0)) d(0) < d(0) For d in [1/2, 1): Simple choice x(0) = 1.5 Max error = 0.5 < 1/d Better approx. x(0) = 4(3 – 1) – 2d = 2.9282 – 2d Max error  0.1 1 x 1/x 2   May 2012 Computer Arithmetic, Division

16.4 Speedup of Convergence Division Compute y = 1/d Do the multiplication yz Division can be performed via 2 log2 k – 1 multiplications This is not yet very impressive 64-bit numbers, 3-ns multiplier  33-ns division Three types of speedup are possible: Fewer multiplications (reduce m) Narrower multiplications (reduce the width of some x(i)s) Faster multiplications May 2012 Computer Arithmetic, Division

Initial Approximation via Table Lookup Convergence is slow in the beginning: it takes 6 multiplications to get 8 bits of convergence and another 5 to go from 8 bits to 64 bits d x(0) x(1) x(2) = (0.1111 1111 . . . )two Better approx Approx to 1/d Read this value, x(0+), directly from a table, thereby reducing 6 multiplications to 2 A 2w  w lookup table is necessary and sufficient for w bits of convergence after 2 multiplications Example with 4-bit lookup: d = 0.1011 xxxx . . . (11/16  d < 12/16) Inverses of the two extremes are 16/11  1.0111 and 16/12  1.0101 So, 1.0110 is a good estimate for 1/d 1.0110  0.1011 = (11/8)  (11/16) = 121/128 = 0.1111001 1.0110  0.1100 = (11/8)  (3/4) = 33/32 = 1.000010 May 2012 Computer Arithmetic, Division

Visualizing the Convergence with Table Lookup Fig. 16.3 Convergence in division by repeated multiplications with initial table lookup. May 2012 Computer Arithmetic, Division

Convergence Does Not Have to Be from Below Fig. 16.4 Convergence in division by repeated multiplications with initial table lookup and the use of truncated multiplicative factors. May 2012 Computer Arithmetic, Division

Sequential Dividers with Carry-Save Adders

Block diagram of a radix-2 SRT divider with partial remainder in stored-carry form

Pentium bug (1) October 1994 Thomas Nicely, Lynchburg Collage, Virginia finds an error in his computer calculations, and traces it back to the Pentium processor November 7, 1994 First press announcement, Electronic Engineering Times Late 1994 Tim Coe, Vitesse Semiconductor presents an example with the worst-case error c = 4 195 835/3 145 727 Pentium = 1.333 739 06... Correct result = 1.333 820 44...

Pentium bug (2) Intel admits “subtle flaw” November 30, 1994 Intel’s white paper about the bug and its possible consequences Intel - average spreadsheet user affected once in 27,000 years IBM - average spreadsheet user affected once every 24 days Replacements based on customer needs December 20, 1994 Announcement of no-question-asked replacements

Pentium bug (3) Error traced back to the look-up table used by the radix-4 SRT division algorithm 2048 cells, 1066 non-zero values {-2, -1, 1, 2} 5 non-zero values not downloaded correctly to the lookup table due to an error in the C script

Follow-up Courses

DIGITAL SYSTEMS DESIGN ECE 681 VLSI Design for ASICs (Fall semesters) H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 2. ECE 699 Digital Signal Processing Hardware Architectures (Spring semesters) A. Cohen, project, FPGA design for DSP ECE 682 VLSI Test Concepts (Spring semesters) T. Storey, homework

NETWORK AND SYSTEM SECURITY ECE 646 Cryptography and Computer Network Security (Fall semesters) K.Gaj, hardware, software, or analytical project 2. ECE 746 Advanced Applied Cryptography (Spring semesters) J.-P. Kaps, hardware, software, or analytical project ECE 899 Cryptographic Engineering (Spring semesters) J.-P. Kaps, research-oriented project