June 2007 Computer Arithmetic, Function EvaluationSlide 1 VI Function Evaluation Topics in This Part Chapter 21 Square-Rooting Methods Chapter 22 The CORDIC.

Slides:

Advertisements

Similar presentations

Roundoff and truncation errors

Advertisements

ECE 645 – Computer Arithmetic Lecture 11: Advanced Topics and Final Review ECE 645—Computer Arithmetic 4/22/08.

CENG536 Computer Engineering department Çankaya University.

Chapter 3. Interpolation and Extrapolation Hui Pan, Yunfei Duan.

Radix Conversion Given a value X represented in source system with radix  s, represent the same number in a destination system with radix  d Consider.

Floating Point Numbers

INFINITE SEQUENCES AND SERIES

EE 382 Processor DesignWinter 98/99Michael Flynn 1 AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under.

Copyright 2008 Koren ECE666/Koren Part.9b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.

Algorithm Design Techniques: Induction Chapter 5 (Except Section 5.6)

Level ISA3: Information Representation

TECHNIQUES OF INTEGRATION

CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 11 Cordic, Log, Square, Exponential Functions.

UNIVERSITY OF MASSACHUSETTS Dept

UNIVERSITY OF MASSACHUSETTS Dept

Digital Kommunikationselektronik TNE027 Lecture 3 1 Multiply-Accumulator (MAC) Compute Sum of Product (SOP) Linear convolution y[n] = f[n]*x[n] = Σ f[k]

Spring 2006EE VLSI Design II - © Kia Bazargan 368 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part IX: CORDIC Algorithms.

June 2007Computer Arithmetic, Function EvaluationSlide 1 Part VI Function Evaluation.

May 2012Computer Arithmetic, Function EvaluationSlide 1 Part VI Function Evaluation 28. Reconfigurable Arithmetic Appendix: Past, Present, and Future.

NUMERICAL METHODS WITH C++ PROGRAMMING

COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.

Computer Arithmetic Integers: signed / unsigned (can overflow) Fixed point (can overflow) Floating point (can overflow, underflow) (Boolean / Character)

Ch. 21. Square-rootingSlide 1 VI Function Evaluation Topics in This Part Chapter 21 Square-Rooting Methods Chapter 22 The CORDIC Algorithms Chapter 23.

ECE 645 – Computer Arithmetic Lecture 10: Fast Dividers ECE 645—Computer Arithmetic 4/15/08.

ECE 645 – Computer Arithmetic Lecture 9: Basic Dividers ECE 645—Computer Arithmetic 4/1/08.

Binary Representation and Computer Arithmetic

Preview of Calculus.

Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.

CpE- 310B Engineering Computation and Simulation Dr. Manal Al-Bzoor

1 Calculating Polynomials We will use a generic polynomial form of: where the coefficient values are known constants The value of x will be the input and.

(2.1) Fundamentals  Terms for magnitudes – logarithms and logarithmic graphs  Digital representations – Binary numbers – Text – Analog information 

Data Representation – Binary Numbers

Taylor Series & Error. Series and Iterative methods Any series ∑ x n can be turned into an iterative method by considering the sequence of partial sums.

Department of Computer Systems Engineering, N-W.F.P. University of Engineering & Technology. DSP Presentation Computing Multiplication & division using.

Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large numbers,

Analysis of Algorithms

Integer Conversion Between Decimal and Binary Bases Conversion of decimal to binary more complicated Task accomplished by –Repeated division of decimal.

Chapter 3 Whole Numbers Section 3.6 Algorithms for Whole-Number Multiplication and Division.

Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.

Computer Arithmetic and the Arithmetic Unit Lesson 2 - Ioan Despi.

Fixed and Floating Point Numbers Lesson 3 Ioan Despi.

ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.

In section 11.9, we were able to find power series representations for a certain restricted class of functions. Here, we investigate more general problems.

Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication

CORDIC Algorithm COordinate Rotation DIgital Computer Method for Elementary Function Evaluation (e.g., sin(z), cos(z), tan -1 (y)) Originally Used for.

Advanced Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.4, Non-Restoring and Signed Division Chapter 15 Variation in Dividers.

FAMU-FSU College of Engineering 1 Part III The Arithmetic/Logic Unit.

Lecture 11 Advanced Dividers.

FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Spring 2006 Dr. Michael Frank Module #9 – Number Representations.

CORDIC Algorithm COordinate Rotation DIgital Computer

Recursive Architectures for 2DLNS Multiplication RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR 11 Recursive Architectures for 2DLNS.

CSE 8351 Computer Arithmetic Fall 2005 Instructors: Peter-Michael Seidel.

ELEC692 VLSI Signal Processing Architecture Lecture 12 Numerical Strength Reduction.

Digital logic COMP214  Lecture 2 Dr. Sarah M.Eljack Chapter 1 1.

NUMERICAL ANALYSIS I. Introduction Numerical analysis is concerned with the process by which mathematical problems are solved by the operations.

Part III The Arithmetic/Logic Unit

UNIVERSITY OF MASSACHUSETTS Dept

Morgan Kaufmann Publishers

CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu

UNIVERSITY OF MASSACHUSETTS Dept

CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu

CSCE 350 Computer Architecture

A Case for Table-Based Approximate Computing

Part VI Function Evaluation

UNIVERSITY OF MASSACHUSETTS Dept

UNIVERSITY OF MASSACHUSETTS Dept

UNIVERSITY OF MASSACHUSETTS Dept

CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu

CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.

Presentation transcript:

June 2007 Computer Arithmetic, Function EvaluationSlide 1 VI Function Evaluation Topics in This Part Chapter 21 Square-Rooting Methods Chapter 22 The CORDIC Algorithms Chapter 23 Variation in Function Evaluation Chapter 24 Arithmetic by Table Lookup Learn hardware algorithms for evaluating useful functions Divisionlike square-rooting algorithms Evaluating sin x, tanh x, ln x,... by series expansion Function evaluation via convergence computation Use of tables: the ultimate in simplicity and flexibility

June 2007 Computer Arithmetic, Function EvaluationSlide 2 23 Variations in Function Evaluation Chapter Goals Learning alternate computation methods (convergence and otherwise) for some functions computable through CORDIC Chapter Highlights Reasons for needing alternate methods: Achieve higher performance or precision Allow speed/cost tradeoffs Optimizations, fit to diverse technologies

June 2007 Computer Arithmetic, Function EvaluationSlide 3 Variations in Function Evaluation: Topics Topics in This Chapter Additive / Multiplicative Normalization Computing Logarithms Exponentiation Division and Square-Rooting, Again Use of Approximating Functions Merged Arithmetic

June 2007 Computer Arithmetic, Function EvaluationSlide Additive / Multiplicative Normalization u (i+1) = f(u (i), v (i), w (i) ) v (i+1) = g(u (i), v (i), w (i) ) w (i+1) = h(u (i), v (i), w (i) ) u (i+1) = f(u (i), v (i) ) v (i+1) = g(u (i), v (i) ) Additive normalization: Normalize u via addition of terms to it Constant Desired function Guide the iteration such that one of the values converges to a constant (usually 0 or 1); this is known as normalization The other value then converges to the desired function Multiplicative normalization: Normalize u via multiplication of terms Additive normalization is more desirable, unless the multiplicative terms are of the form 1 ± 2 a (shift-add) or multiplication leads to much faster convergence compared with addition

June 2007 Computer Arithmetic, Function EvaluationSlide 5 Convergence Methods You Already Know CORDIC Example of additive normalization x (i+1) = x (i) –   d i y (i) 2 –i y (i+1) = y (i) + d i x (i) 2 –i z (i+1) = z (i) – d i e (i) Division by repeated multiplications Example of multiplicative normalization d (i+1) = d (i) (2  d (i) )Set d (0) = d; iterate until d (m)  1 z (i+1) = z (i) (2  d (i) ) Set z (0) = z; obtain z/d = q  z (m) Force y or z to 0 by adding terms to it Force d to 1 by multiplying terms with it

June 2007 Computer Arithmetic, Function EvaluationSlide Computing Logarithms x (i+1) = x (i) c (i) = x (i) (1 + d i 2 –i ) y (i+1) = y (i) – ln c (i) = y (i) – ln(1 + d i 2 –i ) Read out from table Force x (m) to 1 y (m) converges to y + ln x d i  {  1, 0, 1} 0 Why does this multiplicative normalization method work? x (m) = x  c (i)  1   c (i)  1/x y (m) = y –  ln c (i) = y – ln (  c (i) ) = y – ln(1/x)  y + ln x If starting with y = 0, y (m) = lnx x (0) = x, y (0) =y

June 2007 Computer Arithmetic, Function EvaluationSlide Computing Logarithms Convergence domain:1/  (1 + 2 –i )  x  1/  (1 – 2 –i ) or 0.21  x  3.45 Number of iterations:k, for k bits of precision; for large i, ln(1  2 –i )   2 –i Use directly for x  [1, 2). For x = 2 q s, we have: ln x = q ln 2 + ln s = q + ln s Radix-4 version can be devised For log 2 x = log 2 (2 q s) = q+log 2 s = q +log 2 e x ln s =q x ln s

June 2007 Computer Arithmetic, Function EvaluationSlide 8 Computing Binary Logarithms via Squaring 1 For x  [1, 2), log 2 x is a fractional number y = (. y –1 y –2 y –3... y –l ) two x = 2 y = 2 x 2 = 2 2y = 2  y –1 = 1 iff x 2  2 (. y –1 y –2 y –3... y –l ) two (y –1. y –2 y –3... y –l ) two Once y –1 has been determined, if y –1 = 0, we are back at the original situation; otherwise, divide both sides of the equation above by 2 to get: x 2 /2 = 2 /2 = 2 (1. y –2 y –3... y –l ) two (. y –2 y –3... y –l ) two

June 2007 Computer Arithmetic, Function EvaluationSlide 9 Computing Binary Logarithms via Squaring 2 The complete procedure for computing log 2 x for 1 ≦ x ≦ 2 is : For i from 1 to l do x ← x 2 if x ≧ 2 then y -i = 1 ; x ← x/2 else y -i = 0 endif endfor

June 2007 Computer Arithmetic, Function EvaluationSlide 10 Computing Binary Logarithms via Squaring 3 Fig Hardware elements needed for computing log 2 x. Generalization to base b: x = b, y –1 = 1 iff x 2  b (. y –1 y –2 y –3... y –l ) two

June 2007 Computer Arithmetic, Function EvaluationSlide Exponentiation 1 x (i+1) = x (i) – ln c (i) = x (i) – ln(1 + d i 2 –i ) y (i+1) = y (i) c (i) = y (i) (1 + d i 2 –i ) Read out from table Force x (m) to 0 y (m) converges to y e x d i  {  1, 0, 1} 1 Why does this additive normalization method work? x (m) = x –  ln c (i)  0   ln c (i)  x y (m) = y  c (i) = y exp(ln  c (i) ) = y exp(  ln c (i) )  y e x If starting from y = 1 y (m) = e x Computing e x

June 2007 Computer Arithmetic, Function EvaluationSlide Exponentiation 2 Convergence domain:  ln (1 – 2 –i )  x   ln (1 + 2 –i ) or –1.24  x  1.56 Number of iterations:k, for k bits of precision; for large i, ln(1  2 –i )   2 –i Can eliminate half the iterations because ln(1 +  ) =  –  2 /2 +  3 /3 –...   for  2 < ulp and we may write y (k) = y (k/2) (1 + x (k/2) ) Radix-4 version can be devised

June 2007 Computer Arithmetic, Function EvaluationSlide 13 General Exponentiation, or Computing x y 1 x y = (e ln x ) y = e y ln x So, compute natural log, multiply, exponentiate When y is an integer, we can exponentiate by repeated multiplication (need to consider only positive y; for negative y, compute reciprocal) In particular, when y is a constant, the methods used are reminiscent of multiplication by constants (Section 9.5)

June 2007 Computer Arithmetic, Function EvaluationSlide 14 General Exponentiation, or Computing x y 2 Example: x 25 = ((((x) 2 x) 2 ) 2 ) 2 x [4 squarings and 2 multiplications] Noting that 25 = ( ) two, leads to a general procedure Computing x y, when y is an unsigned integer Initialize the partial result to 1 Scan the binary representation of y, starting at its MSB, and repeat If the current bit is 1, multiply the partial result by x If the current bit is 0, do not change the partial result Square the partial result before the next step (if any) 25 = ( ) two x 25 = ((((1  x) 2  x) 2 ) 2 ) 2  x

June 2007 Computer Arithmetic, Function EvaluationSlide 15 Faster Exponentiation via Recoding 1 Example: x 31 = ((((x) 2 x) 2 x) 2 x) 2 x [4 squarings and 4 multiplications] Note that 31 = ( ) two = (  1) two x 31 = (((((1  x) 2 ) 2 ) 2 ) 2 ) 2 / x [5 squarings and 1 division] Computing x y, when y is an integer encoded in BSD format Initialize the partial result to 1 Scan the binary representation of y, starting at its MSB, and repeat If the current digit is 1, multiply the partial result by x If the current digit is 0, do not change the partial result If the current digit is  1, divide the partial result by x Square the partial result before the next step (if any)  1

June 2007 Computer Arithmetic, Function EvaluationSlide 16 Faster Exponentiation via Recoding 2 Radix-4 example: 31 = ( ) two = (  1) two = (2 0  1) four x 31 = ((x 2 ) 4 ) 4 / x (2 0  1) four x 31 = ((1  x 2 ) 4 ) 4 / x

June 2007 Computer Arithmetic, Function EvaluationSlide Division and Square-Rooting, Again s (i+1) = s (i) –  (i) d q (i+1) = q (i) +  (i) Computing q = z / d In digit-recurrence division,  (i) is the next quotient digit and the addition for q turns into concatenation; more generally,  (i) can be any estimate for the difference between the partial quotient q (i) and the final quotient q Because s (i) becomes successively smaller as it converges to 0, scaled versions of the recurrences above are usually preferred. In the following, s (i) stands for s (i) r i and q (i) for q (i) r i : s (i+1) = rs (i) –  (i) d Set s (0) = z and keep s (i) bounded q (i+1) = rq (i) +  (i) Set q (0) = 0 and find q * = q (m) r –m In the scaled version,  (i) is an estimate for r (r i–m q – q (i) ) = r (r i q * - q (i) ), where q * = r –m q represents the true quotient

June 2007 Computer Arithmetic, Function EvaluationSlide 18 Square-Rooting via Multiplicative Normalization Idea: If z is multiplied by a sequence of values (c (i) ) 2, chosen so that the product z  (c (i) ) 2 converges to 1, then z  c (i) converges to  z x (i+1) = x (i) (1 + d i 2 –i ) 2 = x (i) (1 + 2d i 2 –i + d i 2 2 –2i ) x (0) = z, x (m)  1 y (i+1) = y (i) (1 + d i 2 –i ) y (0) = z, y (m)   z What remains is to devise a scheme for choosing d i values in {–1, 0, 1} d i = 1 for x (i) 1 +  = 1 +  2 –i To avoid the need for comparison with a different constant in each step, a scaled version of the first recurrence is used in which u (i) = 2 i (x (i) – 1): u (i+1) = 2(u (i) + 2d i ) + 2 –i+1 (2d i u (i) + d i 2 ) + 2 –2i+1 d i 2 u (i) u (0) = z – 1, u (m)  0 y (i+1) = y (i) (1 + d i 2 –i ) y (0) = z, y (m)   z Radix-4 version can be devised: Digit set [–2, 2] or {–1, –½, 0, ½, 1}

June 2007 Computer Arithmetic, Function EvaluationSlide 19 Square-Rooting via Additive Normalization Idea: If a sequence of values c (i) can be obtained such that z – (  c (i) ) 2 converges to 0, then  c (i) converges to  z x (i+1) = z – (y (i+1) ) 2 = z – (y (i) + c (i) ) 2 = x (i) + 2d i y (i) 2 –i – d i 2 2 –2i x (0) = z, x (m)  0 y (i+1) = y (i) + c (i) = y (i) – d i 2 –i y (0) = 0, y (m)   z What remains is to devise a scheme for choosing d i values in {–1, 0, 1} d i = 1 for x (i) +  = +  2 –i To avoid the need for comparison with a different constant in each step, a scaled version of the first recurrence may be used in which u (i) = 2 i x (i) : u (i+1) = 2(u (i) + 2d i y (i) – d i 2 2 –i ) u (0) = z, u (i) bounded y (i+1) = y (i) – d i 2 –i y (0) = 0, y (m)   z Radix-4 version can be devised: Digit set [–2, 2] or {–1, –½, 0, ½, 1}

June 2007 Computer Arithmetic, Function EvaluationSlide Use of Approximating Functions Convert the problem of evaluating the function f to that of function g approximating f, perhaps with a few pre- and postprocessing operations Approximating polynomials need only additions and multiplications Polynomial approximations can be derived from various schemes The Taylor-series expansion of f(x) about x = a is f(x) =  j=0 to  f (j) (a) (x – a) j / j! The error due to omitting terms of degree > m is: f (m+1) (a +  (x – a)) (x – a) m+1 / (m + 1)!0 <  < 1 Setting a = 0 yields the Maclaurin-series expansion f(x) =  j=0 to  f (j) (0) x j / j! and its corresponding error bound: f (m+1) (  x) x m+1 / (m + 1)!0 <  < 1 Efficiency in computation can be gained via Horner’s method and incremental evaluation

June 2007 Computer Arithmetic, Function EvaluationSlide 21 Some Polynomial Approximations (Table 23.1) ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– FuncPolynomial approximationConditions ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 1/x1 + y + y 2 + y y i < x < 2, y = 1 – x e x 1 + x /1! + x 2 /2! + x 3 /3! x i /i ! +... ln x–y – y 2 /2 – y 3 /3 – y 4 /4 –... – y i /i –... 0 < x  2, y = 1 – x ln x2 [z + z 3 /3 + z 5 / z 2i+1 /(2i + 1) +... ] x > 0, z = x–1 x+1 sin xx – x 3 /3! + x 5 /5! – x 7 /7! (–1) i x 2i+1 /(2i + 1)! +... cos x1 – x 2 /2! + x 4 /4! – x 6 /6! (–1) i x 2i /(2i )! +... tan –1 xx – x 3 /3 + x 5 /5 – x 7 / (–1) i x 2i+1 /(2i + 1) +... –1 < x < 1 sinh xx + x 3 /3! + x 5 /5! + x 7 /7! x 2i+1 /(2i + 1)! +... cosh x1 + x 2 /2! + x 4 /4! + x 6 /6! x 2i /(2i )! +... tanh –1 x x + x 3 /3 + x 5 /5 + x 7 / x 2i+1 /(2i + 1) +... –1 < x < 1 –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

June 2007 Computer Arithmetic, Function EvaluationSlide 22 Function Evaluation via Divide-and-Conquer Let x in [0, 4) be the (l + 2)-bit significand of a floating-point number or its shifted version. Divide x into two chunks x H and x L : x = x H + 2 –t x L 0  x H < 4 t + 2 bits 0  x L < 1 l – t bits t bits x H in [0, 4)x L in [0, 1) The Taylor-series expansion of f(x) about x = x H is f(x) =  j=0 to  f (j) (x H ) (2 –t x L ) j / j! A linear approximation is obtained by taking only the first two terms f(x)  f (x H ) + 2 –t x L f (x H ) If t is not too large, f and/or f (and other derivatives of f, if needed) can be evaluated via table lookup

June 2007 Computer Arithmetic, Function EvaluationSlide 23 Approximation by the Ratio of Two Polynomials Example, yielding good results for many elementary functions f(x)  Using Horner’s method, such a “rational approximation” needs 10 multiplications, 10 additions, and 1 division

June 2007 Computer Arithmetic, Function EvaluationSlide Merged Arithmetic Our methods thus far rely on word-level building-block operations such as addition, multiplication, shifting,... Sometimes, we can compute a function of interest directly without breaking it down into conventional operations Example: merged arithmetic for inner product computation z = z (0) + x (1) y (1) + x (2) y (2) + x (3) y (3)                                x (1) y (1) x (3) y (3) x (2) y (2) z (0) Fig Merged-arithmetic computation of an inner product followed by accumulation.

June 2007 Computer Arithmetic, Function EvaluationSlide 25 Example of Merged Arithmetic Implementation Example: Inner product computation z = z (0) + x (1) y (1) + x (2) y (2) + x (3) y (3)                                x (1) y (1) x (3) y (3) x (2) y (2) z (0) Fig Tabular representation of the dot matrix for inner-product computation and its reduction FAs FAs + 1 HA FAs FAs + 1 HA FAs + 2 HAs bit CPA Fig. 23.2

June 2007 Computer Arithmetic, Function EvaluationSlide 26 Another Merged Arithmetic Example Approximation of reciprocal (1/x) and reciprocal square root (1/  x) functions with bits of precision, so that a long floating-point result can be obtained with just one iteration at the end [Pine02] f(x) = c + bv + av 2 1 square Comparable to a multiplier 2 mult’s 2 adds