Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu

Similar presentations


Presentation on theme: "CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu"— Presentation transcript:

1 CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
CSE 575 Computer Arithmetic Spring Mary Jane Irwin (

2 Student Presentations - April 29
1 9:45 Fast adders Matt & Ki-Yong 2 10:10 Array multipliers Jahan & Jaehyun 3 10:35 Divider Wei-Lun & Ing-Chao 11:00 BREAK 4 11:15 Log PE Swapna, Greg, Eric, & Theo 5 12:05 Nanosenor PE Aman

3 Cordic Algorithms Review
A family of convergence algorithms for trig (and other) functions Generalized Cordic defined as Xi+1 = Xi -  di Yi 2-i Yi+1 = Yi + di Xi 2-i Zi+1 = Zi - di Ei Simple hardware - shifters, adders, lookup table When the number of iterations is fixed, K and K’ are constants Thus, we always need k iterations for k digits of precision Can be extended to higher radices (e.g., for base 4, di{-2,-1,1,2} and the number of iterations will be cut in half with essentially the same hardware) COordinate Rotations DIgital Computer

4 Cordic Hardware Review
X X Xi+1 = Xi – di Yi 2-i shift i counter shift Yi+1 = Yi + di Xi 2-i Y Y Zi+1 = Zi - di Ei Z Z Ei lookup table (tan-12-i, 2-i, tanh-12-i) di control

5 Cordic Computation Unit
Rotation mode: di=sign(Zi); Zi0 Vector mode: di= -sign(Xi  Yi); Yi0 Ciclr  = 1 Ei = tan-12-i X K(X cosZ – Y sinZ) Y K(Y cosZ + X sinZ) Z For cos & sin, set X=1/K, Y=0 tanZ=sinZ/cosZ X K (X2 + Y2) Y Z Z + tan-1(Y/X) For tan-1, set X=1, Z=0 cos-1W=tan-1[(1-W2)/W];sin-1W=tan-1[W/(1-W2)] Linear  = 0 Ei = 2-i X X Y Y + X*Z For multiply, set Y=0 X X Y Z Z + Y/X For divide, set Z=0 Hprblc  = -1 tanh-1 2-i X K’(X coshZ – Y sinhZ) Y K’(Y coshZ + X sinhZ) For cosh & sinh, set X=1/K’, Y=0 tanhZ=sinhZ/coshZ; Wt=Et lnW Ez=sinhZ+coshZ X K’ (X2 - Y2) Y Z Z + tanh-1(Y/X) For tanh-1, set X=1, Z=0 cosh-1W=ln[W+(1-W2)];sinh-1W=ln[W+(1+W2)] lnW=2tanh-1|(W-1)/(W+1)|;W=((W+¼)2-(W-¼)2) In executing the iterations for  = -1, steps 4, 13, 40, 121, …, j, 3j+1, … must be repeated. CORDIC CORDIC CORDIC CORDIC CORDIC CORDIC

6 Continued Sums/Products
Characterized by P Pni  Q Q = = M Mni  1 Approximation Term – Normalizing Term Generation Term - Result Term Goals Family of alg’s mapping onto the same hardware Simple iterations (add/subt, shift) Easy selection rules (resulting in simple ops) Linear (or quadratic) convergence Goals are similar to the ones we saw with Cordic

7 Approximation Terms Approximation Term – Normalizing Term
form the multiplicative inverse of one of the independent variables by a product of approximating terms, e.g., Mni normalizing to 0 or 1 can only be computed approximately or form the additive inverse of one of the independent variables by a sum of approximating terms can be computed exactly

8 Generation Terms Generation Term – Result Term
generate the result (e.g., Pni) by either a product or sum of generating terms which are functionally dependent on the approximating terms

9 CS Multiplication Mult for unsigned, normalized operands
Z = X * Y = (X - mi + mi) * Y = (X - mi) * Y + mi * Y Z approx. term gener. term where the mi’s are chosen appropriately mi = si 2-i where si  {0,1} Z = (X -  si 2-i) * Y +  si 2-i * Y Linear convergence (exactly) – k iterations for k digits of precision Simple operations – add/subtract, shift Consider the si selection rules – can be viewed as just scanning X from left (msd) to right (lsd) and subtracting off the 1 digits (forming the additive inverse) X = . x1 x2 x3 … xn if xi = 1 then M = . s1 s2 s3 … sn si = 1 If xi = 1 then si = 1 so the sum of si 2-1 converges to X and we obviously get the product from the generating term SAME AS REGULAR ADD AND SHIFT MULTIPLY (except working from ms end of multiplier) What if si is chosen from the digit set –1, 0, 1 ?

10 CS Multiplication Approx. Term
X - mi  Xi = Xi-1 – si2-i e.g., X1 = X0 – s12-1 X2 = (X0 – s12-1 ) – s22-2 And to guarantee convergence X0 = X ½ for ½  X < ¾ X1 = X0 – m1 where m1 = 1 for ¾  X < 1 so if X  [½ , 1)  X1  [- ¼ , ¼) Operations in Xi iteration are subtract/add and barrel shift Note that X1 is forced to be symmetric about zero – REQUIRED since we are normalizing to zero

11 CS Multiplication Gener. Term
Gener. Term  Product Z mi * Y   si 2-i *  yi 2-i Z0 = 0 Z1 = Z0 + Ym (m1 = ½ or 1) Zi = Zi-1 + Y si 2-i (barrel shift and add) Selection Rules Need |X1|  2/3 to ensure convergence 1 if 2/3 2-(i-1)  Xi-1  1/3 2-(i-1) si = otherwise -1 if - 2/3 2-(i-1)  Xi-1 < - 1/3 2-(i-1) Get X1 bound via previous slide’s initialization step (when we force X1 to be symmetric about zero) In general, just pick an “easy to implement” initialization step to get this condition The choice of the section values 2/3 and 1/3 will become clear on the two next slides!

12 Selection Rules si picked to force Xi to 0
And (as in SRT division) can guarantee convergence since if | X1 |  2/3 * 20 then | Xi |  2/3 * 2-(i-1) si = 0 2/3 2-(i-1) 1/3 2-(i-1) -1/3 2-(i-1) -2/3 2-(i-1) si = -1 so add si = 1 so subtract For lecture choosing si = 1 kicks the next selection into the zero range (can actually be either side of the zero breakpoint)

13 Convergence Proof PROOF (by induction)
For i = 1 by initialization For i-1 if 1) Xi-1  1/3 2-(i-1) then choose si = 1 so Xi = Xi-1 – 2-i  1/3 2-(i-1) - 2-i = 1/3 2-i i = -1/3 2-i so Xi  -1/3 2-i so that si+1 = 0 and Xi+1 = Xi  -1/3 2-i = -2/3 2-(i+1) so that si+2 = -1 or 0 if 2) Xi-1 < - 1/3 2-(i-1) then choose si = -1 so etc QED Note that no two 1’s (or –1’s) for si are ever selected in a row  no two adjacent 1’s of the same sign  canonical recoding of multiplier

14 CS Multiply Example Z = X * Y X = 0.0111 (< 2/3) Y = 0.1101
i si Xi = Xi-1 – si Zi = Zi-1 + Ysi2-i X0 =  1/3* Z0 = 0 X1 = * Z1 = *1*2-1 = > -1/3* = X2 = * Z2 = *0*2-2 = > -1/3* = For lecture Multiplier NOT normalized, so can skip special initialization step. Notice that the multiplier, X, has been canonically recoded > 100-1 Have a msd multiplication algorithm with X recoded on-the-fly to canonical form - how can this be??? (since we know canonical recoding is left directed) Also gives insight into why the Selection Rules use +-1/3 X0 = > 1/3 2**0 says choose s1 =1 (note that x0 would recode to ) X3 = * Z3 = *0*2-3 = < -1/3* = X4 = * Z4 = *-1*2-4 = done =

15 ?Easy? Selection Rules To choose si’s have to compare Xi-1 to the stored constants i 1/3*2-(i-1), -1/3*2-(i-1) Constants vary depending on iteration Full precision comparison – too expensive Can reduce the number of selection constants by scaling Xi by 2i making the selection rules independent of the iteration Can eliminate the full precision comparison by picking a selection value close to  1/3 that needs only a few bits to represent, like  3/8 Selection constant of 3/8 gives a 3 bit comparison and near CANONICAL recoding !!

16 Scaled CS Multiplication
Scaled Approx. Term  0 U0 = X = X  [½ , 1) U1 = X = (X0 - m1)*2  [- ½ , ½ ) Ui = Xi2i = (Xi-1 - si2-i)2i = 2iXi-1 – si = 2*2i-1Xi-1 – si Ui = 2Ui-1 – si Generating Term  Product, Z Z0 = 0 and Z1 = Z0 + Y m1 Z = mi * Y where mi = si2-i  Zi = Zi-1 + Y si 2-i Selection Rules 1 if Ui-1  1/  3/8 si = otherwise -1 if Ui-1 < - 1/  - 3/8 Now have only two comparison constants to store - + 1/3 and – 1/3 0.100 0.011 = +3/8 0.010 0.001 0.000 1.111 1.110 1.101 = -3/8 Linear convergence; simple operations; easy selection rules Continued sums approach, so mi = si2**-i Forcing Xi to zero, so scale by Ui = 2**i Xi

17 Scaled CS Multiply Example
Z = X * Y X = (< 2/3) Y = i si Ui = 2Ui-1 – si Zi = Zi-1 + Ysi2-i U0 =  3/ Z0 = 0 U1 = 2(0.0111) Z1 = *1*2-1 = > -3/ = U2 = 2(-0.001) Z2 = *0*2-2 = > -3/ = For lecture Selection rules S a1 a2 a3 a4 0.1xx x > 3/8 -> 1 > 3/8 -> !s a1 + !s !a1 a2 a3 = 3/8 -> 1 0.010 x < 3/8 -> 0 0.00x x < 3/8 -> 0 1.11x x > -3/8 -> 0 1.10x x > -3/8 -> 0 = -3/8 -> 0 1.100 x < -3/8 -> s a1 !a2 !a3 + s !a1 1.0xx x < -3/8 -> -1 U3 = 2(-0.01) Z3 = *0*2-3 = -0.1 < -3/ = U4 = 2(-0.1) Z4 = *-1*2-4 = done =

18 CS Multiply Hardware Ui = 2Ui-1 – si Zi = Zi-1 + Ysi 2-i Zi si i Zi-1
n-b adder barrel sft complement mini-adder 1-b shifter n-b register ROM Ui-1 si  3/8 ms 4 bits Ui-1 Ui For lecture What is involved in a mini adder design?  what you are doing is adding 1 to a fraction, so, depending on the representation it can be as simple as complementing the high order (sign) bit Complement box is selective and controlled by si (xor word gate) What if the results are generated in carry-save form (to speed up addition)?

19 CP Division Division for unsigned, normalized operands
Z = Y/X = (Y  di) / (X  di) Z gener term approx term where the di’s are chosen appropriately di = (1 + si 2-i) where si  {-1,0,1} Z = (Y  (1 + si 2-i)) / (X  (1 + si 2-i)) Linear convergence (approximately) – k iterations for k digits of precision Simple operations – add/subtract, shift Consider the si selection rules (forming the multiplicative inverse) Xi = Xi-1 + 2**-i Xi-1 for si = 1 Xi-1 = x1 x2 x3 … xn si= -i - x1 x2 x3 … so a simple rule would be if Xi-1 <1 pick si = 1 or if Xi-1 >=1 pick si = 0 We will see that this could give us problems with convergence (to 1) - (what to do if Xi-1>1 – would really like to pick si = -1) What if si is chosen from the digit set –1, 0, 1 ? – easier to get convergence

20 CP Division Approximating Term
Approx. Term  1 X  (1 + si 2-i)  Xi = Xi-1(1 + si 2-i) X0 = X ½  X < ¾ X1 = X0d1 where d1 = 1 ¾  X < 1 if X  [½ , 1)  X1  [¾ , 6/4) remember trying to force Xi  1 so need to initialize it that way Operations in Xi iteration are subtract/add and barrel shift Note that X1 is forced to be symmetric about one – REQUIRED since we are normalizing to one could have gotten it “more symmetic” around one by using d1=2 for X [1/2,2/3) and d1=1 for X [2/3,1) resulting in X1 [2/3,4/3) but the selection rule for d1 would require a full precision comparison, so we compromise on the simpler rule given in slide

21 CP Division Generating Term
Gener. Term  Quotient Z Y  (1 + si 2-i)  Zi = Zi-1(1 + si 2-i) Z0 = Y Z1 = Z0d1 Selection Rules -1 if Xi-1  1 + 1/3 2-(i-1) si = otherwise 1 if Xi-1 < 1 - 1/3 2-(i-1)  again unwieldy selection comparison constants The choice of the section values +1/3 and -1/3 in once again to get canonical recoding on si

22 Scaled CP Division Scaled Approx. Term  0 Selection Rules
Scale factor when norm to 1 is Ui = 2i(Xi – 1) forcing Ui to 0 will force to Xi to 1 U0 = 20(X0–1) = X0–1  [-½ , ½ ) U1 = 21(X1–1) = 2(X0d1–1)  [- ½ , 1 ) Ui = 2i(Xi-1(1+si2-i)–1) = 2 2i-1(Xi-1-1)+Xi-1si = 2Ui-1 + Xi-1si = 2Ui-1+2 2i-1(Xi-1 –1) si2-i+si = 2Ui-1+2Ui-1si2-i+si Ui = 2Ui-1(1 + si2-i) + si Selection Rules -1 if Ui-1  1/  3/8 si = otherwise 1 if Ui-1 < - 1/  - 3/8 For the d1 shown two slides ago, x1 [3/4,6/4) used to kick X1 symmetric about one gives u1 = 2([3/4,6/4)-1) = [-1/2,1) not quite symmetric about zero (maybe should have just chosen d1 = 1) note that when si = 0 which is hopefully will be 2/3rds of the time, only have to do a simple shift

23 Scaled CP Division Example
Z = Y / X X = = Y = = 0.5 i si Ui = 2Ui-1(1 + si2-i) + si Zi = Zi-1(1 + si2-i) U0 = Z0 = 0.5 U1 = 2(2*0.6-1) Z1 = 1.0 = 0.4  3/8 U2 = 2*0.4(1-1/4) Z2 = 1.0(1-1/4) = 0.75 = -0.4  -3/8 U3 = 2*-0.4(1+1/8) Z3 = 0.75(1+1/8) = -0.1 > -3/ = For lecture U4 = 2*0.1=0.2 < 3/ Z4 = U5 = 2*0.2=0.4  3/ Z5 = U6 = 2*0.4(1-1/64) Z6 = (1-1/64) = =

24 CP Divide Hardware Ui = 2Ui-1(1+si2-i)+si Zi = Zi-1(1+si2-i) n-b adder
barrel sft complement mini-adder 1-b shifter n-b register Ui-1 Ui si i Zi si i Zi-1 n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register Ui-1  3/8 ms 4 bits ROM  3/8 for lecture look closer at design of mini-adder  what you are doing is adding 1 to a fraction, so, depending on the representation it can be as simple as complementing the high order (sign) bit What if results generated in carry-save form? si

25 CS/P Mult/Divide Hardware
Ui = 2Ui-1 – si Ui = 2Ui-1(1+si2-i)+si Zi = Zi-1(1+si2-i) Zi = Zi-1 + Ysi 2-i CS/P Mult/Divide Hardware n-b adder barrel sft complement mini-adder 1-b shifter n-b register Ui-1 Ui si i Zi si i Zi-1 i Y n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register Ui-1  3/8 ms 4 bits ROM  3/8 For lecture merged mult/divide interconnections Show were muxes have to go if * and / are combined si

26 Bit Slice Design Approach
n-b adder barrel sft complement mini-adder 1-b shifter n-b register Ui-1 Ui si i Zi si i Zi-1 i Y n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register Ui-1 ROM  3/8 leave routing channels over the bit slice to handle all necessary interconnects instead of “routing around” ! si

27 Continued Sums/Products Review
Approximation Term – Normalizing term form the additive inverse of one of the independent variables by a sum of approximating terms (si2-i) form the multiplicative inverse of one of the independent variables by a product of approximating terms (1 + si2-i) Generation term generates terms which are functionally dependent on the approximating the result by either a sum or a product of generating terms Goals Family of alg’s mapping onto the same hardware Simple iterations (add/subt, shift) Easy selection rules (resulting in simple operations) Linear (or quadratic) convergence

28 CP Exponentiation Z = eX where ½  |X| < ln 2 so eX  [½, 2)
X = (X – ln( ei)) + ln( ei) where ei = (1 + si 2-i) for si  {-1,0,1} so that eX = eX – ln( ei) eln( ei) Z 1 gener term approx term eX = eX+-ln ei *  ei Linear convergence (approximately) – k iterations for k digits of precision Simple operations – add/subtract, shift Have to precompute and store (along with the comparison constants) the values of –ln ei = - ln (1 + si 2**-1) Tricks eln value = value – ln( ei) = +-ln ei

29 CP Exponentiation, con’t
Scaled Approx. Term  0 X + -ln ei  Xi = Xi-1 + (- ln(1 + si 2-i)) X0 = X and X1 = X0 - ln e1 and applying the norm. to 0 scale factor Ui=Xi2i U0 = 20X0 = X0  [-½ , ln 2 ) U1 = 21X1 = 2(X0–ln e1) Ui = 2iXi-1= 2i(Xi-1+(-ln ei)) = 2 2i-1Xi-1+ 2i(-ln ei) = 2Ui-1+ 2i(-ln ei) = 2Ui-1+ 2i(-ln (1 + si 2-i)) Ui = 2Ui-1+ 2i(-ln (1 + si 2-i)) Operations in Xi iteration are subtract/add and barrel shift plus have to have access to the extra stored constants stored constants for i = 2, …,n (when si=0 constant is –ln 1=0)

30 CS/P Exponentiation, con’t
Gener. Term  Exponential Z  ei  Zi = Zi-1(1 + si 2-i) Z0 = 1 Z1 = Z0e1 Selection Rules 1 if Ui-1  3/8 si = otherwise -1 if Ui-1 < - 3/8 The choice of the section values +3/8 and –3/8 in once again to get close to canonical recoding on si with easy comparison constants

31 Exp Initialization Rules
If we have |X|[½, ln2) and we want X1[-½, ½) since we are normalizing to 0 X e ln e X1 = X0 - ln e1 [½, ln 2) e½ ½ [0, ln2–½) [¼, ½) e¼ ¼ [0, ¼) [-¼, ¼) [-¼, ¼) [-½, ¼) e-¼ ¼ [-¼, 0) [½, ln 2) e-½ ½ [-ln2+½, 0)

32 Stored Constants How many stored constants do we need?
 3/8 for si selection ½, ¼, -½, -¼ for e1 selection, e½, e¼, e-½, e-¼ to form Z1 -ln(1+si2-i) in support of Ui computation when i is large, values converge for some n ln(1+) =  - ½2 + 1/33 – ¼4 + … for –1< 1 if i > (n-3)/2 where n is the precision then ln(1+) =  = si 2-i to machine accuracy so only need to store -ln(12-i) for i = 2, …, (n-3)/2 e.g., for n=16, i=2,3,4,5,6  10 values Remember –ln 1 = 0 !!

33 CS/P Exponentiation Hardware
Ui = 2Ui-1+2i(-ln(1+si2-i)) Zi = Zi-1(1+si2-i) CS/P Exponentiation Hardware n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register si si i i i Ui-1 si 3/8 -ln(12-i) ½,-½,¼,-¼ si Ui-1 Zi-1 Ui Zi

34 CS/P Square Root Derivation
Z = X where X  [¼, 1) Try Z = (Xri2 * ri-2) = (Xri2)* ri-1 Z where ri = (1+si2-i) for si  {-1,0,1} but forming ri-1 would be hard, so try again! for lecture Z = X/X = (Xri)/(Xri2) Z gener term approx term

35 CS/P Square Root Scaled Approx. Term (Ui = 2i(Xi – 1))  0
(Xri2)  Ui = 2Ui-1 + 2si(1+Ui-12-i+1) + 2-isi2(1+Ui-12-i+1) U0 = 20(X0–1) = X-1 and U1 = 21(X r1 -1) Gener. Term  Square root Z Xri  Zi = Zi-1(1+si2-i) Z0 = X and Z1 = Z0r1 Selection Rules  Initialization Rule -1 if Ui-1  3/ X[¼, 1) want X1[½, 2) si = otherwise if ¼  X0 < ½ 1 if Ui-1 < - 3/ r1 = 1 if ½  X0 < 1 approx term derivation: X0 = X; X1 = X0 r1 Xi = Xi-1(1 + si 2**-i)**2 -> 1 so Ui = 2**i (Xi - 1) -> 0 U0 = 2**0 (X0 - 1) = X-1 and U1 = 2**1 (X0 r1 - 1) = 2 X r1 - 2 = 2(X1 r1 - 1) Ui = 2**i (Xi - 1) = 2**i (Xi-1 (1 + si 2**-i)**2 - 1) = 2**i (Xi Xi-1 (2 si 2**-i + si**2 2**-2i) = 2 * 2**i-1 (Xi-1 - 1) + 2**i Xi-1 (2 si 2**-2 + si**2 2**-2i) = 2Ui-1 +2si Xi-1 + si**2 Xi-1 2**-1 = {2Ui-1} + {2*2*2**i-1 (Xi-1 -1)si 2**-i + si 2} + {2**-i 2 2**i-1 (Xi-1 -1) si**2 2**-i + si**2 2**-i} 1st term nd term rd term = 2Ui-1 + 2**-i+2 si Ui-1 + si 2 + 2**-2i+1 si**2 Ui-1 + 2**-i si**2 = 2Ui si (1 + Ui-1 2**-i+1) + 2**-i si**2 (1 + Ui-1 2**-i+1) Want X1 symmetric about 1

36 CS/P Square Root Hardware
Ui = 2Ui-1 + 2si(1+Ui-12-(i-1)) + 2-isi2(1+Ui-12-(i-1)) Zi = Zi-1(1+si2-i) CS/P Square Root Hardware si n-b adder barrel sft complement adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register si si2 i-2 i 2i-1 Ui-1 si si si2 3/8 Requires two cycles!! And a MUCH bigger mini-adder. Could use third bank, maybe. But note for large i (like i > n/2) don’t have to worry about the red term since it doesn’t affect result precision. si Ui-1 Zi-1 Ui Zi

37 CS/P Sin&Cos Hardware Ui = 2Ui-1+si2i (-tan-12-i) Ri = Ri-1-si2-iIi-1
Ii = Ii-1+si2-iRi-1 CS/P Sin&Cos Hardware n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register n-b adder barrel sft complement mini-adder 1-b shifter n-b register si !si si i i i i Ui-1 si !si si 3/8 -tan-12-i tan /4 R = cos X and I = sin X si Ui-1 Ri-1 Ii-1 Ui Ri Ii

38 Continued Sums/Products Review
Approximation Term – Normalizing term form the additive inverse of one of the independent variables by a sum of approximating terms (si2-i) form the multiplicative inverse of one of the independent variables by a product of approximating terms (1 + si2-i) Generation term generates the result by either a sum or a product of generating terms which are functionally dependent on the approximating terms Goals Family of alg’s mapping onto the same hardware Simple iterations (add/subt, shift, table lookup) Easy selection rules (requires scaling of approx. term) Linear convergence

39 Key References Baker, More efficient radix-2 algorithms for some elementary functions, IEEE Trans. on Computers, 24(11), 1975. Chen, Automatic computation of exponentials, logarithms, ratios and square roots, IBM J. Research and Development, Vol 16, pp , 1972. DeLugish, A class of algorithms for automatic evaluation of certain elementary functions, PhD Thesis, CS, UIUC, June 1970. Ercegovac, Radix-16 evaluation of certain elementary functions, IEEE Trans. on Computers, 22(6): , 1973. Koren, Zinaty, Evaluating elementary functions in a numerical coprocessor based on rational approximations, IEEE Trans. on Computers, 39(8): , 1990. Parhami, Computer Arithmetic, Oxford Univ. Press, 1999.


Download ppt "CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu"

Similar presentations


Ads by Google