CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 10 Floating Point Number Rounding, Polynomial Expression
CSE 2462 Topics: Rounding F.P. Numbers Polynomial Expression
CSE 2463 Rounding the numbers Why we need the Guard bit Round bit Sticky bit
CSE 2464 Example Normalize according to exponent Renormalize x2 3 Result = x2 3 Take 5 bits after decimal Round bit Sticky Bit
CSE 2465 Rounding We need only one guard bit for normalization after addition. Assumption: Operands are normalized. Why?
CSE 2466 Example Normalize according to exponent Renormalize Result = Take 5 bits after decimal Round bit Bit on the boundary Non-zero => round-up
CSE 2467 Theory behind it gr round guard Other bits OR Sticky bit When shifting right, don ’ t need to remember anything more than 3 bits below This is a necessary and sufficient condition
CSE 2468 Polynomial Approximation of Functions
CSE 2469 Taylor Series f(x) = f(x 0 ) + Example: sin(x) = x – x 3 /3! + x 5 /5! – x 7 /7!+ …
CSE Taylor Series Given: P N (x) = = c 0 +x(c 1 +x(c 2 + … +x(c N-1 +xc N ))))) R(N) =c N R(i-1) =c i-1 +xR(i) … P N (X) =R(0) How to calculate value of function? Group common factors …. N multiples and adds Recursively
CSE Taylor Series 1 adder => do it in series Given more components => can we go faster? Take N = 7 as example c 7 x 7 +c 6 x 6 +c 5 x 5 +c 4 x 4 +c 3 x 3 +c 2 x 2 +c 1 x 1 +c 0 How to accelerate?
CSE Taylor Series c 7 x 7 +c 6 x 6 +c 5 x 5 +c 4 x 4 +c 3 x 3 +c 2 x 2 +c 1 x 1 +c 0 Use 3 stages to generate x k Use x k to generate the polynominal expression. + x x xxxx x Carry-save =constant time Log n x x2x2 x3x3 x4x4 x5x5 x6x6 x7x7
CSE Taylor Series c 7 x +c 6 c 5 x +c 4 x c 3 x +c 2 c 1 x +c 0 x 2( c 7 x +c 6 )+c 5 x +c 4 x x 2 (c 3 x +c 2 )+ c 1 x +c 0 x 4 [x 2( c 7 x +c 6 )+c 5 x +c 4 x]+x 2 (c 3 x +c 2 )+c 1 x +c 0 This is a bit faster. Only 2 stages But what is fastest way to produce result? & energy efficient? => minimize[# of multiplies] All this uses + ’ s and x ’ s. Need to get rid of them. => Let ’ s to try table look-up x x2x2 x4x4
CSE Taylor Series – Table look-up SRAM/DRAM => eat power ROM => better option f(x) = Suppose there is a table as a binary tree. Let x = x H + x L x 0 = x H Example X = x H = f(x H + x L ) = x L =
CSE Taylor Series – Table look-up 1 st order f(x H + x L ) ~= => Only 1 multiplication !!! x Table-1 Table-2 x + f(x H + x L ) xHxH xLxL f(x H ) f’(x H )
CSE Taylor Series With extra order => 1 Extra table and 1 multiplier If you wish to change the function, all you have to do is just change the content of the table Problem? => Now it ’ s the size of the table! L / 2^L
CSE Taylor Series Let ’ s reduce X into 3 sections (instead of the previous 2 (High and Low) ) x = x 1 +x 2 2 -k +x k => f( x) = f( x 1 +x 2 2 -k )+ x k f ’ ( x 1 ) + Epsilon E ~= 2 -3k f(x) requires a 2 n x V n table 2 n : # of bits of x V n : # bits of f(x) 32bit x => 2 32 x 2 32 = bits -> HUGE!! -> but do we really need all those # ’ s in the table??
CSE Taylor Series Let E = epsilon, [] = Lower limit x*y = (x+y) 2 / 4 – (x-y) 2 / 4 = ( [(x+y)/2] + E/2 ) 2 - ( [(x-y)/2] + E/2 ) 2 = [ (x+y)/2 ] 2 - [ (x-y)/2 ] 2 - E * y ……… x Content of lower bits determines lower bits of result, but not other bits !! ……… x2x2 Table
CSE Taylor Series 2 n x V vs.2 n x (v-w ) + 2 L x w 2 n x v – (2 n x w - 2 L x w ) 2 n x v – w (2 n - 2 L ) Size of table is reduced by 2 n x v n /x v /f(x) 2 n x (v-w) n /x v-w / 2 L x w L / w / f(x)