Download presentation
Presentation is loading. Please wait.
Published byLogan Bryan Modified over 9 years ago
1
A Fast Hardware Approach for Approximate, Efficient Logarithm and Anti-logarithm Computation Suganth Paul Nikhil Jayakumar Sunil P. Khatri Department of Electrical and Computer Engineering Texas A&M University, College Station
2
Introduction The fast generation of functions such as logarithm and antilogarithm is important in areas such as DSP, computer graphics, scientific computing, artificial neural networks, logarithmic number systems. Over the past, authors have proposed various hardware approaches to accurately approximate logarithm and antilogarithm functions. Out of these approaches, Look up table (LUT) based methods such as Brubaker, Maenner, Kmetz, SBTM are widely used. Some hardware approaches also include LUTs combined with polynomial approximations. But these need multiplications/divisions. Our approach combines an LUT with linear interpolation implemented in an area and delay efficient manner. The novelty of our approach lies in the fact that we do not need a multiplier or divider to perform interpolation. Also we use the same hardware structure to implement log and antilog. The number format used for the computation is shown below. Here : 0 < < 1 is the Mantissa and : is the exponent.
3
Mitchell Approximation The logarithm of a number is found as Mitchell’s approximation is given by where The error due to this approximation is The error is plotted on the right
4
Kmetz Approximation In the Kmetz method, the Mitchell error curve shown above is sampled at points and stored in an LUT. Here the LUT is indexed by the first bits of the mantissa If the error value looked up from the LUT is, the logarithm is found as where The error in this case due to approximating the logarithm of the mantissa portion is given by
5
Our Approach In our method we interpolate between values stored in the LUT to get a more accurate result. The logarithm of the mantissa part of the number is obtained as where is the error value from the LUT at location is the number of leading bits in the mantissa indexing the table is the next value in the LUT at location is the total number of bits used to represent the mantissa is the decimal value of the last bits of the mantissa The multiplication step is found as is found by using the same LUT as above We consider the following approximations to find and
6
Errors for Various Interpolation Methods and Table Sizes 1. is found by a) Mitchell approximation b) Kmetz approximation using another LUT 2. is found by a) Mitchell approximation b) Kmetz approximation using another LUT We find from the table below that 1.b) 2.b) has the best error performance and hence we use LUTs to approximate the multiplication. Max Error is in
7
Block Diagram of the Log Engine The block diagram shows the implementation of where is the 23 bit mantissa The number of leading bits of the mantissa going to the interpolator depends on the size of the LUTs used in the Interpolator. In this case we are using an LUT that holds 64 values and 13 bits of the mantissa are required. The Interpolator block is shown below.
8
Interpolator Block Diagram The implementation can be pipelined to get a better throughput. The COMPARE block determines if the final stage does an Add or Subtract. The LOD (leading one detector) block finds the position of the leading one and the rest of the bits are used to access the LUT. The LUT used to find and is the same and is implemented as a dual port ROM.
9
Antilog Computation Let The antilogarithm of this number is found as Using Mitchell’s method we make the following approximation A Kmetz approximation can be made by storing the error due to this approximation in an LUT and adding the error value to the above equation for the antilogarithm. In our approach, we compute the antilogarithm by interpolating efficiently between two adjacent table values stored in the LUT without needing a multiplier. We follow the same flow used for computing the logarithm. The error incurred while using different table sizes for computing the antilogarithm is shown below.
10
Comparison of FPGA Resources used by the Log Engine We implemented our method and the Symmetric Bipartite Table Method (SBTM) using a Virtex2P FPGA. Our method requires smaller on-chip Block Rams. Both methods occupied less than 1% of FPGA resources Both methods were able to support clock speeds of a little over 350 MHz.
11
Comparison of LUT Size used and Accuracy of the Log Computation
12
Conclusion Our approach has low memory requirement as compared with other methods to provide better accuracies. When compared to the SBTM, for every two bits of extra bits of accuracy, – we need a factor of 2 increase in the LUT size –the SBTM needs a factor of 3 increase in the LUT size Hence our method scales well for higher accuracy in bits. We are area efficient compared polynomial interpolation methods as we do not need a multiplier or divider to perform interpolation. The implementation can be pipelined and the number of stages in the pipeline can be varied depending on the throughput required. We have presented an approach to efficiently compute the logarithm and antilogarithm of a number in hardware.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.