Download presentation
Presentation is loading. Please wait.
Published byGodfrey Flynn Modified over 9 years ago
1
Ch.5 Fixed-Point vs. Floating Point
2
5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D = - b N-1 2 N-1 + …+ b 1 2 1 + b 0 –There is a dynamic range limitation. The Q-format can be used to help prevent overflow in multiplication.
3
5.1 Q-format Q-format or fractional representation –Implied binary point is moved to the left. –F(B)= - b N-1 2 0 + b N-2 2 1 +…+ b 1 2 -(N-2) + b 0 2 -(N-1) –The programmer keeps track of the binary point. Example: Q-15 –16 bit numbers—1 sign bit and 15 fractional bits. –Multiplication of 2 such numbers gives a Q-30 number. –The result can be truncated to keep the most significant 15 fractional bits, and dropping the extended sign bit—See Fig. 5.2
4
Problems with Q Format There can be precision loss with the Q- format—Figure 5-5 illustrates the concept with the Q-12 example. Addition and subtraction can still be a problem—scaling can be used to help.
5
6.2 Finite Word Length Effects on Fixed-Point DSPs Coefficients in digital filters will be saved in fixed-point formats in fixed-point DSP implementations. The finite word length quantization effect is similar to input data quantization introduced by an A/D converter.
6
5.1 Finite Word Length Effects (p.2) In IIR filters, the fixed-point representation of the coefficients can cause the poles to shift in the z-plane. The amount of shift due to the quantization of a single coefficient is influenced by the positions of all the other poles. To reduce this effect, IIR filters are often implemented as a cascade of 2 nd order systems.
7
5.2 Finite Word Length Effects (p.3) The frequency response of the implemented system is also affected by the quantization of coefficients in the difference equation. Finally, coefficient quantization can also lead to limit cycles in IIR filters—this means that in the absence of an input, the response of stable system to a unit impulse could result in undamped oscillations.
8
5.3 Floating-Point Number Representation C67x processor supports single precision and double precision floating-point representations. The formats are shown in Figure 5.6 and 5.7.
9
5.4 Overflow and Scaling Scaling is the simplest correction method for overflows in fixed-point implementations. This can be implemented in most filtering and transform applications. The input is scaled down for processing and the output is then scaled back up. Right shifting (dividing by 2) is an easy way to implement scaling. The shifting can occur until the overflows disappear from the computations.
10
5.4 Overflow and Scaling (p.2) Scaling of filter coefficients can also be used to avoid overflows. It can be shown that the condition to prevent overflow is –∑ | h[k] | ≤ 1 for k = 0 to N For IIR filters N is taken large enough so that the remaining values are negligible.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.