Presentation is loading. Please wait.

Presentation is loading. Please wait.

Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?

Similar presentations


Presentation on theme: "Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?"— Presentation transcript:

1 Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?

2 Representing fractions – Fixed point a 1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a 10 A number with 10 bits

3 Representing fractions – Fixed point a 1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a 10 A number with 10 bits a 1 a 2 a 3 a 4 a 5 a 6 a 7 a 8. a 9 a 10 Fixing the point

4 Representing fractions – Fixed point NumberFraction Signed (2 complement) -2 7 …2 7 -1Quanta of 0.25 Unsigned0…2 8 -1Quanta of 0.25 Range of representation:

5 Fixed point : the problem Cannot represent wide ranges of numbers. In scientific applications.

6 Representing Fractions – Floating point 10 1 * 10 1 -1.23 * 10 -2 Base (radix) - r -0.123

7 Representing Fractions – Floating point 10 1 * 10 1 -1.23 * 10 -1 exponent -0.123

8 Representing Fractions – Floating point 10 1 * 10 1 -1.23 * 10 -1 Number (Mantissa) -0.123

9 Representing Fractions – Floating point 10 -0.123 (-1) 0 *1 * 10 1 (-1) 1 *1.23 * 10 -1 Sign bit

10 Problem of uniqueness 0.1 100*10 -4 0.001*10 2 Representation is not Unique

11 Problem of uniqueness - Normalization 0.61 610*10 -4 0.0061*10 2 Standardization 6.1*10 -1 One digit to the Left of the point

12 Normalized Binary Floating point D = (-1) a 0 * (1.a 1 a 2 a 3 …)*2 b 1 b 2 b 3 … a 0 b 1 b 2 …b n a 1 a 2 a 3 …a m String of bits

13 Floating point - Questions Representing the (signed) exponent How to represent zero? And Nan, infinity ? How to add, subtract and multiply? Rounding Errors.

14 Floating point – Representing the exponent How to represent singed number ? Sign bit 2-Complement

15 Floating point – Representing the exponent How to represent singed number ? Sign bit 2-Complement Neither

16 Floating point – Representing the exponent We want the exponent to be binary ordered: 0000 < 0001 < …. < 1000 < … < 1111

17 Floating point – Representing the exponent Number = Number - B 000…0001 111…1110 e min e max Usually B = 2 n-1 -1 We define the following sizes like this:

18 Floating point – Representing zero,NAN,  ±  ExponentMantissaRepresent e = e min -1M = 0±0 e = e min -1M ≠ 00.M*2 e min e min ≤ e ≤ e max 1.M*2 e e = e max +1M = 0 ±± e = e max +1M ≠ 0NaN IEEE754 special values Denormalized number normalized number

19 IEEE 754 ParameterSingleDouble Mantissa2453 e max 1271023 e min -126-1022 Exponent width 811 Format width 3264 (Including the sign Bit)

20 What is NaN (not a number) OperationNaN produced by +  + (-  ) * 0 *  / 0/0,  /  Partial list Operation X±NaN X*NaN X/NaN

21 Infinity Provide a safe was to continue calculation when overflow is encountered.

22 Calculations with Floating Point numbers Addition: Equalize the exponents (smaller  larger exponent) Sum the mantissa Renormalize if necessary

23 Calculations with Floating Point numbers Example (in base 10): 91  9.10*10 1 |E| = 1, |M| = 3 9.7  9.70*10 0

24 Calculations with Floating Point numbers 9.10*10 1 + 9.70*10 0 Not The same Order.

25 Calculations with Floating Point numbers 9.10*10 1 + 9.70*10 0 9.10*10 1 + 0.97*10 1 10.7*10 1 renormalize 1.07*10 2

26 Calculations with Floating Point numbers Example II (in base 10): 91  9.10*10 1 |E| = 1, |M| = 3 9.75  9.75*10 0

27 Calculations with Floating Point numbers 9.10*10 1 + 9.75*10 0 Not The same Order.

28 Calculations with Floating Point numbers 9.10*10 1 + 9.75*10 0 9.10 *10 1 + 0.975*10 1 10.75*10 1 renormalize 1.07*10 2 5 (rounding error)

29 Rounding Errors The Problem: Squeezing infinite many real numbers into a finite number of bits

30 Measuring Rounding Errors Units in last place (Ulps) Relative Error

31 Measuring Rounding Errors – ULP error = |d.dddd – (z/r e )|*r p-1 If d.dddd*r e represent z p digits

32 Measuring Rounding Errors – ULP Example I: r = 10, p = 3 The number 3.14*10 -2 represents 0.0314159 Error = 0.159

33 Measuring Rounding Errors – ULP What is the maximum ULP if the rounding is toward the nearest number? 0.5 ULP

34 Measuring Rounding Errors – Relative Error Relative error = |d.dddd*r e – z|/z If d.dddd*r e represent z p digits

35 Measuring Rounding Errors – Relative errors Example I: r = 10, p = 3 The number 3.14*10 -2 represents 0.0314159 Relative Error ~ 0.0005


Download ppt "Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?"

Similar presentations


Ads by Google