CSCI206 - Computer Organization & Programming

CSCI206 - Computer Organization & Programming
Floating Point Limits zyBook: 10.9, 10.10

IEEE 754 Standard (1985) (normalized)
Exponent Mantissa S, E, and M are encoded in the binary word

IEEE754 - Reserved Values Not a Number =

IEEE754 - Example Show 3.14 as a single precision float

3.14 - step 1 write in binary 3.14 == 3 + 0.14 0.14*2 = 0.28
0.28*2 = 0.56 0.56*2 = 1.12 0.12*2 = 0.24 ......

step 1 write in binary Need 24 bits for single (52 for double). In this example, 2 bits before point, 22 bits after. 3.14 ==

3.14 - step 2 normalize binary
Normalized form is 1.yyyyy 3.14 == == Note a total of 24 bits.

3.14 - step 3 write mantissa & sign
3.14 == M = S = 0 (positive) Note that the mantissa keeps only 23 bits, the leading bit is always 1, so it is omitted in representation (only!!).

3.14 - step 4 encode exponent 3.14 == Exponent = 1, B = 127, (8 bits)
E (biased exponent) = 128 =

step 5 write result S = 0 (positive) E = M = to hex = 0x4048f5c3

Endianness On a little-endian system (Intel, etc), the IEEE754 value is byte & word swapped 0x f5 c3 (big endian) c3f5 0x c3f (little endian) Swap bytes and words! float f = 3.14; unsigned char* p = (unsigned char*)&f; printf("%02x%02x %02x%02x\n", *p, *(p+1), *(p+2), *(p+3)); // result on Intel: c3f5 4840, on MIPS: 4048 f5c3

Review IEEE754 S Exponent Mantissa
Special values, else normalized numbers Exponent Mantissa (fraction) Value +/- zero nonzero denormalized number all 1’s +/- infinity NaN (not a number)

Largest Single Precision Float
8 bit exponent (bias = 127), 23 bit fraction All 1’s in the exponent is reserved for NaN and infinity Maximum biased exponent is = 254 Maximum fraction is 23 1’s

= 254 = 127

Move the decimal point 23 digits to the right subtract 23 from exponent

Convert mantissa

Smallest Nonzero Single
What we want is: But that has exponent & fraction = 0 That value is reserved for zero! Therefore, the closest we can get is: either or

Normalized In this case, using a normalized number is not ideal, if we could use a denormalized number we could get a much smaller value: This is equivalent to: An extra 22 bits of precision! Denormalized

The IEEE realized this and when the exponent is zero and the fraction is > 0, the value is treated as a denormalized number. The smallest nonzero normalized: The smallest nonzero denormalized: exp = exp = m = 0000….1

Smallest Nonzero Normalized Single
biased exponent = 1 fraction = 0

Smallest Nonzero Denormalized Single
biased exponent = 0 fraction = *Note, even though the exponent is encoded as -127, it is computed using the smallest “valid” exponent, which is -126.

CSCI206 - Computer Organization & Programming

Similar presentations

Presentation on theme: "CSCI206 - Computer Organization & Programming"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSCI206 - Computer Organization & Programming

Similar presentations

Presentation on theme: "CSCI206 - Computer Organization & Programming"— Presentation transcript:

Similar presentations

About project

Feedback