Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 286 Computer Architecture & Organization

Similar presentations


Presentation on theme: "CS 286 Computer Architecture & Organization"— Presentation transcript:

1 CS 286 Computer Architecture & Organization
Floating-Point Numbers & IEEE-754 Department of Computer Science Southern Illinois University Edwardsville Fall, 2018 Dr. Hiroshi Fujinoki IEEE754/001

2 CS 286 Computer Architecture & Organization
Introduction Two’s complement signed integer is an efficient representation for integers Two’s complement signed integer, however, can not: - Handle fractions - Handle huge numbers (#’s > 2(N-1)-1 for a N-bit 2’s complement integers) (Astronomy needs those huge numbers) - Handle really small numbers (#’s < -2(N-1) for a N-bit 2’s complement integers) (Physics needs those small numbers) Used for fractions in most of the computers Fixed-point notations Floating-point notations Can handle fractions Fractions used in computers Can’t handle huge and small numbers Can handle fractions Can handle huge and small numbers IEEE754/002

3 CS 286 Computer Architecture & Organization
Fixed-point notations The decimal point is fixed at a certain bit position 16 bits LSB MSB Sign Bit 1 1 Integer bits Fraction bits 0 = positive 1 = negative MSB LSB 1 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 (1/2) (1/4) (1/8) (1/16) (1/32) (1/64) (1/128) (1/256) = 1/4 + 1/8 + 1/32 + 1/128 IEEE754/003

4 CS 286 Computer Architecture & Organization
Fixed-point notations The decimal point is fixed at a certain bit position 16 bits LSB MSB 1 1 Sign Bit Not quite large  Approximately +128 0 = positive 1 = negative Integer bits Fraction bits Is this really a large #? The largest number: The smallest number: LSB MSB 1 1 Approximately -128 Is this really a small #? Not quite small  LSB MSB 1 1 1 IEEE754/004

5 CS 286 Computer Architecture & Organization
Exponent Floating-point notations General Implementation Format  S  10( E) Sign 976,000,000,000,000 = 9.76  1014  10(+ 14)  Integer  Fraction  Decimal-point locator Signficand (Mantissa) Move . to the right by one position =  1017 Decrease the exponent by one All the same value = 97.6  1013 Move . to the left by one position =  1015 Increase the exponent by one =  100 Why is this notation efficient?  Can handle fractions YES  Can handle huge numbers (by having a large positive exponent)  Can handle tiny numbers or really small number IEEE754/005 ( ) ( )

6 CS 286 Computer Architecture & Organization
Binary implementation of floating-point numbers This is how computers implement floating-point numbers inside of a computer General Implementation Format  S  2( E) Sign Exponent   2(101011) Signficand (Mantissa) Only 0 and 1 (binary numbers) Examples The following representations are all the same in value +11000 (“24“ in decimal) =  20 All same value +0.11  25  27  2(-3) IEEE754/006

7 CS 286 Computer Architecture & Organization
IEEE 754 Floating-Point Standard The standard most of the processors today use 32 bits Sign Bit 0 = positive 1 = negative LSB MSB     Biased Exponent (8) (a bias of 127) Significand (23) -127127 IEEE-754 Floating-Point Standard uses “Bias of 127” Bit patterns for a 8-bit numbers Decimals 255 128 -127    Bias of 0 Bias of 127 Bias of 2    N. A. N. A.    N. A.          N. A. N. A. N. A. IEEE754/007

8 CS 286 Computer Architecture & Organization
IEEE 754 Floating-Point Standard The standard most of the processors today use 32 bits Sign Bit 0 = positive 1 = negative LSB MSB     Biased Exponent (8) (a bias of 127) Significand (23) -127128 More than one digit above the decimal point! Example for “significand” 18.5(10) = 16(10)+2(10)+0.5(10) = (2) = (2)  20 this format is NOT acceptable Make sure that we have only one digit above the decimal point = (2)  24 = (2)  25 = (2)  26 IEEE754/008

9 CS 286 Computer Architecture & Organization
Three Exceptions in “Exponent Bit” Decimals 255 128 -127    N. A. Bias of 0 Bias of 2 Bias of 127  The pattern of all 1’s (for “128”) is NOT used The Max. exponent is 127 - They are used for the infinity and specific error codes  The pattern of all 0’s (“-127”) is NOT used The Min. exponent is -126  The pattern of all 0’s (“-127”) with all 0’s in the significand bits is reserved for “ ” IEEE754/009

10 CS 286 Computer Architecture & Organization
Three Exceptions in “Exponent Bit”  The pattern of all 0’s (“-127”) with all 0’s in the significand bits is reserved for “ ” This means “ … 00” 32 bits LSB MSB Sign Bit     Significand (23) Biased Exponent (8) IEEE754/010

11 CS 286 Computer Architecture & Organization
Examples  25 LSB MSB 32 bits Significand (23) Biased Exponent (8)    (a bias of 127) Sign Bit 1 1 0 = positive 1 = negative Decimals 255 128 -127    N. A. Bias of 0 Bias of 127 Exponent “5” in bias of 127 0: + 5: 1 IEEE754/011

12 CS 286 Computer Architecture & Organization
Normalized Notation Always align the first left-most “1” in the signifcand at the first digit above the decimal point Always “1.” appear Normalized Unnormalized  25  281  2-8  2-2 +1.1  24 +1.01  279  2-3 Why normalized notation? +1  2-7 IEEE754/012

13 CS 286 Computer Architecture & Organization
Normalized Notation Always align the first left-most “1” in the signifcand at the first digit above the decimal point For any normalized floating-point number, this bit is ALWAYS “1” Examples +1.10  24 LSB MSB 32 bits Sign Bit Significand (23) Biased Exponent (8) 0 = positive 1 = negative    (a bias of 127) 1 1 1 LSB MSB    1 1 LSB MSB    1 You got one more significand bit! IEEE754/013

14 CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB     LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 Smallest Negative Largest Negative Smallest Positive Largest Positive Larger Smaller Negative Positive IEEE754/014

15 CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB     LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 BE = -126 Significand = all 0 Sign = 1 BE = 127 Significand = all “1” Sign = 0 Smallest Negative Largest Negative Smallest Positive Largest Positive Negative Positive BE = 127 Significand = all “1” Sign = 1 BE = -126 Significand = all 0 Sign = 0 IEEE754/015

16 CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB     LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 BE = -126 Significand = all “0” Sign = 1 BE = 127 Significand = all “1” Sign = 0 Negative Positive Negative Underflow Negative Overflow BE = 127 Significand = all “1” Sign = 1 Positive Underflow BE = -126 Significand = all “0” Sign = 0 Positive Overflow IEEE754/016

17 CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB     LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 2(-148) Negative Positive 2(104) 2(104) 2(-148) IEEE754/017

18 CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB     LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 We got high accuracy For tiny numbers Negative Positive We can represent huge numbers We lost accuracy We lost accuracy IEEE754/018

19 CS 286 Computer Architecture & Organization
IEEE-754 Floating-Point Notations  Single-Float  Double-Float (single-precision floating-point) – “float” in C/C++ 32 bits LSB MSB Sign Bit Significand (23) Biased Exponent (8) 0 = positive 1 = negative     (a bias of 127) -126127 (double-precision floating-point) – “double” in C/C+ 64 bits LSB MSB Sign Bit 0 = positive 1 = negative     Biased Exponent (11) Significand (52) IEEE754/019

20 CS 286 Computer Architecture & Organization
IEEE754/000


Download ppt "CS 286 Computer Architecture & Organization"

Similar presentations


Ads by Google