Download presentation
Presentation is loading. Please wait.
1
CS 286 Computer Architecture & Organization
Floating-Point Numbers & IEEE-754 Department of Computer Science Southern Illinois University Edwardsville Fall, 2018 Dr. Hiroshi Fujinoki IEEE754/001
2
CS 286 Computer Architecture & Organization
Introduction Two’s complement signed integer is an efficient representation for integers Two’s complement signed integer, however, can not: - Handle fractions - Handle huge numbers (#’s > 2(N-1)-1 for a N-bit 2’s complement integers) (Astronomy needs those huge numbers) - Handle really small numbers (#’s < -2(N-1) for a N-bit 2’s complement integers) (Physics needs those small numbers) Used for fractions in most of the computers Fixed-point notations Floating-point notations Can handle fractions Fractions used in computers Can’t handle huge and small numbers Can handle fractions Can handle huge and small numbers IEEE754/002
3
CS 286 Computer Architecture & Organization
Fixed-point notations The decimal point is fixed at a certain bit position 16 bits LSB MSB Sign Bit 1 1 Integer bits Fraction bits 0 = positive 1 = negative MSB LSB 1 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 (1/2) (1/4) (1/8) (1/16) (1/32) (1/64) (1/128) (1/256) = 1/4 + 1/8 + 1/32 + 1/128 IEEE754/003
4
CS 286 Computer Architecture & Organization
Fixed-point notations The decimal point is fixed at a certain bit position 16 bits LSB MSB 1 1 Sign Bit Not quite large Approximately +128 0 = positive 1 = negative Integer bits Fraction bits Is this really a large #? The largest number: The smallest number: LSB MSB 1 1 Approximately -128 Is this really a small #? Not quite small LSB MSB 1 1 1 IEEE754/004
5
CS 286 Computer Architecture & Organization
Exponent Floating-point notations General Implementation Format S 10( E) Sign 976,000,000,000,000 = 9.76 1014 10(+ 14) Integer Fraction Decimal-point locator Signficand (Mantissa) Move . to the right by one position = 1017 Decrease the exponent by one All the same value = 97.6 1013 Move . to the left by one position = 1015 Increase the exponent by one = 100 Why is this notation efficient? Can handle fractions YES Can handle huge numbers (by having a large positive exponent) Can handle tiny numbers or really small number IEEE754/005 ( ) ( )
6
CS 286 Computer Architecture & Organization
Binary implementation of floating-point numbers This is how computers implement floating-point numbers inside of a computer General Implementation Format S 2( E) Sign Exponent 2(101011) Signficand (Mantissa) Only 0 and 1 (binary numbers) Examples The following representations are all the same in value +11000 (“24“ in decimal) = 20 All same value +0.11 25 27 2(-3) IEEE754/006
7
CS 286 Computer Architecture & Organization
IEEE 754 Floating-Point Standard The standard most of the processors today use 32 bits Sign Bit 0 = positive 1 = negative LSB MSB Biased Exponent (8) (a bias of 127) Significand (23) -127127 IEEE-754 Floating-Point Standard uses “Bias of 127” Bit patterns for a 8-bit numbers Decimals 255 128 -127 Bias of 0 Bias of 127 Bias of 2 N. A. N. A. N. A. N. A. N. A. N. A. IEEE754/007
8
CS 286 Computer Architecture & Organization
IEEE 754 Floating-Point Standard The standard most of the processors today use 32 bits Sign Bit 0 = positive 1 = negative LSB MSB Biased Exponent (8) (a bias of 127) Significand (23) -127128 More than one digit above the decimal point! Example for “significand” 18.5(10) = 16(10)+2(10)+0.5(10) = (2) = (2) 20 this format is NOT acceptable Make sure that we have only one digit above the decimal point = (2) 24 = (2) 25 = (2) 26 IEEE754/008
9
CS 286 Computer Architecture & Organization
Three Exceptions in “Exponent Bit” Decimals 255 128 -127 N. A. Bias of 0 Bias of 2 Bias of 127 The pattern of all 1’s (for “128”) is NOT used The Max. exponent is 127 - They are used for the infinity and specific error codes The pattern of all 0’s (“-127”) is NOT used The Min. exponent is -126 The pattern of all 0’s (“-127”) with all 0’s in the significand bits is reserved for “ ” IEEE754/009
10
CS 286 Computer Architecture & Organization
Three Exceptions in “Exponent Bit” The pattern of all 0’s (“-127”) with all 0’s in the significand bits is reserved for “ ” This means “ … 00” 32 bits LSB MSB Sign Bit Significand (23) Biased Exponent (8) IEEE754/010
11
CS 286 Computer Architecture & Organization
Examples 25 LSB MSB 32 bits Significand (23) Biased Exponent (8) (a bias of 127) Sign Bit 1 1 0 = positive 1 = negative Decimals 255 128 -127 N. A. Bias of 0 Bias of 127 Exponent “5” in bias of 127 0: + 5: 1 IEEE754/011
12
CS 286 Computer Architecture & Organization
Normalized Notation Always align the first left-most “1” in the signifcand at the first digit above the decimal point Always “1.” appear Normalized Unnormalized 25 281 2-8 2-2 +1.1 24 +1.01 279 2-3 Why normalized notation? +1 2-7 IEEE754/012
13
CS 286 Computer Architecture & Organization
Normalized Notation Always align the first left-most “1” in the signifcand at the first digit above the decimal point For any normalized floating-point number, this bit is ALWAYS “1” Examples +1.10 24 LSB MSB 32 bits Sign Bit Significand (23) Biased Exponent (8) 0 = positive 1 = negative (a bias of 127) 1 1 1 LSB MSB 1 1 LSB MSB 1 You got one more significand bit! IEEE754/013
14
CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 Smallest Negative Largest Negative Smallest Positive Largest Positive Larger Smaller Negative Positive IEEE754/014
15
CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 BE = -126 Significand = all 0 Sign = 1 BE = 127 Significand = all “1” Sign = 0 Smallest Negative Largest Negative Smallest Positive Largest Positive Negative Positive BE = 127 Significand = all “1” Sign = 1 BE = -126 Significand = all 0 Sign = 0 IEEE754/015
16
CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 BE = -126 Significand = all “0” Sign = 1 BE = 127 Significand = all “1” Sign = 0 Negative Positive Negative Underflow Negative Overflow BE = 127 Significand = all “1” Sign = 1 Positive Underflow BE = -126 Significand = all “0” Sign = 0 Positive Overflow IEEE754/016
17
CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 2(-148) Negative Positive 2(104) 2(104) 2(-148) IEEE754/017
18
CS 286 Computer Architecture & Organization
Format of typical floating-point notations 32 bits MSB LSB Sign Bit 0 = positive 1 = negative Biased Exponent (8) Significand (23) (a bias of 127) -126127 We got high accuracy For tiny numbers Negative Positive We can represent huge numbers We lost accuracy We lost accuracy IEEE754/018
19
CS 286 Computer Architecture & Organization
IEEE-754 Floating-Point Notations Single-Float Double-Float (single-precision floating-point) – “float” in C/C++ 32 bits LSB MSB Sign Bit Significand (23) Biased Exponent (8) 0 = positive 1 = negative (a bias of 127) -126127 (double-precision floating-point) – “double” in C/C+ 64 bits LSB MSB Sign Bit 0 = positive 1 = negative Biased Exponent (11) Significand (52) IEEE754/019
20
CS 286 Computer Architecture & Organization
IEEE754/000
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.