Download presentation
Presentation is loading. Please wait.
1
COMS 361 Computer Organization
Title: Real Numbers Date: 11/16/2004 Lecture Number: 20
2
Announcements Homework 9 Exam 2 Due Thursday, 11/18/2004
Tuesday, 11/23/2004
3
Review Memory elements Multiplication SR flip-flop (unclocked)
D Latch (clocked) D flip-flop (clocked) Master-Slave flop-flops Clock Multiplication
4
Outline Multiplication Real numbers IEEE standard 754 (floating point)
Single precision Double precision Decimal to floating point conversion Limitations
5
Multiplication (version 1)
6
Multiplication (version 1)
Left shifted multiplicand Multiplicand register size must be at least Multiplicand bits + multiplier bits long ALU size same as the multiplicand register
7
Multiplication (version 1)
x 1011 x 0001 x 0101 x 0010
8
Multiplication (version 2)
At least half the multiplicand bits are zero Waste to make a 64 bit adder Left shift inserts zeros on the right end of multiplicand Right part of the product will not change Shift the product register right instead Fixes the multiplicand 32 bits Multiplicand register ALU
9
Multiplication (version 2)
10
Multiplication (version 2)
0010 0010 x 0010 x 1011 0010 0010 x 0001 x 0101
11
Multiplication (version 3)
Product register has unused space Large enough to hold the multiplier As unused bits are filled Multiplier requires less bits Initialize product register Zeros in the left half Multiplier in the right half
12
Multiplication (version 3)
13
Real Numbers Are these real numbers or are they Sears numbers?
14
Real Numbers Scientific Notation Normalized scientific notation
Number times 10 to a power = * 103 Normalized scientific notation Zero to the left of the decimal point = * 106 Advantageous to store real numbers in 32 bits (word size)
15
Real Numbers 0.1012 = 0 * 20 + 1 * 2-1 + 0 * 2-2 + 1 * 2-3
/ / / /16 = 0 * * * * 2-3 = =
16
Real Numbers Decimal to binary conversion algorithm
1210 = 0 * * 23 1210 – 810 = 410 2 1 1 410 = 1 * 22 410 – 410 = 010
17
Real Numbers = 0.?????2 20 = 1 2 -1 = 0.5 1 = 1 * 2 -1 2 -2 = 0.25 – 0.5 = 2 -3 = 0.125 2 -4 = = 1 * 2 -2 1 2 -5 = – 0.25 = 2 -6 = = 1 * 2 -3 1 2 -7 = – = 010 2 -8 = 0. 2
18
Real Number Storage Real numbers are stored in floating point representation IEEE Standard 754 Issued 1985 Allows using data on different machines A sign An exponent A mantissa also called a significand (normalized decimal fraction) Single digit to the left of the decimal point
19
IEEE Standard 754 Provides two floating point types Single precision
24-bits of significand precision Double precision 53-bits of significand precision
20
IEEE Single Precision IEEE standard 754
Floating point number representation 32-bit s eeeeeeee fffffff ffffffffffffffff s: (1) sign bit 0 means positive, 1 means negative s exponent significand 31 30 23 22
21
IEEE Single Precision s eeeeeeee fffffff ffffffffffffffff
e: (8) exponent bits [-126 … 127] A bias of 127 is added to the exponent Exponent of 0 is stored as 127, stored exponent of 200 means actual exponent is (200 – 127) = 73 Stored exponent of all zeros and ones are reserved for special numbers f: (24) fractional part [23 bits + 1 implied bit] Since number to the left of the decimal point is not zero, its binary representation will have a leading one Saves a bit, a one is implied and does not need to be explicitly stored
22
Special Single Cases Two zeros Signed zero
e = 0, f = 0 (exponent and fractional bits are all 0) (-1)s x 0.0 0x (+0) 0x (-0)
23
Special Single Cases Positive infinity Negative infinity +INF
s = 0, e = 255, f = 0 (all fractional bits are all 0) 0x7f Negative infinity -INF s = 1, e = 255, f = 0 (all fractional bits are all 0) 0xff
24
Special Single Cases Not-A-Number (NaN)
s = 0 | 1, e = 255, f != 0 (at least one fractional bit is NOT 0) There are many representations for NaN Here is one example 0x7fc0 0000
25
Special Single Cases The smallest incremental value in IEEE single precision representation A single one bit in the least significant bit of the significand 23-bits, smallest represents Small but a finite number Gives rise to the accuracy number can be represented
26
Special Single Cases Example:
No digits to the left of the decimal point 1.3 rounded to 1.0, 1.7 rounded to 2.0 Error is no more than 1/2 23 digits to the left of the binary point Error is no more than Accurate to 7 decimal places Compute 1
27
IEEE Floating Point Numbers
Minimum representable number is: Minimum exponent (-126) Minimum significand Minimum single precision floating point number 0x
28
IEEE Floating Point Numbers
Maximum representable number is: Maximum exponent (127) Maximum significand Maximum single precision floating point number 0x7f7f ffff
29
IEEE Double Precision IEEE standard 754
Floating point number representation 64-bit s eeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffff s: (1) sign bit 0 means positive, 1 means negative s exponent significand 63 62 52 51 32 significand 31
30
Double Precision s eeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffff e: (11) exponent bits [-1022 … 1023] A bias of 1023 is added to the exponent Exponent of 0 is stored as 1023, stored exponent of 2000 means actual exponent is (2000 – 1023) = 977 Stored exponent of all zeros and ones are reserved for special numbers f: (53) fractional part [52 bits + 1 implied bit] Since number to the left of the decimal point is not zero, its binary representation will have a leading one Saves a bit, a one is implied and does not need to be explicitly stored
31
Real (Decimal) Number Storage
Double precision floating point numbers s: (1) sign bit e: (11) exponent bits [-1022 … 1023] f: (53) fractional part [52 bits + 1 implied bit] seeeeeee eee f f f f f f f f f f f f Byte Byte
32
Special Double Cases Two zeros Signed zero
e = 0, f = 0 (exponent and fractional bits are all 0) (-1)s x 0.0 64 bits … 0000 0x (+0) … 0000 0x (-0)
33
Special Double Cases Positive infinity Negative infinity +INF
s = 0, e = 2047, f = 0 (all fractional bits are all 0) … 0000 0x7ff Negative infinity -INF s = 1, e = 2047, f = 0 (all fractional bits are all 0) … 0000 0xfff
34
Special Double Cases Not-A-Number (NaN)
s = 0 | 1, e = 2047, f != 0 (at least one fractional bit is NOT 0) There are many representations for NaN Here is one example … 0000 0x7ff
35
Special Double Cases Maximum double number
… 1111 0x7fef ffff ffff ffff x 10308 Minimum positive single number … 0000 0x x
36
Decimal to Float Conversion
Show – in IEEE single precision format First, save sign (negative so 1) and convert to binary… = x 20 Normalize… = x 24 Strip 1 off the mantissa and extend to form significand = Bias the exponent… Exp + Bias = = 131 =
37
Real Number Storage Hex value : 0xC1C10000 Link me baby
Hex value : 0xC1C10000 Link me baby
38
Real Number Storage Numbers have limited precision Compute 1
39
Real Number Storage #include <iostream.h> void main() {
cout << "precision example" << endl; cout << "Number of bytes in a float: " << sizeof(float) << endl; float epsilon = 1.0f, value; int iteration = 0; int maxIteration = 100; while(iteration < maxIteration) { epsilon /= 2.0; value = 1.0f + epsilon; if (value == 1) break; iteration++; } // end while(...) cout << "Iteration: " << iteration << " Epsilon: " << epsilon << " Value: " << value << endl << endl; iteration = 0; double epsilonD = 1.0, valueD; cout << "Number of bytes in a double: " << sizeof(double) << endl; epsilonD /= 2.0; valueD = epsilonD; if (valueD == 1) break; cout << "Iteration: " << iteration << " Epsilon: " << epsilonD << " Value: " << valueD << endl; }
40
Real Number Storage Numbers have limited precision
Most real numbers have an infinite decimal expansion
41
Real Number Storage Limited Range and Precision
There are three categories of numbers left out when floating point representation is used Numbers out of range because their absolute value is too large (similar to integer overflow) Numbers out of range because their absolute value is too small (numbers too near zero to be stored given the precision available Numbers whose binary representations require either an infinite number of binary digits or more binary digits than the bits available
42
Real Number Storage Limited Range and Precision Illustrated
With one bit to the right of the decimal point, only the real number 0.5 can be represented.
43
Real Number Storage Limited Range and Precision Illustrated
real numbers that can be represented with two bits 0.25, 0.5, 0.75 real numbers that can be represented with three bits 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875 The holes correspond to all the unrepresented numbers: 0.126, 0.255, 0.3, …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.