Download presentation
Presentation is loading. Please wait.
1
Floating Points & IEEE 754
2
Column Pattern What goes to the right of 1’s column? 23 8 22 4 21 2 20
3
2-3 = 1 2 3 = 1 8 = 0.125 Column Pattern Negative powers of two: 23 8
2-3 = = = 0.125 23 8 22 4 21 2 20 1 2-1 0.5 2-2 0.25 2-3 0.125 2-4 0.0625
4
10.112 = 2 + 0.5 + 0.25 = 2.7510 Number Binary decimal: 23 8 22 4 21 2
20 1 2-1 0.5 2-2 0.25 2-3 0.125 2-4 0.0625
5
Fixed Decimal Problems
Fixed decimal points waste space: 400,000,000,000,000, vs x 1017 vs x 10-14 In computers, space is precious Computers use a floating decimal point (Like scientific notation)
6
Floating Point 1 Bits used to represent 3 parts: Sign Exponent
Fraction (or Mantissa) Sign Exponent Mantissa 1
7
Sign 0 = positive, 1 = negative Sign Exponent Mantissa 1
8
Exponent 1 Binary integer in excess notation
Gives power of 2 to multiply by 100 = 0 So 20 Sign Exponent Mantissa 1 Binary Value 000 -4 001 -3 010 -2 011 -1 100 101 1 110 2 111 3
9
Mantissa 1 Fractional Value Always a decimal 1000 = 0.5 Sign Exponent
1 2-1 0.5 2-2 0.25 2-3 0.125 2-4 0.0625 1
10
IEEE 754 Standards for 32bit and 64 bit floats
32 bit : float or single 64 bit : double
11
The Range The floating point number range: + -Normalized -Denorm
NaN NaN 0 +0
12
Sign Sign bit 1 = negative 0 = positive
13
Exponent Exponent X bits = 2x different values Stored as biased value
8 bits = 256 values = 0-255 Stored as biased value Bias value subtracted to get exponent IEEE 32-bit float Biased 127: Exponent = value - 127 Binary Value Exponent 1 -126 128 137 10 255
14
Mantissa Mantissa Normalized Assumed 1.XXXXX Value in range [1-2)
Binary Represents Value … … 1 … … 1.5 … … 1.25 … … 1.375
15
Special Patterns Exponent of all 1's/0's is special:
16
Mantissa If exponent 0 Mantissa is Denormalized Assumed 0.XXXXX
Value in range [0-1) Binary Represents Value … … … … 0.5 … … 0.25 … … 0.375
17
Special Patterns Exponent of all 1's/0's is special:
18
Issues Can't count on absolute precision:
19
Issues Small values closer than large values
Accuracy expressed in digits not decimal places 32 bit : 7-8 decimal digits 64 bit : digits
20
Issues Can't count on absolute precision:
Proper epsilon depends on magnitude of x
21
Issues No associativity or commutitive property in floating point math
a*(b*c) and a*b*c can give different results
22
Issues Errors compound with repeated calculations
23
Floating Point ARM VFP unit optional in lower end chips
24
Floating Point ARM Special registers for coprocessor
32 32-bi registers s0-s31 Pairs can be used as 64-bit registers
25
Floating Point Instructions
Can declare float/double data values VLDR to load VFP registers from address in ARM register
26
Moving and Converting VMOV can move bits from regular to VFP registers: Special instructions to: Convert float word Convert float double
27
x86 x86 processors had optional floating point coprocessor (x87)
All floating point functions stack based:
28
SSE SSE : Intel extension to Pentium chips
Added addressable registers for floating points
29
Integer vs FP Performance
Intel Haswell architecture:
30
Integer vs FP Performance
Intel Haswell architecture:
31
Awesomeness Fast Inverse Square Root
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.