Floating Point Numbers. It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly.

Floating Point Numbers

It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly quickly Given n bits of information, there are 2 n possible combinations These 2 n representations can encode pretty much anything you want, letters, numbers, instructions….

Bases of number systems Base 10 numbers: 0,1,2,3,4,5,6,7,8,9 3107 = 3  10 3 +1  10 2 + 0  10 1 +7  10 0 Base 2 numbers: 0,1 3107 = 1 2 4 8 16 32 64 128 256 512 1024 2048 =1  2 11 + 1  2 10 + 0  2 9 + 0  2 8 + 0  2 7 + 0  2 6 + 1  2 5 + 0  2 4 + 0  2 3 + 0  2 2 + 1  2 1 + 1  2 0 =110000100011 in MATLAB, dec2bin Addition, multiplication etc, all proceed same way

Base Notation What does 10 mean? 10 in binary = 2 decimal 10 in octal (base 8) = 8 decimal 10 in decimal = 10 decimal Need some method of differentiating between these possibilities To avoid confusion, where necessary we write 10 10 = 10 2 =

Integer Representation Integers obviously fit into this base 2 notations Remains challenge to represent negative numbers 2s complement Excess-N Extra choice is order of bits Choice is made chip-by-chip portability

Floating Point Representation Computers represent floating point numbers in binary form For generality, they use a binary form of scientific notation In binary, we can use powers of 2

Floating Point Size In IEEE.h IEEE.h:#define IEEE_FLOAT_SIZE 4 IEEE.h:#define IEEE_DOUBLE_SIZE 8 IEEE.h:#define IEEE_QUAD_SIZE 16

Distribution Precision# bitsMantissa Bits Expon. Bits Sign Bit Single322381 Double6452111

In Decimal Terms Each binary floating point double holds roughly 16 decimal digits technically, 2^(-52) MATLAB example

Advantages Scientific notation can work on any scale (all handled by exponent) So long as errors are small relative to scale of data values, calculations are accurate right?

Example 1 1e12 + 0.2 – 1e12

Problem Nice decimal numbers (0.2) have continuing binary representations like 1/3 = 0.3333333, 0.2 has binary 0.0011 0011 0011 0011… Analogy with adding, subtracting large number

Roundoff Error Round-off error will always be present e.g. Roundoff error is more significant when you are subtracting two almost equal quantities e.g in decimal, 255.67 – 255.69

Example 2 A = 112000000 B = 100000 C = 0.0009 X = A - B / C

Common occurrence Delta x in finite element methods numerical differentiation Places where more closely packed data gives

Example 3: Numerical Diff.

Example 4: Recursion Comparing sum of delta x and real sum t = 0; N = 2^12; dx = 1/N; for (I = 1:N)  t = t + dx; end

Avoiding (Large) Roundoff Error Avoid substracting almost-equal quantities Avoid dividing by small quantities Avoid sums over large loops, especially with different orders of magnitude in the sum Avoid recursive calculations, where errors will accumulate

Floating Point Numbers. It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly.

Similar presentations

Presentation on theme: "Floating Point Numbers. It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Floating Point Numbers. It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly.

Similar presentations

Presentation on theme: "Floating Point Numbers. It's all just 1s and 0s Computers are fundamentally driven by logic and thus bits of data Manipulation of bits can be done incredibly."— Presentation transcript:

Similar presentations

About project

Feedback