Presentation is loading. Please wait.

Presentation is loading. Please wait.

Round-Off and Truncation Errors

Similar presentations


Presentation on theme: "Round-Off and Truncation Errors"— Presentation transcript:

1 Round-Off and Truncation Errors
CHAPTER 4 Round-Off and Truncation Errors

2 Numerical Accuracy Truncation error : Method dependent
Errors which result from using an approximation rather than an exact procedure Round-off error : Machine dependent Errors which result from not being able to adequately represent the true value Result from using an approximate number to represent exact number

3 Taylor Series Expansion
Construction of finite-difference formula Numerical accuracy: discretization error a x Base point x = a

4 Taylor series expansions

5 Taylor Series and Remainder
Taylor series (base point x = a) Remainder

6 Truncation Error (xi = 0, h = x  xi+1 = x) Taylor series expansion
Example (higher-order terms truncated) (xi = 0, h = x  xi+1 = x)

7 Power series Polynomials
The function becomes more nonlinear as m increases

8 A MATLAB Script Filename: fun_exp.m function sum = exp(x)
% Evaluate exponential function exp(x) % by Taylor series expansion % f(x)=1 + x + x^2/2! + x^3/3! + … + x^n/n! clear all x = input(‘enter the value of x = ’); n = input(‘enter the order n = ’); term =1 ; sum= term; for i = 1 : n term = term*x/i; sum = sum + term; end

9 MATLAB For Loops Filename: fun_exp2.m function sum = exp(x)
% Evaluate exponential function exp(x) % by Taylor series expansion % f(x)=1 + x + x^2/2! + x^3/3! + … + x^n/n! x = input(‘enter the value of x =’); n = input(‘enter the order n = ’); term(1) =1 ; sum(1)= term(1); for i = 1 : n term(i+1) = term(i)*x/i; sum(i+1) = sum(i) + term(i+1); end % Display the results disp(‘i term(i) sum(i)’) a = 1:n+1; [a’ term’ sum’]

10 Truncation Error n term sum n term sum

11 Truncation Error n term sum n term sum How to reduce error?

12 Round-off Errors Computers can represent numbers to a finite precision
Most important for real numbers - integer math can be exact, but limited How do computers represent numbers? Binary representation of the integers and real numbers in computer memory

13 MATLAB uses double precision
32 bits (23, 8, 1) 28 = 256 64 bits (52, 11, 1) 211 = 2048 MATLAB uses double precision

14 Order of operation Addition problem: exact result
with 3-digit arithmetic: Round-off error

15 Cancellation error If b is large, r is close to b
Difference of two numbers very close to each other  potential for greater error! Rationalize:

16 Try b = 97 x2 (r = 96.9794) (3 sig. figs.) exact: 0.01031
standard: rationalized: Corresponding to “cancellation, critical arithmetic”

17 Significant Figures 48.9 mph? mph?

18 Significant Digits The places which can be used with confidence
32-bit machine: 7 significant digits 64-bit machine: 17 significant digits Double precision: reduce round-off error, but increase CPU time

19 False Significant Figures
3.25/1.96 = (from MATLAB) But in practice only report 1.65 (chopping) or 1.66 (rounding)! Why?? Because we don’t know what is beyond the second decimal place

20

21 Accuracy and precision
Accuracy - How closely a measured or computed value agrees with the true value Precision - How closely individual measured or computed values agree with each other Accuracy is getting all your shots near the target. Precision is getting them close together. More Accurate More Precise

22 Numerical Errors The difference between the true value and the approximation Approximation = true value + true error Et = true value  approximation = x*  x or in percent

23 Approximate Error But the true value is not known
If we knew it, we wouldn’t have a problem Use approximate error

24 Number Systems 1 lb = 16 oz, 1 ft = 12 in, ½”, ¼”, …..
Base-10 (Decimal): 0,1,2,3,4,5,6,7,8,9 Base-8 (Octal): 0,1,2,3,4,5,6,7 Base-2 (Binary): 0,1 – off/on, close/open, negative/positive charge Other non-decimal systems 1 lb = 16 oz, 1 ft = 12 in, ½”, ¼”, …..

25 Decimal System (base 10) Binary System (base 2)

26 Integer Representation
Signed magnitude method Use the first bit of a word to indicate the sign – 0: negative (off), 1: positive (on) Remaining bits are used to store a number Sign Number off / on, close / open, negative / positive

27 Integer Representation
8-bit word +/ are the same, therefore we may use “-0” to represent “-128” Total numbers = 28 = 256 (-128 127) Sign Number

28 Integer Representation
16-bit word Range: -32,768 to 32,767 Overflow: > 32,767 (cannot represent 43,000 A&M students) Underflow: < -32,768 (magnitude too large) 32-bit word Range: -2,147,483,648 to 2,147,483,647 9 significant digits Overflow: world population 6 billion Underflow: budget deficit -$100 billion

29 Integer Operations So 123 + 45 = overflow and -74 * 2 = underflow
Integer arithmetic can be exact as long as you don't get remainders in division 7/2 = 3 in integer math or overflow the maximum integer For a 8-bit computer max = 128 (or -127) So = overflow and -74 * 2 = underflow

30 Floating-Point Representation
Real numbers (also called floating-point numbers) are represented differently For fraction or very large numbers Store as sign is 1 or 0 for negative or positive exponent is maximum value (positive or negative) of base mantissa contains significant digits sign signed exponent mantissa

31 Floating-Point Representation
sign of number signed exponent mantissa m: mantissa B: Base of the number system e: “signed” exponent Note: the mantissa is usually “normalized” if the leading digit is zero

32 Integer representation
Floating-point number representation

33 Decimal Representation
8-bit word sign signed exponent number 1|095| (base: B = 10) mantissa: m = -(1* * * *10-4 ) = signed exponent: e = + (9* *100) = 95

34 Floating-Point Representation
8-bit word (without normalization) sign signed exponent number 0|111| (base: B = 2) mantissa: m = +(0* * * *2-4 ) = 5/16 signed exponent: e = - (1*21 + 1*20) = -3

35 Normalization (Less accurate) (Normalization) Remove the leading zero by lowering the exponent (d1 = 1 for all numbers) if m < 1/2, multiply by 2 to remove the leading 0 floating-point allow fractions and very large numbers to be represented, but take up more memory and CPU time

36 Binary Representation
8-bit word (with normalization) sign signed exponent number 1|011| (base: B = 2) mantissa: m = -(1* * * *2-4 ) = -9/16 signed exponent: e = + (1*21 + 1*20) = 3

37 Single Precision 23 for the digits 32 bits 8 for the signed exponent
A real variable (number) is stored in four words, or 32 bits (64 bits for Supercomputers) bit (binary digit): 0 or 1 byte: 4 bits, 24 = 16 possible values word: 2 bytes = 8 bits, 28 = 256 possible values 23 for the digits 32 bits for the signed exponent 1 for the sign

38 Double Precision signed exponent  210 =  1024 52 for the digits
A real variable is stored in eight words, or 64 bits 16 words, 128 bits for supercomputers signed exponent  210 =  1024 52 for the digits 64 bits for the signed exponent 1 for the sign

39 Round-off Errors Example - three significant digits
Floating point characteristics contribute to round-off error (limited bits for storage) Limited range of quantities can be represented A finite number of quantities can be represented The interval between numbers increases as the numbers grow Example - three significant digits …… ( increment) …… (0.001 increment) …… (0.01 increment)

40 MATLAB MATLAB uses double precision
Finite number of real quantities (integers, real numbers or text) can be represented For 8-bit, 28 = 256 quantities For 16-bit, 216 = quantities MATLAB uses double precision 4 bytes = 64 bits more than 1019 (264) quantities


Download ppt "Round-Off and Truncation Errors"

Similar presentations


Ads by Google