Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
In the previous slide Why numerical methods? –differences between human and computer –a very simple numerical method What is algorithm? –definition and components –three problems and three algorithms Convergence –compare rate of convergence 2
In this slide Error (motivation) Floating point number system –difference to real number system –problem of roundoff Introduced/propagated error Focus on numerical methods –three bugs 3
Let’s start from error Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 4
Can be analyzed Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 5
Should be prevented Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 6
1.3 7 Mathematics on the Computer Floating Point Number Systems
8
Restriction of d 1 9 d 1 must not be zero (except when the number being represented is 0 )
Floating point vs. real number Discrete vs. continuous –continuous means that between any two numbers, there are infinitely many other numbers Finite vs. infinite –number of element and range of values –a floating point number system contains its smallest/largest element underflow/overflow 10
Any Questions? 11
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements **** *** **** are more closely spaced think about the difference between two adjacent elements while the exponent changes 12 hint
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements **** *** **** are more closely spaced think about the difference between two adjacent elements while the exponent changes 13
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements near the zero are more closely spaced think about the difference between two adjacent elements while the exponent changes 14
Floating point system is 15 discrete, finite and nonuniform
Roundoff error When the number is outside the system Select an element to represent the number –chop –round A number to its floating point equivalent – y → fl(y) 16
17
18
Roundoff error When the number is outside the system Select an element to represent the number –chop –round A number to its floating point equivalent – y → fl(y) 19
Formal definition 20
An example 21
In general case (chopped) 22
In general case (chopped) 23
Machine precision/epsilon The error bound is independent of the number, y It depends on –base ( β ) –the number of digits ( k ) The bound is a function of the hardware implementation Cause of roundoff error 24
Formal definition 25
Another term about precision 26
27
So far, 28 we talked about floating point number systems in abstract
Then, 29 what systems are we likely to encounter in practice?
Real floating point system 1970s –begun to develop a standard binary floating point numbers to eliminate inconsistencies 1985 –IEEE –Binary Floating Point Arithmetic Standard 754 The IEEE Standard –F(2,24,-125,128), single precision –F(2,53,-1021,1024), double precision 30
IEEE standard single precision 31
Mathematics on the Computer: Floating Point Arithmetic
Motivation Floating point arithmetic stands for the mathematics on the computer, but why should we know that? The IEEE Standard – 5.96 x –seems pretty accurate However, 33
Numerical methods 34 perform a sequence of calculations on computer, where each operation introduces some roundoff error
35 when they are accumulated
Typical arithmetic Three steps –operand → its floating point equivalent –the exact arithmetic –result → its floating point equivalent 36
37
Not associative ( )+23.21= = ( )= =24.88 We should perform the arithmetic in ********* order to obtain the most accurate result 38 question
All 39 intermediate results have been rounded
Any Questions? 40
Not associative ( )+23.21= = ( )= =24.88 We should perform the arithmetic in ********* order to obtain the most accurate result 41
Not associative ( )+23.21= = ( )= =24.88 We should perform the arithmetic in Ascending order to obtain the most accurate result 42
In FP arithmetic, 43 always notice the number of significant digits and the least significant bits
Not distributive 44
45 Accumulation of roundoff error
46
Introduced/propagated error 47
Propagated error 48 can be large even if the introduced error is small
A notation in the analysis 49
In multiplication 50
In division 51
The relative error propagates slowly The absolute error can grow rapidly, when multiplying by a large number or dividing by a small number 52
Propagated error 53 in addition and subtraction
In addition and subtraction 54
Absolute vs. relative error Multiplication and division may result large absolute error Addition and subtraction may result large relative error –more crucial –cancellation error two nearly equal numbers are subtracted Algorithms should avoid the subtraction of nearly equal numbers 55
56 Recall that
Should be prevented Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 57
To prevent, 58 we need to know the floating point system
59 Bug 1
60
± 61 be careful
62 In action
In action 63
Analysis The larger root – (actual root: ) –is the floating point equivalent of the actual root The smaller root – 0.15 (actual root: ) –nearly 20% relative error 64
Any Questions? 65
An intuitive question How to solve the quadratic formula problem? Reformulate the calculation of the smaller root 66 hint
67
68
69 Bug 2
70
Multiplier -1/6 71 The world is cruel :p You got
72
73 After one pass of Gaussian elimination
74
The next multiplier 75 fl(-3.333/0.0001)
77
Cascade of effects Cancellation error led to a small pivot element A small pivot led to a large multiplier A large and then led to loss of significant digits 78
4.167 disappeared 79
80 Bug 3
Values of a function Even evaluating a function can prove difficult f(x) = e x – cosx – x, where x → 0 – e x → 0 – cosx → 0 81
82
83
How reformulate 84 When seeing cosx, sinx and e x, Taylor series
Reforming with Taylor series 85
86
More precision These bugs are under F(10,4,-,-) Just add more precision –FORTRAN REAL*8 → REAL*16 –C/C++ float → double Not always work –Introduced by Rump and reconsidered by Aberth, Precise Numerical Methods Using C++,
88
Need at least 37 digits 89
Any Questions? 90
Good, 91 that means we would like to have exercises
Exercise /3/25 9:00am to or hand over in class. You may arbitrarily pick one problem among the first three, which means this exercise contains only five
93
94
95
96
97
98
99