Download presentation
Presentation is loading. Please wait.
Published byLeslie Blankenship Modified over 9 years ago
1
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
2
In the previous slide Why numerical methods? –differences between human and computer –a very simple numerical method What is algorithm? –definition and components –three problems and three algorithms Convergence –compare rate of convergence 2
3
In this slide Error (motivation) Floating point number system –difference to real number system –problem of roundoff Introduced/propagated error Focus on numerical methods –three bugs 3
4
Let’s start from error Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 4
5
Can be analyzed Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 5
6
Should be prevented Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 6
7
1.3 7 Mathematics on the Computer Floating Point Number Systems
8
8
9
Restriction of d 1 9 d 1 must not be zero (except when the number being represented is 0 )
10
Floating point vs. real number Discrete vs. continuous –continuous means that between any two numbers, there are infinitely many other numbers Finite vs. infinite –number of element and range of values –a floating point number system contains its smallest/largest element underflow/overflow 10
11
Any Questions? 11
12
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements **** *** **** are more closely spaced think about the difference between two adjacent elements while the exponent changes 12 hint
13
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements **** *** **** are more closely spaced think about the difference between two adjacent elements while the exponent changes 13
14
Floating point vs. real number Nonuniform vs. uniform –real numbers are uniformly distributed –in a floating point number system, the elements near the zero are more closely spaced think about the difference between two adjacent elements while the exponent changes 14
15
Floating point system is 15 discrete, finite and nonuniform
16
Roundoff error When the number is outside the system Select an element to represent the number –chop –round A number to its floating point equivalent – y → fl(y) 16
17
17
18
18
19
Roundoff error When the number is outside the system Select an element to represent the number –chop –round A number to its floating point equivalent – y → fl(y) 19
20
Formal definition 20
21
An example 21
22
In general case (chopped) 22
23
In general case (chopped) 23
24
Machine precision/epsilon The error bound is independent of the number, y It depends on –base ( β ) –the number of digits ( k ) The bound is a function of the hardware implementation Cause of roundoff error 24
25
Formal definition 25
26
Another term about precision 26
27
27
28
So far, 28 we talked about floating point number systems in abstract
29
Then, 29 what systems are we likely to encounter in practice?
30
Real floating point system 1970s –begun to develop a standard binary floating point numbers to eliminate inconsistencies 1985 –IEEE –Binary Floating Point Arithmetic Standard 754 The IEEE Standard –F(2,24,-125,128), single precision –F(2,53,-1021,1024), double precision 30
31
IEEE standard single precision 31
32
1.4 32 Mathematics on the Computer: Floating Point Arithmetic
33
Motivation Floating point arithmetic stands for the mathematics on the computer, but why should we know that? The IEEE Standard – 5.96 x 10 -18 –seems pretty accurate However, 33
34
Numerical methods 34 perform a sequence of calculations on computer, where each operation introduces some roundoff error
35
35 when they are accumulated http://www.radgraphics.net/images/main/atomic%20explosion%20-%204.jpg
36
Typical arithmetic Three steps –operand → its floating point equivalent –the exact arithmetic –result → its floating point equivalent 36
37
37
38
Not associative (0.1329+1.543)+23.21=1.676+23.21=24.89 0.1329+(1.543+23.21)=0.1329+24.75=24.88 We should perform the arithmetic in ********* order to obtain the most accurate result 38 question
39
All 39 intermediate results have been rounded
40
Any Questions? 40
41
Not associative (0.1329+1.543)+23.21=1.676+23.21=24.89 0.1329+(1.543+23.21)=0.1329+24.75=24.88 We should perform the arithmetic in ********* order to obtain the most accurate result 41
42
Not associative (0.1329+1.543)+23.21=1.676+23.21=24.89 0.1329+(1.543+23.21)=0.1329+24.75=24.88 We should perform the arithmetic in Ascending order to obtain the most accurate result 42
43
In FP arithmetic, 43 always notice the number of significant digits and the least significant bits
44
Not distributive 44
45
45 Accumulation of roundoff error
46
46
47
Introduced/propagated error 47
48
Propagated error 48 can be large even if the introduced error is small
49
A notation in the analysis 49
50
In multiplication 50
51
In division 51
52
The relative error propagates slowly The absolute error can grow rapidly, when multiplying by a large number or dividing by a small number 52
53
Propagated error 53 in addition and subtraction
54
In addition and subtraction 54
55
Absolute vs. relative error Multiplication and division may result large absolute error Addition and subtraction may result large relative error –more crucial –cancellation error two nearly equal numbers are subtracted Algorithms should avoid the subtraction of nearly equal numbers 55
56
56 Recall that http://www.dianadepasquale.com/ThinkingMonkey.jpg
57
Should be prevented Numerical methods are generally designed to determine approximation solutions 3 categories of error types –modeling: made when you decide the algorithm –discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series –roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 57
58
To prevent, 58 we need to know the floating point system
59
59 Bug 1 http://rinat.relcom.net/Gallery/slides/bug.jpg
60
60
61
± 61 be careful
62
62 In action http://thomashawk.com/hello/209/1017/1024/Jackson%20Running.jpg
63
In action 63
64
Analysis The larger root – 239.4 (actual root: 239.4246996 ) –is the floating point equivalent of the actual root The smaller root – 0.15 (actual root: 0.1253003555 ) –nearly 20% relative error 64
65
Any Questions? 65
66
An intuitive question How to solve the quadratic formula problem? Reformulate the calculation of the smaller root 66 hint
67
67
68
68
69
69 Bug 2 http://rinat.relcom.net/Gallery/slides/bug.jpg
70
70
71
Multiplier -1/6 71 The world is cruel :p You got -1.667
72
72
73
73 After one pass of Gaussian elimination http://i5.tinypic.com/4yqudc7.jpg
74
74
75
The next multiplier 75 fl(-3.333/0.0001)
76
76 -33330 http://www.radgraphics.net/images/main/atomic%20explosion%20-%204.jpg
77
77
78
Cascade of effects Cancellation error led to a small pivot element A small pivot led to a large multiplier A large and then led to loss of significant digits 78
79
4.167 disappeared 79
80
80 Bug 3 http://rinat.relcom.net/Gallery/slides/bug.jpg
81
Values of a function Even evaluating a function can prove difficult f(x) = e x – cosx – x, where x → 0 – e x → 0 – cosx → 0 81
82
82
83
83
84
How reformulate 84 When seeing cosx, sinx and e x, Taylor series
85
Reforming with Taylor series 85
86
86
87
More precision These bugs are under F(10,4,-,-) Just add more precision –FORTRAN REAL*8 → REAL*16 –C/C++ float → double Not always work –Introduced by Rump and reconsidered by Aberth, Precise Numerical Methods Using C++, 1998 87
88
88
89
Need at least 37 digits 89
90
Any Questions? 90
91
Good, 91 that means we would like to have exercises
92
Exercise 92 2010/3/25 9:00am Email to darby@ee.ncku.edu.tw or hand over in class. You may arbitrarily pick one problem among the first three, which means this exercise contains only five problems.darby@ee.ncku.edu.tw
93
93
94
94
95
95
96
96
97
97
98
98
99
99
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.