Recall our hypothetical computer Marc-32

Recall our hypothetical computer Marc-32
Sign: 1 bit Mantissa: 23 bits Exponent: 8 bits Normalized floating point representation 0 𝒙= 𝑞  2𝒎 (1𝑞2,−126𝑚127) Unit roundoff error:  2𝟐𝟒 Floating point machine number 𝒇𝒍 𝒙 𝒇𝒍 𝒙 =𝒙 𝟏+ ||

Example 1 (P46) What’s the binary form of x=2/3
Example 1 (P46) What’s the binary form of x=2/3? What are two nearby machine numbers x- and x+ in the Marc-32? Which one is taken to be fl(x)? What are the absolute roundoff error and relative roundoff error in representing x by fl(x)?

Solution. First, we write 2/3 in the binary form
2/3 = (0.a1a2a3…) (1) where ai’s are either 0 or 1. We multiply by 2 for both sides to obtain 4/3 = (a1.a2a3…)2 Thus, we get a1=1 by taking the integer part of both sides. So, 1/3 = 4/3  1 = (a1.a2a3…)2  1= (0.a2a3a4…)2 2/3 = (a2.a3a4a5…) (2)

x? = (1.0101…011)2  2-1 (by rounding up)
From (1)-(2), we have 1=a1=a3=a5=a7=… 0=a2=a4=a6=a8=… Thus, x = 2/3 = ( …)2 = ( …)2  2-1 In Marc-32, the two nearby machine numbers are x? = (1.0101…010)2  (by chopping) x? = (1.0101…011)2  (by rounding up) 23 bits Recall: x< x < x+

So, fl(x)=? Next, x  x = (1.01010…)2  2-24  2-1
= ( …)2  2-23  2-1 = 2/3  2-24 x x = (x x )  (x  x) = 2-24  2/3  2-24 = 1/3  2-24 So, fl(x)=?

Stored as machine numbers fl(a),fl(b),… Rounded off (舍入)
Input numbers a,b,c,… Normalized (标准化) Stored as machine numbers fl(a),fl(b),… Rounded off (舍入) Do one arithmetic operation/calculation Obtain a number (result) e.g. fl(a)fl(b)

The computer with 5 decimal digits stores those results in rounded form as
The relative errors are respectively

Denote  one of the four basic arithmetic operations:   , 
Assume x,y are machine numbers, then there is some constant  s.t. fl(xy) = [xy] (1+) where ||; here,  can be taken to be the unit roundoff error for the machine. In Marc-32, =2-24. Q: How to compute xy if x,y are not machine numbers?

If x,y are not machine numbers, then
there is still some constant  s.t. fl(x) = x (1+1) fl(y) = y (1+2) fl(xy) = fl(fl(x)fl(y)) = (fl(x)fl(y)) (1+3) = [(x(1+1))  (y(1+2))](1+3) = (xy)(1+1+2+12) (1+3)  xy where |1|,|2|,|3|; still,  can be taken to be the unit roundoff error for the machine.

Q: How about compand arithmetic operations?
Assume x,y,z  A={machine numbers of Marc-32}. fl(x(y+z)) = [x fl(y+z)] (1+1) |1| 2-24 = [x (y+z) (1+2)] (1+1) |2| 2-24 = x (y+z) (1+2+1 +12)  x (y+z) (1+2+1) = x (y+z) (1+3) |3| ? Here 3=2+1

Exercise. Find fl(x(y+z)) for x, y, z  A={machine numbers of Marc-32}.

Theorem on Relative Roundoff Error in Adding

2.2 Absolute & Relative Errors: Loss of Significance/Precision
Assume a real number 𝑥 is approximated by another number 𝑥 ∗ , the error is 𝑥−𝑥 ∗ . The absolute error |𝑥−𝑥 ∗ | The relative error |𝑥−𝑥 ∗ | |𝑥|

The relative error involved in representing a real number 𝑥 by a nearby floating-point machine number fl(𝑥) is bounded by the unit roundoff error  |𝒙−𝒇𝒍(𝒙)| |𝒙|  Roundoff errors are inevitable & difficult to control.

Loss of Significance The subject of numerical analysis is largely involved in understanding and controlling errors of various kinds.

For example, 𝑥= ,𝑦= 𝑥−𝑦= If this calculation were to be performed in a decimal computer having a five-digit mantissa, we would have 𝑓𝑙 𝑥 = , 𝑓𝑙(𝑦)= 𝑓𝑙(𝑥)−𝑓𝑙(𝑦)= The relative error |𝒙−𝒚−[𝒇𝒍 𝒙 −𝒇𝒍 𝒚 ]| |𝒙−𝒚| 𝟒%

Loss of Significance The result is usually stored as a normalized floating-point number, i.e., 𝒇𝒍 𝒙 −𝒇𝒍 𝒚 =𝟎.𝟎𝟎𝟎𝟏𝟑 =𝟎.𝟏𝟑𝟎𝟎𝟎𝟏𝟎𝟑 The added three 0’s in above do NOT represent additional accuracy, i.e., those three additional 0’s are NOT significant numbers (有效数字).

Subtraction of Nearly Equal Quantities
Example 1 The assignment statement 𝑦  𝑥 2 +1 −1 can cause loss of significance for small values of 𝑥. How to avoid this trouble? Solution. The statement can be replaced by 𝑦  𝑥 2 /( 𝑥 ) in programming to avoid such trouble.

Ex. 2：求根（保留小数点后10位） e-4

Loss of Precision Theorem 1(P57) Theorem on Loss of Precision
If 𝑥 and 𝑦 are positive normalized floating-point binary machine numbers such that 𝑥𝑦 and 2 −𝑞  1− 𝑦 𝑥  2 −𝑝 then at most 𝑞 and at least 𝑝 significant binary bits are lost in the subtraction 𝑥𝑦.

Proof. Only prove the lower bound and leave the upper bound as your after-class exercise.
The normalized binary floating-point forms for 𝑥,𝑦 are 𝒙=𝒓  𝟐𝒏 , 𝒚=𝒔  𝟐𝒎 , ( 1 2 𝑟,𝑠1) Since 𝑥𝑦, the computer may have to shift 𝑦 so that 𝑦 has the same exponent as 𝑥 before performing 𝑥𝑦. So, we must write 𝑦 as 𝑦= 𝒔  𝟐𝒎𝒏  𝟐𝒎 and then 𝑥𝑦=(𝒓 𝒔  𝟐𝒎𝒏) 𝟐𝒏

𝑟−𝒔𝟐𝒎𝒏=𝒓 𝟏− 𝒔𝟐𝒎 𝒓𝟐𝒏 =𝒓 𝟏− 𝑦 𝑥 𝟏− 𝑦 𝑥  2 −𝑝
By assumption, we have 𝑟−𝒔𝟐𝒎𝒏=𝒓 𝟏− 𝒔𝟐𝒎 𝒓𝟐𝒏 =𝒓 𝟏− 𝑦 𝑥 𝟏− 𝑦 𝑥  2 −𝑝 WLOG, assume the mantissa in the computer has 𝑝+𝑘 digits (𝑘1), then 𝒓−𝒔𝟐𝒎𝒏= 𝟎.𝟎𝟎𝟎𝒂𝟏𝒂𝟐𝒂𝒌 𝟐 𝑝 The normalized floating point form of 𝑥𝑦 is 𝑥−𝑦= 𝟎. 𝒂𝟏𝒂𝟐𝒂𝟑𝒂𝒌𝟎𝟎 𝟐 𝟐𝒏𝒑 𝑖𝑓 𝒂𝟏 0 𝟎. 𝒂𝟐𝒂𝟑𝒂𝒌𝟎𝟎𝟎 𝟐 𝟐𝒏𝒑𝟏 𝑖𝑓 𝒂𝟏=0,𝒂𝟐 0  

i.e., a shift of at least 𝑝 bits to the left is required; meanwhile, at least 𝑝 spurious 0’s are attached to the right end of the mantissa, which means that at least 𝑝 bits of precision have been lost.

𝑦𝑥−sin⁡(𝑥) Example 3. Consider the assignment statement
This calculation involves a loss of significance for small values of 𝑥. How to avoid this trouble?

Solution. By the Taylor series for sin(𝑥), we have 𝑦=𝑥− sin 𝑥
=𝑥−(𝑥− 𝑥3 3! + 𝑥5 5! − 𝑥7 7! +) = 𝑥3 3! − 𝑥5 5! + 𝑥7 7! − 𝑥9 9!  If 𝑥 is near 0, a truncated series can be used, e.g., 𝑦 (𝑥 3 /6)(1− (𝑥 2 /20)(1− (𝑥 2 /42)(1− 𝑥 2 /72))) Note that both assignment statements may be used for a wide range of values of 𝑥.

Homework & Programming
Check the course’s webpage for Homework # Due Thursday, 9. 29 Programming #1 Due Thursday, 9. 29

Recall our hypothetical computer Marc-32

Similar presentations

Presentation on theme: "Recall our hypothetical computer Marc-32"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recall our hypothetical computer Marc-32

Similar presentations

Presentation on theme: "Recall our hypothetical computer Marc-32"— Presentation transcript:

Similar presentations

About project

Feedback