Recall our hypothetical computer Marc-32

Slides:



Advertisements
Similar presentations
Roundoff and truncation errors
Advertisements

2009 Spring Errors & Source of Errors SpringBIL108E Errors in Computing Several causes for malfunction in computer systems. –Hardware fails –Critical.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Floating Point Numbers
ECIV 201 Computational Methods for Civil Engineers Richard P. Ray, Ph.D., P.E. Error Analysis.
Round-Off and Truncation Errors
1 CSE1301 Computer Programming Lecture 30: Real Number Representation.
CSE1301 Computer Programming Lecture 33: Real Number Representation
Dr Damian Conway Room 132 Building 26
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 3 Approximations and Errors.
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
1 Error Analysis Part 1 The Basics. 2 Key Concepts Analytical vs. numerical Methods Representation of floating-point numbers Concept of significant digits.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM.
Floating Point Numbers.  Floating point numbers are real numbers.  In Java, this just means any numbers that aren’t integers (whole numbers)  For example…
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. by Lale Yurttas, Texas A&M University Chapter 31.
Simple Data Type Representation and conversion of numbers
2.2 Errors. Why Study Errors First? Nearly all our modeling is done on digital computers (aside: what would a non-digital analog computer look like?)
Numerical Computations in Linear Algebra. Mathematically posed problems that are to be solved, or whose solution is to be confirmed on a digital computer.
Binary Real Numbers. Introduction Computers must be able to represent real numbers (numbers w/ fractions) Two different ways:  Fixed-point  Floating-point.
Information Representation (Level ISA3) Floating point numbers.
Computer Arithmetic Nizamettin AYDIN
1 Lecture 5 Floating Point Numbers ITEC 1000 “Introduction to Information Technology”
Fixed-Point Arithmetics: Part II
Lecture 2 Number Representation and accuracy
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
CISE301_Topic11 CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4:
EE 3561_Unit_1(c)Al-Dhaifallah EE 3561 : - Computational Methods in Electrical Engineering Unit 1: Introduction to Computational Methods and Taylor.
Introduction to Numerical Analysis I
Round-off Errors.
Round-off Errors and Computer Arithmetic. The arithmetic performed by a calculator or computer is different from the arithmetic in algebra and calculus.
16. Binary Numbers Programming in C++ Computer Science Dept Va Tech August, 1999 © Barnette ND, McQuain WD, Keenan MA 1 Binary Number System Base.
MECN 3500 Inter - Bayamon Lecture 3 Numerical Methods for Engineering MECN 3500 Professor: Dr. Omar E. Meza Castillo
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Chapter 3.
Errors in Numerical Methods
ESO 208A/ESO 218 LECTURE 2 JULY 31, ERRORS MODELING OUTPUTS QUANTIFICATION TRUE VALUE APPROXIMATE VALUE.
R EPRESENTATION OF REAL NUMBER Presented by: Pawan yadav Puneet vinayak.
Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7.
Numerical Analysis CC413 Propagation of Errors. 2 In numerical methods, the calculations are not made with exact numbers. How do these inaccuracies propagate.
Introduction to error analysis Class II "The purpose of computing is insight, not numbers", R. H. Hamming.
Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.
Introduction to Numerical Analysis I MATH/CMPSC 455 Loss of Significance.
Floating Point Numbers
Introduction to Numerical Analysis I
Floating Point Representations
Department of Computer Science Georgia State University
Machine arithmetic and associated errors Introduction to error analysis Class II.
Dr.Faisal Alzyoud 2/20/2018 Binary Arithmetic.
CS 232: Computer Architecture II
Machine arithmetic and associated errors Introduction to error analysis (cont.) Class III.
Outline Introduction Floating Point Arithmetic Adder Multiplier.
Chapter 2 ERROR ANALYSIS
Errors in Numerical Methods
Roundoff and Truncation Errors
Floating Point Representation
Errors in Numerical Methods
(Part 3-Floating Point Arithmetic)
How to represent real numbers
Approximations and Round-Off Errors Chapter 3
Chapter 3 DataStorage Foundations of Computer Science ã Cengage Learning.
Numerical Analysis Lecture 2.
Example 1 (P46) What’s the binary form of x=2/3
Prof. Giancarlo Succi, Ph.D., P.Eng.
Chapter 2: Floating point number systems and Round-off error
Chapter 1 / Error in Numerical Method
Roundoff and Truncation Errors
Numbers with fractions Could be done in pure binary
靜夜思 床前明月光, 疑是地上霜。 舉頭望明月, 低頭思故鄉。 ~ 李白 李商隱.
Errors and Error Analysis Lecture 2
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
Presentation transcript:

Recall our hypothetical computer Marc-32 Sign: 1 bit Mantissa: 23 bits Exponent: 8 bits Normalized floating point representation 0 𝒙= 𝑞  2𝒎 (1𝑞2,−126𝑚127) Unit roundoff error:  2𝟐𝟒 Floating point machine number 𝒇𝒍 𝒙 𝒇𝒍 𝒙 =𝒙 𝟏+ ||

Example 1 (P46) What’s the binary form of x=2/3 Example 1 (P46) What’s the binary form of x=2/3? What are two nearby machine numbers x- and x+ in the Marc-32? Which one is taken to be fl(x)? What are the absolute roundoff error and relative roundoff error in representing x by fl(x)?

Solution. First, we write 2/3 in the binary form 2/3 = (0.a1a2a3…)2 (1) where ai’s are either 0 or 1. We multiply by 2 for both sides to obtain 4/3 = (a1.a2a3…)2 Thus, we get a1=1 by taking the integer part of both sides. So, 1/3 = 4/3  1 = (a1.a2a3…)2  1= (0.a2a3a4…)2 2/3 = (a2.a3a4a5…)2 (2)

x? = (1.0101…011)2  2-1 (by rounding up) From (1)-(2), we have 1=a1=a3=a5=a7=… 0=a2=a4=a6=a8=… Thus, x = 2/3 = (0.101010…)2 = (1.01010…)2  2-1 In Marc-32, the two nearby machine numbers are x? = (1.0101…010)2  2-1 (by chopping) x? = (1.0101…011)2  2-1 (by rounding up) 23 bits Recall: x< x < x+

So, fl(x)=? Next, x  x = (1.01010…)2  2-24  2-1 = (0.101010…)2  2-23  2-1 = 2/3  2-24 x x = (x x )  (x  x) = 2-24  2/3  2-24 = 1/3  2-24 So, fl(x)=?

The absolute roundoff error |fl(x)  x| = 1/3  2-24 The relative roundoff error |fl(x)  x| |x| = 1/3  2−24 2/3 =2−25 Check your textbook (P47) for the decimal form of x, x and x  x

 

Stored as machine numbers fl(a),fl(b),… Rounded off (舍入) Input numbers a,b,c,… Normalized (标准化) Stored as machine numbers fl(a),fl(b),… Rounded off (舍入) Do one arithmetic operation/calculation Obtain a number (result) e.g. fl(a)fl(b)

   

The computer with 5 decimal digits stores those results in rounded form as   The relative errors are respectively    

The computer with 5 decimal digits stores those results in rounded form as   The relative errors are respectively    

Denote  one of the four basic arithmetic operations:   ,  Assume x,y are machine numbers, then there is some constant  s.t. fl(xy) = [xy] (1+) where ||; here,  can be taken to be the unit roundoff error for the machine. In Marc-32, =2-24. Q: How to compute xy if x,y are not machine numbers?

If x,y are not machine numbers, then there is still some constant  s.t. fl(x) = x (1+1) fl(y) = y (1+2) fl(xy) = fl(fl(x)fl(y)) = (fl(x)fl(y)) (1+3) = [(x(1+1))  (y(1+2))](1+3) = (xy)(1+1+2+12) (1+3)  xy where |1|,|2|,|3|; still,  can be taken to be the unit roundoff error for the machine.

Q: How about compand arithmetic operations? Assume x,y,z  A={machine numbers of Marc-32}. fl(x(y+z)) = [x fl(y+z)] (1+1) |1| 2-24 = [x (y+z) (1+2)] (1+1) |2| 2-24 = x (y+z) (1+2+1 +12)  x (y+z) (1+2+1) = x (y+z) (1+3) |3| ? Here 3=2+1

Exercise. Find fl(x(y+z)) for x, y, z  A={machine numbers of Marc-32}.

Theorem on Relative Roundoff Error in Adding

2.2 Absolute & Relative Errors: Loss of Significance/Precision Assume a real number 𝑥 is approximated by another number 𝑥 ∗ , the error is 𝑥−𝑥 ∗ . The absolute error |𝑥−𝑥 ∗ | The relative error |𝑥−𝑥 ∗ | |𝑥|

The relative error involved in representing a real number 𝑥 by a nearby floating-point machine number fl(𝑥) is bounded by the unit roundoff error  |𝒙−𝒇𝒍(𝒙)| |𝒙|  Roundoff errors are inevitable & difficult to control.

Loss of Significance The subject of numerical analysis is largely involved in understanding and controlling errors of various kinds.

For example, 𝑥=0.3721478693,𝑦=0.3720230572 𝑥−𝑦=0.0001248121 If this calculation were to be performed in a decimal computer having a five-digit mantissa, we would have 𝑓𝑙 𝑥 =0.37215, 𝑓𝑙(𝑦)=0.37202 𝑓𝑙(𝑥)−𝑓𝑙(𝑦)=0.00013 The relative error |𝒙−𝒚−[𝒇𝒍 𝒙 −𝒇𝒍 𝒚 ]| |𝒙−𝒚| 𝟒%

Loss of Significance The result is usually stored as a normalized floating-point number, i.e., 𝒇𝒍 𝒙 −𝒇𝒍 𝒚 =𝟎.𝟎𝟎𝟎𝟏𝟑 =𝟎.𝟏𝟑𝟎𝟎𝟎𝟏𝟎𝟑 The added three 0’s in above do NOT represent additional accuracy, i.e., those three additional 0’s are NOT significant numbers (有效数字).

Subtraction of Nearly Equal Quantities Example 1 The assignment statement 𝑦  𝑥 2 +1 −1 can cause loss of significance for small values of 𝑥. How to avoid this trouble? Solution. The statement can be replaced by 𝑦  𝑥 2 /( 𝑥 2 +1 +1) in programming to avoid such trouble.

Ex. 2: 求根(保留小数点后10位) 5.00000125000062500039e-4

Loss of Precision Theorem 1(P57) Theorem on Loss of Precision If 𝑥 and 𝑦 are positive normalized floating-point binary machine numbers such that 𝑥𝑦 and 2 −𝑞  1− 𝑦 𝑥  2 −𝑝 then at most 𝑞 and at least 𝑝 significant binary bits are lost in the subtraction 𝑥𝑦.

Proof. Only prove the lower bound and leave the upper bound as your after-class exercise. The normalized binary floating-point forms for 𝑥,𝑦 are 𝒙=𝒓  𝟐𝒏 , 𝒚=𝒔  𝟐𝒎 , ( 1 2 𝑟,𝑠1) Since 𝑥𝑦, the computer may have to shift 𝑦 so that 𝑦 has the same exponent as 𝑥 before performing 𝑥𝑦. So, we must write 𝑦 as 𝑦= 𝒔  𝟐𝒎𝒏  𝟐𝒎 and then 𝑥𝑦=(𝒓 𝒔  𝟐𝒎𝒏) 𝟐𝒏

𝑟−𝒔𝟐𝒎𝒏=𝒓 𝟏− 𝒔𝟐𝒎 𝒓𝟐𝒏 =𝒓 𝟏− 𝑦 𝑥 𝟏− 𝑦 𝑥  2 −𝑝 By assumption, we have 𝑟−𝒔𝟐𝒎𝒏=𝒓 𝟏− 𝒔𝟐𝒎 𝒓𝟐𝒏 =𝒓 𝟏− 𝑦 𝑥 𝟏− 𝑦 𝑥  2 −𝑝 WLOG, assume the mantissa in the computer has 𝑝+𝑘 digits (𝑘1), then 𝒓−𝒔𝟐𝒎𝒏= 𝟎.𝟎𝟎𝟎𝒂𝟏𝒂𝟐𝒂𝒌 𝟐 𝑝 The normalized floating point form of 𝑥𝑦 is 𝑥−𝑦= 𝟎. 𝒂𝟏𝒂𝟐𝒂𝟑𝒂𝒌𝟎𝟎 𝟐 𝟐𝒏𝒑 𝑖𝑓 𝒂𝟏 0 𝟎. 𝒂𝟐𝒂𝟑𝒂𝒌𝟎𝟎𝟎 𝟐 𝟐𝒏𝒑𝟏 𝑖𝑓 𝒂𝟏=0,𝒂𝟐 0  

i.e., a shift of at least 𝑝 bits to the left is required; meanwhile, at least 𝑝 spurious 0’s are attached to the right end of the mantissa, which means that at least 𝑝 bits of precision have been lost.

𝑦𝑥−sin⁡(𝑥) Example 3. Consider the assignment statement This calculation involves a loss of significance for small values of 𝑥. How to avoid this trouble?

Solution. By the Taylor series for sin(𝑥), we have 𝑦=𝑥− sin 𝑥 =𝑥−(𝑥− 𝑥3 3! + 𝑥5 5! − 𝑥7 7! +) = 𝑥3 3! − 𝑥5 5! + 𝑥7 7! − 𝑥9 9!  If 𝑥 is near 0, a truncated series can be used, e.g., 𝑦 (𝑥 3 /6)(1− (𝑥 2 /20)(1− (𝑥 2 /42)(1− 𝑥 2 /72))) Note that both assignment statements may be used for a wide range of values of 𝑥.

Homework & Programming Check the course’s webpage for Homework #4 Due Thursday, 9. 29 Programming #1 Due Thursday, 9. 29