Floating Point Math & Representation

Slides:

Advertisements

Similar presentations

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

Advertisements

2-1 Chapter 2 - Data Representation Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture.

Chapter 2: Data Representation

Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 2: Data Representation.

CS 447 – Computer Architecture Lecture 3 Computer Arithmetic (2)

Booth’s Algorithm.

Floating Point Numbers

Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)

Computer Science 210 Computer Organization Floating Point Representation.

Simple Data Type Representation and conversion of numbers

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Arithmetic Nizamettin AYDIN

Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.

2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer.

Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large numbers,

CH09 Computer Arithmetic  CPU combines of ALU and Control Unit, this chapter discusses ALU The Arithmetic and Logic Unit (ALU) Number Systems Integer.

Oct. 18, 2007SYSC 2001* - Fall SYSC2001-Ch9.ppt1 See Stallings Chapter 9 Computer Arithmetic.

Fixed and Floating Point Numbers Lesson 3 Ioan Despi.

CSC 221 Computer Organization and Assembly Language

Number Systems & Operations

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.

Numbers in Computers.

Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.

1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.

Chapter 9 Computer Arithmetic

William Stallings Computer Organization and Architecture 8th Edition

MATH Lesson 2 Binary arithmetic.

Floating Point Representations

Programming and Data Structure

Computer Science 210 Computer Organization

Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.

A brief comparison of integer and double representation

Dr. Clincy Professor of CS

Digital Logic & Design Dr. Waseem Ikram Lecture 02.

Integer Division.

Lecture 9: Floating Point

Dr. Clincy Professor of CS

Floating Point Number system corresponding to the decimal notation

IEEE floating point format

CS/COE0447 Computer Organization & Assembly Language

William Stallings Computer Organization and Architecture 7th Edition

Data Structures Mohammed Thajeel To the second year students

Topic 3d Representation of Real Numbers

Luddy Harrison CS433G Spring 2007

CSCI206 - Computer Organization & Programming

Number Representations

(Part 3-Floating Point Arithmetic)

Computer Science 210 Computer Organization

Digital Logic & Design Lecture 02.

ECEG-3202 Computer Architecture and Organization

CS 101 – Sept. 4 Number representation Integer Unsigned √ Signed √

Chapter 8 Computer Arithmetic

Floating Point Numbers

Morgan Kaufmann Publishers Arithmetic for Computers

Topic 3d Representation of Real Numbers

Floating Point Numbers

Chapter3 Fixed Point Representation

Computer Organization and Assembly Language

Number Representations

Presentation transcript:

Floating Point Math & Representation Intro to Computer Org. Floating Point Math & Representation

Motivation Integers are a very limited subset of the real numbers, even if we could represent all integers. Much of the time, we wish to use fractional numbers, be they rational or irrational.

A Human Perspective Consider a standard fractional number such as 105.8712. The decimal indicates the fractional component. We could try storing this one as an integer and mark the decimal place…

A Human Perspective But consider a number such as 1.058712 * 1046. This one can’t be easily represented by integers due to its magnitude. What about 1.058712 * 10-46? All of these are fairly easy to represent in our writing system.

A Human Perspective Note that the last two numbers presented were written in scientific notation. Commonly used shorthand that preserves the most significant details of any number.

Binary Scientific Notation Just as we have scientific notation in the decimal number system, we can have it in a binary system. Ex: 1.11010001 * 23 = 111010.001 = 25 + 24 + 23 + 21 + 2-3 = 32 + 16 + 8 + 2 + .125 = 58.125

Decimal -> Binary We already learned how to convert integers to binary. Divide the number by two. Take the remainder as the next digit, working right-to-left. If quotient = 0, stop. Otherwise, take the quotient as the new number, return to step 1.

Decimal -> Binary To convert the fractional part from decimal to binary: Multiply the number by two. Remove the integer part and save it as the next digit, working left-to-right. If remainder of number = 0, stop. Otherwise, take the remainder as the new number, return to step 1.

Decimal -> Binary Note that the process is almost a direct inversion of the process for whole numbers.

Decimal -> Binary 0.625 * 2 = 1.250 => .1___ 0.625 * 2 = 1.250 => .1___ 0.250 * 2 = 0.500 => .10__ 0.500 * 2 = 1.000 => .101_ 0 is left, so we’re done!

Decimal -> Binary Because we’re using base 2, some fractions become repeating. 0.2 * 2 = 0.4 => .0______ 0.4 * 2 = 0.80 => .00_____ 0.8 * 2 = 1.60 => .001____ 0.6 * 2 = 1.20 => .0011___ 0.2 * 2…

Decimal -> Binary Because we’re using base 2, some fractions become repeating. 0.2 * 2 = 0.4 => .0______ 0.4 * 2 = 0.80 => .00_____ 0.8 * 2 = 1.60 => .001____ 0.6 * 2 = 1.20 => .0011___ 0.2 * 2…

Decimal -> Binary For conversions in this class, it’s highly advised to handle each side of a decimal point separately.

Decimal -> Binary Let’s now consider 14.625. 14 => 11102. 0.625 => 0.1012. So, 14.625 => 1110.1012. Final step – normalization 14.625 = 1.1101012 * 23. Scientific notation allows only one leading digit.

Binary Scientific Notation All the basic mathematical operations operate in a similar fashion to their decimal scientific notation counterparts. Read starting in page 195 for more details on the operations. Part of section 3.6, the introduction to floating point.

Binary Scientific Notation Notable exception – sometimes division is performed by taking the reciprocal of the divisor, then performing multiplication.

Floating Point The floating point specification has three pieces. A sign bit The fractional part of the number An exponent field The entire number fits within 32 bits if single-precision 64 bits if double-precision

Floating Point The floating point specification has three pieces. A sign bit The fractional part of the number An exponent field

Floating Point – Sign Bit Floating point represents sign in a manner identical to that of the sign-magnitude representation of integers. Leading bit = 1 if negative, 0 if positive. Denotable as S in formulas, where (-1)S gives the sign of the floating-point number.

Floating Point The floating point specification has three pieces. A sign bit The fractional part of the number Also called significand. An exponent field

Floating Point - Significand With one exception, all binary numbers in scientific notation share the same leading digit – 1. Definition of the notation says leading digit must be non-zero. As a result, the leading 1 is not stored – only the parts after the “binary point” are.

Floating Point - Significand Thus, to obtain the part of the number that represents the significand, (1. + Fraction) signifies the need to prefix the bits appropriately.

Floating Point The floating point specification has three pieces. A sign bit The fractional part of the number An exponent field

Floating Point – Exponents Floating point numbers store their exponents in yet another representation – one that is biased. Take desired exponent and add the bias to it. Single-precision: Add 127 (27-1). Double-precision: Add 1023 (210-1).

Floating Point – Exponents Represent the result as an unsigned integer. Why might one wish to do this?

Floating Point – Exponents Special cases After biasing the exponent, two values indicate special conditions. The minimum possible value (zero) The maximum possible value (all ones) 255 in single-precision 2047 in double-precision The corresponding unbiased exponents are thus out-of-range.

Floating Point – Exponents Before covering the special cases, let’s take a look at standard floating-point numbers. We’ll use single-precision for our examples.

Single-Precision Floats Sign Exponent Fraction/Significand 1 bit 8 bits 23 bits Let’s take a look at 14.625. = 1.1101012 * 23. Sign = positive, so we’d represent it with a zero.

Single-Precision Floats Sign Exponent Fraction/Significand 8 bits 23 bits Let’s take a look at 14.625. = 1.1101012 * 23. Exponent (unbiased) = 3. Exponent (biased) = 3 + 127 = 130. 130 = 100000102. Sidenote – 2(8-1) – 1 = 127. Pattern holds for double-precision.

Single-Precision Floats Sign Exponent Fraction/Significand 10000010 23 bits Let’s take a look at 14.625. = 1.1101012 * 23. We ignore the leading 1. The part to the right of the binary point is 110101, so we store this.

Single-Precision Floats Sign Exponent Fraction/Significand 10000010 1101 0100 0000 0000 0000 000 Let’s take a look at 14.625. = 1.1101012 * 23. We ignore the leading 1. The part to the right of the binary point is 110101, so we store this and append zeros to the end to fill out the remaining bits. 31

Single-Precision Floats How about (136.111…) * 2-13? Let’s work this one on the board.

Floating Point - Zero One of the special cases of floating point is when all bits of the float/double are zero. A very convenient representation of zero!

Imprecision of Floating Point All floating-point numbers are stored in a scientific notation format with limited bits. This gives a tradeoff between range of numbers representable and the precision on those numbers. Single-precision gives 22 bits for the significant, giving accuracy to a little over seven decimal digits.

Imprecision of Floating Point What is the smallest possible difference between any two numbers with the same (binary) magnitude? That is, the same power of two in binary scientific notation. It depends on the power of two!

Imprecision of Floating Point Sign Exponent Fraction/Significand 3 (unbiased) 22 bits Let’s take a look at numbers between 8 and 16. (16 not included.) Unbiased exponent =3. What would be the smallest possible difference, looking at the bit patterns?

Imprecision of Floating Point Sign Exponent Fraction/Significand 3 (unbiased) 0000 0000 0000 0000 0000 01 The answer, in floating point format, is all zeros with a one at the end. Since the exponent is the same for both numbers, the difference is only in the significand. The above difference is the smallest non-zero difference possible. Warning – the above is before normalization – there is no leading 1! 37

Imprecision of Floating Point Sign Exponent Fraction/Significand 3 (unbiased) 0000 0000 0000 0000 0000 01 What would this translate to in decimal? (-1)0 * (1. + 2-22) * 23 = 2-22 + 3 = 2-19 Note that we don’t add the 1 to the fraction because we’re dealing with a difference between two floating point numbers here before normalization. 2-19 = 0.0000019073486328125, or 1.9073486328125 * 10-6. 38

Imprecision of Floating Point Note that this number is not precise to seven actual decimal places when the number is not in scientific notation! Numerical Analysis studies the consequences of discrete representations of real numbers, equations, and the like.

Other Special Cases Special case – biased exponent is all ones. Is the significand all zero bits? If so, the bit pattern represents ±∞. If not, it represents NaN. (Not a number) Useful for when illegal operations are performed, such as dividing by zero.

Other Special Cases Last special case – biased exponent is zero, but the significand is non-zero. This represents a denormalized number. Allows a gradual loss of precision at the limits of the exponent range. For more details, read section 3.6 of the text. We won’t worry about it much here.

More examples (On board) 6,706,993,152 – approximate population of the world. 1100011111100010010011000000000002. 1110001.012 + 101011.10012 Answer in single-precision + decimal 1110001.012 * 101011.10012 Answer in single-precision Answer in decimal notation