CSCI206 - Computer Organization & Programming

Slides:

Advertisements

Similar presentations

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

Floating Point (FLP) Representation A Floating Point value: f = m*r**e Where: m – mantissa or fractional r – base or radix, usually r = 2 e - exponent.

Floating Point Numbers

Faculty of Computer Science © 2006 CMPUT 229 Floating Point Representation Operating with Real Numbers.

Floating Point Numbers. CMPE12cGabriel Hugh Elkaim 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or.

Floating Point Numbers. CMPE12cCyrus Bazeghi 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or 2 64.

Number Systems Standard positional representation of numbers:

1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1.

Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.

Floating Point Numbers

CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.

Floating Point Numbers

Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)

1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.

Ch. 2 Floating Point Numbers

2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Chapter Contents.

Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI 230 Information Representation: Negative and Floating Point.

Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.

ECE232: Hardware Organization and Design

Floating Point Numbers Topics –IEEE Floating Point Standard –Rounding –Floating Point Operations –Mathematical properties.

Floating Point Representations CDA 3101 Discussion Session 02.

Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI N305 Information Representation: Floating Point Representation.

1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.

Integer and Fixed Point P & H: Chapter 3

1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 08, 2004 Lecture Number: 30.

Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.

Computer Architecture Lecture 22 Fasih ur Rehman.

Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.

Data Representation: Floating Point for Real Numbers Computer Organization and Assembly Language: Module 11.

IT11004: Data Representation and Organization Floating Point Representation.

Numbers in Computers.

CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.

Fixed-point and floating-point numbers Ellen Spertus MCS 111 October 4, 2001.

Answer CHAPTER FOUR.

Floating Point (FLP) Representation

FLOATING-POINT NUMBER REPRESENTATION

CSCI206 - Computer Organization & Programming

Floating Points & IEEE 754.

Floating Point Representations

Lecture 6. Fixed and Floating Point Numbers

Computer Architecture & Operations I

2.4. Floating Point Numbers

Floating Point Representations

CSCI206 - Computer Organization & Programming

Recitation 4&5 and review 1 & 2 & 3

Topics IEEE Floating Point Standard Rounding Floating Point Operations

Numbers in a Computer Unsigned integers Signed magnitude

Floating Point Number system corresponding to the decimal notation

CS/COE0447 Computer Organization & Assembly Language

PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.

S09 Recitation #1 Jernej Barbic (S06) Revised: Alex Gartrell (S09)

Luddy Harrison CS433G Spring 2007

CSCI206 - Computer Organization & Programming

Number Representations

Floating Point Representation

The IEEE Floating Point Standard and execution units for it

CSCI206 - Computer Organization & Programming

How to represent real numbers

How to represent real numbers

COMS 361 Computer Organization

Faculty of Cybernetics, Statistics and Economic Informatics –

The IEEE Floating Point Standard and execution units for it

Review In last lecture, done with unsigned and signed number representation. Introduced how to represent real numbers in float format.

COMS 161 Introduction to Computing

CS 286 Computer Architecture & Organization

IT11004: Data Representation and Organization

Computer Organization and Assembly Language

Number Representations

Lecture 9: Shift, Mult, Div Fixed & Floating Point

Presentation transcript:

CSCI206 - Computer Organization & Programming Floating Point Limits zyBook: 10.9, 10.10

IEEE 754 Standard (1985) (normalized) Exponent Mantissa S, E, and M are encoded in the binary word

IEEE754 - Reserved Values Not a Number =

IEEE754 - Example Show 3.14 as a single precision float

3.14 - step 1 write in binary 3.14 == 3 + 0.14 0.14*2 = 0.28 0.28*2 = 0.56 0.56*2 = 1.12 0.12*2 = 0.24 ...... 0.0010......

3.14 - step 1 write in binary Need 24 bits for single (52 for double). In this example, 2 bits before point, 22 bits after. 3.14 == 11.0010001111010111000011

3.14 - step 2 normalize binary Normalized form is 1.yyyyy 3.14 == 11.0010001111010111000011 == Note a total of 24 bits.

3.14 - step 3 write mantissa & sign 3.14 == M = 10010001111010111000011 S = 0 (positive) Note that the mantissa keeps only 23 bits, the leading bit is always 1, so it is omitted in representation (only!!).

3.14 - step 4 encode exponent 3.14 == Exponent = 1, B = 127, (8 bits) E (biased exponent) = 128 = 1000 0000

3.14 - step 5 write result S = 0 (positive) E = 1000 0000 M = 10010001111010111000011 0 1000 0000 10010001111010111000011 to hex = 0x4048f5c3

Endianness On a little-endian system (Intel, etc), the IEEE754 value is byte & word swapped 0x 40 48 f5 c3 (big endian) 4840 c3f5 0x c3f5 4840 (little endian) Swap bytes and words! float f = 3.14; unsigned char* p = (unsigned char*)&f; printf("%02x%02x %02x%02x\n", *p, *(p+1), *(p+2), *(p+3)); // result on Intel: c3f5 4840, on MIPS: 4048 f5c3

Review IEEE754 S Exponent Mantissa Special values, else normalized numbers Exponent Mantissa (fraction) Value +/- zero nonzero denormalized number all 1’s +/- infinity NaN (not a number)

Largest Single Precision Float 8 bit exponent (bias = 127), 23 bit fraction All 1’s in the exponent is reserved for NaN and infinity Maximum biased exponent is 1111 1110 = 254 Maximum fraction is 23 1’s

Largest Single Precision Float 1111 1110 = 254 254-127 = 127

Largest Single Precision Float Move the decimal point 23 digits to the right subtract 23 from exponent

Largest Single Precision Float Convert mantissa

Smallest Nonzero Single What we want is: But that has exponent & fraction = 0 That value is reserved for zero! Therefore, the closest we can get is: either or

Smallest Nonzero Single Normalized In this case, using a normalized number is not ideal, if we could use a denormalized number we could get a much smaller value: This is equivalent to: An extra 22 bits of precision! Denormalized

Smallest Nonzero Single The IEEE realized this and when the exponent is zero and the fraction is > 0, the value is treated as a denormalized number. The smallest nonzero normalized: The smallest nonzero denormalized: exp = 0000 0001 exp = 0000 0000 m = 0000….1

Smallest Nonzero Normalized Single biased exponent = 1 fraction = 0 0000 0001 00000000000000000000000

Smallest Nonzero Denormalized Single biased exponent = 0 fraction = 0.00000000000000000000001 0000 0000 00000000000000000000001 *Note, even though the exponent is encoded as -127, it is computed using the smallest “valid” exponent, which is -126.