CSCI206 - Computer Organization & Programming

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Floating Point (FLP) Representation A Floating Point value: f = m*r**e Where: m – mantissa or fractional r – base or radix, usually r = 2 e - exponent.
Floating Point Numbers
Faculty of Computer Science © 2006 CMPUT 229 Floating Point Representation Operating with Real Numbers.
Floating Point Numbers. CMPE12cGabriel Hugh Elkaim 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or.
Floating Point Numbers. CMPE12cCyrus Bazeghi 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or 2 64.
Number Systems Standard positional representation of numbers:
1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1.
Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.
Floating Point Numbers
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
Floating Point Numbers
Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
Ch. 2 Floating Point Numbers
2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Chapter Contents.
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI 230 Information Representation: Negative and Floating Point.
Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.
ECE232: Hardware Organization and Design
Floating Point Numbers Topics –IEEE Floating Point Standard –Rounding –Floating Point Operations –Mathematical properties.
Floating Point Representations CDA 3101 Discussion Session 02.
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI N305 Information Representation: Floating Point Representation.
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
Integer and Fixed Point P & H: Chapter 3
1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 08, 2004 Lecture Number: 30.
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
Computer Architecture Lecture 22 Fasih ur Rehman.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
Data Representation: Floating Point for Real Numbers Computer Organization and Assembly Language: Module 11.
IT11004: Data Representation and Organization Floating Point Representation.
Numbers in Computers.
CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.
Fixed-point and floating-point numbers Ellen Spertus MCS 111 October 4, 2001.
Answer CHAPTER FOUR.
Floating Point (FLP) Representation
FLOATING-POINT NUMBER REPRESENTATION
CSCI206 - Computer Organization & Programming
Floating Points & IEEE 754.
Floating Point Representations
Lecture 6. Fixed and Floating Point Numbers
Computer Architecture & Operations I
2.4. Floating Point Numbers
Floating Point Representations
CSCI206 - Computer Organization & Programming
Recitation 4&5 and review 1 & 2 & 3
Topics IEEE Floating Point Standard Rounding Floating Point Operations
Numbers in a Computer Unsigned integers Signed magnitude
Floating Point Number system corresponding to the decimal notation
CS/COE0447 Computer Organization & Assembly Language
PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.
S09 Recitation #1 Jernej Barbic (S06) Revised: Alex Gartrell (S09)
Luddy Harrison CS433G Spring 2007
CSCI206 - Computer Organization & Programming
Number Representations
Floating Point Representation
The IEEE Floating Point Standard and execution units for it
CSCI206 - Computer Organization & Programming
How to represent real numbers
How to represent real numbers
COMS 361 Computer Organization
Faculty of Cybernetics, Statistics and Economic Informatics –
The IEEE Floating Point Standard and execution units for it
Review In last lecture, done with unsigned and signed number representation. Introduced how to represent real numbers in float format.
COMS 161 Introduction to Computing
CS 286 Computer Architecture & Organization
IT11004: Data Representation and Organization
Computer Organization and Assembly Language
Number Representations
Lecture 9: Shift, Mult, Div Fixed & Floating Point
Presentation transcript:

CSCI206 - Computer Organization & Programming Floating Point Limits zyBook: 10.9, 10.10

IEEE 754 Standard (1985) (normalized) Exponent Mantissa S, E, and M are encoded in the binary word

IEEE754 - Reserved Values Not a Number =

IEEE754 - Example Show 3.14 as a single precision float

3.14 - step 1 write in binary 3.14 == 3 + 0.14 0.14*2 = 0.28 0.28*2 = 0.56 0.56*2 = 1.12 0.12*2 = 0.24 ...... 0.0010......

3.14 - step 1 write in binary Need 24 bits for single (52 for double). In this example, 2 bits before point, 22 bits after. 3.14 == 11.0010001111010111000011

3.14 - step 2 normalize binary Normalized form is 1.yyyyy 3.14 == 11.0010001111010111000011 == Note a total of 24 bits.

3.14 - step 3 write mantissa & sign 3.14 == M = 10010001111010111000011 S = 0 (positive) Note that the mantissa keeps only 23 bits, the leading bit is always 1, so it is omitted in representation (only!!).

3.14 - step 4 encode exponent 3.14 == Exponent = 1, B = 127, (8 bits) E (biased exponent) = 128 = 1000 0000

3.14 - step 5 write result S = 0 (positive) E = 1000 0000 M = 10010001111010111000011 0 1000 0000 10010001111010111000011 to hex = 0x4048f5c3

Endianness On a little-endian system (Intel, etc), the IEEE754 value is byte & word swapped 0x 40 48 f5 c3 (big endian) 4840 c3f5 0x c3f5 4840 (little endian) Swap bytes and words! float f = 3.14; unsigned char* p = (unsigned char*)&f; printf("%02x%02x %02x%02x\n", *p, *(p+1), *(p+2), *(p+3)); // result on Intel: c3f5 4840, on MIPS: 4048 f5c3

Review IEEE754 S Exponent Mantissa Special values, else normalized numbers Exponent Mantissa (fraction) Value +/- zero nonzero denormalized number all 1’s +/- infinity NaN (not a number)

Largest Single Precision Float 8 bit exponent (bias = 127), 23 bit fraction All 1’s in the exponent is reserved for NaN and infinity Maximum biased exponent is 1111 1110 = 254 Maximum fraction is 23 1’s

Largest Single Precision Float 1111 1110 = 254 254-127 = 127

Largest Single Precision Float Move the decimal point 23 digits to the right subtract 23 from exponent

Largest Single Precision Float Convert mantissa

Smallest Nonzero Single What we want is: But that has exponent & fraction = 0 That value is reserved for zero! Therefore, the closest we can get is: either or

Smallest Nonzero Single Normalized In this case, using a normalized number is not ideal, if we could use a denormalized number we could get a much smaller value: This is equivalent to: An extra 22 bits of precision! Denormalized

Smallest Nonzero Single The IEEE realized this and when the exponent is zero and the fraction is > 0, the value is treated as a denormalized number. The smallest nonzero normalized: The smallest nonzero denormalized: exp = 0000 0001 exp = 0000 0000 m = 0000….1

Smallest Nonzero Normalized Single biased exponent = 1 fraction = 0 0000 0001 00000000000000000000000

Smallest Nonzero Denormalized Single biased exponent = 0 fraction = 0.00000000000000000000001 0000 0000 00000000000000000000001 *Note, even though the exponent is encoded as -127, it is computed using the smallest “valid” exponent, which is -126.