Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., 3.1416 –very small numbers, e.g.,.000000001 –very large numbers,

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
CENG536 Computer Engineering department Çankaya University.
Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Lecture 16: Computer Arithmetic Today’s topic –Floating point numbers –IEEE 754 representations –FP arithmetic Reminder –HW 4 due Monday 1.
Floating Point (FLP) Representation A Floating Point value: f = m*r**e Where: m – mantissa or fractional r – base or radix, usually r = 2 e - exponent.
Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.
COMP3221: Microprocessors and Embedded Systems Lecture 14: Floating Point Numbers Lecturer: Hui Wu Session 2, 2004.
1 Lecture 9: Floating Point Today’s topics:  Division  IEEE 754 representations  FP arithmetic Reminder: assignment 4 will be posted later today.
Chapter 5 Floating Point Numbers. Real Numbers l Floating point representation is used whenever the number to be represented is outside the range of integer.
Number Systems Standard positional representation of numbers:
Floating Point Numbers
CHAPTER 5: Floating Point Numbers
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
1 Module 2: Floating-Point Representation. 2 Floating Point Numbers ■ Significant x base exponent ■ Example:
Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)
Computer Science 210 Computer Organization Floating Point Representation.
Lecture 8 Floating point format
The Binary Number System
Numbers and number systems
Binary Real Numbers. Introduction Computers must be able to represent real numbers (numbers w/ fractions) Two different ways:  Fixed-point  Floating-point.
Information Representation (Level ISA3) Floating point numbers.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Arithmetic Nizamettin AYDIN
Number Systems II Prepared by Dr P Marais (Modified by D Burford)
1 Lecture 5 Floating Point Numbers ITEC 1000 “Introduction to Information Technology”
CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.
Fixed-Point Arithmetics: Part II
IT253: Computer Organization
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
Data Representation - Part II. Characters A variable may not be a non-numerical type Character is the most common non- numerical type in a programming.
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
S. Rawat I.I.T. Kanpur. Floating-point representation IEEE numbers are stored using a kind of scientific notation. ± mantissa * 2 exponent We can represent.
9.4 FLOATING-POINT REPRESENTATION
Fixed and Floating Point Numbers Lesson 3 Ioan Despi.
Digital DesignFloating-Point Number-0 CS3104: 數位系統導論 Principles of Digital Design [project2] floating-point number addition 吳中浩 教授 助教 高鵬程 國立清華大學資訊工程學系.
CSC 221 Computer Organization and Assembly Language
ITEC 1011 Introduction to Information Technologies 4. Floating Point Numbers Chapt. 5.
COMP201 Computer Systems Floating Point Numbers. Floating Point Numbers  Representations considered so far have a limited range dependent on the number.
1 Number Systems Lecture 10 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication
CSPP58001 Floating Point Numbers. CSPP58001 Floating vs. fixed point Floating point refers to a binary decimal representation where there is not a fixed.
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Binary Arithmetic.
Floating Point Numbers
CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
Fixed-point and floating-point numbers Ellen Spertus MCS 111 October 4, 2001.
Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.
Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.
CHAPTER 5: Representing Numerical Data
Floating Point Numbers
Floating Point Representations
Floating Point Number system corresponding to the decimal notation
CS 232: Computer Architecture II
CS/COE0447 Computer Organization & Assembly Language
Number Representations
Computer Science 210 Computer Organization
How to represent real numbers
How to represent real numbers
Floating Point Numbers
Floating Point Numbers
Numbers with fractions Could be done in pure binary
Number Representations
Lecture 9: Shift, Mult, Div Fixed & Floating Point
Presentation transcript:

Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large numbers, e.g., X 10 9 –A notation that renders numbers with a single digit to the left of the decimal point is called Scientific notation. –A number in scientific notation that has no leading zeros is called a normalized number. E.g 0.1 X and 10.0 X are not in normalized scientific notation, while 1.0 X is. –If we introduce decimals, we can easily create other possible representations. Each of these alternative representations is created by shifting the decimal point from its original location. Since each phase shift represents a multiplication or division of the value by the base, we can increase or decrease the exponent to compensate for the shift.

Thus to define a number in exponential or scientific notation requires the specification of four separate components. a) the sign of the number b) the magnitude of the number known as the MANTISSA c) the sign of the exponent d) the magnitude of the exponent Two additional pieces of information are required to complete the picture: e) the base of the exponent (in this case 10, for computers it is 2) f) the location of the radix point (the radix point is set at a particular location in the number, most commonly the beginning or the end of the number. Since its location never changes, it is not necessary to actually store the point. Instead the location is implied. A designer of a floating point representation must find a compromise between the size of the fraction and the size of the exponent because a fixed word size means you must take a bit from one to add a bit to the other. This trade off is between precision and range: increasing the size of the fraction enhances the precision of the fraction, while increasing the size of the exponent increases the range of numbers that can be represented.

For floating point numbers, we might assign the digits as follows: SEEMMMMM The sign digit will be used to store the sign of the mantissa. Most commonly, the mantissa is stored using sign magnitude format. A few computers use complementary notation. No specific provision for the sign of the exponent. Thus the sign of the exponent must be included within the digits of the exponent itself. The method of storing the exponent is known as the Excess N notation, where N is the chosen mid value. Say the exponent can take values from 00 to 99 (two decimal digits), and we pick a value somewhere in the middle, ’50’, and declare that value to correspond to the exponent 0, then every value lower than that will be negative and those above will be positive. What we have done is offset, or bias the value of the exponent by our chosen amount. Thus to convert from exponential format to the format used in our example, we add the offset to the exponent, and store in that form.

MIPS floating point representation Representation: –In normalized scientific notation, numbers are represented as a single non zero digit to the left of the binary point. Eg. 1.xxxxxx X 2 yyyy –Out of the 32 bits, 1 bit is reserved for the sign, 8 bits for the exponent, and 23 bits for the fraction. –Thus numbers almost as large as 2.0 X and almost as small as 2.0 X can be represented. –Overflow occurs when the exponent is too large, and Underflow occurs when the negative exponent is too large to fit in the exponent field. IEEE 754 floating point standard: –single precision: 8 bit exponent, 23 bit significand –double precision: 11 bit exponent, 52 bit significand –MIPS double precision allows numbers almost as small as 2.0 X and almost as large as 2.0 X

IEEE 754 floating-point standard Leading “1” bit of significand is implicit. Hence the number is 24 bits long in single precision (implied 1 and a 23 bit fraction ) and 53 bits long in double precision ( 1+52). We use the term “significand” to represent the 24 or 53 bit number that is 1 plus the fraction, and the “fraction” when we mean the 23 or 52 bit number. Exponent is “biased” to make sorting easier –all 0s is smallest exponent all 1s is largest –bias of 127 for single precision and 1023 for double precision –summary: (–1) sign ´ (1+Fraction) ´ 2 exponent – bias - Refer Fig on Page 194 of the Text book Example: –decimal: -.75 = - ( ½ + ¼ ) –binary: -.11 = -1.1 x 2 -1 –floating point: exponent = 126 = –IEEE single precision:

Floating point addition 1.Start 2.Compare the exponents of the two numbers; shift the smaller number to the right until its exponent matches the larger exponent 3.Add the significands 4.Normalize the sum, either shifting right and incrementing the exponent, or shifting left and decrementing the exponent. 5.Overflow or underflow, if YES, exception, if NO, step 6 6.Round the significand to the appropriate number of bits. 7.Still normalized? If Yes Done, If No, go to step 4

Floating point addition