CSE 140: Computer Arithmetic Algorithms and Hardware Design

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

CSCE 212 Chapter 3: Arithmetic for Computers Instructor: Jason D. Bakos.
Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Carnegie Mellon Instructors: Dave O’Hallaron, Greg Ganger, and Greg Kesden Floating Point : Introduction to Computer Systems 4 th Lecture, Sep 6,
1 Binghamton University Floating Point CS220 Computer Systems II 3rd Lecture.
Carnegie Mellon Instructors: Randy Bryant & Dave O’Hallaron Floating Point : Introduction to Computer Systems 3 rd Lecture, Aug. 31, 2010.
Copyright 2008 Koren ECE666/Koren Part.4b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.
University of Washington Today Topics: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties.
1 CSE1301 Computer Programming Lecture 30: Real Number Representation.
CSE1301 Computer Programming Lecture 33: Real Number Representation
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Winter 2004 Lecture 10 Thursday 02/19/02.
Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 10 Floating Point Number Rounding, Polynomial.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Winter 2004 Lecture 9.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Fall 2006 Lecture 9: Floating Point Numbers.
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computing Systems Basic arithmetic for computers.
Floating Point Numbers Topics –IEEE Floating Point Standard –Rounding –Floating Point Operations –Mathematical properties.
Instructor: Andrew Case Floating Point Slides adapted from Randy Bryant & Dave O’Hallaron.
Carnegie Mellon Instructors: Greg Ganger, Greg Kesden, and Dave O’Hallaron Floating Point : Introduction to Computer Systems 4 th Lecture, Sep 4,
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Instructor: San Skulrattanakulchai Floating Point MCS-284:
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
Carnegie Mellon Instructors: Randy Bryant, Dave O’Hallaron, and Greg Kesden Floating Point : Introduction to Computer Systems 4 th Lecture, Sep 5,
CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.
Number Representation Fixed and Floating Point
1 Floating Point. 2 Topics Fractional Binary Numbers IEEE 754 Standard Rounding Mode FP Operations Floating Point in C Suggested Reading: 2.4.
Lecture 4 IEEE 754 Floating Point Topics IEEE 754 standard Tiny float January 25, 2016 CSCE 212 Computer Architecture.
Floating Point Topics IEEE Floating-Point Standard Rounding Floating-Point Operations Mathematical Properties CS 105 “Tour of the Black Holes of Computing!”
Floating Point CSE 238/2038/2138: Systems Programming
Floating Point Borrowed from the Carnegie Mellon (Thanks, guys!)
Floating Point Numbers
CS 105 “Tour of the Black Holes of Computing!”
2.4. Floating Point Numbers
Topics IEEE Floating Point Standard Rounding Floating Point Operations
CS 105 “Tour of the Black Holes of Computing!”
Floating Point Number system corresponding to the decimal notation
CS 232: Computer Architecture II
CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard
PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.
Arithmetic for Computers
ECE/CS 552: Floating Point
CSCE 350 Computer Architecture
Number Representations
CSCI206 - Computer Organization & Programming
CS 105 “Tour of the Black Holes of Computing!”
How to represent real numbers
CS 105 “Tour of the Black Holes of Computing!”
Floating Point Arithmetic August 31, 2009
CS 105 “Tour of the Black Holes of Computing!”
COMS 361 Computer Organization
Faculty of Cybernetics, Statistics and Economic Informatics –
October 17 Chapter 4 – Floating Point Read 5.1 through 5.3 1/16/2019
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
CS 105 “Tour of the Black Holes of Computing!”
CS213 Floating Point Topics IEEE Floating Point Standard Rounding
UNIVERSITY OF MASSACHUSETTS Dept
CS 105 “Tour of the Black Holes of Computing!”
“The course that gives CMU its Zip!”
Number Representations
CSE140 Final Review Xinyuan Wang 06/06/2019.
CS 105 “Tour of the Black Holes of Computing!”
Presentation transcript:

CSE 140: Computer Arithmetic Algorithms and Hardware Design Lecture 19: Floating Point Numbers Instructor: Prof. Chung-Kuan Cheng

Motivation Maximal information with given bit numbers. Arithmetic with proper precision. Fairness of rounding. Features at the expenses of the complexity of the operations.

Topics: Floating Point Numbers (IEEE P754) Standard Operations Exceptional Situations Rounding Modes Numerical Computing with IEEE Floating Point Arithmetic, Michael L. Overton, SIAM

Standard 232  Typically Goal: Dynamic Range: largest #/ smallest # If too large, holes between #’s

Standard ulp (unit in the last place) Difference between two consecutive values of the significand. 3 Parts x = ~s be:sign, significand, exponent Sign Bit 23-bit Significand 8-bit exponent

Standard: nomalization ±e1e2e3e4e5e6e7e8s1s2s3…s22s23 1.s1s2s3…s22s23 normalized number 0.s1s2s3…s22s23 denormalized number Id e1e2e3e4e5e6e7e8 0 0 0 0 0 0 0 0 0 x=0.s1s2s3…s22s23 2-126 1 0 0 0 0 0 0 0 1 x=1.s1s2s3…s22s23 2-126 2 0 0 0 0 0 0 1 0 x=1.s1s2s3…s22s23 2-125 . 126 0 1 1 1 1 1 1 0 x=1.s1s2s3…s22s23 2-1 127 0 1 1 1 1 1 1 1 x=1.s1s2s3…s22s23 20 128 1 0 0 0 0 0 0 0 x=1.s1s2s3…s22s23 21 253 1 1 1 1 1 1 0 1 x=1.s1s2s3…s22s23 2126 254 1 1 1 1 1 1 1 0 x=1.s1s2s3…s22s23 2127 255 1 1 1 1 1 1 1 1 x= Inf if (s1 …s23)= 0, else NaN NaN  Not a Number

Standard: normalization 0.01x2-3 = 0.001x2-2 Same number, so normalize to remove redundancy Use a default 1 in front for one more bit precision. Smallest Number 0.00…01x2-126 = 1.0x2-23x2-126 = 1x2-149

Standard - Example ±eeeeeeee sssss sssss sssss sssss sss 1 00000000 00000000000000000000000 =-0.000…0x2-126 minimum 0 00000000 00000000000000000000001 = 1x2-149 normalized minimum 0 00000001 00000000000000000000000 = 1.000…0x2-126 0 00000001 00000000000000000000001 = 1.000…1x2-126 . 0 01111111 00000000000000000000000 = 1.000…0x20 0 01111111 00000000000000000000001 = 1.000…1x20 0 10000000 00000000000000000000001 = 1.000…1x21

Standard – Example Cont. - Normalized Maximum 0 11111111 00000000000000000000000 = Inf Tiniest= 1 x 2-149 Nmin = 1.0 x 2-126 Nmax = (2 – 2-23)2127

Double Floating Point ± e1e2…e11 s1s2…s52 0 00…000 s1s2…s52 x=0.s1s2…s52 2-1022 0 00…001 s1s2…s52 x=1.s1s2…s52 2-1022 . 0 01…111 s1s2…s52 x=1.s1s2…s52 20 0 10…000 s1s2…s52 x=1.s1s2…s52 21 0 11…110 s1s2…s52 x=1.s1s2…s52 21023 0 11…111 s1s2…s52 x=Inf if (s1…s52)=0

Overflow/Underflow Underflow Denser Sparser Overflow Nmin Nmax

Addition/Multiplication ~s1xbe1 + (~s2xbe2) = ~sxbe = ~s1xbe1 + ~s2/be1-e2 x be1 = (~s1 + ~s2/be1-e2) x be1 (~s1xbe1) x (~s2xbe2) = ~(s1xs2)be1+e2

Exceptions a/0 = Inf if a > 0 a/Inf = 0 if a != 0 a·0 = 0 a·Inf = Inf if a > 0 a + Inf = Inf 0·Inf = invalid operation (NaN) 0/0 = invalid operation (NaN) Inf - Inf = NaN NaP op a = NaN

Rounding Mode Adder Output = Cout z1z0.z-1z-2…z-l GRS Guard Bit Round Bit Sticky Bit, OR of all bits below bit R 1.101 x 23 +1.110 x 23 11.011 x 23 1.1011x24 Normalize – need to round or

Rounding 1.110 23 - 1.101 23 0.001 23 1.000 20 normalize 1.101 23 - 1.101 22 - 0.1101 23 0.1101 23 1.101 22 Guard bit

Rounding Round to the nearest even 1.10111 toward 0 1.1011 Toward +Inf 1.1100 Toward -Inf 1.1011

Conventional Rounding Error 1.10100  1.101 = 0 1.10101  1.101 = -0.25 1.10110  1.110 = +0.5 1.10111  1.110 = +0.25 Average Error = 0.5/4 = 0.125 Assuming 3 bit precision.

Conventional Rounding Error 1.10100  1.101 = 0 1.10101  1.101 = -0.25 1.10110  1.110 = +0.5 1.10111  1.110 = +0.25 Average Error = 0.5/4 = 0.125 Correction: round up only if there is one or more 1s at the right. 1.101100  1.101 = -0.5 1.101101  1.110 = +0.375

Rounding: round to even Round up conditions: 1.bbb…bbbGRXXXX Round = 1, Sticky = 1  > 0.5 Guard = 1, Round = 1, Sticky = 0  Round to even Value Fraction GRS Incr? Rounded 128 1.0000000 000 N 1.000 15 1.1010000 100 N 1.101 17 1.0001000 010 N 1.000 19 1.0011000 110 Y 1.010 138 1.0001010 011 Y 1.001 63 1.1111100 111 Y 10.000 Guard bit: LSB of result Round bit: 1st bit removed Sticky bit: OR of remaining bits Round up= R(G+S)

Rounding: Round to even 1.bbb…bbbGRS Round up = R(G+S) GRS Rouding Error 000 0 0 001 0 -0.25 010 0 -0.5 011 +1 +0.25 100 0 0 101 0 -0.25 110 +1 +0.5 111 +1 +0.25

Rounding We need only one guard bit for normalization after addition. Assumption: Operands are normalized.

Example 2 1.00001 23 -1.01011 2-1 Normalize according to exponent (assuming 5 bit precision) 1.00001 23 -0.000101011 23 0.111100101 23 Renormalize 1.11100101 22 Result = 1.11101 22 Bit on the boundary Round bit Take 5 bits after decimal Non-zero => round-up

Theory behind it g r guard Other bits round OR Sticky bit When shifting right, don’t need to remember anything more than 3 bits below

Reference Michael L. Overton, Numerical Computing with IEEE Floating Point Arithmetic, SIAM, 2001 Behrooz Parhami, Computer Arithmetic, second edition, Oxford, 2010 Israel Koren, Computer Arithmetic Algorithms, second edition, A K Peters, Ltd., 2002