© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2010 ECE408, University of Illinois, Urbana-Champaign 1 Floating-Point Considerations.

Slides:

Advertisements

Similar presentations

L9: Floating Point Issues CS6963. Outline Finish control flow and predicated execution discussion Floating point – Mostly single precision until recent.

Advertisements

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

Arithmetic in Computers Chapter 4 Arithmetic in Computers2 Outline Data representation integers Unsigned integers Signed integers Floating-points.

Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

CENG536 Computer Engineering department Çankaya University.

© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007 ECE 498AL, University of Illinois, Urbana-Champaign 1 ECE 498AL Lecture 11: Floating-Point Considerations.

Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:

Princess Sumaya Univ. Computer Engineering Dept. Chapter 3: IT Students.

Floating Point in computers Comply with standards: IEEE 754 ISO/IEC 559.

Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.

Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?

University of Washington Today Topics: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties.

L10: Floating Point Issues and Project CS6963. Administrative Issues Project proposals – Due 5PM, Friday, March 13 (hard deadline) Homework (Lab 2) –

L9: Next Assignment, Project and Floating Point Issues CS6963.

CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.

Floating Point Numbers

COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.

Information Representation (Level ISA3) Floating point numbers.

© David Kirk/NVIDIA and Wen-mei W. Hwu, University of Illinois, Urbana-Champaign 1 CSCE 4643 GPU Programming Lecture 8 - Floating Point Considerations.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Architecture Lecture 3: Logical circuits, computer arithmetics Piotr Bilski.

CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.

NUMBER REPRESENTATION CHAPTER 3 – part 3. ONE’S COMPLEMENT REPRESENTATION CHAPTER 3 – part 3.

CUDA Performance Considerations Patrick Cozzi University of Pennsylvania CIS Spring 2011.

Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.

Computing Systems Basic arithmetic for computers.

Computer Architecture

ECE232: Hardware Organization and Design

L9: Project Discussion and Floating Point Issues CS6235.

Introduction to CUDA Programming Profiler, Assembly, and Floating-Point Andreas Moshovos Winter 2009 Some material from: Wen-Mei Hwu and David Kirk NVIDIA.

© David Kirk/NVIDIA and Wen-mei W. Hwu, University of Illinois, Urbana-Champaign 1 ECE408 Applied Parallel Programming Lecture 15 - Floating.

CSC 221 Computer Organization and Assembly Language

Floating Point Arithmetic

Floating Point Representation for non-integral numbers – Including very small and very large numbers Like scientific notation – –2.34 × –

© David Kirk/NVIDIA and Wen-mei W. Hwu, University of Illinois, Urbana-Champaign 1 ECE408 Applied Parallel Programming Lecture 16 - Floating.

Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:

Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication

1 Floating Point Operations - Part II. Multiplication Do unsigned multiplication on the mantissas including the hidden bits Do unsigned multiplication.

Computer Architecture Lecture 22 Fasih ur Rehman.

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.

Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.

© David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30 – July 2, Taiwan 2008 CUDA Course Programming Massively Parallel Processors: the CUDA experience.

© David Kirk/NVIDIA and Wen-mei W. Hwu Urbana, Illinois, August 10-14, VSCSE Summer School 2009 Many-core Processors for Science and Engineering.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Floating-Point Considerations.

1 Floating Point. 2 Topics Fractional Binary Numbers IEEE 754 Standard Rounding Mode FP Operations Floating Point in C Suggested Reading: 2.4.

Floating Point Representations

Floating Point Representations

© David Kirk/NVIDIA and Wen-mei W

Integer Division.

Topics IEEE Floating Point Standard Rounding Floating Point Operations

Floating Point Numbers: x 10-18

Floating Point Number system corresponding to the decimal notation

CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard

Arithmetic for Computers

ECE 498AL Spring 2010 Lecture 11: Floating-Point Considerations

Introduction to CUDA Programming

CS 105 “Tour of the Black Holes of Computing!”

How to represent real numbers

CSCE 4643 GPU Programming Lecture 9 - Floating Point Considerations

CS 105 “Tour of the Black Holes of Computing!”

© David Kirk/NVIDIA and Wen-mei W. Hwu,

CS213 Floating Point Topics IEEE Floating Point Standard Rounding

CS 105 “Tour of the Black Holes of Computing!”

CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming

Presentation transcript:

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Floating-Point Considerations

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 2 Objective To understand the fundamentals of floating-point representation To know the IEEE-754 Floating Point Standard CUDA GPU Floating-point speed, accuracy and precision –Deviations from IEEE-754 –Accuracy of device runtime functions –-fastmath compiler option –Future performance considerations

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 3 Motivation High speed floating point arithmetic has become a standard feature for microprocessors. –Important for application programmers to understand and take advantage of floating point arithmetic in developing applications Focus on –Accuracy of floating point arithmetic –Precision of floating point number representation How the accuracy and precision of floating point number representation should be taken into consideration in parallel computation.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 4 What is IEEE floating-point format? A floating point binary number consists of three parts: –sign (S), exponent (E), and mantissa (M). –Each (S, E, M) pattern uniquely identifies a floating point number. For each bit pattern, its IEEE floating-point value is derived as: –value = (-1) S * M * {2 E }, where 1.0 ≤ M < 10.0 B = (10.0) 2 = (2.0) 10 The interpretation of S is simple: S=0 results in a positive number and S=1 a negative number.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 5 Normalized Representation Specifying that 1.0 B ≤ M < 10.0 B makes the mantissa value for each floating point number unique. –For example, the only one mantissa value allowed for 0.5 D is M = D = 1.0 B * 2 -1 –Neither 10.0 B * 2 -2 nor 0.1 B * 2 0 qualifies Because all mantissa values are of the form 1.XX…, one can omit the “1.” part in the representation. –The mantissa value of 0.5 D in a 2-bit mantissa is 00, which is derived by omitting “1.” from 1.00.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 6 Exponent Representation In an n-bits exponent representation, 2 n-1 -1 is added to its 2's complement representation to form its excess representation. –See Table for a 3-bit exponent representation A simple unsigned integer comparator can be used to compare the magnitude of two FP numbers Symmetric range for +/- exponents (111 reserved) 2’s complementActual decimalExcess (reserved pattern)

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 7 A simple, hypothetical 5-bit FP format Assume 1-bit S, 2-bit E, and 2-bit M –0.5 D = 1.00 B * 2 -1 –0.5 D = , where S = 0, E = 00, and M = (1.)00 2’s complementActual decimalExcess (reserved pattern) 11 00

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 8 Representable Numbers The representable numbers of a given format is the set of all numbers that can be exactly represented in the format. See Table for representable numbers of an unsigned 3- bit integer format

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 9 Representable Numbers of a 5-bit Hypothetical IEEE Format 0 No-zero Abrupt underflowGradual underflow EM S=0S=1 S=0S=1S=0S= (2 -1 ) *2 -3 -( *2 -3 ) 001* * *2 -3 -( *2 -3 ) 002* * *2 -3 -( *2 -3 ) 003* * (2 0 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +1*2 -2 )2 0 +1*2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +2*2 -2 )2 0 +2*2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) *2 -2 -(2 0 +3*2 -2 )2 0 +3*2 -2 -(2 0 +3*2 -2 ) (2 1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +1*2 -1 )2 1 +1*2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +2*2 -1 )2 1 +2*2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) *2 -1 -(2 1 +3*2 -1 )2 1 +3*2 -1 -(2 1 +3*2 -1 ) 11Reserved pattern Mmmmmmmmmmmmmmmmmm m mmmmmmmmmmmmmmmmmm Mmmmmmmmmmmmmmmmmm m Mmmmmmmmmmmmmmmmm Cannot represent Zero!

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 10 Representable Numbers: Observations major intervals (defined by exponent) Mantissa bits define the number of representable numbers in every range 0 is not represented Representable numbers closer to each other towards 0

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 11 Flush to Zero Treat all bit patterns with E=0 as 0.0 –This takes away several representable numbers near zero and lump them all into 0.0 –For a representation with large M, a large number of representable numbers numbers will be removed

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 12 Flush to Zero 0 No-zero Flush to Zero Denormalized EMS=0S=1 S=0S=1 S=0S= (2 -1 ) *2 -3 -( *2 -3 ) 00 1* * *2 -3 -( *2 -3 ) 00 2* * *2 -3 -( *2 -3 ) 00 3* * (2 0 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) (2 1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) 11Reserved pattern

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 13 Denormalized Numbers The actual method adopted by the IEEE standard is called denromalized numbers or gradual underflow. –The method relaxes the normalization requirement for numbers very close to 0. –whenever E=0, the mantissa is no longer assumed to be of the form 1.XX. Rather, it is assumed to be 0.XX. In general, if the n-bit exponent is 0, the value is 0.M * ^(n-1)

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 14 Denormalization 0 No-zeroFlush to Zero Denormalized EMS=0S=1S=0S=1 S=0S= (2 -1 ) *2 -3 -( *2 -3 )00 1* * *2 -3 -( *2 -3 )00 2* * *2 -3 -( *2 -3 )00 3* * (2 0 ) *2 -2 -(2 0 +1*2 -2 )2 0 +1*2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +2*2 -2 )2 0 +2*2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +3*2 -2 )2 0 +3*2 -2 -(2 0 +3*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) (2 1 ) *2 -1 -(2 1 +1*2 -1 )2 1 +1*2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +2*2 -1 )2 1 +2*2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +3*2 -1 )2 1 +3*2 -1 -(2 1 +3*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) 11Reserved pattern

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 15 Arithmetic Instruction Throughput int and float add, shift, min, max and float mul, mad: 4 cycles per warp –int multiply (*) is by default 32-bit requires multiple cycles / warp –Use __mul24() / __umul24() intrinsics for 4-cycle 24-bit int multiply Integer divide and modulo are expensive –Compiler will convert literal power-of-2 divides to shifts –Be explicit in cases where compiler can’t tell that divisor is a power of 2! –Useful trick: foo % n == foo & (n-1) if n is a power of 2

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 16 Arithmetic Instruction Throughput Reciprocal, reciprocal square root, sin/cos, log, exp: 16 cycles per warp –These are the versions prefixed with “__” –Examples:__rcp(), __sin(), __exp() Other functions are combinations of the above –y / x == rcp(x) * y == 20 cycles per warp –sqrt(x) == rcp(rsqrt(x)) == 32 cycles per warp

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 17 Runtime Math Library There are two types of runtime math operations –__func(): direct mapping to hardware ISA Fast but low accuracy (see prog. guide for details) Examples: __sin(x), __exp(x), __pow(x,y) –func() : compile to multiple instructions Slower but higher accuracy (5 ulp, units in the least place, or less) Examples: sin(x), exp(x), pow(x,y) The -use_fast_math compiler option forces every func() to compile to __func()

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 18 Make your program float-safe! Future hardware will have double precision support –G80 is single-precision only –Double precision will have additional performance cost –Careless use of double or undeclared types may run more slowly on G80+ Important to be float-safe (be explicit whenever you want single precision) to avoid using double precision where it is not needed –Add ‘f’ specifier on float literals: foo = bar * 0.123; // double assumed foo = bar * 0.123f; // float explicit –Use float version of standard library functions foo = sin(bar); // double assumed foo = sinf(bar); // single precision explicit

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 19 Deviations from IEEE-754 Addition and Multiplication are IEEE 754 compliant –Maximum 0.5 ulp (units in the least place) error However, often combined into multiply-add (FMAD) –Intermediate result is truncated Division is non-compliant (2 ulp) Not all rounding modes are supported Denormalized numbers are not supported No mechanism to detect floating-point exceptions

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 20 GPU Floating Point Features G80SSEIBM AltivecCell SPE PrecisionIEEE 754 Rounding modes for FADD and FMUL Round to nearest and round to zero All 4 IEEE, round to nearest, zero, inf, -inf Round to nearest only Round to zero/truncate only Denormal handlingFlush to zero Supported, 1000’s of cycles Flush to zero NaN supportYes No Overflow and Infinity support Yes, only clamps to max norm Yes No, infinity FlagsNoYes Some Square rootSoftware onlyHardwareSoftware only DivisionSoftware onlyHardwareSoftware only Reciprocal estimate accuracy 24 bit12 bit Reciprocal sqrt estimate accuracy 23 bit12 bit log2(x) and 2^x estimates accuracy 23 bitNo12 bitNo

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 21 Floating-Point Calculation Results Can Depend on Execution Order 1.00* * * *2 -2 = 1.00* * *2 -2 = 1.00* *2 -2 = 1.00*2 1 (1.00* *2 0 ) + (1.00* *2 -2 ) = 1.00* *2 -1 = 1.01*2 1 Order 1 Order 2 Pre-sorting is often used to increase stability of floating point results.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 22 Representable Numbers of a 5-bit Hypothetical IEEE Format 0 No-zero Abrupt underflowGradual underflow EM S=0S=1 S=0S=1S=0S= (2 -1 ) *2 -3 -( *2 -3 ) 001* * *2 -3 -( *2 -3 ) 002* * *2 -3 -( *2 -3 ) 003* * (2 0 ) *2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +1*2 -2 )2 0 +1*2 -2 -(2 0 +1*2 -2 ) *2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +2*2 -2 )2 0 +2*2 -2 -(2 0 +2*2 -2 ) *2 -2 -(2 0 +3*2 -2 ) *2 -2 -(2 0 +3*2 -2 )2 0 +3*2 -2 -(2 0 +3*2 -2 ) (2 1 ) *2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +1*2 -1 )2 1 +1*2 -1 -(2 1 +1*2 -1 ) *2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +2*2 -1 )2 1 +2*2 -1 -(2 1 +2*2 -1 ) *2 -1 -(2 1 +3*2 -1 ) *2 -1 -(2 1 +3*2 -1 )2 1 +3*2 -1 -(2 1 +3*2 -1 ) 11Reserved pattern Cannot represent Zero!