Floating Point Arithmetics

Slides:

Advertisements

Similar presentations

Spring 2013 Advising Starts this week! CS2710 Computer Organization1.

Advertisements

Arithmetic in Computers Chapter 4 Arithmetic in Computers2 Outline Data representation integers Unsigned integers Signed integers Floating-points.

CSCE 212 Chapter 3: Arithmetic for Computers Instructor: Jason D. Bakos.

1 CS61C L9 Fl. Pt. © UC Regents CS61C - Machine Structures Lecture 9 - Floating Point, Part I September 27, 2000 David Patterson

Integer Multiplication (1/3)

CS61C L11 Floating Point II (1) Garcia, Fall 2005 © UCB Lecturer PSOE, new dad Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C.

1 Lecture 9: Floating Point Today’s topics:  Division  IEEE 754 representations  FP arithmetic Reminder: assignment 4 will be posted later today.

Chapter 3 Arithmetic for Computers. Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's.

CS 61C L12 Pseudo (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #12: FP II & Pseudo Instructions.

CS 61C L16 : Floating Point II (1) Kusalo, Spring 2005 © UCB Lecturer NSOE Steven Kusalo inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.

CS 61C L16 : Floating Point II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C.

CS61C L14 MIPS Instruction Representation II (1) Garcia, Spring 2007 © UCB Google takes on Office!  Google Apps: premium “services” ( , instant messaging,

CS61C L15 Floating Point I (1) Garcia, Fall 2011 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.

CS61C L16 Floating Point II (1) Garcia, Fall 2006 © UCB Prof Smoot given Nobel!!  Prof George Smoot joins the ranks of the most distinguished faculty.

CS61C L16 Floating Point II (1)Mowery, Spring 2008 © UCB TA Oddinaire Keaton Mowery inst.eecs.berkeley.edu/~cs61c UC Berkeley.

CS61C L11 Floating Point II (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #11.

COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.

CPSC 321 Computer Architecture ALU Design – Integer Addition, Multiplication & Division Copyright 2002 David H. Albonesi and the University of Rochester.

CS 61C L16 : Floating Point II (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Arithmetic Nizamettin AYDIN

Csci 136 Computer Architecture II – Floating-Point Arithmetic

CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.

CS61C L11 Floating Point (1) Pearce, Summer 2010 © UCB inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 11 Floating Point

CPS3340 COMPUTER ARCHITECTURE Fall Semester, /14/2013 Lecture 16: Floating Point Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.

Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Spring 2006 University of Ilam.

Lecture 9: Floating Point

Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Ilam University.

FLOATING POINT ARITHMETIC P&H Slides ECE 411 – Spring 2005.

CS 61C L3.2.1 Floating Point 1 (1) K. Meinz, Summer 2004 © UCB CS61C : Machine Structures Lecture Floating Point Kurt Meinz inst.eecs.berkeley.edu/~cs61c.

CHAPTER 3 Floating Point.

CDA 3101 Fall 2013 Introduction to Computer Organization

10/7/2004Comp 120 Fall October 7 Read 5.1 through 5.3 Register! Questions? Chapter 4 – Floating Point.

Chapter 9 Computer Arithmetic

William Stallings Computer Organization and Architecture 8th Edition

Floating Point Representations

Morgan Kaufmann Publishers Arithmetic for Computers

Computer Architecture & Operations I

Computer Architecture & Operations I

Integer Division.

Morgan Kaufmann Publishers Arithmetic for Computers

Lecture 9: Floating Point

Topics IEEE Floating Point Standard Rounding Floating Point Operations

CS/COE0447 Computer Organization & Assembly Language

Morgan Kaufmann Publishers

CDA 3101 Summer 2007 Introduction to Computer Organization

William Stallings Computer Organization and Architecture 7th Edition

ECE/CS 552: Floating Point

CSCE 350 Computer Architecture

inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture Floating Point Arithmetic

CS 105 “Tour of the Black Holes of Computing!”

How to represent real numbers

Computer Arithmetic Multiplication, Floating Point

ECEG-3202 Computer Architecture and Organization

CS 105 “Tour of the Black Holes of Computing!”

October 17 Chapter 4 – Floating Point Read 5.1 through 5.3 1/16/2019

cnn.com/2004/ALLPOLITICS/10/05/debate.main/

Floating Point Numbers

Morgan Kaufmann Publishers Arithmetic for Computers

CS213 Floating Point Topics IEEE Floating Point Standard Rounding

Inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #11: FP II & Pseudo Instructions Andy Carle Greet class.

CS 105 “Tour of the Black Holes of Computing!”

Number Representation

CDA 3101 Spring 2016 Introduction to Computer Organization

Presentation transcript:

Floating Point Arithmetics Prof. Giancarlo Succi, Ph.D., P.Eng. E-mail: Giancarlo.Succi@unibz.it Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Floating Point Summary 31 S Exponent 30 23 22 Significand 1 bit 8 bits 23 bits (-1)S x (1 + Significand) x 2(Exponent-127) Double precision identical, except with exponent bias of 1023 Interpretation of value in each position extends beyond the decimal/binary point 1.1001 = (1x20) + (1x2-1) + (0x2-2) + (0x2-3) + (1x2-4) Significand or mantissa 27=128 Double means with more bits… Prof. G. Succi, Ph.D., P.Eng.

Special numbers review What have we defined so far? (Single Precision) Exponent Significand Object 0 0 0 0 nonzero ??? 1-254 anything +/- fl. pt. # 255 0 +/- infinity 255 nonzero ??? Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Multiplication Paper and pencil example (unsigned): Multiplicand 1000 8 Multiplier x1001 9 1000 0000 0000 +1000 01001000 m bits x n bits = m + n bit product Multiplication is therefore a sequence of additions with shifts. I keep my multiplier and then I shift and add the multiplicand –show it on a slide. Step if last bit of multiplicator = 1; Prod = Prod + Multiplicand shift left multiplicand shift right multiplicator if not done, goto step We need 64 bits for the multiplicand and for the product A simpler approach would be to shift the product and keep untouched the multiplicand… shift right product Last optimization-> see the next slide. Prof. G. Succi, Ph.D., P.Eng.

Example of multiplication Multiplicand: 0011, Multiplicator: 1100 Result: requires 8 bits We combine Result with Multiplicator: Multiplicand Result with Multiplicator Start 0011 00001100 [1] shift 00000110 [2] shift 00000011 [3] add&shift 00011001 [4] add&shift 00100100 Negative numbers => (a) compute the sign, and then (b) do mult on positive Prof. G. Succi, Ph.D., P.Eng.

Multiplication: Instructions In MIPS, we multiply registers, so: 32-bit value x 32-bit value = 64-bit value Syntax of Multiplication (signed): mult register1, register2 Multiplies 32-bit values in those registers & puts 64-bit product in special result regs: puts product upper half in hi, lower half in lo hi and lo are 2 registers separate from the 32 general purpose registers Use mfhi register & mflo register to move from hi, lo to another register Prof. G. Succi, Ph.D., P.Eng.

Multiplication: Example in Java: a = b * c; in MIPS: let b be $s2; let c be $s3; and let a be $s0 and $s1 (since it may be up to 64 bits) mult $s2,$s3 # b*c mfhi $s0 # upper half of # product into $s0 mflo $s1 # lower half of # product into $s1 Note: Often, we only care about the lower half of the product –the operands are not so big  Prof. G. Succi, Ph.D., P.Eng.

Division Paper and pencil example (unsigned): 1001 Quotient Divisor 1000 |1001010 Dividend -1000 10 101 1010 -1000 10 Remainder (or Modulo result) Dividend = Quotient x Divisor + Remainder The numbers here are 1000=8 and 1001010=2+8+64=74. We know that 74:8=9 with the remainder of 2… Prof. G. Succi, Ph.D., P.Eng.

Division: Instruction Syntax of Division (signed): div register1, register2 Divides 32-bit values in register 1 by 32-bit value in register 2: puts remainder of division in hi puts quotient of division in lo Notice that this can be used to implement both the Java division operator (/) and the Java modulo operator (%) Prof. G. Succi, Ph.D., P.Eng.

Division: Example Example: in Java: a = c / d; b = c % d; in MIPS: let a be $s0; let b be $s1; let c be $s2; and let d be $s3 div $s2,$s3 # lo=c/d, hi=c%d mflo $s0 # get quotient mfhi $s1 # get remainder Prof. G. Succi, Ph.D., P.Eng.

Unsigned Instructions & Overflow MIPS also has versions of these two arithmetic instructions for unsigned operands: multu divu Determines whether or not the product and quotient are changed if the operands are signed or unsigned. MIPS does not check overflow on ANY signed/unsigned multiply, divide instructions Up to the software to check hi Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

FP Addition & Subtraction Much more difficult than with integers Can’t just add significands How do we do it? De-normalize to match larger exponent Add significands to get resulting one Normalize (& check for under/overflow) Round if needed (may need to goto 3) Note: If signs differ, just perform a subtract instead. The subtraction is similar Prof. G. Succi, Ph.D., P.Eng.

FP Problems Problems in implementing FP add/sub: If signs differ for add (or same for sub), what will be the sign of the result? Question: How do we integrate this into the integer arithmetic unit? Answer: We don’t! Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Floating Point Instructions Separate floating point instructions: Single Precision: add.s, sub.s, mul.s, div.s Double Precision: add.d, sub.d, mul.d, div.d These instructions are far more complex than their integer counterparts, so they can take much longer to execute. Prof. G. Succi, Ph.D., P.Eng.

Architectural Problems It’s inefficient to have different instructions taking vastly differing amounts of time. Generally, a particular piece of data will not change from FP to int, or vice versa, within a program. So only one type of instruction will be used on it. Some programs do no floating point calculations It takes lots of hardware relative to integers to do Floating Point fast Prof. G. Succi, Ph.D., P.Eng.

Solution: Make a completely separate chip that handles only FP Coprocessor 1: FP chip contains 32 32-bit registers: $f0, $f1, … most of the registers specified in .s and .d instruction refer to this set separate load and store: lwc1 and swc1 (“load word coprocessor 1”, “store …”) Double Precision: by convention, even/odd pair contain one DP FP number: $f0/$f1, $f2/$f3, … , $f30/$f31 Even register is the name Prof. G. Succi, Ph.D., P.Eng.

Actual Architectures 1990 Computers actually contained multiple separate chips: Processor: handles all the normal stuff Coprocessor 1: handles FP and only FP; more coprocessors?… Yes, later Today, FP coprocessor integrated with CPU, or cheap chips may leave out FP HW Instructions to move data between main processor and coprocessors: mfc0, mtc0, mfc1, mtc1, etc. Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Representation for Not a Number What do I get if I calculate sqrt(-4.0)or 0/0? If infinity is not an error, these shouldn’t be either. Called Not a Number (NaN) Exponent = 255, Significand nonzero Why is this useful? Hope NaNs help with debugging? They contaminate: op(NaN,X) = NaN Op = any operation. It is the concept of the bottom. Prof. G. Succi, Ph.D., P.Eng.

Special Numbers (now) What have we defined so far? (Single Precision)? Exponent Significand Object 0 0 0 0 nonzero ??? 1-254 anything +/- fl. pt. # 255 0 +/- infinity 255 nonzero NaN Prof. G. Succi, Ph.D., P.Eng.

Covered Ranges Problem: There’s a gap among representable FP numbers around 0 Smallest representable pos num: a = 1.0… 2 * 2-126 = 2-126 Second smallest representable pos num: b = 1.000……1 2 * 2-126 = 2-126 + 2-149 a - 0 = 2-126 b - a = 2-149 Normalization and implicit 1 is to blame! Gaps! b + - a Prof. G. Succi, Ph.D., P.Eng.

Denormalized numbers Solution: - + We still haven’t used Exponent = 0, Significand nonzero Denormalized number: no leading 1, implicit exponent = -126. Smallest representable pos num: a = 2-149 Second smallest representable pos num: b = 2-148 + - Prof. G. Succi, Ph.D., P.Eng.

Special Numbers (finally) What have we defined so far? (Single Precision)? Exponent Significand Object 0 0 0 0 nonzero denorm. 1-254 anything +/- fl. pt. # 255 0 +/- infinity 255 nonzero NaN Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Rounding When we perform math on real numbers, we have to worry about rounding to fit the result in the significand field. The FP hardware carries two extra bits of precision, and then round to get the proper value Rounding also occurs when converting a double to a single precision value, or converting a floating point number to an integer Prof. G. Succi, Ph.D., P.Eng.

IEEE Rounding modes Round towards +infinity Round towards -infinity ALWAYS round “up”: 2.001 -> 3 -2.001 -> -2 Round towards -infinity ALWAYS round “down”: 1.999 -> 1, -1.999 -> -2 Truncate Just drop the last bits (round towards 0) Round to (nearest) even Normal rounding, almost Prof. G. Succi, Ph.D., P.Eng.

Round to even Round like you learned in grade school Except if the value is right on the borderline, in which case we round to the nearest EVEN number 2.5 -> 2 3.5 -> 4 Insures fairness on calculation This way, half the time we round up on tie, the other half time we round down Ask statisticians… This is the default rounding mode Prof. G. Succi, Ph.D., P.Eng.

Content Review Fixed Point Multiplication, Division FP Addition, Subtraction FP Architecture Special numbers Rounding Conversion Issues Prof. G. Succi, Ph.D., P.Eng.

Proposed Issues Converting a float -> int -> float produces same float number? Converting a int -> float -> int produces same int number? FP add is associative ? (x+y)+z = x+(y+z) Prof. G. Succi, Ph.D., P.Eng.

Converting float -> int -> float if (f == (float)((int) f)) { printf(“true”); } Will not always print “true” Small floating point numbers (<1) don’t have integer representations For other numbers, rounding errors Prof. G. Succi, Ph.D., P.Eng.

Converting int -> float -> int if (i == (int)((float) i)) { printf(“true”); } Will not always print “true” Large values of integers don’t have exact floating point representations What about double? Prof. G. Succi, Ph.D., P.Eng.

 FP + and -are NOT associative! FP associative FP add, subtract associative: FALSE! x = – 1.5 x 1038, y = 1.5 x 1038, and z = 1.0 x + (y + z) = –1.5x1038 + (1.5x1038 + 1.0) = –1.5x1038 + (1.5x1038) = 0.0 (x + y) + z = (–1.5x1038 + 1.5x1038) + 1.0 = (0.0) + 1.0 = 1.0  FP + and -are NOT associative! Prof. G. Succi, Ph.D., P.Eng.

Why? FP results approximate the real result! In this example: 1.5 x 1038 is so much larger than 1.0 that 1.5 x 1038 + 1.0 in floating point representation is still 1.5 x 1038 Prof. G. Succi, Ph.D., P.Eng.