Number Representation Fixed and Floating Point

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
CENG536 Computer Engineering department Çankaya University.
Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3: IT Students.
Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.
Representing fractions – Fixed point The problem: How to represent fractions with finite number of bits ?
Floating Point Numbers
University of Washington Today Topics: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties.
Floating Point Numbers. CMPE12cGabriel Hugh Elkaim 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or.
Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
Floating Point Numbers
CPSC 321 Computer Architecture ALU Design – Integer Addition, Multiplication & Division Copyright 2002 David H. Albonesi and the University of Rochester.
ECEN 248 Integer Multiplication, Number Format Adopted from Copyright 2002 David H. Albonesi and the University of Rochester.
FLOATING POINT COMPUTER ARCHITECTURE AND ORGANIZATION.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Architecture Lecture 3: Logical circuits, computer arithmetics Piotr Bilski.
Number Systems II Prepared by Dr P Marais (Modified by D Burford)
Computer Arithmetic.
CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI 230 Information Representation: Negative and Floating Point.
Fixed-Point Arithmetics: Part II
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
Floating Point Numbers Topics –IEEE Floating Point Standard –Rounding –Floating Point Operations –Mathematical properties.
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
S. Rawat I.I.T. Kanpur. Floating-point representation IEEE numbers are stored using a kind of scientific notation. ± mantissa * 2 exponent We can represent.
Fixed and Floating Point Numbers Lesson 3 Ioan Despi.
Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
Computer Architecture Lecture 22 Fasih ur Rehman.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
10/7/2004Comp 120 Fall October 7 Read 5.1 through 5.3 Register! Questions? Chapter 4 – Floating Point.
Floating Point Representations
Floating Point Numbers
2.4. Floating Point Numbers
Recitation 4&5 and review 1 & 2 & 3
Integer Division.
Topics IEEE Floating Point Standard Rounding Floating Point Operations
Floating Point Numbers: x 10-18
NxN Crossbar design for Barrel Shifter
Floating Point Number system corresponding to the decimal notation
CS 232: Computer Architecture II
CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard
PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.
Topic 3d Representation of Real Numbers
Luddy Harrison CS433G Spring 2007
CSCE 350 Computer Architecture
Number Representations
CSCI206 - Computer Organization & Programming
CS 105 “Tour of the Black Holes of Computing!”
How to represent real numbers
How to represent real numbers
Floating Point Arithmetic August 31, 2009
CS 105 “Tour of the Black Holes of Computing!”
Faculty of Cybernetics, Statistics and Economic Informatics –
October 17 Chapter 4 – Floating Point Read 5.1 through 5.3 1/16/2019
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
Topic 3d Representation of Real Numbers
CS 105 “Tour of the Black Holes of Computing!”
Number Representations
Lecture 9: Shift, Mult, Div Fixed & Floating Point
Presentation transcript:

Number Representation Fixed and Floating Point No Method Capable of Representing ALL Real Numbers Using Finite Register Lengths Must Use Approximations to Represent Values Concentrate on Two Forms: Fixed Point Floating Point Others are: Rational Number Systems – uses ratios of integers Logarithmic Number Systems – uses signs and logarithms of values

Fixed Versus Floating Point Fixed Point Values Represent Values where Any Two Differ by 1 unit in the last place (ulp) Equal Spacing Between Numbers Floating Point Values Use Two Multi-Bit Words Mantissa Exponent Both Forms Must be Capable of Representing Signed Quantities Fixed Point Values CAN be Used to Represent Fractional Quantities

Floating Point Characteristics Total Number of Representations = Total Bit Strings For n-bit Register we have 2n Range of Value is Larger than Fixed Point Precision of Value is Smaller Distance Between Two Consecutive Values Increases

Floating Point s e m s – Sign Bit (signed magnitude) e – Exponent (in 2’s Complement Form) m – Mantissa (significand or fraction) mMAX=1 - ulp; [0,1) hidden bit float – BIAS = 127 (32 bits-23 for m and 8 for e) double – BIAS=1023 (64 bits-52 for m and 11 for e) Sign of Exponent is Complement of it’s MSb Thus, adding/subtracting bias is just complementation of MSb

Floating Point Example double = 00000000 bfe80000 Big Endian – MSW has Higher Address s e m 1 011 1111 1110 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 s = 1; e = 1022; m = 0.5 Value = (-1)11.5 2(1022-1023) Value = -(1.5)(0.5) = -0.75

Floating Point Normalization Redundant /representations are Possible! Hidden Bit Helps Out of All Possible Representations, Choose One With Fewest Leading Zeros in Significand This is Normalization After Performing Arithmetic, Renormalization May Need to be Accomplished

Floating Point Special Numbers Value v when exponent e and fraction f are special values (IEEE standard) Note: NaN = Not a Number

IEEE/ANSI 754/854 Standard

Denormalized Numbers Allows for Gradual Degradation for Underflow

Denormals

Operations – Internal Precision

Floating Point Addition/Subtraction

Floating Point Multiplication/Division

Conversions and Roundings

Exceptions

Rounding Schemes Signed Magnitude Two’s Complement

Round to Nearest (Signed Magnitude)

Rounding Comments

Round to Nearest Even/Odd Round to Nearest Odd (R*)

Jamming/von Neumann Rounding

ROM Rounding

Rounding

Rounding Examples Round Towards + Downward Directed Rounding

Floating Point Operations

Adders/Subtractors

Operand Packing/Unpacking

Other Key Parts of FP Add/Sub Unit

Pre-Shifting

Four-stage Combinational Shifter Pre-shifts Operand by 0 to 15 Bits

Leading Zeros/Ones – Counting vs. Prediction

Leading Zeros Prediction

Guard Digits What is the smallest number of extra digits needed for rounding? post-normalization? Multiplication – Double Length Result Add/Sub w/ differing exp. – Can have Double Length Result FP Unit Provides One Length Result

Significand Ranges Assume Significand M(0,1-ulp] Then Normalized M ranges as: Multiplication: prod=M1M2 For postnormalization need at most one shift left to get:

Significand Ranges (cont) Division: quot=M1M2 Need at most one shift right to get: Conclusion: 1 Extra Digit Needed for Postnormalization 1 Extra Digit Needed for Round-to-Nearest 2 Extra Digits Needed G - guard R - round

“Sticky Bit” in std754 Round-to-Nearest-Even Requires 1 Extra Bit The “sticky bit”, S Turns out to be Logical-OR of Other Additional Bits

Floating Point Multiplier

Floating Point Divider