Double-Precision Floating-Point Numbers Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2007.

Slides:



Advertisements
Similar presentations
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Lecture - 2 Number systems and computer data formats
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Laplace Transform Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2008 by Douglas Wilhelm Harder.
Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
2-1 Chapter 2 - Data Representation Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture.
Chapter 2: Data Representation
Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 2: Data Representation.
Topics in Applied Mathematics Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Faculty of Computer Science © 2006 CMPUT 229 Floating Point Representation Operating with Real Numbers.
1 IEEE Floating Point Revision Guide for Phase Test Week 5.
Floating Point Numbers
Signed Numbers.
2-1 Computer Organization Part Fixed Point Numbers Using only two digits of precision for signed base 10 numbers, the range (interval between lowest.
Floating Point Numbers
Floating Point Numbers
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
The IEEE Format for storing float (single precision) data type Use the “enter” key to proceed through the show.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Binary Representation and Computer Arithmetic
Simple Data Type Representation and conversion of numbers
Information Representation (Level ISA3) Floating point numbers.
Fixed-Point Iteration Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2007 by Douglas Wilhelm.
MATH 212 NE 217 Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada Copyright © 2011.
2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Chapter Contents.
Welcome to ECE 204 Numerical Methods for Computer Engineers Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo.
Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.
Proof by Induction.
2-1 Chapter 2 - Data Representation Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer.
MATH 212 NE 217 Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada Copyright © 2011.
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large numbers,
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Binary Numbers Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2007 by Douglas Wilhelm Harder.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
CSC 221 Computer Organization and Assembly Language
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI N305 Information Representation: Floating Point Representation.
1 Number Systems Lecture 10 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Problems with Floating-Point Representations Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright.
Decimal Numbers Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2007 by Douglas Wilhelm Harder.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Floating Point Binary A2 Computing OCR Module 2509.
IT11004: Data Representation and Organization Floating Point Representation.
MATH 212 NE 217 Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada Copyright © 2011.
R EPRESENTATION OF REAL NUMBER Presented by: Pawan yadav Puneet vinayak.
Number Systems. Topics  The Decimal Number System  The Binary Number System  Converting from Binary to Decimal  Converting from Decimal to Binary.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Fixed-point and floating-point numbers Ellen Spertus MCS 111 October 4, 2001.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Complete binary trees Outline Introducing complete binary trees –Background –Definitions –Examples –Logarithmic height –Array storage.
Floating Point Numbers Dr. Mohsen NASRI College of Computer and Information Sciences, Majmaah University, Al Majmaah
FLOATING-POINT NUMBER REPRESENTATION
MATH Lesson 2 Binary arithmetic.
Introduction to Numerical Analysis I
Backgrounder: Binary Math
CSCI206 - Computer Organization & Programming
Open Addressing: Quadratic Probing
Outline Introducing perfect binary trees Definitions and examples
Data Structures Mohammed Thajeel To the second year students
Floating Point Representation
CSCI206 - Computer Organization & Programming
Presentation transcript:

Double-Precision Floating-Point Numbers Douglas Wilhelm Harder Department of Electrical and Computer Engineering University of Waterloo Copyright © 2007 by Douglas Wilhelm Harder. All rights reserved. ECE 204 Numerical Methods for Computer Engineers

Double-Precision Floating-Point Numbers This topic introduces binary numbers –requirements –a poor means of storage –a good means of storage

Double-Precision Floating-Point Numbers We will now use this same floating-point format, but we will apply it to binary numbers

Double-Precision Floating-Point Numbers In our example, we used six decimal digits The double-precision floating-point format uses 64 bits (or eight bytes) Like our format, they are broken up into –a leading sign bit, –an exponent (with a bias), and –a mantissa

Double-Precision Floating-Point Numbers Like our six-digit version, the bits are stored in the order: SEEEEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM The bias is , or 1023 This allows us to represent numbers in the range to , though the floating- point standard IEEE 754 reserves the use of the lowest (all zeros) and highest (all ones) exponents

Double-Precision Floating-Point Numbers Recall that the leading bit in a floating- point representation must be non-zero, thus, the bit must be 1 We therefore do not store the leading digit, thus, the mantissa actually represents 1.MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM

Double-Precision Floating-Point Numbers Rather than printing out a lot of 1s and 0s, instead, we will use hexadecimal numbers: a b c d e f151111

Double-Precision Floating-Point Numbers To convert a binary number into hexadecimal, simply group the bits into groups of four (starting a a radix point if it exists) and replace each group with the corresponding hexadecimal value To convert from hexadecimal to binary, replace each hexadecimal digit with its four-bit equivalent (including leading zeros)

Double-Precision Floating-Point Numbers Some of the more common numbers are: >> format hex >> 1 ans = 3ff >> 2 ans = >> -1 ans = bff >> -2 ans = c Recall that 3ff 16 = which is our bias

Double-Precision Floating-Point Numbers Some operations are quite straight- forward: –multiplication by 2 adds 1 to the exponent and leaves the mantissa unchanged –division by 2 subtracts 1 from the exponent and leaves the mantissa unchanged

Double-Precision Floating-Point Numbers Rounding rules are simplified Given a binary number which has more than 53 bits of precision, then to round it to a 53 bit number –if the 54 th bit is 0, then truncate (round down) –if all bits after the 53 rd bit are 1000··· then round up if the 53 rd bit is 1, otherwise truncate, and –otherwise, round up (add 1 to the 53 rd bit)

Double-Precision Floating-Point Numbers Remember, we deal with 53 bits because we store 52 bits together with the implicit leading 1

Usage Notes These slides are made publicly available on the web for anyone to use If you choose to use them, or a part thereof, for a course at another institution, I ask only three things: –that you inform me that you are using the slides, –that you acknowledge my work, and –that you alert me of any mistakes which I made or changes which you make, and allow me the option of incorporating such changes (with an acknowledgment) in my set of slides Sincerely, Douglas Wilhelm Harder, MMath