Chapter 6 Floating Point

Slides:

Advertisements

Similar presentations

Spring 2013 Advising Starts this week! CS2710 Computer Organization1.

Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

CENG536 Computer Engineering department Çankaya University.

Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

Floating Point (FLP) Representation A Floating Point value: f = m*r**e Where: m – mantissa or fractional r – base or radix, usually r = 2 e - exponent.

Floating Point Numbers

COMP3221: Microprocessors and Embedded Systems Lecture 14: Floating Point Numbers Lecturer: Hui Wu Session 2, 2004.

Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.

Floating Point Numbers

Representation and Conversion of Numeric Types 4 We have seen multiple data types that C provides for numbers: int and double 4 What differences are there.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Arithmetic Nizamettin AYDIN

1 COMS 161 Introduction to Computing Title: Numeric Processing Date: October 22, 2004 Lecture Number: 24.

Computing Systems Basic arithmetic for computers.

ECE232: Hardware Organization and Design

Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 10 Department of Computer Science and Software Engineering University of.

Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.

Data Representation in Computer Systems

CSC 221 Computer Organization and Assembly Language

COMP201 Computer Systems Floating Point Numbers. Floating Point Numbers  Representations considered so far have a limited range dependent on the number.

Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.

10/7/2004Comp 120 Fall October 7 Read 5.1 through 5.3 Register! Questions? Chapter 4 – Floating Point.

Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.

Answer CHAPTER FOUR.

Floating Point Arithmetic – Part I

William Stallings Computer Organization and Architecture 8th Edition

Floating Points & IEEE 754.

Floating Point Numbers

Floating Point Representations

Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.

Dr. Clincy Professor of CS

Computer Architecture & Operations I

Data Representation Binary Numbers Binary Addition

Integer Division.

Topics IEEE Floating Point Standard Rounding Floating Point Operations

Floating Point Numbers: x 10-18

Chapter 3 Data Storage.

Floating Point Number system corresponding to the decimal notation

CS 232: Computer Architecture II

April 2006 Saeid Nooshabadi

PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.

William Stallings Computer Organization and Architecture 7th Edition

Arithmetic for Computers

Topic 3d Representation of Real Numbers

Recent from Dr. Dan Lo regarding 12/11/17 Dept Exam

Number Representations

Floating Point Representation

(Part 3-Floating Point Arithmetic)

How to represent real numbers

How to represent real numbers

Dr. Clincy Professor of CS

Approximations and Round-Off Errors Chapter 3

ECEG-3202 Computer Architecture and Organization

COMS 361 Computer Organization

Faculty of Cybernetics, Statistics and Economic Informatics –

October 17 Chapter 4 – Floating Point Read 5.1 through 5.3 1/16/2019

Floating Point Numbers

Recent from Dr. Dan Lo regarding 12/11/17 Dept Exam

Review In last lecture, done with unsigned and signed number representation. Introduced how to represent real numbers in float format.

Topic 3d Representation of Real Numbers

Chapter3 Fixed Point Representation

Computer Organization and Assembly Language

Number Representations

Floating Point Arithmetic

Presentation transcript:

Chapter 6 Floating Point

Outline Floating Point Representation Floating Point Arithmetic The Numeric Coprocessor

Floating Point Representation Non-integral binary numbers 0.123 = 1 × 10−1 + 2 × 10−2 + 3 × 10−3 0.1012 = 1 × 2−1 + 0 × 2−2 + 1 × 2−3 = 0.625 110.0112 = 4 + 2 + 0.25 + 0.125 = 6.375

10进制  2进制（整数） 10001011 2 139 (余1 69 2 (余1 34 2 (余0 17 2 (余1 8 除余法 2 (139)10=(10001011)2

10进制  2进制（小数） 0.6875  2 … 整数部分为 1 1.3750 0.375  2 (0.6875)10 = (0.1011)2 … 整数部分为 0 0.750  2 … 整数部分为 1 1.500 0.500  2 … 整数部分为 1 小数部分为 0 1.0

Converting 0.85 to binary 0.85 × 2 = 1.7 0.7 × 2 = 1.4 0.4 × 2 = 0.8 0.8 × 2 = 1.6 0.6 × 2 = 1.2 0.2 × 2 = 0.4

A consistent format e.g., 23.85 or 10111.11011001100110 . . .2 would be stored as: 1.011111011001100110 . . . × 2100 A normalized floating point number has the form: 1.ssssssssssssssss × 2eeeeeee where 1.sssssssssssss is the significand and eeeeeeee is the exponent.

IEEE floating point representation The IEEE (Institute of Electrical and Electronic Engineers) is an international organization that has designed specific binary formats for storing floating point numbers. The IEEE defines two different formats with different precisions: single and double precision. Single precision is used by float variables in C and double precision is used by double variables. Intel’s math coprocessor also uses a third, higher precision called extended precision. In fact, all data in the coprocessor itself is in this precision. When it is stored in memory from the coprocessor it is converted to either single or double precision automatically.

IEEE single precision mantissa The binary exponent is not stored directly. Instead, the sum of the exponent and 7F is stored from bit 23 to 30. This biased exponent is always non-negative. The fraction part assumes a normalized significand (in the form 1.sssssssss).Since the first bit is always an one, the leading one is not stored! This allows the storage of an additional bit at the end and so increases the precision slightly. This idea is know as the hidden one representation.

How would 23.85 be stored? First, it is positive so the sign bit is 0. Next, the true exponent is 4, so the biased exponent is 7F+4 = 8316. Finally, the fraction is 01111101100110011001100 (remember the leading one is hidden). -23.85 be represented? Just change the sign bit: C1 BE CC CD. Do not take the two’s complement!

Special meanings for IEEE floats. An infinity is produced by an overflow or by division by zero. An undefined result is produced by an invalid operation such as trying to find the square root of a negative number, adding two infinities, etc. Normalized single precision numbers can range in magnitude from 1.0× 2−126 ( ≈1.1755 × 10−35) to 1.11111 . . . × 2127 (≈ 3.4028 × 1035).

Denormalized numbers Denormalized numbers can be used to represent numbers with magnitudes too small to normalize (i.e. below 1.0×2−126). E.g., 1.0012×2−129 ( ≈1.6530×10−39). in the unnormalized form: 0.010012 × 2−127. To store this number, the biased exponent is set to 0 and the fraction is the complete significand of the number written as a product with 2−127

IEEE double precision IEEE double precision uses 64 bits to represent numbers and is usually accurate to about 15 significant decimal digits. 用 11 位表示指数，52 位表示尾数。 The double precision has the same special values as single precision. mantissa

2. Floating Point Arithmetic Floating point arithmetic on a computer is different than in continuous mathematics. In mathematics, all numbers can be considered exact. on a computer many numbers can not be represented exactly with a finite number of bits. All calculations are performed with limited precision. 不要用浮点值表示精确值　　一些非整数值（如几美元和几美分这样的小数）需要很精确。浮点数不是精确值，所以使用它们会导致舍入误差。因此，使用浮点数来试图表示象货币量这样的精确数量不是一个好的想法。使用浮点数来进行美元和美分计算会得到灾难性的后果。浮点数最好用来表示象测量值这类数值，这类值从一开始就不怎么精确。

It is important to realize that floating point arithmetic on a computer (or calculator) is always an approximation. Addition To add two floating point numbers, the exponents must be equal. If they are not already equal, then they must be made equal by shifting the significand of the number with the smaller exponent. E.g., 10.375 + 6.34375 = 16.71875 1.0100110 × 23 + 1.1001011 × 22 ----------------------------------------- 16.75

Subtraction

Multiplication and division For multiplication, the significands are multiplied and the exponents are added. Consider 10.375 × 2.5 = 25.9375: Division is more complicated, but has similar problems with round off errors.

epsilon 浮点数比较 The main point of this section is that floating point calculations are not exact. The programmer needs to be aware of this. if ( f (x) == 0.0 ) error if ( fabs( f (x)) < EPS ) EPS is a macro To compare a floating point value (say x) to another (y) use: if ( fabs(x − y)/fabs(y) < EPS )

3. The Numeric Coprocessor Hardware Instructions Examples Quadratic formula Reading array from file Finding primes

Hardware A math coprocessor has machine instructions that perform many floating point operations much faster than using a software procedure. Since the Pentium, all generations of 80x86 processors have a builtin math coprocessor. The numeric coprocessor has eight floating point registers. Each register holds 80-bits of data. The registers are named ST0, ST1, ST2, . . . ST7, which are organized as a stack. There is also a status register in the numeric coprocessor. It has several flags. Only the 4 flags used for comparisons will be covered: C0, C1, C2 and C3.

Instructions, at Page 123 Loading and storing Addition and subtraction Array sum example Multiplication and division Comparisons

Quadratic formula

Reading array form file readt.c read.asm

Finding primes fprime.c prime2.asm

Summary Floating Point Representation Floating Point Arithmetic The Numeric Coprocessor