Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems sjmaybank@dcs.bbk.ac.uk Autumn 2013 Week 4a: Floating Point Notation for Binary Fractions 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London Recap: Binary Numbers In the standard notation for binary numbers a string of binary digits such as 1001 stands for a sum of powers of 2: 1x23+0x22+0x21+1x20 Binary[1001] and Decimal[9] are different names for the same number 22 October 2013 Birkbeck College, U. London
Recap: Binary Addition column: 3 2 1 0 1 0 0 1 1 1 ===== 1 1 0 0 There is a carry from column 0 to column 1 and from column 1 to column 2. In tests or examinations, always show the carries. 22 October 2013 Brookshear, Section 1.5
Recap: Two’s Complement Two’s complement representations can be added as if they were standard binary numbers. -3 1101 2 0010 == ==== -1 1111 22 October 2013 Brookshear, Section 1.6
Birkbeck College, U. London Numbers in Computing cells in table: all numbers *: numbers that can be stored in memory #: numbers that can be referred to in a program * # *# Example: 0.1 cannot be stored in memory in IEEE double precision floating point, but the following is a correct Java statement t = 0.1; 22 October 2013 Birkbeck College, U. London
Spacing Between Numbers Two’s complement: equally spaced numbers Floating point: big gaps between big numbers, small gaps between small numbers. 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London The Key: Exponents -4 -3 -2 -1 0 1 2 3 2-4 2-3 2-2 2-1 20 21 22 23 1/16 1/8 ¼ ½ 1 2 4 8 big gaps between big numbers small gaps between small numbers 22 October 2013 Birkbeck College, U. London
Example of a Binary Fraction -101.11001 The binary fraction has three parts: The sign – The position of the radix point The bit string 10111001 22 October 2013 Brookshear, Section 1.7
Reconstruction of a Binary Fraction The sign is + The position of the radix point is just to the right of the second bit from the left The bit string is 101101 What is the binary fraction? 22 October 2013 Brookshear, Section 1.7
Summary To represent a binary fraction three pieces of information are needed: Sign Position of the radix point Bit string 22 October 2013 Brookshear, Section 1.7
Standard Form for a Binary Fraction Any non-zero binary fraction can be written in the form ±2r x 0.t where t is a bit string beginning with 1. Examples 11.001 = +22 x 0.11001 -0.011011 = -2-1 x 0.11011 22 October 2013 Brookshear, Section 1.7
Floating Point Representation Write a non-zero binary fraction in the form ± 2r x 0.t Record the sign – bit string s1 Record r – bit string s2 Record t – bit string s3 Output s1||s2||s3 22 October 2013 Brookshear, Section 1.7
Floating Point Notation 8 bit floating point: s e1 e2 e3 m1 m2 m3 m4 sign exponent mantissa 1 bit 3 bits 4 bits radix r bit string t The exponent is in 3 bit excess notation 22 October 2013 Brookshear, Section 1.7
To Find the Floating Point Notation Write the non-zero number as ± 2r x 0.t If sign = -1, then s1=1, else s1=0. s2 = 3 bit excess notation for r. s3= rightmost four bits of t. 22 October 2013 Brookshear, Section 1.7
Birkbeck College, U. London Example b= - 0.00101011101 s=1 b= -2-2 x 0.101011101 exponent = -2, s2 =010 Floating point notation 10101010 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London Second Example Floating point notation: 10111100 s1=1, therefore negative. s2 = 011, exponent=-1 s3 = 1100 Binary fraction -0.011 = -3/8 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London Class Examples Find the floating point representation of the decimal number -1 1/8 Find the decimal number which has the floating point representation 01101101 22 October 2013 Birkbeck College, U. London
Round-Off Error 2+5/8= 10.101 2 ½ = 10.100 The 8 bit floating point notations for 2 5/8 and 2 ½ are the same: 01101010 The error in approximating 2+5/8 with 10.100 is round-off error or truncation error. 22 October 2013 Brookshear, Section 1.7
Floating Point Addition Let [x] be the floating point number closest to the number x. Floating point addition, x@y is defined by x@y=[[x]+[y]] Each operation w |-> [w] may introduce round off. 22 October 2013 Birkbeck College, U. London
Examples of Floating Point Addition 2 ½: 01101010 1/8: 00101000 ¼: 00111000 2 ¾: 01101011 2 ½ @(1/8 @ 1/8)=2 ½ @ 1/4=2 ¾ (2 ½ @ 1/8) @ 1/8=2 ½ @ 1/8=2 ½ 22 October 2013 Birkbeck College, U. London
Round-Off in Decimal and Binary 1/5=0.2 exactly in decimal notation 1/5=0.0011001100110011….. in binary notation 1/5 cannot be represented exactly in binary floating point no matter how many bits are used. Round-off is unavoidable but it is reduced by using more bits. 22 October 2013 Birkbeck College, U. London
Size of Round-Off Error E(x) E(x)/x≈α where α is constant. If x > 0, y > 0 and |x-y|<α x, then x-y cannot be found accurately using floating point arithmetic. 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London Examples 4 1/4: 100.01, fpoint = 01111000 4: 100.00, fpoint = 01111000 4 ¼-4 -> fpoint = 00000000 ½: 0.1, fpoint = 01001000 ¼: 0.01, fpoint = 00111000 ½-¼ -> fpoint = 00111000 22 October 2013 Birkbeck College, U. London
Birkbeck College, U. London Floating Point Errors Overflow: number too large to be represented. Underflow: number <>0 and too small to be represented. Invalid operation: e.g. SquareRoot[-1]. See http://en.wikipedia.org/wiki/Floating_point 22 October 2013 Birkbeck College, U. London
IEEE Standard for Floating Point Arithmetic Single precision, 32 bits. 1 … 8 9 31 Mantissa m bits 9-31 Sign s bit 0 Exponent e bits 1-8 If 0<e<255, then value = (-1)s x 2e-127 x 1.m If e=0, s=0, m=0, then value = 0 If e=0, s=1, m=0, then value = -0 For a general discussion of fp arithmetic see http://www.ee.columbia.edu/~marios/matlab/Fall96Cleve.pdf 22 October 2013 Birkbeck College, U. London