Download presentation
Presentation is loading. Please wait.
1
Topic 3d Representation of Real Numbers
Introduction to Computer Systems Engineering (CPEG 323) 2018/9/19 cpeg323-08F\Topic3d-323
2
Recap What can be represented in n bits? But, what about?
Unsigned 0 to 2n - 1 2’s Complement -2n-1 to 2n-1 - 1 BCD 0 to 10n/4 - 1 But, what about? very large numbers 9,349,398,989,787,762,244,859,087,678 very small number rationals /3 Irrationals …., ….., 2018/9/19 cpeg323-08F\Topic3d-323
3
Conceptual Overview: Finite-precision numbers
Negative underflow Positive underflow Expressible negative numbers zero Expressible positive numbers Negative overflow Positive overflow ………………. Is there any “underflow” region in integer representation? Between two adjacent integer numbers, there is no other integer. 2018/9/19 cpeg323-08F\Topic3d-323
4
Representing Real Numbers
How to represent fractional parts to the right of the ``decimal'' point? A number like 0.12 (i.e., (1/2)10) not represented well by integers 0 or 1! Two ways to represent real numbers better: Fixed point Floating point 2018/9/19 cpeg323-08F\Topic3d-323
5
Fixed-Point Data Format
4 2 1 1/2 1/4 …… 1/8 S 1 1 1 . 1 Sign Integer Fractional When representing a real value, you have two choices: the fixed-point format or the floating-point format. Any way, to avoid nonlinear effects introduced by overflow, underflow, saturation, and wrapping during computation, control of each variable’s range has to be done by appropriate scaling of operands. In the floating-point case, the exponent of a variable is part of the run-time representation and is computed automatically by the floating-point ALU. But in the fixed-point case, the exponent of each variable is implicit and determined by the programmer off-line using the predicted dynamic range of the variable. Therefore, the scaling of fixed-point operands has to be done explicitly by the programmer. This is generally a tedious and error-prone process. In fact, you may regard the fixed-point as a form of integer with an additional hypothetical binary point. In our research, the number of bits to the left of the binary-point excluding the sign bit is called Integer Word-Length or IWL. hypothetical binary point 2018/9/19 cpeg323-08F\Topic3d-323
6
Fixed Point Pros Cons Add two reals just by adding the integers
Much less logic than floating point Faster Often used in digital signal processing Cons The range of numbers is narrow number= It is much more economical to represent as 4*1026 2018/9/19 cpeg323-08F\Topic3d-323
7
Recall Scientific Notation
decimal point exponent -23 6.02 x 10 Significand radix (base) Issues: Arithmetic (+, -, *, / ) Representation, Normal form Range and Precision(Determined by?) Rounding and errors Exceptions (e.g., divide by zero, overflow, underflow) Properties 2018/9/19 cpeg323-08F\Topic3d-323
8
Scientific Notation: Normalized
12.35 x 10^-9 ? 1.235 x 10^-8 ? scientific notation: has a single digit to the left of the decimal point Normalized scientific notation: such a single digit must be non-zero. 2018/9/19 cpeg323-08F\Topic3d-323
9
Floating Point Representation
Numerical Form: (–1)s M 2E Sign bit s determines whether number is negative or positive Significand M normally a fractional value in range [1.0,2.0). Exponent E weights value by power of two Encoding MSB is sign bit: S=0/1 exp field encodes E frac field encodes M s exp frac 2018/9/19 cpeg323-08F\Topic3d-323
10
IEEE 754 standard Three formats: single/double/extended precision (32,64,80 bits). Single precision: s exp frac 1 bit bits bits Double precision: see page 192 in P&H book 2018/9/19 cpeg323-08F\Topic3d-323
11
IEEE 754 Standard Floating Point Representation
the representation: (–1)s (1+ Fraction) 2E-bias where E is the exponent part in the notation Sign bit s determines whether number is negative or positive Fraction is normally a fractional value in range between 0 and 1 A leading 1 added to the fraction is “implicit” Exponent using a “biased” notation” For single precision – the bias is 127 That is, when you convert the notation to the number it represents, the exponential part used should be reading E out from the IEEE 754 notation, and calculated as E-127. 2018/9/19 cpeg323-08F\Topic3d-323
12
Advantage of using the biased notation for exponents
Under the single-precision IEEE 754 standard: bias = 127 If the real exponent is +1, what is the biased exponent ? Answer: = 128! (Note = +1!) How about if the real exponent is -1 ? Answer: = 126! (Note = -1!) How about if the real exponent is -127 ? E -127 = -127, then E = ? 2018/9/19 cpeg323-08F\Topic3d-323
13
Observations Using biased notation, the exponential part in the single-precision IEEE 754 notation itself never get nagative. 2018/9/19 cpeg323-08F\Topic3d-323
14
Normalized Encoding Example
Value: Float F = ; = X 213 Significand M = frac = (23 bits! With leading 1 hiding!) Exponent E = 1310, Bias = Exp = = Floating Point Representation: Binary: exp frac 2018/9/19 cpeg323-08F\Topic3d-323
15
Denormalized Values Condition Significand value M = 0.xxx…x Cases
exp = 000…0 Significand value M = 0.xxx…x xxx…x: bits of frac Cases frac = 000…0 Represents value 0 Note that have distinct values +0 and –0 frac 000…0 Numbers very close to 0.0 Lose precision as get smaller “Gradual underflow” 2018/9/19 cpeg323-08F\Topic3d-323
16
Special Values (infinity) Condition Not-a-Number (NaN) exp = 111…1
Operation that overflows Both positive and negative E.g., 1.0/0.0 = 1.0/0.0 = +, 1.0/0.0 = Not-a-Number (NaN) Represents case when no numeric value can be determined E.g., sqrt(–1), 2018/9/19 cpeg323-08F\Topic3d-323
17
IEEE 754: Summary Normalized: ± 0<exp<max Any bit pattern ±
Denormalized: Any nonzero bit pattern zero: Infinite: 11…1 NaN: 11…1 Any nonzero bit pattern 2018/9/19 cpeg323-08F\Topic3d-323
18
IEEE754: Summary (Cont.) + 0 +0 -Normalized -Denorm +Denorm
NaN NaN 0 +0 Negative overflow Positive underflow Negative underflow Positive overflow 2018/9/19 cpeg323-08F\Topic3d-323
19
FP Addition Operands Exact Result: (–1)s M 2E (–1)s1 M1 2E1
Assume E1 > E2 Exact Result: (–1)s M 2E Exponent E: E1 Sign s, significand M: Result of signed align & add E1–E2 (–1)s1 M1 (–1)s2 M2 + (–1)s M 2018/9/19 cpeg323-08F\Topic3d-323
20
Decimal Number Conversion
Convert Binary to Decimal (base 2 to base 10) x x x x x. d d d d d d …2 … … = 1*23+1*22+0*21+1*20+0*2-1+1*2-2+1*2-3 = 2018/9/19 cpeg323-08F\Topic3d-323
21
Decimal Number Conversion (Cont.)
Convert Decimal Integer to binary Integer divide the decimal value by 2 and then write down the remainder from bottom to top (Divide 2 and Get the Remainders) 3710 = ?2 Quotient reminder 37÷2 = … 1 18÷2 = … 0 9÷2 = … 1 4÷2 = … 0 2÷2= … 0 1÷2 = … 1 Can it always be converted into an accurate binary number? 2018/9/19 YES! cpeg323-08F\Topic3d-323
22
Decimal Number Conversion (Cont.)
Convert Decimal Fraction to Binary Fraction multiply the decimal value by 2 and then write down the integer number from top to bottom (Multiply 2 and Get the Integers) . = ?2 .375 * 2 = .75 * 2 = .5 * 2 = Where to put the binary point? Prior to the First number! 0.0112 Can it always be converted into an accurate binary number? 2018/9/19 NO! cpeg323-08F\Topic3d-323
23
Decimal Number Conversion (Cont.)
Convert Decimal Number to Binary Binary Put Together: Integer Part . Fraction Part = 2018/9/19 cpeg323-08F\Topic3d-323
24
Summary Computer arithmetic is constrained by limited precision
Bit patterns have no inherent meaning but standards do exist two’s complement IEEE 754 floating point OPcode determines “meaning” of the bit patterns Performance and accuracy are important so there are many complexities in real machines (i.e., algorithms and implementation). 2018/9/19 cpeg323-08F\Topic3d-323
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.