Number Representation

Slides:

Advertisements

Similar presentations

Number Representation Part 2 Little-Endian vs. Big-Endian Representations Floating Point Representations ECE 645: Lecture 5.

Advertisements

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

Cryptography and Network Security

Lecture 2 Addendum Rounding Techniques ECE 645 – Computer Arithmetic.

Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

ECIV 201 Computational Methods for Civil Engineers Richard P. Ray, Ph.D., P.E. Error Analysis.

1 Lecture 9: Floating Point Today’s topics:  Division  IEEE 754 representations  FP arithmetic Reminder: assignment 4 will be posted later today.

Assembly Language and Computer Architecture Using C++ and Java

Number Systems Standard positional representation of numbers:

CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM.

Data Representation – Binary Numbers

Information Representation (Level ISA3) Floating point numbers.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

CISE301_Topic11 CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4:

Number Representation Part 2 Little-Endian vs. Big-Endian Representations Floating Point Representations Rounding Representation of the Galois Field elements.

Number Representation Part 2 Floating Point Representations Little-Endian vs. Big-Endian Representations Galois Field Representations ECE 645: Lecture.

Lecture 9: Floating Point

CSC 221 Computer Organization and Assembly Language

Data Security and Encryption (CSE348) 1. Lecture # 12 2.

Number Representation Part 2 Fixed-Radix Signed Representations Floating Point Representations Little-Endian vs. Big-Endian Representations Galois Field.

Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.

ECE 645 – Computer Arithmetic Lecture 2: Number Representations (2) ECE 645—Computer Arithmetic 1/29/08.

Number Representation Part 2 Floating Point Representations Rounding Representation of the Galois Field elements ECE 645: Lecture 5.

Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.

1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.

William Stallings Computer Organization and Architecture 8th Edition

MATH Lesson 2 Binary arithmetic.

Floating Point Representations

Programming and Data Structure

Floating Point Numbers

Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.

Part III The Arithmetic/Logic Unit

Number Representation

Number Systems and Binary Arithmetic

Integer Division.

Lecture 9: Floating Point

Topics IEEE Floating Point Standard Rounding Floating Point Operations

Fixed-Point Representations.

Dr. Clincy Professor of CS

CS 232: Computer Architecture II

CS/COE0447 Computer Organization & Assembly Language

CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard

William Stallings Computer Organization and Architecture 7th Edition

Chapter 6 Floating Point

CS1010 Programming Methodology

Data Structures Mohammed Thajeel To the second year students

Lecture 10: Floating Point, Digital Design

Arithmetic Logical Unit

CS 105 “Tour of the Black Holes of Computing!”

How to represent real numbers

Approximations and Round-Off Errors Chapter 3

ECEG-3202 Computer Architecture and Organization

Floating Point Arithmetic August 31, 2009

CS 105 “Tour of the Black Holes of Computing!”

UNIVERSITY OF MASSACHUSETTS Dept

UNIVERSITY OF MASSACHUSETTS Dept

ECE 331 – Digital System Design

UNIVERSITY OF MASSACHUSETTS Dept

Recent from Dr. Dan Lo regarding 12/11/17 Dept Exam

CS213 Floating Point Topics IEEE Floating Point Standard Rounding

UNIVERSITY OF MASSACHUSETTS Dept

CS 105 “Tour of the Black Holes of Computing!”

CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.

Mathematical Background: Extension Finite Fields

Lecture 9: Shift, Mult, Div Fixed & Floating Point

Presentation transcript:

Number Representation ECE 645: Lecture 5 Number Representation Part 2 Little-Endian vs. Big-Endian Representations Floating Point Representations Rounding Representation of the Galois Field elements

Required Reading Endianness, from Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Endianness Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 17, Floating-Point Representations

Little-Endian vs. Big-Endian Representation of Integers

Little-Endian vs. Big-Endian Representation A0 B1 C2 D3 E4 F5 67 8916 MSB LSB Little-Endian Big-Endian MSB = A0 LSB = 89 B1 67 C2 F5 D3 E4 address E4 D3 F5 C2 67 B1 LSB = 89 MSB = A0 MAX

Little-Endian vs. Big-Endian Camps MSB LSB . . . . . . address LSB MSB MAX Little-Endian Big-Endian Motorola 68xx, 680x0 Bi-Endian Intel IBM AMD Motorola Power PC Hewlett-Packard DEC VAX Silicon Graphics MIPS Sun SuperSPARC RS 232 Internet TCP/IP

Little-Endian vs. Big-Endian Origin of the terms Jonathan Swift, Gulliver’s Travels A law requiring all citizens of Lilliput to break their soft-eggs at the little ends only A civil war breaking between the Little Endians and the Big-Endians, resulting in the Big Endians taking refuge on a nearby island, the kingdom of Blefuscu Satire over holy wars between Protestant Church of England and the Catholic Church of France

Little-Endian vs. Big-Endian Advantages and Disadvantages Big-Endian Little-Endian easier to determine a sign of the number easier to compare two numbers easier to divide two numbers easier to print easier addition and multiplication of multiprecision numbers

Pointers (1) Big-Endian Little-Endian MAX int * iptr; 89 int * iptr; 89 (* iptr) = 8967; (* iptr) = 6789; 67 F5 address E4 iptr+1 D3 C2 B1 A0 MAX

Pointers (2) Big-Endian Little-Endian MAX long int * lptr; 89 long int * lptr; 89 (* lptr) = 8967F5E4; (* lptr) = E4F56789; 67 F5 address E4 D3 C2 lptr + 1 B1 A0 MAX

Floating Point Representations

The ANSI/IEEE standard floating-point number representation formats Originally IEEE 754-1985. Superseded by IEEE 754-2008 Standard.

Table 17.1 Some features of the ANSI/IEEE standard floatingpoint number representation formats

Exponent encoding in 8 bits for the single/short (32-bit) ANSI/IEEE format Decimal code 1 126 127 128 254 255 Hex code 00 01 7E 7F 80 FE FF Exponent value –126 –1 +1 +127 f = 0: Representation of 0 f  0: Representation of denormals, 0.f  2–126 f = 0: Representation of  f  0: Representation of NaNs 1.f  2e

Fig. 17.4 Denormals in the IEEE single-precision format.

New IEEE 754-2008 Standard Basic Formats

Binary Interchange Formats New IEEE 754-2008 Standard Binary Interchange Formats

Requirements for Arithmetic Results of the 4 basic arithmetic operations (+, -, , ) as well as square-rooting must match those obtained if all intermediate computations were infinitely precise That is, a floating-point arithmetic operation should introduce no more imprecision than the error attributable to the final rounding of a result that has no exact representation (this is the best possible) Example: (1 + 2-1)  (1 + 2-23 ) Exact result 1 + 2-1 + 2-23 + 2-24 Rounded result 1 + 2-1 + 2-22 Error = ½ ulp

Rounding 101

Rounding Modes The IEEE 754-2008 standard includes five Default: Round to nearest, ties to even (rtne) Optional: Round to nearest, ties away from 0 (rtna) Round toward zero (inward) Round toward + (upward) Round toward – (downward)

Required Reading Parhami, Chapter 17.5, Rounding schemes Rounding Algorithms 101 http://www.diycalculator.com/popup-m-round.shtml

Rounding Rounding occurs when we want to approximate a more precise number (i.e. more fractional bits L) with a less precise number (i.e. fewer fractional bits L') Example 1: old: 000110.11010001 (K=6, L=8) new: 000110.11 (K'=6, L'=2) Example 2: new: 000111. (K'=6, L'=0) The following pages show rounding from L>0 fractional bits to L'=0 bits, but the mathematics hold true for any L' < L Usually, keep the number of integral bits the same K'=K

Rounding Equation y = round(x) Whole part Fractional part xk–1xk–2 . . . x1x0 . x–1x–2 . . . x–l yk–1yk–2 . . . y1y0 Round y = round(x)

Rounding Techniques There are different rounding techniques: 1) truncation results in round towards zero in signed magnitude results in round towards -∞ in two's complement 2) round to nearest number 3) round to nearest even number (or odd number) 4) round towards +∞ Other rounding techniques 5) jamming or von Neumann 6) ROM rounding Each of these techniques will differ in their error depending on representation of numbers i.e. signed magnitude versus two's complement Error = round(x) – x

1) Truncation The simplest possible rounding scheme: chopping or truncation xk–1xk–2 . . . x1x0 . x–1x–2 . . . x–l xk–1xk–2 . . . x1x0 trunc ulp Truncation in signed-magnitude results in a number chop(x) that is always of smaller magnitude than x. This is called round towards zero or inward rounding 011.10 (3.5)10  011 (3)10 Error = -0.5 111.10 (-3.5)10  111 (-3)10 Error = +0.5 Truncation in two's complement results in a number chop(x) that is always smaller than x. This is called round towards -∞ or downward-directed rounding 100.01 (-3.5)10  100 (-4)10

Truncation Function Graph: chop(x) – 4 3 2 1 chop( x ) x – 4 3 2 1 – 4 3 2 1 Fig. 17.5 Truncation or chopping of a signed-magnitude number (same as round toward 0). Fig. 17.6 Truncation or chopping of a 2’s-complement number (same as round to -∞).

Bias in two's complement truncation X (binary) X (decimal) chop(x) (binary) chop(x) (decimal) Error 011.00 3 011 011.01 3.25 -0.25 011.10 3.5 -0.5 011.11 3.75 -0.75 100.01 -3.75 100 -4 100.10 -3.5 100.11 -3.25 101.00 -3 101 Assuming all combinations of positive and negative values of x equally possible, average error is -0.375 In general, average error = -(2-L'-2-L )/2, where L' = new number of fractional bits

Implementation truncation in hardware Easy, just ignore (i.e. truncate) the fractional digits from L to L'+1 xk-1 xk-2 .. x1 x0. x-1 x-2 .. x-L = yk-1 yk-2 .. y1 y0. ignore (i.e. truncate the rest)

2) Round to nearest number Rounding to nearest number what we normally think of when say round 010.01 (2.25)10  010 (2)10 Error = -0.25 010.11 (2.75)10  011 (3)10 Error = +0.25 010.00 (2.00)10  010 (2)10 Error = +0.00 010.10 (2.5)10  011 (3)10 Error = +0.5 [round-half-up (arithmetic rounding)] 010.10 (2.5)10  010 (2)10 Error = -0.5 [round-half-down]

Round-half-up: dealing with negative numbers Rounding to nearest number what we normally think of when say round 101.11 (-2.25)10  110 (-2)10 Error = +0.25 101.01 (-2.75)10  101 (-3)10 Error = -0.25 110.00 (-2.00)10  110 (-2)10 Error = +0.00 101.10 (-2.5)10  110 (-2)10 Error = +0.5 [asymmetric implementation] 101.10 (-2.5)10  101 (-3)10 Error = -0.5 [symmetric implementation]

Round to Nearest Function Graph: rtn(x) Round-half-up version Asymmetric implementation Symmetric implementation

Bias in two's complement round to nearest Round-half-up asymmetric implementation X (binary) X (decimal) rtn(x) (binary) rtn(x) (decimal) Error 010.00 2 010 010.01 2.25 -0.25 010.10 2.5 011 3 +0.5 010.11 2.75 +0.25 101.01 -2.75 101 -3 101.10 -2.5 110 -2 101.11 -2.25 110.00 Assuming all combinations of positive and negative values of x equally possible, average error is +0.125 Smaller average error than truncation, but still not symmetric error We have a problem with the midway value, i.e. exactly at 2.5 or -2.5 leads to positive error bias always Also have the problem that you can get overflow if only allocate K' = K integral bits Example: rtn(011.10)  overflow This overflow only occurs on positive numbers near the maximum positive value, not on negative numbers

Implementing round to nearest (rtn) in hardware Round-half-up asymmetric implementation Two methods Method 1: Add '1' in position one digit right of new LSB (i.e. digit L'+1) and keep only L' fractional bits xk-1 xk-2 .. x1 x0. x-1 x-2 .. x-L + 1 = yk-1 yk-2 .. y1 y0. y-1 Method 2: Add the value of the digit one position to right of new LSB (i.e. digit L'+1) into the new LSB digit (i.e. digit L) and keep only L' fractional bits + x-1 yk-1 yk-2 .. y1 y0. ignore (i.e. truncate the rest) ignore (i.e truncate the rest)

Round to Nearest Even Function Graph: rtne(x) To solve the problem with the midway value we implement round to nearest-even number (or can round to nearest odd number) Fig. 17.9 R* rounding or rounding to the nearest odd number. Fig. 17.8 Rounding to the nearest even number.

Bias in two's complement round to nearest even (rtne) X (binary) X (decimal) rtne(x) (binary) rtne(x) (decimal) Error 000.10 0.5 000 -0.5 001.10 1.5 010 2 +0.5 010.10 2.5 011.10 3.5 0100 (overfl) 4 100.10 -3.5 100 -4 101.10 -2.5 -2 110.10 -1.5 111.10 average error is now 0 (ignoring the overflow) cost: more hardware

4) Rounding towards infinity We may need computation errors to be in a known direction Example: in computing upper bounds, larger results are acceptable, but results that are smaller than correct values could invalidate upper bound Use upward-directed rounding (round toward +∞) up(x) always larger than or equal to x Similarly for lower bounds, use downward-directed rounding (round toward -∞) down(x) always smaller than or equal to x We have already seen that round toward -∞ in two's complement can be implemented by truncation

Rounding Toward Infinity Function Graph: up(x) and down(x) down(x) can be implemented by chop(x) in two's complement

Two's Complement Round to Zero Two's complement round to zero (inward rounding) also exists inward( x ) – 4 3 2 1

Other Methods Note that in two's complement round to nearest (rtn) involves an addition which may have a carry propagation from LSB to MSB Rounding may take as long as an adder takes Can break the adder chain using the following two techniques: Jamming or von Neumann ROM-based

5) Jamming or von Neumann Chop and force the LSB of the result to 1 Simplicity of chopping, with the near-symmetry or ordinary rounding Max error is comparable to chopping (double that of rounding)

6) ROM Rounding Fig. 17.11 ROM rounding with an 8  2 table. Example: Rounding with a 32  4 table Rounding result is the same as that of the round to nearest scheme in 31 of the 32 possible cases, but a larger error is introduced when x3 = x2 = x1 = x0 = x–1 = 1 xk–1 . . . x4x3x2x1x0 . x–1x–2 . . . x–l xk–1 . . . x4y3y2y1y0 ROM ROM data ROM address

Representation of the Galois Field elements

Evariste Galois (1811-1832)

Evariste Galois (1811-1832) Studied the problem of finding algebraic solutions for the general equations of the degree  5, e.g., f(x) = a5x5+ a4x4+ a3x3+ a2x2+ a1x+ a0 = 0 Answered definitely the question which specific equations of a given degree have algebraic solutions. On the way, he developed group theory, one of the most important branches of modern mathematics.

Evariste Galois (1811-1832) Galois submits his results for the first time to the French Academy of Sciences Reviewer 1 Augustin-Luis Cauchy forgot or lost the communication. 1830 Galois submits the revised version of his manuscript, hoping to enter the competition for the Grand Prize in mathematics Reviewer 2 Joseph Fourier – died shortly after receiving the manuscript. 1831 Third submission to the French Academy of Sciences Reviewer 3 Simeon-Denis Poisson – did not understand the manuscript and rejected it.

Evariste Galois (1811-1832) May 1832 Galois provoked into a duel The night before the duel he wrote a letter to his friend containing the summary of his discoveries. The letter ended with a plea: “Eventually there will be, I hope, some people who will find it profitable to decipher this mess.” May 30, 1832 Galois was grievously wounded in the duel and died in the hospital the following day. Galois manuscript rediscovered by Joseph Liouville 1846 Galois manuscript published for the first time in a mathematical journal.

Field Set F, and two operations typically denoted by (but not necessarily equivalent to) + and * Set F, and definitions of these two operations must fulfill special conditions.

Examples of fields Infinite fields { R= set of real numbers, + addition of real numbers * multiplication of real numbers } Finite fields { set Zp={0, 1, 2, … , p-1}, + (mod p): addition modulo p, * (mod p): multiplication modulo p }

Finite Fields = Galois Fields p – prime pm – number of elements in the field GF(pm) GF(p) GF(2m) Most significant special cases Arithmetic operations present in many libraries Polynomial basis representation Normal basis representation Fast in hardware Fast squaring

Quotient and remainder Given integers a and n, n>0 ! q, r  Z such that a = q n + r and 0  r < n a q – quotient r – remainder (of a divided by n) q = = a div n n a r = a - q n = a –  n = n = a mod n

mod 5 = -32 mod 5 =

Two integers a and b are congruent modulo n Integers coungruent modulo n Two integers a and b are congruent modulo n (equivalent modulo n) written a  b iff a mod n = b mod n or a = b + kn, k  Z n | a - b

Laws of modular arithmetic

Rules of addition, subtraction and multiplication modulo n a + b mod n = ((a mod n) + (b mod n)) mod n a - b mod n = ((a mod n) - (b mod n)) mod n a  b mod n = ((a mod n)  (b mod n)) mod n

9 · 13 mod 5 = 25 · 25 mod 26 =

Laws of modular arithmetic Regular addition Modular addition a+b = a+c iff b=c a+b  a+c (mod n) iff b  c (mod n) Regular multiplication Modular multiplication If a  b  a  c (mod n) and gcd (a, n) = 1 then b  c (mod n) If a  b = a  c and a  0 then b = c

Modular Multiplication: Example 0 1 2 3 4 5 6 7 x 6  x mod 8 0 6 4 2 0 6 4 2 0 1 2 3 4 5 6 7 x 5  x mod 8 0 5 2 7 4 1 6 3

Finite Fields = Galois Fields p – prime pm – number of elements in the field GF(pm) GF(p) GF(2m) Most significant special cases Arithmetic operations present in many libraries Polynomial basis representation Normal basis representation Fast in hardware Fast squaring

Elements of the Galois Field GF(2m) Binary representation (used for storing and processing in computer systems): A = (am-1, am-2, …, a2, a1, a0) ai  {0, 1} Polynomial representation (used for the definition of basic arithmetic operations): m-1 A(x) =  aixi = am-1xm-1 + am-2xm-2 + …+ a2x2 + a1x+a0 i=0  multiplication + addition modulo 2 (XOR)

Addition and Multiplication in the Galois Field GF(2m) Inputs A = (am-1, am-2, …, a2, a1, a0) B = (bm-1, bm-2, …, b2, b1, b0) ai , bi  {0, 1} Output C = (cm-1, cm-2, …, c2, c1, c0) ci  {0, 1}

Addition in the Galois Field GF(2m) A  A(x) B  B(x) C  C(x) = A(x) + B(x) = = (am-1+bm-1)xm-1 + (am-2+bm-2)xm-2+ …+ + (a2+b2)x2 + (a1+b1)x + (a0+b0) = = cm-1xm-1 + cm-2xm-2 + …+ c2x2 + c1x+c0  multiplication + addition modulo 2 (XOR) ci = ai + bi = ai XOR bi C = A XOR B

Multiplication in the Galois Field GF(2m) A  A(x) B  B(x) C  C(x) = A(x)  B(x) mod P(X) = cm-1xm-1 + cm-2xm-2 + …+ c2x2 + c1x+c0 P(x) - irreducible polynomial of the degree m P(x) = pmxm + pm-1xm-1 + …+ p2x2 + p1x+p0