ELEN 5346/4304 DSP and Filter Design Fall 2008 1 Lecture 12: Number representation and Quantization effects Instructor: Dr. Gleb V. Tcheslavski Contact:

Slides:

Advertisements

Similar presentations

Microcomputer Systems 1

Advertisements

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

CENG536 Computer Engineering department Çankaya University.

Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

CENG536 Computer Engineering Department Çankaya University.

1 IEEE Floating Point Revision Guide for Phase Test Week 5.

King Fahd University of Petroleum and Minerals

1 Lecture 3 Bit Operations Floating Point – 32 bits or 64 bits 1.

Signed Numbers.

Computer ArchitectureFall 2008 © August 25, CS 447 – Computer Architecture Lecture 3 Computer Arithmetic (1)

CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.

Floating Point Numbers

DIGITAL SYSTEMS TCE1111 Representation and Arithmetic Operations with Signed Numbers Week 6 and 7 (Lecture 1 of 2)

S. Barua – CPSC 240 CHAPTER 2 BITS, DATA TYPES, & OPERATIONS Topics to be covered are Number systems.

Number Systems Lecture 02.

Binary Number Systems.

School of Computer Science G51CSA 1 Computer Arithmetic.

Prepared by: Hind J. Zourob Heba M. Matter Supervisor: Dr. Hatem El-Aydi Faculty Of Engineering Communications & Control Engineering.

Chapter3 Fixed Point Representation Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2009.

The Binary Number System

Data Representation – Binary Numbers

Binary Real Numbers. Introduction Computers must be able to represent real numbers (numbers w/ fractions) Two different ways:  Fixed-point  Floating-point.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Arithmetic Nizamettin AYDIN

Computer Architecture Lecture 3: Logical circuits, computer arithmetics Piotr Bilski.

Computer Arithmetic.

Fixed-Point Arithmetics: Part II

Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.

Computing Systems Basic arithmetic for computers.

FINITE word length effect in fixed point processing The Digital Signal Processors have finite width of the data bus. The word-length after mathematical.

Computer Architecture

Lecture 4: Number Systems (Chapter 3) (1) Data TypesSection3-1 (2) ComplementsSection3-2 (3) Fixed Point RepresentationsSection3-3 (4) Floating Point RepresentationsSection3-4.

S. Rawat I.I.T. Kanpur. Floating-point representation IEEE numbers are stored using a kind of scientific notation. ± mantissa * 2 exponent We can represent.

Introduction to Computer Engineering ECE/CS 252, Fall 2010 Prof. Mikko Lipasti Department of Electrical and Computer Engineering University of Wisconsin.

Lecture 5. Topics Sec 1.4 Representing Information as Bit Patterns Representing Text Representing Text Representing Numeric Values Representing Numeric.

ECEG-3202: Computer Architecture and Organization, Dept of ECE, AAU 1 Floating-Point Arithmetic Operations.

1 EENG 2710 Chapter 1 Number Systems and Codes. 2 Chapter 1 Homework 1.1c, 1.2c, 1.3c, 1.4e, 1.5e, 1.6c, 1.7e, 1.8a, 1.9a, 1.10b, 1.13a, 1.19.

Fixed and Floating Point Numbers Lesson 3 Ioan Despi.

Operations on Bits Arithmetic Operations Logic Operations

1 Number Systems Lecture 10 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.

Number Representation for

Floating Point Arithmetic

Fixed & Floating Number Format Dr. Hugh Blanton ENTC 4337/5337.

Number Systems & Operations

Digital Logic Lecture 3 Binary Arithmetic By Zyad Dwekat The Hashemite University Computer Engineering Department.

IT1004: Data Representation and Organization Negative number representation.

Numbers in Computers.

ECE DIGITAL LOGIC LECTURE 3: DIGITAL COMPUTER AND NUMBER SYSTEMS Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2016, 01/19/2016.

Lecture No. 4 Computer Logic Design. Negative Number Representation 3 Options –Sign-magnitude –One’s Complement –Two’s Complement  used in computers.

Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.

1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.

Floating Point Numbers Dr. Mohsen NASRI College of Computer and Information Sciences, Majmaah University, Al Majmaah

Floating Point Arithmetic – Part I

Floating Point Representations

Addition and Subtraction

Integer Division.

Floating Point Numbers: x 10-18

Outline Introduction Floating Point Arithmetic Adder Multiplier.

Data Representation Data Types Complements Fixed Point Representation

How to represent real numbers

Chapter3 Fixed Point Representation

Presentation transcript:

ELEN 5346/4304 DSP and Filter Design Fall Lecture 12: Number representation and Quantization effects Instructor: Dr. Gleb V. Tcheslavski Contact: Office Hours: Room 2030 Class web site: p/index.htm p/index.htm

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers Up to this point, we were considering implementations of discrete-time systems without any considerations of finite-word-length effects that are inherent in any digital realization, whether in hardware or software. Let us consider first two different representations of numbers. 1. Fixed-point representation. A real number X is represented as: Where b i represents the digit, r is the radix or base, A is the number of integer digits, and B is the number of fractional digits. For example: (12.2.1)

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers We will focus our attention on the binary representation as most important for DSP. In this case r = 2 and the digits {b i } are called binary digits or bits. They take the values {0, 1}. The binary digit b -A is called the most significant bit (MSB) of the number, and the binary digit b B is called the least significant bit (LSB) of the number. The “binary point” between the digits b 0 and b 1 does not exist explicitly and the logics assumes location of this point. By using an n-bit integer format (A = n-1, B = 0), we can represent unsigned integer numbers from 0 to 2 n -1. More frequently, the fractional format (A = 0, B = n-1) is used with a binary point between b 0 and b 1 that can represent numbers from 0 to 1-2 -n. Any integer or mixed number can be represented in a fraction format by factoring out the term r A. There are three formats to represent negative numbers. The format for the positive numbers is the same: the MSB is set to zero (12.3.1)

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers The negative numbers can be represented by: 1)Sign-Magnitude format: MSB is set to 1 to represent “-” 2)One’s-Complement Format: Where is the complement of b i (i.e., we replace ones by zeros and zeros by ones for all bits). 3) Two’s-Complement Format: Where is modulo-2 addition. For example, -3/8 is obtained by complementing 0011 (3/8) to obtain 1100 and then adding 0001, which yields 1101 to represent -3/8 in the two’s-complement format. (12.4.1) (12.4.2) (12.4.3) (12.4.4)

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers The basic operations of addition and multiplication depend on the format used. Most fixed-point digital signal processors use two’s-complement arithmetic, therefore, the range for (B + 1) bit number ranges from -1 to 1-2 -B. In general, the multiplication of two fixed-point numbers each of b bits in length results in a product of 2b bits of length. The product is either truncated or rounded back to b bits resulting either in truncation or rounding errors. A fixed-point representation allows to cover a range of numbers, say, x max – x min with a fixed resolution: (12.5.1) where m = 2 b is the number of levels and b is the number of bits.

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers 2. Floating-point representation. Covers a larger dynamic range by representing the number X as where M is a mantissa – the fractional part of the number: 0.5  M  1, E (exponent) is either negative or positive number. Both mantissa and exponent require additional sign bits for representing negative numbers. For example: (12.6.1) Multiplication of two floating-point numbers is done by multiplying their mantissas and adding their exponents. Addition of two floating-point numbers requires that the exponents must be equal, which can be achieved by shifting the mantissa of the smaller number to the right and compensating by increasing the corresponding exponent. This, in general, may lead to loss of precision.

ELEN 5346/4304 DSP and Filter Design Fall Representation of numbers Overflow occurs in the multiplication of two floating-point numbers when the sum of the exponents exceeds the dynamic range of the fixed-point representation of the exponent. The floating-point representation allows us to cover a larger dynamic range than the fixed-point representation by varying the resolution across the range. The distance between two successive floating-point numbers increases as the numbers increase in size. Also, the floating-point representation provides finer resolution for small numbers but coarser resolution for large numbers.

ELEN 5346/4304 DSP and Filter Design Fall Quantization 1. Fixed-point: truncation To truncate a fixed-point number from (  +1) bits to (b+1) bits, we just discard the least significant (  -b) bits. The truncation error is denoted by (12.8.1) Here Q(X) is the truncated version of the number X. For a positive X, the error is equal to zero if all bits being discarded are zeros and is largest if all discarded bits are ones. (12.8.2)

ELEN 5346/4304 DSP and Filter Design Fall Quantization For a negative X, the truncation error will be different for three different formats: 1)Sign-Magnitude: 2)One’s-complement: 3)Two’s-complement: (12.9.1) (12.9.2) (12.9.3)

ELEN 5346/4304 DSP and Filter Design Fall Quantization 2. Fixed-point: rounding In case of rounding, the number is quantized to the nearest quantization level. The rounding error does not depend on the format used to represent negative numbers: ( ) In practice,  >> b, therefore, 2 -   0 in all expressions considered.

ELEN 5346/4304 DSP and Filter Design Fall Quantization 3. Floating-point Quantization is carried out on the mantissa only in case of floating-point numbers. Therefore, it is more reasonable to consider the relative error. Considering a floating-point representation of a number ( ) ( ) ( ) In practice, a rounding quantizer can be modeled as follows: ( )