Recitation 4&5 and review 1 & 2 & 3

Slides:

Advertisements

Similar presentations

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

Advertisements

Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.

– 1 – CS213, S’06 Floating Point 4/5/2006 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties CS213.

15213 Recitation Fall 2014 Section A, 8 th September Vinay Bhat.

University of Washington Today Topics: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties.

Number Systems Standard positional representation of numbers:

Integer Arithmetic Floating Point Representation Floating Point Arithmetic Topics.

ICS 2005 Instructor: Peter A. Dinda TA: Bin Lin Recitation 2.

CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.

Floating Point Numbers

FLOATING POINT COMPUTER ARCHITECTURE AND ORGANIZATION.

Number Systems So far we have studied the following integer number systems in computer Unsigned numbers Sign/magnitude numbers Two’s complement numbers.

Computing Systems Basic arithmetic for computers.

Floating Point Numbers Topics –IEEE Floating Point Standard –Rounding –Floating Point Operations –Mathematical properties.

Floating Point Representations CDA 3101 Discussion Session 02.

1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 08, 2004 Lecture Number: 30.

Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.

Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.

CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.

Number Representation Fixed and Floating Point

Floating Point Carnegie Mellon /18-243: Introduction to Computer Systems 4 th Lecture, 26 May 2011 Instructors: Gregory Kesden.

Floating Point Topics IEEE Floating-Point Standard Rounding Floating-Point Operations Mathematical Properties CS 105 “Tour of the Black Holes of Computing!”

Floating Point Numbers

Floating Point Representations

Lecture 6. Fixed and Floating Point Numbers

Computer Science 210 Computer Organization

Floating Point CSE 238/2038/2138: Systems Programming

Floating Point Borrowed from the Carnegie Mellon (Thanks, guys!)

Floating Point Numbers

CS 105 “Tour of the Black Holes of Computing!”

2.4. Floating Point Numbers

Floating Point Representations

CSCI206 - Computer Organization & Programming

Topics IEEE Floating Point Standard Rounding Floating Point Operations

Bits, Bytes, and Integers CSE 238/2038/2138: Systems Programming

Instructor: David Ferry

Floating Point Number system corresponding to the decimal notation

CS 232: Computer Architecture II

CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard

Outline Introduction Floating Point Arithmetic Adder Multiplier.

What to bring: iCard, pens/pencils (They provide the scratch paper)

Number Representations

Data Representation Data Types Complements Fixed Point Representation

CS201 - Lecture 5 Floating Point

CSCI206 - Computer Organization & Programming

The IEEE Floating Point Standard and execution units for it

CSCI206 - Computer Organization & Programming

Computer Science 210 Computer Organization

Bits and Bytes Topics Representing information as bits

CS 105 “Tour of the Black Holes of Computing!”

How to represent real numbers

Bits and Bytes Topics Representing information as bits

CS 105 “Tour of the Black Holes of Computing!”

Floating Point Arithmetic August 31, 2009

Bits and Bytes Topics Representing information as bits

CS 105 “Tour of the Black Holes of Computing!”

COMS 361 Computer Organization

The IEEE Floating Point Standard and execution units for it

CS 105 “Tour of the Black Holes of Computing!”

CS213 Floating Point Topics IEEE Floating Point Standard Rounding

COMS 161 Introduction to Computing

Topic 3d Representation of Real Numbers

IT11004: Data Representation and Organization

CS 105 “Tour of the Black Holes of Computing!”

Lecture 2: Bits, Bytes, Ints

Number Representations

Lecture 9: Shift, Mult, Div Fixed & Floating Point

ECE 120 Midterm 1 HKN Review Session.

Presentation transcript:

Recitation 4&5 and review 1 & 2 & 3 9-26/28-2017

Recitation 4

IEEE Floating Point Representation V = (-1)s * M * 2E S: sign, s = 0 positive s = 1 negative M: Significand, 1 ≤ M ≤ 2 - € for Normalized 0 ≤ M ≤ 1- € for Denormalized € = smallest possible number greater than 0. E: Exponent and possibly negative.

IEEE Floating Point Representation 32 bit (Single precision) s =1, k = 8, n =23 64 bit (Double precision) s =1, k = 11, n =52 1 bit k bits n bits 111 s exponent fraction

IEEE Floating Point Representation Normalized Denormalized Infinity Nan 32 bits 111 s exponent ≠ 0 & ≠ 255 fraction 111 s exponent = 0 fraction 111 s exponent = 255 fraction = 0 111 s exponent = 255 fraction ≠ 0

IEEE Floating Point Representation V = (-1)s * M * 2E S: Sign bit E = exponent – Bias (Normalized) = 1 – Bias (Denormalized) Bias = 2k-1 - 1 M = 1 + Fraction (Normalized) = Fraction (Denormalized) Fraction = .fn-1fn-2…f1f0 * 2-n 1 bit k bits n bits 111 s exponent fraction

Normalized 32 bit (Single precision) 64 bit (Double precision) s =1, k = 8, n =23 Bias = 28-1 - 1 = 127 Exponent ranges : 0 to 255 but not 0 and 255 E = Exponent – bias = -126 to +127 64 bit (Double precision) s =1, k = 11, n =52 Bias = 211-1 - 1 = 1023 Exponent ranges : 0 to 2047 but not 0 and 2047 E = Exponent – bias = -1022 to +1023

Normalized M = 1 + Fraction Fraction = .fn-1fn-2…f1f0 * 2-n Here .fn-1fn-2…f1f0 = .11……1 < 1 M = 1 +Fraction < 2 = 1 ≤ M ≤ 2 – ε

Denormalized Exponent = 0 (All k bit is 0) 32 bit (Single precision) s =1, k = 8, n =23 Bias = 28-1 - 1 = 127 E = 1 – bias = -126 64 bit (Double precision) s =1, k = 11, n =52 Bias = 211-1 - 1 = 1023 E = 1 – bias = -1022

Denormalized M = Fraction Fraction = .fn-1fn-2…f1f0 * 2-n Here .fn-1fn-2…f1f0 = .11……1 < 1 M = Fraction < 1 = 0 ≤ M ≤ 1 – ε

Example FP representation of (40.15625)10 in 32 bit Sign bit, s = 0 (40.15625)10 = (101000.00101)2 Normalize: 1.0100000101 * 25 Convert the exp to biased: 127 + 5 = 132 (132)10 = (10000100)2 Result : 0 10000100 01000001010...0 s k n

Example 0 10001001 11010001101000010100110 s = 0, positive exp = 10001001 = 137 (Normalize) E = exp – Bias = 137 – 28-1-1 = 10 M = 1 + Fraction = 1 + .fn-1fn-2…f1f0 * 2-n = 1. 11010001101000010100110 s k n

Example(Contd.) V = (-1)s * M * 2E = (-1)0 * M * 210 = M * 210 = 1. 11010001101000010100110 * 210 = 11101000110.1000010100110 = 1862.520263671875

Examples Description Exponent Fraction Smallest denorm. 000…000 000…001 Largest denorm. 111…111 Smallest norm. Largest norm. 111…110 Examples of Positive Floating Point Numbers

Rounding Modes 1.40 1.60 1.50 2.50 -1.50 Round-to-even 1 2 -2 Round-toward-zero -1 Round-down Round-up 3 Rounding Modes

Recitation 5

Rounding Binary Numbers Binary Fractional Numbers – “Even” when least significant bit is 0 – “Half way” when bits to right of rounding position = 100…2 Examples – Round to nearest 1/4 (2 bits right of binary point) Value Binary Rounded Action Rounded Value 2 3/32 10.000112 10.002 (<1/2- down) 2 2 3/16 10.001102 10.012 (>1/2 - up) 2 1/4 2 7/8 10.111002 11.002 ( 1/2 - up) 3 2 5/8 10.101002 10.102 ( 1/2- down) 2 1/2

Round to Even When rounding to even, consider the two possible choices and choose the one with a 0 in the final position. Example: round to even at the 1/4 position: 1.10 1 1/2 1.1010000 1 5/8 1.11 1 3/4 1.11 1 3/4 1.1110000 1 7/8 10.00 2.0

Rounding practice

Floating Point Representation

An IEEE floating point representation uses 4 exp bits and 5 frac bits.

Review 1 & 2 & 3 Endian Bitwise & shift operation Conversion between binary, decimal, hexadecimal

Recitation1: 10 problems 0x3A6B=(0011 1010 0110 1011)b Problem2: Convert to binary: 0x3A6B 0x3A6B=(0011 1010 0110 1011)b

Recitation1: 10 problems 935=a*162+b*161+c =a*256+b*16+c Problem3: Convert from decimal to binary: 935 Problem4: Convert from decimal to hexadecimal: 935 935=a*162+b*161+c =a*256+b*16+c =3*256+10*16+7 =0x3A7 (0011 1010 0111)b

Recitation1: 10 problems (1011011101)b =(10 1101 1101)b =(2 D D )h Problem5: Convert from binary to hexadeximal: 1011011101 = 10 1101 1101 = 0x2DD Problem6: Convert from binary to decimal: 1011011101 = 2*162 + 13*16 + 13 = 733 (1011011101)b =(10 1101 1101)b =(2 D D )h =0x2DD =2*162+13*16+13 =733

Little Endian vs Big Endian Little endian: store the least significant byte in the smallest address. store the most significant byte in the largest address. Big Endian: store the most significant byte in the smallest address. store the least significant byte in the largest address. So, we can know how 0x1234567 is stored in memory. address 1000 1001 1002 1003 Little Endian 67 45 23 01 address 1000 1001 1002 1003 Big Endian 01 23 45 67

Two’s Complement Addition n = 5 bits (-3)10 + (-10)10 N = 3 =(00011)2 N = (10)10 = (01010)2 N* = -N = (11101)2 N* = -N = (10110)2 11101 +10110 110 011 = N* = -N = (-13)10 N = (01100)2 = (13)10 Ignored

Bitwise Operators Or And Not Exclusive-Or (Xor) A|B = 1 when either A=1 or B=1 And A&B = 1 when both A=1 and B=1 Not ~A = 1 when A=0 Exclusive-Or (Xor) A^B = 1 when either A=1 or B=1, but not both

Bitwise Operators Operate on Bit Vectors Operations applied bitwise 01101001 & 01010101 01000001 01101001 | 01010101 01111101 01101001 ^ 01010101 00111100 ~ 01010101 10101010 01000001 01111101 00111100 10101010

Shift Operations Left Shift: x << y Right Shift: x >> y Argument x 01100010 Left Shift: x << y Shift bit-vector x left y positions Throw away extra bits on left Fill with 0’s on right Right Shift: x >> y Shift bit-vector x right y positions Throw away extra bits on right Logical shift Fill with 0’s on left Arithmetic shift Replicate most significant bit on left Undefined Behavior Shift amount < 0 or ≥ word size << 3 00010000 Log. >> 2 00011000 Arith. >> 2 00011000 Argument x 10100010 << 3 00010000 Log. >> 2 00101000 Arith. >> 2 11101000