CS 224 Computer Organization Spring 2011 Arithmetic for Computers With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide contents.

Slides:



Advertisements
Similar presentations
The MIPS 32 1)Project 1 Discussion? 1)HW 2 Discussion? 2)We want to get some feel for programming in an assembly language - MIPS 32 We want to fully understand.
Advertisements

CSE431 Chapter 3.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 3: Arithmetic for Computers Mary Jane Irwin ( )
Spring 2013 Advising Starts this week! CS2710 Computer Organization1.
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Chapter Three.
Computer Organization CS224 Fall 2012 Lesson 19. Floating-Point Example  What number is represented by the single-precision float …00 
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Arithmetic in Computers Chapter 4 Arithmetic in Computers2 Outline Data representation integers Unsigned integers Signed integers Floating-points.
CS/ECE 3330 Computer Architecture Chapter 3 Arithmetic.
Chapter 3 Arithmetic for Computers. Exam 1 CSCE
CSCE 212 Chapter 3: Arithmetic for Computers Instructor: Jason D. Bakos.
©UCB CPSC 161 Lecture 6 Prof. L.N. Bhuyan
Lecture 16: Computer Arithmetic Today’s topic –Floating point numbers –IEEE 754 representations –FP arithmetic Reminder –HW 4 due Monday 1.
1 Lecture 9: Floating Point Today’s topics:  Division  IEEE 754 representations  FP arithmetic Reminder: assignment 4 will be posted later today.
Chapter 3 Arithmetic for Computers. Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's.
1  2004 Morgan Kaufmann Publishers Chapter Three.
CSE431 L03 MIPS Arithmetic Review.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 03: MIPS Arithmetic Review Mary Jane Irwin (
Computer Systems Organization: Lecture 3
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 11: Floating Point Partially adapted from Computer Organization and Design, 4 th edition,
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 6: Logic/Shift Instructions Partially adapted from Computer Organization and Design, 4.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
CS/COE0447 Computer Organization & Assembly Language
Systems Architecture Lecture 14: Floating Point Arithmetic
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Chapter 3 -Arithmetic for Computers Operations on integers – Addition and subtraction – Multiplication and division – Dealing with overflow Floating-point.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Architecture ALU Design : Division and Floating Point
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /14/2013 Lecture 16: Floating Point Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
1 ECE369 Sections 3.5, 3.6 and ECE369 Number Systems Fixed Point: Binary point of a real number in a certain position –Can treat real numbers as.
Oct. 18, 2007SYSC 2001* - Fall SYSC2001-Ch9.ppt1 See Stallings Chapter 9 Computer Arithmetic.
Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Spring 2006 University of Ilam.
CSF 2009 Arithmetic for Computers Chapter 3. Arithmetic for Computers Operations on integers Addition and subtraction Multiplication and division Dealing.
Lecture 9: Floating Point
Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Ilam University.
Floating Point Representation for non-integral numbers – Including very small and very large numbers Like scientific notation – –2.34 × –
Lecture 12: Integer Arithmetic and Floating Point CS 2011 Fall 2014, Dr. Rozier.
Computer Architecture Chapter 3 Instructions: Arithmetic for Computer Yu-Lun Kuo 郭育倫 Department of Computer Science and Information Engineering Tunghai.
CS.305 Computer Architecture Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available by Dr Mary.
Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication
CDA 3101 Fall 2013 Introduction to Computer Organization
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
Chapter 3 Arithmetic for Computers CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.
EI 209 Chapter 3.1CSE, 2015 EI 209 Computer Organization Fall 2015 Chapter 3: Arithmetic for Computers Haojin Zhu ( )
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Lecture 6: MIPS Floating.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers.
순천향대학교 정보기술공학부 이 상 정 1 3. Arithmetic for Computers.
COMPUTER ORGANIZATION ARITHMETIC YASSER MOHAMMAD.
CSE 340 Computer Architecture Spring 2016 MIPS Arithmetic Review.
/ Computer Architecture and Design
Morgan Kaufmann Publishers Arithmetic for Computers
Computer Architecture & Operations I
Morgan Kaufmann Publishers Arithmetic for Computers
Chapter Three : Arithmetic for Computers
CS 314 Computer Organization Fall Chapter 3: Arithmetic for Computers
CS/COE0447 Computer Organization & Assembly Language
Morgan Kaufmann Publishers Arithmetic for Computers
Morgan Kaufmann Publishers
/ Computer Architecture and Design
Arithmetic for Computers
Computer Architecture Computer Science & Engineering
Computer Arithmetic Multiplication, Floating Point
Chapter 3 Arithmetic for Computers
CSC3050 – Computer Architecture
Morgan Kaufmann Publishers Arithmetic for Computers
Chapter 3 Arithmetic for Computers
Number Representation
Presentation transcript:

CS 224 Computer Organization Spring 2011 Arithmetic for Computers With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide contents

CS224 Spring 2010 Chapter 3 2 Bits are just bits (no inherent meaning) — conventions define relationship between bits and numbers Binary numbers (base 2) decimal: n -1 Of course it gets more complicated: numbers are finite (overflow) fractions and real numbers negative numbers e.g., no MIPS subi instruction; addi can add a negative number How do we represent negative numbers? i.e., which bit patterns will represent which numbers ? Numbers

CS224 Spring 2010 Chapter bit signed numbers (2’s complement): two = 0 ten two = + 1 ten two = + 2,147,483,646 ten two = + 2,147,483,647 ten two = – 2,147,483,648 ten two = – 2,147,483,647 ten two = – 2 ten two = – 1 ten Number Representations maxint minint Converting <32-bit values into 32-bit values  copy the most significant bit (the sign bit) into the “empty” bits > >  sign extend versus zero extend (lb vs. lbu) MSB LSB

CS224 Spring 2010 Chapter 3 4 MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic operations of the ISA -add, addi, addiu, addu -sub, subu -mult, multu, div, divu -sqrt -and, andi, nor, or, ori, xor, xori -beq, bne, slt, slti, sltu, sltiu 32 m (operation) result A B ALU 4 zeroovf 1 1 With special handling for sign extend – addi, addiu, slti, sltiu zero extend – andi, ori, xori overflow detection – add, addi, sub

CS224 Spring 2010 Chapter 3 5 Dealing with Overflow OperationOperand AOperand BResult indicating overflow A + B≥ 0 < 0 A + B< 0 ≥ 0 A - B≥ 0< 0 A - B< 0≥ 0  Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit  When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur  MIPS signals overflow with an exception (a.k.a. interrupt) – an unscheduled procedure call to the OS, where the Exception Program Counter (EPC) contains the address of the instruction that caused the exception

CS224 Spring 2010 Chapter 3 6 A MIPS ALU Implementation + A1A1 B1B1 result 1 less + A0A0 B0B0 result 0 less + A 31 B 31 result 31 less set  Enable overflow bit setting for signed arithmetic (add, addi, sub) add/subt op ovf zero...  Zero detect (slt, slti, sltiu, sltu, beq, bne)

CS224 Spring 2010 Chapter 3 7 But What about Performance ? Critical path of n-bit ripple-carry adder is n*CP Design trick – compute carries in parallel w/ Carry Lookahead Unit (CLU) A0A0 B0B0 1-bit ALU result 0 CarryIn 0 CarryOut 0 A1A1 B1B1 1-bit ALU result 1 CarryIn 1 CarryOut 1 A2A2 B2B2 1-bit ALU result 2 CarryIn 2 CarryOut 2 A3A3 B3B3 1-bit ALU result 3 CarryIn 3 CarryOut 3

CS224 Spring 2010 Chapter 3 8 Multiply Binary multiplication is just a bunch of right shifts and adds multiplicand multiplier partial product array double precision product n 2n n can be formed in parallel and added in parallel for faster multiplication

CS224 Spring 2010 Chapter 3 9 Multiplication Hardware Initially 0

CS224 Spring 2010 Chapter 3 10 Optimized Multiplier Perform steps in parallel: add/shift One cycle per partial-product addition That’s ok, if frequency of multiplications is low

CS224 Spring 2010 Chapter 3 11 Fast Multiplication Hardware Can build a faster multiplier by using a parallel tree of adders with one 32-bit adder for each bit of the multiplier at the base Rather than use a single 32-bit adder 31 times, this hardware “unrolls the loop” to use 31 adders, organized in a “tree” to minimize delay

CS224 Spring 2010 Chapter 3 12 Multiply in MIPS Product stored in two 32-bit registers called Hi and Low mult $s0, $s1# $s0 * $s1 -> high/low multu $s0, $s1# $s0 * $s1 (unsigned) Results are moved from Hi/Low mfhi $t0# $t0 = Hi mflo $t0# $t0 = Low Pseudoinstruction mul $t0, $s0, $s1# $t0 = $s0 * $s1

CS224 Spring 2010 Chapter 3 13 Divide Long-hand division (108 ÷ 12 = 9) (n bit Quotient) | (Divisor) | (Dividend) (Remainder) N + 1 Steps

CS224 Spring 2010 Chapter 3 14 Solution #1: Operations Test Shift Quot Left and set bit 1 Shift divisor right 33rd? Done Start Remainder  0 Remainder < 0 < 33 repetitions 33 repetitions 1. 2a. 3. Remainder = Rem - Div Restore Rem; sll Quot left 0 2b.

CS224 Spring 2010 Chapter 3 15 Solution #1: Hardware Remainder Divisor 64-bit ALU Shift Right Shift Left Write Control 32 bits 64 bits Quotient MSB

CS224 Spring 2010 Chapter 3 16 Issues and Alternative Notice that:  Half of the divisor bits are always 0 Either high or low bits are 0, even as shifted  Half of the 64 bit adder is therefore wasted Solution #2  Instead of shifting divisor right, shift the remainder left  Adder only needs to be 32 bits wide  Can also remove an iteration by switching order to shift and then subtract  Remainder is in left half of register

CS224 Spring 2010 Chapter 3 17 Solution #2: Hardware Remainder Divisor 32-bit ALU Shift Left Control 32 bits 64 bits Quotient Write Shift Left MSB

CS224 Spring 2010 Chapter 3 18 Solution #3: Operations Test Shift Rem Left and set bit 1 32nd? Done – Shift Left Rem right Start Remainder  0 Remainder < 0 < repetitions 2. 3a. 1. Rem = Rem – Div (left half) Restore Rem; sll Rem set 0 3b. Shift Remainder left

CS224 Spring 2010 Chapter 3 19 Solution #3: Hardware Remainder Divisor 32-bit ALU Control 32 bits 64 bits Write Shift Left MSB

CS224 Spring 2010 Chapter 3 20 Divide in MIPS Quotient and remainder stored in two 32-bit registers called Hi and Low div $s0, $s1# $s0/$s1 -> high/low divu $s0, $s1# $s0/$s1 (unsigned) Results are moved from Hi/Low mfhi $t0# $t0 = Hi (remainder) mflo $t0# $t0 = Low (quotient) Pseudoinstructions div $t0, $s0, $s1# $t0 = $s0 / $s1 rem $t0, $s0, $s1# $t0 = $s0 % $s1

CS224 Spring 2010 Chapter 3 21 Floating Point Representation for non-integer numbers  Including very small and very large numbers Like scientific notation  –2.34 ×  × 10 –4  × 10 9 In binary  ±1.xxxxxxx 2 × 2 yyyy Types float and double in C normalized not normalized §3.5 Floating Point

CS224 Spring 2010 Chapter 3 22 Floating Point Standard Defined by IEEE Std Developed in response to divergence of representations  Portability issues for scientific code Now almost universally adopted Two representations  Single precision (32-bit)  Double precision (64-bit)

CS224 Spring 2010 Chapter 3 23 IEEE Floating-Point Format S: sign bit (0  non-negative, 1  negative) Normalize significand: 1.0 ≤ |significand| < 2.0  Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)  Significand is Fraction with the “1.” restored Exponent: excess representation: actual exponent + Bias  Ensures exponent is unsigned  Single: Bias = 127; Double: Bias = 1023 SExponentFraction single: 8 bits double: 11 bits single: 23 bits double: 52 bits

CS224 Spring 2010 Chapter 3 24 Single-Precision Range Exponents and reserved Smallest value  Exponent:  actual exponent = 1 – 127 = –126  Fraction: 000…00  significand = 1.0  ±1.0 × 2 –126 ≈ ±1.2 × 10 –38 Largest value  exponent:  actual exponent = 254 – 127 = +127  Fraction: 111…11  significand ≈ 2.0  ±2.0 × ≈ ±3.4 ×

CS224 Spring 2010 Chapter 3 25 IEEE 754 Double Precision Double precision number represented in 64 bits MIPS Format (-1) S × S × 2 E or (-1) S × (1 + Fraction) × 2 (Exponent-Bias) sign Exponent: bias 1023 binary integer 0 < E < 2047 Significand: magnitude, normalized binary significand with hidden bit (1): 1.F SE F F 32

CS224 Spring 2010 Chapter 3 26 Double-Precision Range Exponents 0000…00 and 1111…11 reserved Smallest value  Exponent:  actual exponent = 1 – 1023 = –1022  Fraction: 000…00  significand = 1.0  ±1.0 × 2 –1022 ≈ ±2.2 × 10 –308 Largest value  Exponent:  actual exponent = 2046 – 1023 =  Fraction: 111…11  significand ≈ 2.0  ±2.0 × ≈ ±1.8 ×

IEEE 754 FP Standard Encoding Special encodings are used to represent unusual events –± infinity for division by zero –NAN (not a number) for the results of invalid operations such as 0/0 –True zero is the bit string all zero Single PrecisionDouble PrecisionObject Represented E (8)F (23)E (11)F (52) …00000true zero (0) 0000 nonzero0000…0000nonzero± denormalized number to anything0000…0001 to 1111 …1110 anything± floating point number … ± infinity 1111 nonzero1111 … 1111nonzeronot a number (NaN)

CS224 Spring 2010 Chapter 3 28 Floating-Point Precision Relative precision  all fraction bits are significant  Single: approx 2 –23 Equivalent to 23 × log 10 2 ≈ 23 × 0.3 ≈ 6 decimal digits of precision  Double: approx 2 –52 Equivalent to 52 × log 10 2 ≈ 52 × 0.3 ≈ 16 decimal digits of precision

CS224 Spring 2010 Chapter 3 29 Floating-Point Example What number is represented by the single- precision float …00  S = 1  Fraction = 01000…00 2  Exponent = = 129 x = (–1) 1 × ( ) × 2 (129 – 127) = (–1) × 1.25 × 2 2 = –5.0

CS224 Spring 2010 Chapter 3 30 Floating-Point Addition Consider a 4-digit decimal example  × × 10 –1 1. Align decimal points  Shift number with smaller exponent  × × Add significands  × × 10 1 = × Normalize result & check for over/underflow  × Round and renormalize if necessary  × 10 2

CS224 Spring 2010 Chapter 3 31 Floating-Point Addition Now consider a 4-digit binary example  × 2 –1 + – × 2 –2 (0.5 + –0.4375) 1. Align binary points  Shift number with smaller exponent  × 2 –1 + – × 2 –1 2. Add significands  × 2 –1 + – × 2 – 1 = × 2 –1 3. Normalize result & check for over/underflow  × 2 –4, with no over/underflow 4. Round and renormalize if necessary  × 2 –4 (no change) =

CS224 Spring 2010 Chapter 3 32 FP Adder Hardware Much more complex than integer adder Doing it in one clock cycle would take too long – Much longer than integer operations – Slower clock would penalize all instructions FP adder usually takes several cycles – Can be pipelined

CS224 Spring 2010 Chapter 3 33 FP Adder Hardware Step 1 Step 2 Step 3 Step 4

CS224 Spring 2010 Chapter 3 34 Floating-Point Multiplication Consider a 4-digit decimal example  × × × 10 –5 1. Add exponents  For biased exponents, subtract bias from sum  New exponent = 10 + –5 = 5 2. Multiply significands  × =  × Normalize result & check for over/underflow  × Round and renormalize if necessary  × Determine sign of result from signs of operands  × 10 6

CS224 Spring 2010 Chapter 3 35 Floating-Point Multipl ication Now consider a 4-digit binary example  × 2 –1 × – × 2 –2 (0.5 × –0.4375) 1. Add exponents  Unbiased: –1 + –2 = –3  Biased: (– ) + (– ) = – – 127 = – Multiply significands  × =  × 2 –3 3. Normalize result & check for over/underflow  × 2 –3 (no change) with no over/underflow 4. Round and renormalize if necessary  × 2 –3 (no change) 5. Determine sign: if same, +; else, -  – × 2 –3 = –

CS224 Spring 2010 Chapter 3 36 FP Arithmetic Hardware FP multiplier is of similar complexity to FP adder o But uses a multiplier for significands instead of an adder FP arithmetic hardware usually does o Addition, subtraction, multiplication, division, reciprocal, square-root o FP  integer conversion Operations usually takes several cycles o Can be pipelined

CS224 Spring 2010 Chapter 3 37 FP Instructions in MIPS FP hardware is coprocessor 1 o Adjunct processor that extends the ISA Separate FP registers o 32 single-precision: $f0, $f1, … $f31 o Paired for double-precision: $f0/$f1, $f2/$f3, … Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s FP instructions operate only on FP registers o Programs generally don’t do integer ops on FP data, or vice versa o Gives us more registers w/ minimal code-size impact FP load and store instructions o lwc1, ldc1, swc1, sdc1 e.g., ldc1 $f8, 32($sp )

CS224 Spring 2010 Chapter 3 38 FP Instructions in MIPS Single-precision arithmetic add.s, sub.s, mul.s, div.s e.g.: add.s $f0, $f1, $f6 Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g.: mul.d $f4, $f4, $f6 Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, …) Sets or clears FP condition-code bit e.g.: c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t, bc1f e.g.: bc1t TargetLabel

CS224 Spring 2010 Chapter 3 39 FP Example: °F to °C C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr )); } fahr in $f12, result in $f0, literals in global memory space Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc1 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra

CS224 Spring 2010 Chapter 3 40 Associativity Parallel programs may interleave operations in unexpected orders §3.6 Parallelism and Computer Arithmetic: Associativity Assumptions of associativity may fail, since FP operations are not associative ! Need to validate parallel programs under varying degrees of parallelism

CS224 Spring 2010 Chapter 3 41 Support for Accurate Arithmetic Rounding (except for truncation) requires the hardware to include extra F bits during calculations Guard bit – used to provide one F bit when shifting left to normalize a result (e.g., when normalizing F after division or subtraction) G Round bit – used to improve rounding accuracy R Sticky bit – used to support Round to nearest even; is set to a 1 whenever a 1 bit shifts (right) through it (e.g., when aligning F during addition/subtraction) S IEEE 754 FP rounding modes Always round up (toward +∞) Always round down (toward -∞) Truncate (toward 0) Round to nearest even (when the Guard || Round || Sticky are 100) – always creates a 0 in the least significant (kept ) bit of F F = 1. xxxxxxxxxxxxxxxxxxxxxxx G R S