Download presentation
Presentation is loading. Please wait.
1
Computer Organization
Lecture Set – 03 Chapter 3 Huei-Yung Lin Lectures 1 & 2
2
Roadmap for the Term: Major Topics
Computer Systems Overview Technology Trends Instruction Sets (and Software) Logic and Arithmetic \ Performance Processor Implementation Memory Systems Input/Output CCUEE Computer Organization 4
3
Review: Positional Notation of Numbers
Example: Binary (base 10) numbers Base = 2 Digits = {0,1} Note “bit” == “Binary digit” N = 1001two = 1 23 = 1ten + 8ten = 9ten Example: Hexadecimal (base 16) numbers Base = 16 Digits = {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} N = 1A3Fhex = 15 = 15ten + 48ten ten ten = 6719ten = two CCUEE Computer Organization
4
Range of Unsigned Binary Numbers
CCUEE Computer Organization
5
Review: Unsigned vs. Signed Numbers
Basic binary - allows representation of non-negative numbers only In C, Java, etc: unsigned int x; Useful for unsigned quantities like addresses Most of us need negative numbers, too! In C, Java, etc: int x; How can we do this? … Use a signed representation CCUEE Computer Organization
6
Signed Number Representations
Sign/Magnitude Two’s Complement - the one almost everyone uses One’s Complement Biased - used for exponent sign in Floating Point CCUEE Computer Organization
7
Sign/Magnitude Representation
Approach: Use binary number and added sign bit Problems: Two values of zero Difficult to implement in hardware - consider addition Must first check signs of operands Then compute value Then compute sign of result 1 Sign 1 Magnitude = -25 CCUEE Computer Organization
8
Two’s Complement Representation
Goal: make the hardware easy to design Approach: explicitly represent result of “borrow” in subtract Borrow results in “leading 1’s” Weight leftmost “sign bit” with -2n-1 Use sign bit to represent “last borrow” N = 1111tc = 1 -23 = 7ten + -8ten = -1ten N = 1001tc = 1 -23 = 1ten + -8ten = -7ten N = 0101tc = 1 -23 = 1ten + 4ten = 5ten All negative numbers have a “1” in the sign bit Single representation of zero CCUEE Computer Organization
9
Range of Two’s Complement Numbers
CCUEE Computer Organization
10
Negating Two’s Complement Numbers
Important shortcut: Invert the individual bits Add 1 Result: Two’s complement representation of negated number! Examples (with 4 bits): - (0111) = = (1100) = = (1111) = = CCUEE Computer Organization
11
Other Signed Binary Representations
One’s Complement Use one’s complement (inverted bits) to represent negated numbers +1 = = Invert(0001) = 1110 Problem: two values of zero (0000, 1111) Biased Add a bias (offset) to all numbers Most negative number: = -2n-1 Zero: = 0 Most positive number: = +2n-1-1 Used for exponent in IEEE floating point representation (more about this later) CCUEE Computer Organization
12
Computer Organization
Sign Extension To convert a “narrower” signed number to a “wider” one: Copy the bits of the narrower number into the lower bits Copy the sign bit from the narrower number into all of the remaining bits of the result Example: Converting signed 8-bit byte to 32-bit word: Orignal byte: -6 Result word: -6 Orignal byte: 45 Result word: 45 CCUEE Computer Organization
13
Zero-Padding - for Unsigned Numbers
To convert a “narrower” unsigned number to a “wider” one Copy the bits of the narrower number into the lower bits Copy “zeros” into upper bits of wider number zeros Orignal byte: 45 Result word: 45 zeros Orignal byte: 193 Result word: 193 CCUEE Computer Organization
14
Computer Organization
Sign Extension in MIPS Load-byte (lb) instruction Loads an 8-bit signed number from memory Performs sign extension before placing in 32-bit register Load-byte unsigned (lbu) Loads an 8-bit unsigned number (e.g., ASCII character) from memory No sign extension - places byte with leading “0’s” in 32-bit register CCUEE Computer Organization
15
Sign Extension in MIPS I-Format Instructions
I-Format Instructions have 16-bit immediate field MIPS operations are defined on 32-bit registers Sign extension performed on immediate operands “when it makes sense” Sign extension used for addi, beq, bne, ... Zero-padding used for andi, ori, ... CCUEE Computer Organization
16
Signed & Unsigned Comparisons
MIPS provides two versions of “set less than” slt - signed comparison - useful when comparing signed numbers sltu - unsigned comparison - useful when comparing unsigned numbers (e.g. addresses) Example: suppose $s0= $s1= what do the following instructions do? sltu $t0, $s0, $s1 slt $t1, $s0, $s1 231-1 > 1, so $t0 = 0 -1 < 1, so $t1 = 1 CCUEE Computer Organization
17
Review: Binary Addition
Key building block: Full Adder Ai Bi Ci Si Ci+1 1 Ai Bi Si Ci Ci+1 CCUEE Computer Organization
18
Computer Organization
Multiple-Bit Adders String together Full Adders to form a Ripple Adder A3 B3 S3 C3 C4 A2 B2 S2 C2 C3 A1 B1 S1 C1 C2 A0 B0 S0 C0 C1 CCUEE Computer Organization
19
How to Subtract with an Adder
Recall Definition of subtraction: A-B = A + (-B) Two’s Complement Negation Shortcut -B = bit_invert(B)+1 A0 B0 S0 C0 C1 A1 B1 S1 C2 A2 B2 S2 C3 A3 B3 S3 C4 1 CCUEE Computer Organization
20
Designing an Adder/Subtractor
Recall Definition of subtraction: A-B = A + (-B) Two’s Complement Negation Shortcut -B = bit_invert(B)+1 Control Add/Sub A0 B0 S0 C0 C1 A1 B1 S1 C2 A2 B2 S2 C3 A3 B3 S3 C4 0 to add 1 to subtract CCUEE Computer Organization
21
Overflow in Addition & Subtraction
Overflow - occurs when not enough bits are available to represent the result Example: unsigned 32-bit result ≥ 232 Example: signed 32-bit result < -231 or ≥ 231 Detecting overflow - look for different signs in operands vs. result: Operation Operand A Operand B Result A + B ≥ 0 < 0 A - B CCUEE Computer Organization
22
What to Do When Overflow Occurs?
In some languages (e.g., C, Java) - nothing (“responsibility left to the programmer”) In other languages (e.g. Ada, Fortran) - “notify programmer” through runtime exception How MIPS handles overflow: add, sub, addi - runtime exception on overflow addu, subu, addiu - no runtime exception on overflow Note functions otherwise identical to add, sub, addi … including sign extension in addiu! CCUEE Computer Organization
23
Computer Organization
Delay in Adders Review: full adder equations Sum: si = ai XOR bi XOR ci Carry: ci+1 = ai bi + ai ci + bi ci Delay estimate: 32-bit ripple add Worst case: A0 or B0 to C32 (or S31) Carry delay - each stage: 2 gate delays Total delay: 64 gate delays - too high! A0 B0 S0 C0 C1 A1 B1 S1 C2 A2 B2 S2 C3 A3 B3 S3 C4 CCUEE Computer Organization
24
Speeding up Carry - Carry Lookahead
Key idea: trade off delay, amount of logic used Benefit: Faster addition Cost: much more logic Define two signals for each adder stage: Generate gi = ai bi Propagate pi = ai + bi Why use these names? Adder i will always generate a carry if ai, bi both true Adder i will propagate a carry input if either or both ai, bi true 1 1 X A0 B0 S0 C0 C1 CCUEE Computer Organization
25
Carry Lookahead (cont’d)
Now rewrite carry output as function of ai,bi,pi,gi Original eqn: ci+1 = ai bi + ai ci + bi ci New eqn: ci+1 = gi + pi ci "Flatten" carry function in terms of gi, pi c1 = g0 + p0 c0 c2 = g1 + p1 c1 = g1 + p1 (g0 + p0 g0 ) = g1 + p1 g0 + p1 p0 c0 c3 = g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 c0 c4 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 + p3 p2 p1 p0 c0 Add carry lookahead logic that computes c1-c4 in terms of p0-p3 and g0-g3 CCUEE Computer Organization
26
Computer Organization
Using Carry Lookahead Practical computation for 4-bit adders, but... Too expensive for 16 bits or 32 bits! a0 b0 s0 c0 Carry In g0 p0 a1 b1 s1 c1 g1 p1 a2 b2 s2 c2 g2 p2 a3 b3 s3 c3 g3 p3 c4 Carry Out G0 P0 Carry Lookahead Unit CCUEE Computer Organization
27
Computer Organization
Using Carry Lookahead Cost a limiting factor Practical computation for 4-bit adders, but... Too expensive for 16 bits or 32 bits! Alternative: Combine 4-bit Carry-Lookahead Adders Ripple/Lookahead - string together CLAs Group-Lookahead - add another level of lookahead CCUEE Computer Organization
28
Ripple/Lookahead Adder
String together CLA’s Faster than ripple adder, but… Still long delays b3-b0 a3-a0 A B S c0 c4 4 s3-s0 b7-b4 a7-a4 b11-b8 a11-a8 b15-b12 a15-a12 s7-s4 s11-s8 s15-s12 CLA CCUEE Computer Organization
29
Group Carry-Lookahead
Approach: use carry lookahead for 4-bit groups “Super Propagate” equations: P0 = p3*p2*p1*p0 P1 = p7*p6*p5*p4 P2 = p11*p10*p9*p8 P3 = p15*p14*p13*p12 “Super Generate” equations: G0 = g3 + (p3*g2) + (p3*p2* g1) + (p3*p2 *p1 * g0) G1 = g7 + (p7*g6) + (p7*p6* g5) + (p7*p6 *p5 * g4) G2 = g11 + (p11*g10) + (p11*p10* g9) + (p11*p10 *p9 * g8) G3 = g15 + (p15*g14) + (p15*p14* g13) + (p15*p14 *p13 * g12) Combine groups using second level CCUEE Computer Organization
30
Group Carry-Lookahead
b3-b0 a3-a0 b7-b4 a7-a4 b11-b8 a11-a8 b15-b12 a15-a12 s3-s0 A B S c0 4 CLA P G G0 P0 C1 G1 P1 C2 G2 P2 C3 G3 P3 C4 s7-s4 s11-s8 s15-s12 Group Carry Lookahead Unit c16 CCUEE Computer Organization
31
Delay Comparison - 16 Bit Adder
Ripple Adder 2 gate delays per bit 16 bits Total: 32 gate delays Group Lookahead Adder Generating C4 (c16) - 2 gate delays from Pi, Gi Generating Pi, Gi - 2 gate delays from pi, gi Generating pi, gi - 1 gate delay from ai, bi Total: = 5 gate delays CCUEE Computer Organization
32
Arithmetic-Logic Units
Combinational logic element that performs multiple functions: Arithmetic: add, subtract Logical: AND, OR A B F(A,B) Operation Select ALU CCUEE Computer Organization
33
Constructing an ALU - First Cut
Construct in bit slices, like the ripple adder Add gates, multiplexer for logic functions, subtract CCUEE Computer Organization
34
ALU Design - Putting it Together
CCUEE Computer Organization
35
Overflow Detection in ALUs
Overflow occurs when conditions in Fig. 3.3 are met Problem B.25: equivalent to testing cmsb+1 ≠ cmsb CCUEE Computer Organization
36
Supporting the MIPS slt Instruction
Want result of 000…001 when A < B Modify bit slice hardware CCUEE Computer Organization
37
Supporting the MIPS slt Instruction
Add additional multiplexer input, “Less” to slice Set (MSB only) Bit 31: 0 • Bit 1: 0 Bit 0: 1 if A<B CCUEE Computer Organization
38
Supporting the MIPS slt Instruction
Feed “Set” to “Less” input of LSB It’s actually more complicated than this because of overflow - see text CCUEE Computer Organization
39
Supporting the MIPS slt Instruction
Set “less” to “00….01” when result less than zero Details - see Fig. B.5.10, B.5.11 pp. B-33 - B-34 Use sign bit - “pass around” to LSB of “less” Complicated by overflow conditions CCUEE Computer Organization
40
Final Result: ALU Function
B Result ALU Operation ALU Overflow Zero CarryOut ALU control input Function 000 AND 001 OR 010 add 110 subtract 111 set on less than CCUEE Computer Organization
41
Computer Organization
Multiplication Basic algorithm analogous to decimal multiplication Break multiplier into digits Multiply one digit at a time; shift multiplicand to form partial products Create product as sum of partial products n bit multiplicand m bit multiplier = (n+m) bit product Multiplicand (6) Multiplier (3) 0110 0000 Product (18) Partial Products CCUEE Computer Organization
42
Computer Organization
Multiplier Hardware Sequential Combinational CCUEE Computer Organization
43
Sequential Multiplier - First Version
Multiplicand shifts left Multiplier shifts right Sample LSB of multiplier to decide whether to add Multiplicand (64 bits) Shift Left Multiplier (32 bits) Shift Right Product (64 bits) Write Control 64-bit ALU LSB LSB – least significant bit CCUEE Computer Organization
44
Algorithm - 1st Cut Multiplier
START DONE 1. Test MPY0 1a. Add MCND to PROD Place result in PROD 2. Shift MCND left 1 bit 2. Shift MPY right 1 bit 32nd Repitition? Multiplier0=1 Multiplier0=0 CCUEE Computer Organization
45
Animation - 1st Cut Multiplier
Multiplicand shifts left Multiplier shifts right Sample LSB of multiplier to decide whether to add Multiplier Product (64 bits) Write Control 64-bit ALU LSB Multiplicand Multiplicand Multiplicand Multiplicand Multiplicand CCUEE Computer Organization
46
Sequential Multiplier - 2nd Version
Observation: we’re only adding 32 bits at a time Clever idea: Why not... Hold the multiplicand still and… Shift the product right! Multiplicand (32 bits) Multiplier (32 bits) Shift Right Product (64 bits) Write Control 32-bit ALU LSB LHPROD (32 bits) RHPROD (32 bits) CCUEE Computer Organization
47
Algorithm - 2nd Version Multiplier
START DONE 1. Test MPY0 1a. Add MCND to left half of PROD Place result in left half of PROD 2. Shift PROD right 1 bit 2. Shift MPY right 1 bit 32nd Repitition? Multiplier0=1 Multiplier0=0 No: <32 Repititions Yes: 32 Repititions CCUEE Computer Organization
48
Sequential Multiplier - 3nd Version
Observation: we can store the multiplier and product in the same register! As multiplier shifts out…. Product shifts in Multiplicand (32 bits) Product (64 bits) Write Control 32-bit ALU LSB Shift Right LHPROD (32 bits) MPY (initial) (32 bits) MP/RHPROD (32 bits) CCUEE Computer Organization
49
Algorithm - 3rd Version Multiplier
START DONE 1. Test PROD0 1a. Add MCND to left half of PROD Place result in left half of PROD 2. Shift PROD right 1 bit 0. LOAD MPY in right half of PROD 32nd Repitition? Product0=1 Product0=0 No: <32 Repititions Yes: 32 Repititions CCUEE Computer Organization
50
Multiply Instructions in MIPS
MIPS adds new registers for product result: Hi - upper 32 bits of product Lo - lower 32 bits of product MIPS multiply instructions mult $s0, $s1 multu $s0, $s1 Accessing Hi, Lo registers mfhi $s1 mflo $s1 CCUEE Computer Organization
51
Computer Organization
Division Overview Grammar school algorithm: long division Subtract shifted divisor from dividend when it “fits” Quotient bit: 1 or 0 Question: how can hardware tell “when it fits?” 1 1 Quotient Divisor 1000 Dividend -1000 1010 -1000 10 Remainder Dividend = Quotient Divisor + Remainder CCUEE Computer Organization
52
Division Hardware - 1st Version
Shift register moves divisor (DIVR) to right ALU subtracts DIVR, then restores (adds back) if REM < 0 (i.e. divisor was “too big”) Divisor DIVR (64 bits) Shift R QUOT (32 bits) Shift L Remainder REM (64 bits) Write Control 64-bit ALU Sign bit (REM<0) ADD/ SUB LSB CCUEE Computer Organization
53
Division Algorithm - First Version
START: Place Dividend in REM DONE REM ≥ 0? 2a. Shift QUOT left 1 bit; LSB=1 2. Shift DIVR right 1 bit 1. REM = REM - DIVR 33nd Repitition? REM ≥ 0 REM < 0 No: <33 Repetitions Yes: 33 Repetitions 2b. REM = REM + DIVR Shift QUOT left 1 bit; LSB=0 Restore CCUEE Computer Organization
54
Divide 1st Version - Observations
We only subtract 32 bits in each iteration Idea: Instead of shifting divisor to right, shift remainder to left First step cannot produce a 1 in quotient bit Switch order to shift first, then subtract Save 1 iteration CCUEE Computer Organization
55
Divide Hardware - 2nd Version
Divisor Holds Still Dividend/Remainder Shifts Left End Result: Remainder in upper half of register QUOT (32 bits) Shift L REM (64 bits) Write Control 32-bit ALU Sign bit (REM<0) ADD/ SUB DIVR (32 bits) LSB CCUEE Computer Organization
56
Divide Hardware - 3rd Version
Combine quotient with remainder register REM (64 bits) Write Control 32-bit ALU Sign bit (REM<0) ADD/ SUB DIVR (32 bits) Shift L LSB Shift R CCUEE Computer Organization
57
Divide Algorithm - 3rd Version
START: Place Dividend in REM DONE (shift LH right 1 bit) REM ≥ 0? 3a.. Shift REM left 1 bit; LSB=1 1. Shift REM left 1 bit 2. LHREM = LHREM - DIVR 32nd Repitition? REM ≥ 0 REM < 0 No: <32 Repetitions Yes: 32 Repetitions 3b. LHREM = LHREM + DIVR Shift REM left 1 bit; LSB=0 CCUEE Computer Organization
58
Dividing Signed Numbers
Check sign of divisor, dividend Negate quotient if signs of operands are opposite Make remainder sign match dividend (if nonzero) CCUEE Computer Organization
59
MIPS Divide Instructions
div $s2, $s3 divu $s2, $s3 Results in Lo, Hi registers Hi: remainder Lo: quotient Divide pseudoinstructions div $s3, $s2, $s1 # $s3 = $s2 / $s1 divu $s3, $s2, $s1 Software must check for overflow, divide-by-zero CCUEE Computer Organization
60
Summary - Multiplication and Division
Sequential multipliers - efficient but slow Combinational multipliers - fast but expensive Division is more complex and problematic What about divide by zero? Restore step needed to undo unwanted subtractions CCUEE Computer Organization
61
Floating Point - Motivation
Review: n-bit integer representations Unsigned: 0 to 2n-1 Signed Two’s Complement: - 2n-1 to 2n-1-1 Biased (excess-b): -b to 2n-b Problem: how do we represent: Very large numbers 9,345,524,282,135,672, Very small numbers , Rational numbers 2/3 Irrational numbers sqrt(2) Transcendental numbers e, π CCUEE Computer Organization
62
Fixed Point Representation
Idea: fixed-point numbers with fractions Decimal point (binary point) marks start of fraction Decimal: = 1 10-4 Binary: = 1 2-7 Problems Limited locations for “decimal point” (binary point”) Won’t work for very small or very larger numbers CCUEE Computer Organization
63
Another Approach: Scientific Notation
Represent a number as a combination of Mantissa (significand): Normalized number AND Exponent (base 10) Example: 6.02 1023 Exponent Significand (mantissa) Radix (base) CCUEE Computer Organization
64
Computer Organization
Floating Point Key idea: adapt scientific notation to binary Fixed-width binary number for significand Fixed-width binary number for exponent (base 2) Idea: represent a number as 1.xxxxxxxtwo 2yyyy Exponent Leading ‘1’ (Implicit) Significand (mantissa) Radix (2) Important Points: This is a tradeoff between precision and range Arithmetic is approximate - error is inevitable! CCUEE Computer Organization
65
Computer Organization
IEEE 754 Floating Point Single precision (C/C++/Java float type) Value N = (-1)S 1.F 2E-127 Double precision (C/C++/Java double type) Value N = (-1)S 1.F 2E-1023 Bias Bias CCUEE Computer Organization
66
Floating Point Examples
8.75ten = 1 X X X 2-2 = X 23 Single Precision: Significand: …. (note leading 1 is implied) Exponent: = 130 = two Double Precision: Significand: … Exponent: = 1026 = two CCUEE Computer Organization
67
Floating Point Examples
-0.375ten = 1 X X 2-3 = 1. 1 X 2-2 Single Precision: Significand: …. Exponent: = 125 = two Double Precision: Significand: … Exponent: = 1021 = two CCUEE Computer Organization
68
Floating Point Examples
Q: What is the value of the following single-precision word? Significand = Exponent = = -119 Final Result = ( ) X = X 10-36 CCUEE Computer Organization
69
Special Values in IEEE Floating Point
exponent - reserved for zero value (all bits zero) “Denormalized numbers” - drop the “1.” Used for “very small” numbers … “gradual underflow” Smallest denormalized number (single precision): X = 2-149 exponent Infinity exponent, zero significand NaN (Not a Number) exponent, nonzero significand CCUEE Computer Organization
70
Floating Point Range and Precision
The tradeoff: range in exchange for uniformity “Tiny” example: floating point with: 3 exponent bits 2 signficand bits s exp S 1 2 4 5 Graphic and Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002 – –10 –5 +5 +10 + Denormalized Normalized Infinity –1 –0.8 –0.6 –0.4 –0.2 +0.2 +0.4 +0.6 +0.8 +1 Denormalized Normalized Infinity +0 –0 CCUEE Computer Organization
71
Visualizing Floating Point - “Small” FP Representation
8-bit Floating Point Representation the sign bit is in the most significant bit. the next four bits are the exponent, with a bias of 7. the last three bits are the frac Same General Form as IEEE Format normalized, denormalized representation of 0, NaN, infinity) s exp significand 2 3 6 7 Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002 CCUEE Computer Organization
72
Small FP - Values Related to Exponent
Exp exp E 2E /64 (denorms) /64 /32 /16 /8 /4 /2 n/a (inf, Nan). CCUEE Computer Organization
73
Small FP Example - Dynamic Range
s exp frac E Value /8*1/64 = 1/512 /8*1/64 = 2/512 … /8*1/64 = 6/512 /8*1/64 = 7/512 /8*1/64 = 8/512 /8*1/64 = 9/512 /8*1/2 = 14/16 /8*1/2 = 15/16 /8*1 = 1 /8*1 = 9/8 /8*1 = 10/8 /8*128 = 224 /8*128 = 240 n/a inf Denormalized numbers closest to zero largest denorm smallest norm Normalized numbers closest to 1 below closest to 1 above largest norm CCUEE Computer Organization
74
Learning from Tiny & Small FP
Non-uniform spacing of numbers very small spacing for large negative exponents very large spacing for large positive exponents Exact representation: sums of powers of 2 Approximate representation: everything else CCUEE Computer Organization
75
Summary: IEEE Floating Point Values
Source: book p. 301 CCUEE Computer Organization
76
IEEE Floating Point - Interesting Numbers
Description exp frac Numeric Value Zero 00…00 00…00 0.0 Smallest Pos. Denorm. 00…00 00…01 2– {23,52} X 2– {126,1022} Single 1.4 X 10–45 Double 4.9 X 10–324 Largest Denormalized 00…00 11…11 (1.0 – ) X 2– {126,1022} Single 1.18 X 10–38 Double 2.2 X 10–308 Smallest Pos. Normalized 00…01 00… X 2– {126,1022} Just larger than largest denormalized One 01…11 00…00 1.0 Largest Normalized 11…10 11…11 (2.0 – ) X 2{127,1023} Single 3.4 X 1038 Double 1.8 X 10308 CCUEE Computer Organization
77
Floating Point Addition (Fig. 3.16)
1. Align binary point to number with larger exponent 2. Add significands 3. Normalize result and adjust exponent 4. If overflow/underflow throw exception 5. Round result (go to 3 if normalization needed again) A X X + B X X 10.00 X 20 (Normalize) X Hardware - Fig. 3.17, p. 201 CCUEE Computer Organization
78
Floating Point Multiplication (Fig. 3.18)
1. Add 2 exponents together to get new exponent (subtract 127 to get proper biased value) 2. Multiply significands 3. Normalize result if necessary (shift right) & adjust exponent 4. If overflow/underflow throw exception 5. Round result (go to 3 if normalization needed again) 6. Set sign of result using sign of X, Y CCUEE Computer Organization
79
MIPS Floating Point Instructions
Organized as a coprocessor Separate registers $f0-$f31 Separate operations Separate data transfer (to same memory) Basic operations add.s - single add.d - double sub.s - single sub.d - double mul.s - single mul.d - double div.s - single div.d - double CCUEE Computer Organization
80
MIPS Floating Point Instructions (cont’d)
Data transfer lwc1, swcl (l.s, s.s) - load/store float to fp reg l.d, s.d - load/store double to fp reg pair Testing / branching c.lt.s, c.lt.d, c.eq.s, c.eq.d, … compare and set condition bit if true bclt - branch if condition true bclf - branch if condition false CCUEE Computer Organization
81
Computer Organization
Rounding Extra bits allow rounding after computation Guard Digit (may shift into number during normalization) Round digit - used to round when guard bit shifted during normalization Sticky bit - used when there are 1’s to the right of the round digit e.g., “ ” (round to nearest even) IEEE 754 supports four rounding modes Always round up Always round down Truncate Round to nearest even (most common) CCUEE Computer Organization
82
Limitations on Floating-Point Math
Most numbers are approximate Roundoff error is inevitable Range (and accuracy) vary depending on exponent “Normal” math properties not guaranteed: Inverse (1/r)*r ≠ 1 Associative (A+B) + C ≠ A + (B+C) (A*B) * C ≠ A * (B*C) Distributive (A+B) * C ≠ A*B + B*C Scientific calculations require error management take a numerical analysis for more info CCUEE Computer Organization
83
IEEE Floating Point - Special Properties
Floating Point 0 same as Integer 0 All bits = 0 Can (Almost) Use Unsigned Integer Comparison A > B if: A.EXP > B.EXP or A.EXP=B.EXP and A.SIG > B.SIG But, must first compare sign bits Must consider -0 == 0 NaNs problematic Will be greater than any other values What should comparison yield? This is equivalent to unsigned comparision! CCUEE Computer Organization
84
Computer Organization
Summary - Chapter 3 Important Topics Signed & Unsigned Numbers (3.2) Addition and Subtraction (3.3) Carry Lookahead (B.6) Constructing an ALU (B.5) Multiplication and Division (3.4, 4.5) Floating Point (3.6) Coming Up: Performance (Chapter 4) CCUEE Computer Organization
85
Computer Organization
References Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides – Fall 1997 © UCB Rob Rutenbar’s Slides – Fall 1999 CMU John Nestor’s ECE 313 Slides – Fall 2004 LC Other sources as noted CCUEE Computer Organization
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.