1 Modified from Modified from 1998 Morgan Kaufmann Publishers Chapter Three: Arithmetic for Computers Section 2 citation and following credit line is included: 'Copyright 1998 Morgan Kaufmann Publishers.' Permission is granted to alter and distribute this material provided that the following credit line is included: 'Adapted from Computer Organization & Design, The hardware/Software Interface, Patterson and Hennesy, second edition, Copyright 1998 Morgan Kaufmann Publishers.' " This material may not be copied or distributed for commercial purposes without express written permission of the copyright holder.
2 Modified from Modified from 1998 Morgan Kaufmann Publishers Computer Arithmetic Overall Outline The outline for the entire subject of Computer Arithmetic is given below. However, for the sake of readability, the subject is divided into logical sections that can be presented together. The outline for each section will indicate what is covered in that specific section. Introduction Numbers and their representation 2’s Complement Detecting Overflow Basic Review –Binary Conversion –Binary Arithmetic –Hex & Octal Numbers –Basic Boolean Algebra Design Process Design of a “Fast” ALU for MIPS ISA Faster Design, Carry-Look-Ahead Adder
3 Modified from Modified from 1998 Morgan Kaufmann Publishers Computer Arithmetic Overall Outline continued Additional MIPS Requirements Elements of Design Process Summary of Design Process MIPS Arithmetic Instructions Multiplication methods Division Methods Floating Point Summary
4 Modified from Modified from 1998 Morgan Kaufmann Publishers Computer Arithmetic Section 2 Outline In this unit we will cover: Design Process Design a “Fast” ALU for the MIPS ISA –Requirements –ALU to do one-bit andi, ori instructions –32-bit ALU (Ripple Carry) –Expand ALU to support other operations –Test for equality –Overflow –Performance Question
5 Modified from Modified from 1998 Morgan Kaufmann Publishers The Design Process "To Design Is To Represent" Design activity yields description/representation of an object -- Traditional craftsman does not distinguish between the conceptualization and the artifact -- Separation comes about because of complexity -- The concept is captured in one or more representation languages -- This process IS design Design Begins With Requirements -- Functional Capabilities: what it will do -- Performance Characteristics: Speed, Power, Area, Cost,...
6 Modified from Modified from 1998 Morgan Kaufmann Publishers Design Process (cont.) Design Finishes As Assembly -- Design understood in terms of components and how they have been assembled -- Top Down decomposition of complex functions (behaviors) into more primitive functions -- bottom-up composition of primitive building blocks into more complex assemblies CPU DatapathControl ALURegsShifter Nand Gate Design is a "creative process," not a simple method
7 Modified from Modified from 1998 Morgan Kaufmann Publishers Design Refinement Informal System Requirement Initial Specification Intermediate Specification Final Architectural Description Intermediate Specification of Implementation Final Internal Specification Physical Implementation refinement increasing level of detail
8 Modified from Modified from 1998 Morgan Kaufmann Publishers Design as Search Design involves educated guesses and verification -- Given the goals, how should these be prioritized? -- Given alternative design pieces, which should be selected? -- Given design space of components & assemblies, which part will yield the best solution? Feasible (good) choices vs. Optimal choices Problem A Strategy 1Strategy 2 SubProb 1 SubProb2 SubProb3 BB1BB2BB3BBn
9 Modified from Modified from 1998 Morgan Kaufmann Publishers Problem: Design a “fast” ALU for the MIPS ISA Requirements? Must support the Arithmetic / Logic operations Tradeoffs of cost and speed based on frequency of occurrence, hardware budget
10 Modified from Modified from 1998 Morgan Kaufmann Publishers MIPS ALU requirements Add, AddU, Sub, SubU, AddI, AddIU –=> 2’s complement adder/sub with overflow detection And, Or, AndI, OrI, Xor, Xori, Nor –=> Logical AND, logical OR, XOR, nor SLTI, SLTIU (set less than) –=> 2’s complement adder with inverter, check sign bit of result ALU from P&H book chapter 3 (Appendix B) supports these operations
11 Modified from Modified from 1998 Morgan Kaufmann Publishers MIPS arithmetic instruction format Signed arithmetic generate overflow, no carry R-type: I-Type: opRsRtRdfunct opRsRtImmed 16 Typeopfunct ADDI10xx ADDIU11xx SLTI12xx SLTIU13xx ANDI14xx ORI15xx XORI16xx LUI17xx Typeopfunct ADD0040 ADDU0041 SUB0042 SUBU0043 AND0044 OR0045 XOR0046 NOR0047 Typeopfunct SLT0052 SLTU0053
12 Modified from Modified from 1998 Morgan Kaufmann Publishers Design Trick: divide & conquer Break the problem into simpler problems, solve them and glue together the solution Example: assume the immediates have been taken care of before the ALU –10 operations (4 bits) 00add 01addU 02sub 03subU 04and 05or 06xor 07nor 12slt 13sltU
13 Modified from Modified from 1998 Morgan Kaufmann Publishers Refined Requirements (1) Functional Specification inputs: 2 x 32-bit operands A, B, 4-bit mode (control) outputs:32-bit result S, 1-bit carry, 1 bit overflow 10 Operations (m):add, addu, sub, subu, and, or, xor, nor, slt, sltU (2) Block Diagram ALU AB m ovf S 32 4 c
14 Modified from Modified from 1998 Morgan Kaufmann Publishers Different Implementations Not easy to decide the “best” way to build something –Don't want too many inputs to a single gate –Don’t want to have to go through too many gates –for our purposes, ease of comprehension is important Let's look at a 1-bit ALU for addition: –Lets represent our 1-bit full adders with the following block diagram How could we build a 1-bit ALU for add, and, and or? How could we build a 32-bit ALU? c out = a b + a c in + b c in sum = a XOR b XOR c in
15 Modified from Modified from 1998 Morgan Kaufmann Publishers Selects one of the inputs to be the output, based on a control input Lets build our ALU using a MUX: S C A B 0 1 Review: The Multiplexor note: we call this a 2-input mux even though it has 3 inputs! The S input, select, is control
16 Modified from Modified from 1998 Morgan Kaufmann Publishers Building a 32 bit ALU (Ripple Carry) 1-bit ALU doing add, and, or 32-bit ALU capable of add, and,or The adder Designed by Directly linking The carries of 1-bit adders in This fashion is Called a ripple Carry adder
17 Modified from Modified from 1998 Morgan Kaufmann Publishers Two's complement approach: just negate b and add 1. How do we negate in control logic? –A solution: What about subtraction (a – b) ? BinvertCarryIn Least significant bit Oper ation Result A+b’+1 (a-b) A+b
18 Modified from Modified from 1998 Morgan Kaufmann Publishers Add, and, or, subtract are common to all ALU operations. Now lets try an operation more specific to MIPS Need to support the set-on-less-than instruction (slt) –remember: slt is an arithmetic instruction –produces a 1 in the least significant bit if rs < rt and 0 otherwise –use subtraction: (a-b) < 0 implies a < b Need to support test for equality (beq $t5, $t6, $t7) –use subtraction: (a-b) = 0 implies a = b So now we use a new input for our ALU called “Less”. –The least significant bit of Less will be the most significant bit (bit 32) called “set” of the output of the adder performing a 2’s complement. All other bits of Less will be 0. –Lets See: Tailoring the ALU to the MIPS
Supporting slt To perform slt, Binvert = 1, CarryIn = 1, Operation = 3 1-bit block diagram Least significant bit of Less = The most significant bit (Set) of the output of the adder, all other bits of Less are 0s Comes From Set For block 32
20 Modified from Modified from 1998 Morgan Kaufmann Publishers 32-bit ALU Block Diagram Only this Less input Is Set and is bit 32 of The result of 2’s complement All other Less inputs are 0
21 Modified from Modified from 1998 Morgan Kaufmann Publishers Some Improvement Note that when the ALU operation requires subtraction, the Binvert and CarryIn inputs are both set to one for 2’s complement of b. We can simplify the control by combining CarryIn and Binvert into one control called “Bnegate”. Bnegate is 1 for all ALU operations requiring subtraction. Bnegate is 0 for additions and logical operations.
22 Modified from Modified from 1998 Morgan Kaufmann Publishers Test for equality Notice control lines: Bnegate Operation 000 = and 001 = or 010 = add 110 = subtract 111 = slt Note: Zero is a 1 when the result is 0! Zero = ( Result 31 + … + Result 0)’ So now our ALU can also test for a = b
23 Modified from Modified from 1998 Morgan Kaufmann Publishers Revised Diagram LSB and MSB need to do a little extra AB M S 32 4 Ovflw ALU0 a0b0 cinco s0 ALU0 a31b31 cinco s31 C/L (combinationa l Logic) to produce select, comp, c-in ?
24 Modified from Modified from 1998 Morgan Kaufmann Publishers Overflow (Remember) Examples: = 10 but = - 9 but... 2’s ComplementBinaryDecimal Decimal – 6 – 4 – 5 7
25 Modified from Modified from 1998 Morgan Kaufmann Publishers Overflow Detection Overflow: the result is too large (or too small) to represent properly –Example: - 8 < = 4-bit binary number <= 7 When adding operands with different signs, overflow cannot occur! Overflow occurs when adding: –2 positive numbers and the sum is negative –2 negative numbers and the sum is positive On your own: Prove you can detect overflow by: –Carry into MSB Carry out of MSB – 6 –4 – 5 7 0
26 Modified from Modified from 1998 Morgan Kaufmann Publishers Overflow Detection Logic Carry into MSB Carry out of MSB –For a N-bit ALU: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1] A0 B0 1-bit ALU Result0 CarryIn0 CarryOut0 A1 B1 1-bit ALU Result1 CarryIn1 CarryOut1 A2 B2 1-bit ALU Result2 CarryIn2 A3 B3 1-bit ALU Result3 CarryIn3 CarryOut3 Overflow XYX XOR Y
27 Modified from Modified from 1998 Morgan Kaufmann Publishers More Revised Diagram LSB and MSB need to do a little extra AB M = control S 32 4 Ovflw ALU0 a0b0 cinco s0 ALU31 a31b31 cinco s31 C/L to produce select, comp, c-in signed-arith and cin xor co Insert Logic For Overflow here
28 Modified from Modified from 1998 Morgan Kaufmann Publishers But What about Performance? Critical Path (CP) of n-bit Rippled-carry adder is n*CP of 1-bit ALU A0 B0 1-bit ALU Result0 CarryIn0 CarryOut0 A1 B1 1-bit ALU Result1 CarryIn1 CarryOut1 A2 B2 1-bit ALU Result2 CarryIn2 CarryOut2 A3 B3 1-bit ALU Result3 CarryIn3 CarryOut3
29 Modified from Modified from 1998 Morgan Kaufmann Publishers Is a 32-bit ALU as fast as a 1-bit ALU? Is there more than one way to do addition? –two extremes: ripple carry and sum-of-products (only requires 2 levels of logic) Can you see the ripple? How could you get rid of it? c 1 = b 0 c 0 + a 0 c 0 + a 0 b 0 c 2 = b 1 c 1 + a 1 c 1 + a 1 b 1 c 2 = c 3 = b 2 c 2 + a 2 c 2 + a 2 b 2 c 3 = c 4 = b 3 c 3 + a 3 c 3 + a 3 b 3 c 4 = Not feasible! Why? Problem: ripple carry adder is slow Design Trick: throw hardware at it