ECE 232 L8.Arithm.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 8 Computer Arithmetic ALU, Adders Maciej Ciesielski
ECE 232 L8.Arithm.2 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Outline °Number representation Signed and unsigned numbers Comparisons, sign extensions Overflow exception, detection °Computer arithmetic ALU Adders Overflow detection °Speed: Carry-Look-Ahead adder
ECE 232 L8.Arithm.3 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Sign and magnitude Sign bit = 0 pos 1 neg 0n-2n-1 n-1 bit integer Problem: two zeros (+0, –0)
ECE 232 L8.Arithm.4 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Two’s complement representation Sign bit = 0 pos 1 neg 0n-2n-1 Formula: -X two = 2 n - X X two = -2 n-1 x n n-2 x n-2 + … + 2 x 1 + x 0
ECE 232 L8.Arithm.5 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Signed vs. Unsigned Comparison °Instruction slt: set on less-then unsigned (ignore sign bit) R1 = 0… = 1 twos R2 = 0… = 2 twos R3 = 1… = -1 twos °After executing these instructions: slt r4,r2,r1 ; if (r2 < r1) r4=1; else r4=0 slt r5,r3,r1 ; if (r3 < r1) r5=1; else r5=0 sltu r6,r2,r1 ; if (r2 < r1) r6=1; else r6=0 sltu r7,r3,r1 ; if (r3 < r1) r7=1; else r7=0 °What are values of registers r4 - r7? Why? r4 = ; r5 = ; r6 = ; r7 = ;
ECE 232 L8.Arithm.6 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Sign Extension °Extend (left) – why do we need it * ? °Extend the MS-bit all the way to the left °Example: 4bits:5 10 = bits:5 10 = bits: = bits: = * To fill in 32-bit register with shorter constant
ECE 232 L8.Arithm.7 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Data Path Diagram Program Counter (PC) Instruction Register Register File ALU Cache Memory Data In Address 4 Out Rs Rt Rd Control Logic
ECE 232 L8.Arithm.8 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Design Process Design finishes as assembly - Design understood in terms of components and how they have been assembled - Top Down decomposition of complex functions (behaviors) into more primitive functions - bottom-up composition of primitive building blocks into more complex assemblies CPU DatapathControl ALURegsShifter Gates Design is a "creative process," not a simple method
ECE 232 L8.Arithm.9 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers MIPS ALU requirements °ALU operations: Arithmetic: add, addu, sub, subu, addi, addiu 2’s complement adder/sub with overflow detection Logical: AND, ANDi, OR, ORi, XOr, XOri, NOR Logical AND, logical OR, XOR, NOR Decision: slti, sltiu (set less than), beq, bne 2’s complement adder with inverter, check sign bit of result °ALU from textbook Chapter 4 supports these OPs
ECE 232 L8.Arithm.10 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers MIPS arithmetic instruction format °Signed arithmetic generate overflow, no carry R-type: I-Type: opRsRtRdfunct opRsRtImmed 16 Typeopfunct ADDI10xx ADDIU11xx SLTI12xx SLTIU13xx ANDI14xx ORI15xx XORI16xx LUI17xx Typeopfunct ADD0040 ADDU0041 SUB0042 SUBU0043 AND0044 OR0045 XOR0046 NOR0047 Typeopfunct SLT0052 SLTU0053
ECE 232 L8.Arithm.11 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Refined Requirements Functional Specification inputs: two 32-bit operands A, B; 4-bit mode M outputs:32-bit result S; 1-bit carry c; 1 bit overflow operations:add, addu, sub, subu, and, or, xor, nor, slt, sltu Block Diagram (symbol) ALU AB M overflow S 32 4 c
ECE 232 L8.Arithm.12 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Refined Diagram: bit-slice ALU AB M S 32 4 Overflow ALU0 a0b0 m cinco s0 ALU0 a31b31 m cinco s31
ECE 232 L8.Arithm.13 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ALU – bit slice design °Basic ALU functions Add, AND, OR A B 1-bit Full Adder Cout MUX Cin Result ADD AND ORS-select
ECE 232 L8.Arithm.14 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ALU - Additional operations °Subtract: A - B = A + (– B) form two’s complement by invert and +1 (C in ) °Set-less-than? – left as an exercise (see text, 4.5) A B 1-bit Full Adder Cout MUX Cin Resul t add and or S-select invert
ECE 232 L8.Arithm.15 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Revised Diagram °LSB and MSB need to do a little extra less than, set, overflow, zero – see Fig in text AB M S 32 4 Overflow ALU0 a0b0 cinco s0 ALU0 a31b31 cinco s31 C/L to produce select, comp, c-in ?
ECE 232 L8.Arithm.16 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Overflow °Examples: = 10 but = - 9 but... 2’s ComplementBinaryDecimal Decimal = 7 = 3 1 = – = – 4 = – 5 = 7
ECE 232 L8.Arithm.17 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Overflow Detection °Overflow: the result is too large (or too small) to represent properly Example: - 8 < = 4-bit binary number <= 7 °When adding operands with different signs, overflow cannot occur! °Overflow occurs when adding: 2 positive numbers and the sum is negative 2 negative numbers and the sum is positive °On your own: Prove you can detect overflow by: Carry into MSB carry out of MSB = 7 = 3 1 = – 6 10 = – 4 = – 5 = 7=
ECE 232 L8.Arithm.18 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Overflow Detection Logic °Carry into MSB Carry out of MSB For a n-bit ALU: overflow = Cin[N - 1] XOR Cout[N - 1] A0 B0 1-bit ALU Result0 Cin0 Cout0 A1 B1 1-bit ALU Result1 Cin1 Cout1 A2 B2 1-bit ALU Result2 Cin2 A3 B3 1-bit ALU Result3 Cin3 Cout3 Overflow XYX XOR Y
ECE 232 L8.Arithm.19 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers More Revised Diagram °LSB and MSB need to do a little extra M 4 C/L to produce select, comp, c-in Signed-arithmetic, Cin xor Cout AB S 32 Overflow ALU0 a0b0 cinco s0 ALU0 a31b31 cin co s31
ECE 232 L8.Arithm.20 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers What about Performance? °Critical path of n-bit Ripple-Carry (RC) adder is nCP Cout3 1-bit ALU Result0 Cin0 Cout0 1-bit ALU Result1 Cin1 Cout1 1-bit ALU Result2 CiIn2 Cout2 1-bit ALU A0 B0 A1 B1 A2 B2 A3 B3 Result3 Cin3 Design trick: throw hardware at it
ECE 232 L8.Arithm.21 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Carry Look Ahead adder - Principle °Examine the Full Adder table a b Cin Cout S Cout = a b + Cin (a + b) S = a’b’c + a’bc’ + ab’c’ + abc = a b c a b Cin Cout S In general, for bit i: c i+1 = a i b i + c i (a i +b i ) where c i = Cout, c i-1 = Cin
ECE 232 L8.Arithm.22 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Designing Carry Look Ahead adder °Compute carry-out C i in terms of the primary inputs: c i+2 = a i+1 b i+1 + c i+1 (a i+1 + b i+1 ) = a i+1 b i+1 + (c i (a i + b i ) + a i b i ) (a i+1 + b i+1 ) a i+1 b i+1 c i+2 S i+1 FA i+1 c i+1 SiSi aiai bibi cici FA i °Create auxiliary functions: Generate: g i = a i b i and Propagate: p i = a i + b i c 1 = a 0 b 0 + c 0 (a 0 + b 0 ) = g 0 + (p 0 c 0 ) c 2 = a 1 b 1 + (a 1 + b 1 ) (a 0 b 0 + c 0 (a 0 + b 0 )) = g 1 + p 1 g 0 + p 1 p 0 c 0
ECE 232 L8.Arithm.23 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Carry Look Ahead (CLA) adder AB C-out 000“kill” 01C-in“propagate” 10C-in“propagate” 111“generate” P = A and B G = A xor B Cin C1 =G0 + C0 P0 C2 = G1 + G0 P1 + C0 P0 P1 C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 G C4 =... P A0 B0 S0 G0G0 P0P0 A1 B1 S1 G1G1 P1P1 A2 B2 S2 G2G2 P2P2 A3 B3 S3 G3G3 P3P3 These can be used in a hybrid 4x4-bit adder
ECE 232 L8.Arithm.24 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Plumbing as Carry Look Ahead analogy c1 = g0 + c0 p0 c2 = g1 + g0 p1 + c0 p0 p1 c4 = g3 + g2 p3 + g1 p2 + g0 p1 p2 + c0 p0 p1 p2 p3
ECE 232 L8.Arithm.25 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Cascaded Carry Look Ahead (16-bit): Abstraction C2 = G1 + G0 P1 + C0 P0 P1 C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 C1 =G0 + C0 P0 G P C4 =... CLACLA 4-bit Adder 4-bit Adder 4-bit Adder G0 P0 C0 Carries are generated by CLA, not RC adder
ECE 232 L8.Arithm.26 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers 2nd level Carry, Propagate as Plumbing
ECE 232 L8.Arithm.27 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Carry Select Adder (CSA) °Design trick: guess n-bit adder Carry propagate delay CP(2n) = 2*CP(n) n-bit adder 1 0 Cout CP(2n) = CP(n) + CP(mux) Carry-select adder Compute both, select one
ECE 232 L8.Arithm.28 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Carry Skip Adder: reduce worst case delay 4-bit Ripple Adder A0B S P0 P1 P2 P3 4-bit Ripple Adder A4B S P0 P1 P2 P3 Exercise: optimal design uses variable block sizes Just speed up the slowest case for each block
ECE 232 L8.Arithm.29 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Additional MIPS ALU requirements °Multiply: mult, multu (next lecture), divide: div, divu (?) need 32-bit multiply and divide, signed and unsigned °Shift: sll, srl, sra (next lecture) need left shift, right shift, right shift arithmetic by 0 to 31 bits °NOR (leave as exercise to reader) logical NOR or use 2 steps: (A OR B) XOR
ECE 232 L8.Arithm.30 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Elements of the Design Process °Divide and Conquer (e.g., ALU) Formulate a solution in terms of simpler components. Design each of the components (subproblems) °Generate and Test (e.g., ALU) Given a collection of building blocks, look for ways of putting them together that meets requirement °Successive Refinement (e.g., carry lookahead) Solve "most" of the problem (i.e., ignore some constraints or special cases), examine and correct shortcomings. °Formulate High-Level Alternatives (e.g., carry select) Articulate many strategies to "keep in mind" while pursuing any one approach. °Work on the Things you Know How to Do The unknown will become “obvious” as you make progress.
ECE 232 L8.Arithm.31 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Summary of the Design Process Hierarchical Design to manage complexity Top Down vs. Bottom Up vs. Successive Refinement Importance of Design Representations: Block Diagrams Decomposition into Bit Slices Truth Tables, K-Maps Circuit Diagrams Other Descriptions: state diagrams, timing diagrams, reg xfer,... Optimization Criteria: Gate Count [Package Count] Logic Levels Fan-in/Fan-out Power top down bottom up Area Delay mux design meets at TT CostDesign timePin Out
ECE 232 L8.Arithm.32 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Lecture Summary °Computer arithmetic Number systems Consequences for computer organization Example: overflow detection °An Overview of the Design Process Design is an iterative process, multiple approaches to get started Do NOT wait until you know everything before you start Example: Instruction Set drives the ALU design