CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm.

CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm

Head’s Up  Week 6 & 7 material l Digital Logic Design l Processor organization / description l MIPS arithmetic operations l PH 3.1, 3.2, 3.3  Reminders l Midterm #1 – Thursday, February 23rd  Next week’s material l MIPS arithmetic operations -Reading assignment – PH 3.4 through 3.5

To make the architect’s crucial task even conceivable, it is necessary to separate the architecture, the definition of the product as perceivable by the user, from its implementation. Architecture versus implementation defines a clean boundary between parts of the design task, and there is plenty of work on each side of it. The Mythical Man-Month, Brooks, pg. 256

Review: MIPS ISA CategoryInstrOp CodeExampleMeaning Arithmetic (R & I format) add0 and 32add $s1, $s2, $s3$s1 = $s2 + $s3 subtract0 and 34sub $s1, $s2, $s3$s1 = $s2 - $s3 add immediate8addi $s1, $s2, 6$s1 = $s2 + 6 or immediate13ori $s1, $s2, 6$s1 = $s2 v 6 Data Transfer (I format) load word35lw $s1, 24($s2)$s1 = Memory($s2+24) store word43sw $s1, 24($s2)Memory($s2+24) = $s1 load byte32lb $s1, 25($s2)$s1 = Memory($s2+25) store byte40sb $s1, 25($s2)Memory($s2+25) = $s1 load upper imm15lui $s1, 6$s1 = 6 * 2 16 Cond. Branch (I & R format) br on equal4beq $s1, $s2, Lif ($s1==$s2) go to L br on not equal5bne $s1, $s2, Lif ($s1 !=$s2) go to L set on less than0 and 42slt $s1, $s2, $s3if ($s2<$s3) $s1=1 else $s1=0 set on less than immediate 10slti $s1, $s2, 6if ($s2<6) $s1=1 else $s1=0 Uncond. Jump (J & R format) jump2j 2500go to 10000 jump register0 and 8jr $t1go to $t1 jump and link3jal 2500go to 10000; $ra=PC+4

Review: MIPS Organization, so far Processor Memory 32 bits 2 30 words read/write addr read data write data word address (binary) 0…0000 0…0100 0…1000 0…1100 1…1100 Register File src1 addr src2 addr dst addr write data 32 bits src1 data src2 data 32 registers ($zero - $ra) 32 5 5 5 PC ALU 32 0123 7654 byte address (big Endian) Fetch PC = PC+4 DecodeExec Add 32 4 Add 32 br offset

Processor Organization  Processor control needs to have the l Ability to input instructions from memory l Logic to control instruction sequencing and to issue signals that control the way information flows between the datapath components and the operations performed by them  Processor datapath needs to have the l Ability to load data from and store data to memory l Interconnected components - functional units (e.g., ALU) and storage units (e.g., Register File) - for executing the ISA  Need a way to describe the organization l High level (block diagram) description l Schematic (gate level) description l Textural (simulation/synthesis level) description

Levels of Description of a Digital System Architectural Functional/Behavioral Register Transfer Logic Circuit models programmer's view at a high level; written in your favorite programming language more detailed model, like the block diagram view model is in terms of datapath FUs, registers, busses; register xfer operations are clock phase accurate model is in terms of logic gates; delay information can be specified for gates; digital waveforms model is in terms of circuits (electrical behavior); accurate analog waveforms Less Abstract More Accurate Slower Simulation Special languages + simulation systems for describing the inherent parallel activity in hardware (VHDL and verilog) Schematic capture + logic simulation package like LogicWorks

Why Simulate First?  Physical breadboarding l discrete components/lower scale integration precedes actual construction of the prototype l verification of the initial design  No longer possible as designs reach higher levels of integration!  Simulation before construction - aka functional verification l high level constructs means faster to design and test l can play “what if” more easily l limited performance (can’t usually simulate all possible input transitions) and accuracy (can’t usually model wiring delays accurately), however

Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design. The Mythical Man-Month, Brooks, pg. 43

Review: MIPS ISA CategoryInstrOp CodeExampleMeaning Arithmetic (R & I format) add0 and 32add $s1, $s2, $s3$s1 = $s2 + $s3 subtract0 and 34sub $s1, $s2, $s3$s1 = $s2 - $s3 add immediate8addi $s1, $s2, 6$s1 = $s2 + 6 or immediate13ori $s1, $s2, 6$s1 = $s2 v 6 Data Transfer (I format) load word35lw $s1, 24($s2)$s1 = Memory($s2+24) store word43sw $s1, 24($s2)Memory($s2+24) = $s1 load byte32lb $s1, 25($s2)$s1 = Memory($s2+25) store byte40sb $s1, 25($s2)Memory($s2+25) = $s1 load upper imm15lui $s1, 6$s1 = 6 * 2 16 Cond. Branch (I & R format) br on equal4beq $s1, $s2, Lif ($s1==$s2) go to L br on not equal5bne $s1, $s2, Lif ($s1 !=$s2) go to L set on less than0 and 42slt $s1, $s2, $s3if ($s2<$s3) $s1=1 else $s1=0 set on less than immediate 10slti $s1, $s2, 6if ($s2<6) $s1=1 else $s1=0 Uncond. Jump (J & R format) jump2j 2500go to 10000 jump register0 and 8jr $t1go to $t1 jump and link3jal 2500go to 10000; $ra=PC+4

Review: MIPS Organization, so far Processor Memory 32 bits 2 30 words read/write addr read data write data word address (binary) 0…0000 0…0100 0…1000 0…1100 1…1100 Register File src1 addr src2 addr dst addr write data 32 bits src1 data src2 data 32 registers ($zero - $ra) 32 5 5 5 PC ALU 32 0123 7654 byte address (big Endian) Fetch PC = PC+4 DecodeExec Add 32 4 Add 32 br offset

Arithmetic  Where we've been: l Abstractions: -Instruction Set Architecture (ISA) -Assembly and machine language  What's up ahead: l Implementing the architecture 32 m (operation) result A B ALU 4 zeroovf 1 1

Number Representation  Bits are just bits (have no inherent meaning) l conventions define the relationships between bits and numbers  Binary numbers (base 2) - integers 0000  0001  0010  0011  0100  0101  0110  0111  1000  1001 ... l in decimal from 0 to 2 n -1 for n bits  Of course, it gets more complicated l storage locations (e.g., register file words) are finite, so have to worry about overflow (i.e., when the number is too big to fit into 32 bits) l have to be able to represent negative numbers, e.g., how do we specify -8 in addi$sp, $sp, -8#$sp = $sp - 8 l in real systems have to provide for more that just integers, e.g., fractions and real numbers (and floating point)

Possible Representations Sign Mag.Two’s Comp.One’s Comp. 1000 = -8 1111 = -71001= -71000 = -7 1110 = -61010 = -61001 = -6 1101 = -51011 = -51010 = -5 1100 = -4 1011 = -4 1011 = -31101 = -31100 = -3 1010 = -21110 = -21101 = -2 1001 = -11111 = -11110 = -1 1000 = -01111 = -0 0000 = +00000 = 00000 = +0 0001 = +1 0010 = +2 0011 = +3 0100 = +4 0101 = +5 0110 = +6 0111 = +7  Issues: balance number of zeros ease of operations  Which one is best? Why?

 32-bit signed numbers (2’s complement): 0000 0000 0000 0000 0000 0000 0000 0000 two = 0 ten 0000 0000 0000 0000 0000 0000 0000 0001 two = + 1 ten 0000 0000 0000 0000 0000 0000 0000 0010 two = + 2 ten... 0111 1111 1111 1111 1111 1111 1111 1110 two = + 2,147,483,646 ten 0111 1111 1111 1111 1111 1111 1111 1111 two = + 2,147,483,647 ten 1000 0000 0000 0000 0000 0000 0000 0000 two = – 2,147,483,648 ten 1000 0000 0000 0000 0000 0000 0000 0001 two = – 2,147,483,647 ten 1000 0000 0000 0000 0000 0000 0000 0010 two = – 2,147,483,646 ten... 1111 1111 1111 1111 1111 1111 1111 1101 two = – 3 ten 1111 1111 1111 1111 1111 1111 1111 1110 two = – 2 ten 1111 1111 1111 1111 1111 1111 1111 1111 two = – 1 ten  What if the bit string represented addresses? l need operations that also deal with only positive (unsigned) integers maxint minint MIPS Representations

Review: Signed Binary Representation 2’s compdecimal 1000-8 1001-7 1010-6 1011-5 1100-4 1101-3 1110-2 1111 00000 00011 00102 00113 01004 01015 01106 01117 2 3 - 1 = 1011 then add a 1 1010 complement all the bits -(2 3 - 1) = -2 3 =

 Negating a two's complement number: complement all the bits and add a 1 l remember: “negate” and “invert” are quite different!  Converting n-bit numbers into numbers with more than n bits: l MIPS 16-bit immediate gets converted to 32 bits for arithmetic copy the most significant bit (the sign bit) into the other bits 0010 -> 0000 0010 1010 -> 1111 1010 sign extension versus zero extend ( lb vs. lbu ) Two's Complement Operations

Goal: Design a ALU for the MIPS ISA  Must support the Arithmetic/Logic operations of the ISA  Tradeoffs of cost and speed based on frequency of occurrence, hardware budget

MIPS Arithmetic and Logic Instructions  Signed arithmetic generates overflow, but no carry out R-type: I-Type: 3125201550 opRsRtRdfunct opRsRtImmed 16 Type opfunct ADDI001000xx ADDIU001001xx SLTI001010xx SLTIU001011xx ANDI001100xx ORI001101xx XORI001110xx LUI001111xx Type op funct ADD000000100000 ADDU000000100001 SUB000000100010 SUBU000000100011 AND000000100100 OR000000100101 XOR000000100110 NOR000000100111 Type opfunct 000000101000 000000101001 SLT000000101010 SLTU000000101011 000000101100

Design Trick: Divide & Conquer  Break the problem into simpler problems, solve them and glue together the solution  Example: assume the immediates have been taken care of before the ALU l now down to 10 operations l can encode in 4 bits 00add 01addu 02sub 03subu 04and 05or 06xor 07nor 12slt 13sltu

 Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101  Two's complement operations easy subtraction using addition of negative numbers 0111  0111 - 0110  + 1010  Overflow (result too large for finite computer word): e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 1000 Addition & Subtraction

 Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101  Two's complement operations easy subtraction using addition of negative numbers 0111  0111 - 0110  + 1010  Overflow (result too large for finite computer word): e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 Addition & Subtraction 1101 0001 0001 0001 1 0001 1000

Building a 1-bit Binary Adder 1 bit Full Adder A B S carry_in carry_out S = A xor B xor carry_in carry_out = A  B v A  carry_in v B  carry_in (majority function)  How can we use it to build a 32-bit adder?  How can we modify it easily to build an adder/subtractor? ABcarry_incarry_outS 00000 00101 01001 01110 10001 10110 11010 11111

Building 32-bit Adder 1-bit FA A0A0 B0B0 S0S0 c 0 =carry_in c1c1 1-bit FA A1A1 B1B1 S1S1 c2c2 A2A2 B2B2 S2S2 c3c3 c 32 =carry_out 1-bit FA A 31 B 31 S 31 c 31...  Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect...  Ripple Carry Adder (RCA) advantage: simple logic, so small (low cost) disadvantage: slow and lots of glitching (so lots of energy consumption)

Building 32-bit Adder/Subtractor  Remember 2’s complement is just complement all the bits add a 1 in the least significant bit A 0111  0111 B - 0110  + 1010 1-bit FA S0S0 c 0 =carry_in c1c1 1-bit FA S1S1 c2c2 S2S2 c3c3 c 32 =carry_out 1-bit FA S 31 c 31... A0A0 A1A1 A2A2 A 31 B0B0 B1B1 B2B2 B 31 add/subt B0B0 control (0=add,1=subt) B 0 if control = 0, !B 0 if control = 1

Overflow Detection and Effects  Overflow: the result is too large to represent in the number of bits allocated  When adding operands with different signs, overflow cannot occur! Overflow occurs when l adding two positives yields a negative l or, adding two negatives gives a positive l or, subtract a negative from a positive gives a negative l or, subtract a positive from a negative gives a positive  On overflow, an exception (interrupt) occurs l Control jumps to predefined address for exception l Interrupted address (address of instruction causing the overflow) is saved for possible resumption  Don't always want to detect (interrupt on) overflow

New MIPS Instructions CategoryInstrOp CodeExampleMeaning Arithmetic (R & I format) add unsigned0 and 33addu $s1, $s2, $s3$s1 = $s2 + $s3 subt unsigned0 and 35subu $s1, $s2, $s3$s1 = $s2 - $s3 add imm. unsigned 9addiu $s1, $s2, 6$s1 = $s2 + 6 Data Transfer load byte unsigned 36lbu $s1, 25($s2)$s1 = Memory($s2+25) Cond. Branch (I & R format) set on less than unsigned 0 and 43sltu $s1, $s2, $s3if ($s2<$s3) $s1=1 else $s1=0 set on less than imm. unsigned 11sltiu $s1, $s2, 6if ($s2<6) $s1=1 else $s1=0  Sign extend - addiu, sltiu  Zero extend - lbu  No overflow detected - addu, subu, addiu, sltu, sltiu

Conclusion  We can build an ALU to support the MIPS ISA l we can efficiently perform subtraction using two’s complement l we can replicate a 1-bit ALU to produce a 32-bit ALU  Important points about hardware l all of the gates are always working (concurrent) l the speed of a gate is affected by the number of inputs to the gate (fan-in) and the number of gates that the output is connected to (fan-out) l the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “number of levels of logic”)  Our primary focus: comprehension, however, l Clever changes to organization can improve performance (similar to using better algorithms in software)

CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm.

Similar presentations

Presentation on theme: "CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm.

Similar presentations

Presentation on theme: "CS 35101 Computer Architecture Spring 2006 Week 6/7 Paul Durand (www.cs.kent.edu/~durand) Course url: www.cs.kent.edu/~durand/cs35101.htm."— Presentation transcript:

Similar presentations

About project

Feedback