Presentation is loading. Please wait.

Presentation is loading. Please wait.

331 W06.1Fall 2003 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 6 [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane.

Similar presentations


Presentation on theme: "331 W06.1Fall 2003 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 6 [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane."— Presentation transcript:

1 331 W06.1Fall 2003 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 6 [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane Irwin’s PSU CSE331 slides]

2 331 W06.2Fall 2003 Head’s Up  This week’s material l VHDL modeling -Reading assignment – Y, Chapter 4 and 5 l MIPS arithmetic operations -Reading assignment – PH 4.1 through 4.3  Next week’s material l MIPS logic and multiply instructions -Reading assignment – PH 4.4 l MIPS ALU design -Reading assignment – PH 4.5

3 331 W06.3Fall 2003 Review: Entity-Architecture Features  Entity defines externally visible characteristics l Ports: channels of communication  Architecture defines the internal behavior or structure l Declaration of internal signals l Description of behavior -concurrent behavioral description: collection of CSA’s -process behavioral description: CSAs and variable assignment statements within a process description -structural description: system described in terms of the interconnections of its components Design Entity-Architecture == Hardware Component Entity == External Characteristics Architecture (Body ) == Internal Behavior or Structure

4 331 W06.4Fall 2003 Review: Model of Execution  CSA’s are executed concurrently - textural order of the statements is irrelevant to the correct operation  Two stage model of circuit execution l first stage -all CSA’s with events occurring at the current time on signals on their right hand side (RHS) are evaluated -all future events that are generated from this evaluation are scheduled l second stage -time is advanced to the time of the next event  VHDL programmer specifies l events - with CSA’s l delays - with CSA’s with delay annotation l concurrency - by having a distinct CSA for each signal

5 331 W06.5Fall 2003 Review: Signal Resolution  Resolving values of pairs of std_logic type signals l When a signal has multiple drivers (e.g., a bus), the value of the resulting signal is determined by a resolution function U unknow n X forcing unknow n 01 Z high imped W weak unknow n L weak 0 H weak 1 - don’t care UUUUUUUUUU XUXXXXXXXX 0UX0X0000X 1UXX11111X ZUX01ZWLHX WUX01WWWWX LUX01LWLWX HUX01HWWHX -UXXXXXXXX

6 331 W06.6Fall 2003 Motivation for Process Construct  How would you build the logic (and the VHDL code) for a 32 by 2 multiplexor given inverters and 2 input nands? 1 0 SEL A DOUT B

7 331 W06.7Fall 2003 MUX CSA Description 1 0 SEL A DOUT B  How can we describe the circuit in VHDL if we don’t know what primitive gates we will be designing with? entity MUX32X2 is port(A,B: in std_logic_vector(31 downto 0); DOUT: out std_logic_vector(31 downto 0); SEL: in std_logic); end MUX32X2;

8 331 W06.8Fall 2003 Mux Process Description  Process fires whenever a signal in the “sensitivity list” changes entity MUX32X2 is port(A,B: in std_logic_vector(31 downto 0); DOUT: out std_logic_vector(31 downto 0); SEL: in std_logic); end MUX32X2; architecture process_behavior of MUX32X2 is begin mux32x2_process: process(A, B, SEL) begin if (SEL = ‘0’) then DOUT <= A after 5 ns; else DOUT <= B after 4 ns; end if; end process mux32x2_process; end process_behavior; 1 0 SEL A DOUT B

9 331 W06.9Fall 2003 VHDL Process Features  Process body is executed sequentially to completion in zero (simulation) time  Delays are associated only with assignment of values to signals l marked by CSAs <= operator  Variable assignments take effect immediately l marked by := operator  Upon initialization all processes are executed once  After initialization processes are data-driven l activated by events on signals in sensitivity list l waiting for the occurrence of specific events using wait statements

10 331 W06.10Fall 2003 Process Programming Constructs  if-then-else l Boolean valued expressions are evaluated sequentially until first true is encountered  case l branches must cover all possible values for the case expression  for loop l loop index declared (locally) by virtue of use in loop stmt l loop index cannot be assigned a value or altered in loop body  while loop l condition may involve variables modified within the loop while (condition) loop for index in value1 to value2 loop case (expression) is when ‘value0’ =>... end case; if (expression1 = ‘value1’) then... elsif (expression2 = ‘value2’) then... end if;

11 331 W06.11Fall 2003 Behavioral Description of a Register File library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; entity regfile is port(write_data: in std_logic_vector(31 downto 0); dst_addr,src1_addr,src2_addr: in UNSIGNED(4 downto 0); write_cntrl: in std_logic; src1_data,src2_data: out std_logic_vector(31 downto 0)); end regfile; Register File src1_addr src2_addr dst_addr write_data 32 bits src1_data src2_data 32 words write_cntrl

12 331 W06.12Fall 2003 Behavioral Description of a Register File, con’t architecture process_behavior of regfile is type reg_array is array(0 to 31) of std_logic_vector (31 downto 0); begin regfile_process: process(src1_addr,src2_addr,write_cntrl) variable data_array: reg_array := ( (X”00000000”),... (X”00000000”)); variable addrofsrc1, addrofsrc2, addrofdst: integer; begin addrofsrc1 := conv_integer(src1_addr); addrofsrc2 := conv_integer(src2_addr); addrofdst := conv_integer(dst_addr); if write_cntrl = ‘1’ then data_array(addrofdst) := write_data; end if; src1_data <= data_array(addrofsrc1) after 10 ns; src2_data <= data_array(addrofsrc2) after 10 ns; end process regfile_process; end process_behavior;

13 331 W06.13Fall 2003 Process Construct with Wait Statement library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; entity dff is port(D,clk: in std_logic; Q,Qbar: out std_logic); end dff; architecture dff_behavior of dff is begin output: process begin wait until (clk’event and clk = ‘1’); Q <= D after 5 ns; Qbar <= not D after 5 ns; end process output; end dff_behavior; D Q clk Qbar positive edge-triggered dff

14 331 W06.14Fall 2003 Wait Statement Types  Wait statements specify conditions under which a process may resume execution after suspension l wait for time expression -suspends process for a period of time defined by the time expression l wait on signal -suspends process until an event occurs on one (or more) of the signals l wait until condition -suspends process until condition evaluates to specified Boolean l wait  Process resumes execution at the first statement following the wait statement wait until (clk’event and clk = ‘1’); wait for (20 ns); wait on clk, reset, status;

15 331 W06.15Fall 2003 Signal Attributes Function attributeFunction signal_name’event Boolean value signifying a change in value on this signal signal_name’active Boolean value singifying an assignment made to this signal (may not be a new value!) signal_name’last_event Time since the last event on this signal signal_name’last_active Time since the signal was last active signal_name’last_value Previous value of this signal  Attributes are used to return various types of information about a signal

16 331 W06.16Fall 2003 Things to Remember About Processes  A process must have either a sensitivity list or at least one wait statement  A process cannot have both a sensitivity list and a wait statement  Remember, all processes are executed once when the simulation is started  Don’t confuse signals and variables. l Signals are declared either in the port definitions in the entity description or as internal signals in the architecture description. They are used in CSAs. Signals will be updated only after the next simulation cycle. l Variable exist only inside architecture process descriptions. They are used in variable assignment statements. Variables are updated immediately.

17 331 W06.17Fall 2003 Finite State Machine “Structure” D(0) Q(0) a clk b z comb dff D(1) Q(1) Fetch PC = PC+4 DecodeExec

18 331 W06.18Fall 2003 Structural VHDL Model in1 Qbar(0) clk in2 out1  System is described by its component interconnections l assumes we have previously designed entity-architecture descriptions for both comb and dff with behavioral models comb c_state(1)nxt_state(1) b a z clk D(0) Q(0) dff D(1) Q(1) Qbar(1) nxt_state(0)c_state(0)

19 331 W06.19Fall 2003 Finite State Machine Structural VHDL entity seq_circuit is port(in1,in2,clk: in std_logic; out1: out std_logic); end seq_circuit; architecture structural of seq_circuit is component comb port(a,b: in std_logic; z: out std_logic; c_state: in std_logic_vector (1 downto 0); nxt_state: out std_logic_vector (1 downto 0)); end component; component dff port(D,clk: in std_logic; Q,Qbar: out std_logic); end component; for all: comb use entity work.comb(comb_behavior); for all: dff use entity work.dff(dff_behavior); signal s1,s2: std_logic_vector (1 downto 0); begin C0:comb port map(a=>in1,b=>in2,c_state=>s1,z=>out1, nxt_state=>s2); D0:dffport map(D=>s2(0),clk=>clk,Q=>s1(0),Qbar=>open); D1:dffport map(D=>s2(1),clk=>clk,Q=>s1(1),Qbar=>open); end structural;

20 331 W06.20Fall 2003 Summary  Introduction to VHDL l A language to describe hardware -entity = symbol, architecture ~ schematic, signals = wires l Inherently concurrent (parallel) l Has time as concept l Behavioral descriptions of a component -can be specified using CSAs -can be specified using one or more processes and sequential statements l Structural descriptions of a system are specified in terms of its interconnections -behavioral models of each component must be provided

21 331 W06.21Fall 2003 Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design. The Mythical Man-Month, Brooks, pg. 43

22 331 W06.22Fall 2003 Review: MIPS ISA CategoryInstrOp CodeExampleMeaning Arithmetic (R & I format) add0 and 32add $s1, $s2, $s3$s1 = $s2 + $s3 subtract0 and 34sub $s1, $s2, $s3$s1 = $s2 - $s3 add immediate8addi $s1, $s2, 6$s1 = $s2 + 6 or immediate13ori $s1, $s2, 6$s1 = $s2 v 6 Data Transfer (I format) load word35lw $s1, 24($s2)$s1 = Memory($s2+24) store word43sw $s1, 24($s2)Memory($s2+24) = $s1 load byte32lb $s1, 25($s2)$s1 = Memory($s2+25) store byte40sb $s1, 25($s2)Memory($s2+25) = $s1 load upper imm15lui $s1, 6$s1 = 6 * 2 16 Cond. Branch (I & R format) br on equal4beq $s1, $s2, Lif ($s1==$s2) go to L br on not equal5bne $s1, $s2, Lif ($s1 !=$s2) go to L set on less than0 and 42slt $s1, $s2, $s3if ($s2<$s3) $s1=1 else $s1=0 set on less than immediate 10slti $s1, $s2, 6if ($s2<6) $s1=1 else $s1=0 Uncond. Jump (J & R format) jump2j 2500go to 10000 jump register0 and 8jr $t1go to $t1 jump and link3jal 2500go to 10000; $ra=PC+4

23 331 W06.23Fall 2003 Review: MIPS Organization, so far Processor Memory 32 bits 2 30 words read/write addr read data write data word address (binary) 0…0000 0…0100 0…1000 0…1100 1…1100 Register File src1 addr src2 addr dst addr write data 32 bits src1 data src2 data 32 registers ($zero - $ra) 32 5 5 5 PC ALU 32 0123 7654 byte address (big Endian) Fetch PC = PC+4 DecodeExec Add 32 4 Add 32 br offset

24 331 W06.24Fall 2003 Arithmetic  Where we've been: l Abstractions: -Instruction Set Architecture (ISA) -Assembly and machine language  What's up ahead: l Implementing the architecture (in VHDL) 32 m (operation) result A B ALU 4 zeroovf 1 1

25 331 W06.25Fall 2003 ALU VHDL Representation entity ALU is port(A, B: in std_logic_vector (31 downto 0); m: in std_logic_vector (3 downto 0); result: out std_logic_vector (31 downto 0); zero: out std_logic; ovf: out std_logic) end ALU; architecture process_behavior of ALU is... begin ALU: process begin... result := A + B;... end process ALU; end process_behavior;

26 331 W06.26Fall 2003 Number Representation  Bits are just bits (have no inherent meaning) l conventions define the relationships between bits and numbers  Binary numbers (base 2) - integers 0000  0001  0010  0011  0100  0101  0110  0111  1000  1001 ... l in decimal from 0 to 2 n -1 for n bits  Of course, it gets more complicated l storage locations (e.g., register file words) are finite, so have to worry about overflow (i.e., when the number is too big to fit into 32 bits) l have to be able to represent negative numbers, e.g., how do we specify -8 in addi$sp, $sp, -8#$sp = $sp - 8 l in real systems have to provide for more that just integers, e.g., fractions and real numbers (and floating point)

27 331 W06.27Fall 2003 Possible Representations Sign Mag.Two’s Comp.One’s Comp. 1000 = -8 1111 = -71001= -71000 = -7 1110 = -61010 = -61001 = -6 1101 = -51011 = -51010 = -5 1100 = -4 1011 = -4 1011 = -31101 = -31100 = -3 1010 = -21110 = -21101 = -2 1001 = -11111 = -11110 = -1 1000 = -01111 = -0 0000 = +00000 = 00000 = +0 0001 = +1 0010 = +2 0011 = +3 0100 = +4 0101 = +5 0110 = +6 0111 = +7  Issues: balance number of zeros ease of operations  Which one is best? Why?

28 331 W06.28Fall 2003  32-bit signed numbers (2’s complement): 0000 0000 0000 0000 0000 0000 0000 0000 two = 0 ten 0000 0000 0000 0000 0000 0000 0000 0001 two = + 1 ten 0000 0000 0000 0000 0000 0000 0000 0010 two = + 2 ten... 0111 1111 1111 1111 1111 1111 1111 1110 two = + 2,147,483,646 ten 0111 1111 1111 1111 1111 1111 1111 1111 two = + 2,147,483,647 ten 1000 0000 0000 0000 0000 0000 0000 0000 two = – 2,147,483,648 ten 1000 0000 0000 0000 0000 0000 0000 0001 two = – 2,147,483,647 ten 1000 0000 0000 0000 0000 0000 0000 0010 two = – 2,147,483,646 ten... 1111 1111 1111 1111 1111 1111 1111 1101 two = – 3 ten 1111 1111 1111 1111 1111 1111 1111 1110 two = – 2 ten 1111 1111 1111 1111 1111 1111 1111 1111 two = – 1 ten  What if the bit string represented addresses? l need operations that also deal with only positive (unsigned) integers maxint minint MIPS Representations

29 331 W06.29Fall 2003 Review: Signed Binary Representation 2’s compdecimal 1000-8 1001-7 1010-6 1011-5 1100-4 1101-3 1110-2 1111 00000 00011 00102 00113 01004 01015 01106 01117 2 3 - 1 = 1011 then add a 1 1010 complement all the bits -(2 3 - 1) = -2 3 =

30 331 W06.30Fall 2003  Negating a two's complement number: complement all the bits and add a 1 l remember: “negate” and “invert” are quite different!  Converting n-bit numbers into numbers with more than n bits: l MIPS 16-bit immediate gets converted to 32 bits for arithmetic copy the most significant bit (the sign bit) into the other bits 0010 -> 0000 0010 1010 -> 1111 1010 sign extension versus zero extend ( lb vs. lbu ) Two's Complement Operations

31 331 W06.31Fall 2003 Goal: Design a ALU for the MIPS ISA  Must support the Arithmetic/Logic operations of the ISA  Tradeoffs of cost and speed based on frequency of occurrence, hardware budget

32 331 W06.32Fall 2003 MIPS Arithmetic and Logic Instructions  Signed arithmetic generates overflow, but no carry out R-type: I-Type: 3125201550 opRsRtRdfunct opRsRtImmed 16 Type opfunct ADDI001000xx ADDIU001001xx SLTI001010xx SLTIU001011xx ANDI001100xx ORI001101xx XORI001110xx LUI001111xx Type op funct ADD000000100000 ADDU000000100001 SUB000000100010 SUBU000000100011 AND000000100100 OR000000100101 XOR000000100110 NOR000000100111 Type opfunct 000000101000 000000101001 SLT000000101010 SLTU000000101011 000000101100

33 331 W06.33Fall 2003 Design Trick: Divide & Conquer  Break the problem into simpler problems, solve them and glue together the solution  Example: assume the immediates have been taken care of before the ALU l now down to 10 operations l can encode in 4 bits 00add 01addu 02sub 03subu 04and 05or 06xor 07nor 12slt 13sltu

34 331 W06.34Fall 2003  Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101  Two's complement operations easy subtraction using addition of negative numbers 0111  0111 - 0110  + 1010  Overflow (result too large for finite computer word): e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 1000 Addition & Subtraction

35 331 W06.35Fall 2003 Building a 1-bit Binary Adder 1 bit Full Adder A B S carry_in carry_out S = A xor B xor carry_in carry_out = A  B v A  carry_in v B  carry_in (majority function)  How can we use it to build a 32-bit adder?  How can we modify it easily to build an adder/subtractor? ABcarry_incarry_outS 00000 00101 01001 01110 10001 10110 11010 11111

36 331 W06.36Fall 2003 Building 32-bit Adder 1-bit FA A0A0 B0B0 S0S0 c 0 =carry_in c1c1 1-bit FA A1A1 B1B1 S1S1 c2c2 A2A2 B2B2 S2S2 c3c3 c 32 =carry_out 1-bit FA A 31 B 31 S 31 c 31...  Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect...  Ripple Carry Adder (RCA) advantage: simple logic, so small (low cost) disadvantage: slow and lots of glitching (so lots of energy consumption)

37 331 W06.37Fall 2003 Building 32-bit Adder/Subtractor  Remember 2’s complement is just complement all the bits add a 1 in the least significant bit A 0111  0111 B - 0110  + 1010 1-bit FA S0S0 c 0 =carry_in c1c1 1-bit FA S1S1 c2c2 S2S2 c3c3 c 32 =carry_out 1-bit FA S 31 c 31... A0A0 A1A1 A2A2 A 31 B0B0 B1B1 B2B2 B 31 add/subt B0B0 control (0=add,1=subt) B 0 if control = 0, !B 0 if control = 1

38 331 W06.38Fall 2003 Overflow Detection and Effects  Overflow: the result is too large to represent in the number of bits allocated  When adding operands with different signs, overflow cannot occur! Overflow occurs when l adding two positives yields a negative l or, adding two negatives gives a positive l or, subtract a negative from a positive gives a negative l or, subtract a positive from a negative gives a positive  On overflow, an exception (interrupt) occurs l Control jumps to predefined address for exception l Interrupted address (address of instruction causing the overflow) is saved for possible resumption  Don't always want to detect (interrupt on) overflow

39 331 W06.39Fall 2003 New MIPS Instructions CategoryInstrOp CodeExampleMeaning Arithmetic (R & I format) add unsigned0 and 33addu $s1, $s2, $s3$s1 = $s2 + $s3 subt unsigned0 and 35subu $s1, $s2, $s3$s1 = $s2 - $s3 add imm. unsigned 9addiu $s1, $s2, 6$s1 = $s2 + 6 Data Transfer load byte unsigned 36lbu $s1, 25($s2)$s1 = Memory($s2+25) Cond. Branch (I & R format) set on less than unsigned 0 and 43sltu $s1, $s2, $s3if ($s2<$s3) $s1=1 else $s1=0 set on less than imm. unsigned 11sltiu $s1, $s2, 6if ($s2<6) $s1=1 else $s1=0  Sign extend - addiu, sltiu  Zero extend - lbu  No overflow detected - addu, subu, addiu, sltu, sltiu

40 331 W06.40Fall 2003 Conclusion  We can build an ALU to support the MIPS ISA l we can efficiently perform subtraction using two’s complement l we can replicate a 1-bit ALU to produce a 32-bit ALU  Important points about hardware l all of the gates are always working (concurrent) l the speed of a gate is affected by the number of inputs to the gate (fan-in) and the number of gates that the output is connected to (fan-out) l the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “number of levels of logic”)  Our primary focus: comprehension, however, l Clever changes to organization can improve performance (similar to using better algorithms in software)


Download ppt "331 W06.1Fall 2003 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 6 [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane."

Similar presentations


Ads by Google