CSE 331 Computer Organization and Design Fall 2007 Week 6 Section 1: Mary Jane Irwin (www.cse.psu.edu/~mji) Section 2: Krishna Narayanan Course material on ANGEL: cms.psu.edu [adapted from D. Patterson slides]
Head’s Up Last week’s material This week’s material Addressing modes; Assemblers, linkers and loaders Exam 1 taken This week’s material VHDL Reading assignment –Y Chp1-5, PH B.4 Next week’s material MIPS arithmetic and ALU design Reading assignment – PH 3.1-3.5, B.5-B.6 Reminders HW 5 is due Wednesday, Oct 10th (by 11:55pm) Quiz 4 will be closed Thursday, Oct 11th (by 11:55pm) Exam #2 is Thursday, Nov 8th, 6:30 to 7:45pm
Processor Organization Processor control needs to have the Ability to input instructions from memory Logic to control instruction sequencing and to issue signals that control the way information flows between the datapath components and the operations performed by them Processor datapath needs to have the Ability to load data from and store data to memory Interconnected components - functional units (e.g., ALU) and storage units (e.g., Register File) - for executing the ISA Need a way to describe the organization High level (block diagram) description Schematic (gate level) description Textural (simulation/synthesis level) description
Why Simulate First? Physical breadboarding (as in CSE 275) discrete components/lower scale integration precedes actual construction of the prototype verification of the initial design No longer possible as designs reach higher levels of integration! Simulation before construction - aka functional verification high level constructs means faster to design and test can play “what if” more easily limited performance (can’t usually simulate all possible input transitions) and limited accuracy (can’t usually model wiring delays accurately) Talk about special topics course this spring
Levels of Description of a Digital System Architectural Functional/Behavioral Register Transfer Logic Circuit model is a high level programmer's view; written in your favorite programming language model is a block diagram view model is in terms of datapath FUs, registers, busses; register xfer operations are clock phase accurate model is in terms of logic gates; delay information can be specified for gates; digital waveforms model is in terms of circuits (electrical behavior); accurate analog waveforms Less Abstract More Accurate Slower Simulation Special languages + simulation systems for describing the inherent parallel activity in hardware (VHDL and verilog) Schematic capture + logic simulation package like LogicWorks
In some form, nearly all design activities of a microprocessor development are aimed at getting the SRTL [structural RTL] model right. We started writing a C program to model the essentials of the P6 engine … We called this program the grassman, because it wasn’t mature enough even for a strawman … We would write everything behaviorally at first, and then gradually substitute the SRTL versions in whatever order they became available. The Pentium Chronicles, Colwell, pg. 49
VHDL (VHSIC Hardware Description Language) Support design, documentation, simulation & verification, and synthesis of hardware Allows integrated design at behavioral and structural (gate) levels Concepts Design entity-architecture descriptions Time-based execution (discrete event simulation) model Entity == External Characteristics Design Entity-Architecture == Hardware Component Architecture (Body) == Internal Behavior or Structure
Entity Interface Externally visible characteristics Ports: channels of communication (inputs, outputs, clocks, control) Generic parameters: define class of components (timing characteristics, size, fan-out) entity name_of_component is port(a,b: in std_logic; y: out std_logic); end entity name_of_component;
Architecture Body Internal behavior or structure of circuit Declaration of module’s internal signals Description of behavior of circuit concurrent behavioral description - collection of Concurrent Signal Assignment (CSA) statements executed concurrently process behavioral description - CSAs and variable assignment statements within a process description Description of structure of the circuit - system described in terms of the interconnections of its components architecture behavior of name_of_component is signal s1,s2: std_logic; begin - description of behavior of ports and signals; end architecture behavior;
VHDL Example: nor-nor gate b y c entity nor_nor_logic is port (a,b,c: in std_logic; y: out std_logic); end entity nor_nor_logic; architecture concurrent_behavior of nor_nor_logic is signal t0: std_logic; begin t0 <= a nor b; y <= t0 nor c; end architecture concurrent_behavior; <= indicates a Concurrent Signal Assignment (CSA) like “real” logic nor_nor “process” is in an infinite loop
Modeling Delays Can model temporal, as well as functional behavior, with delays in CSAs t0 changes 1 ns after a or b changes entity nor_nor_logic is port (a,b,c: in std_logic; y: out std_logic); end entity nor_nor_logic; architecture concurrent_behavior of nor_nor_logic is signal t0: std_logic; begin t0 <= (a nor b) after 1 ns; y <= (t0 nor c) after 1 ns; end architecture concurrent_behavior; Ask them to think about what happens if the after annotation for y is 2 ns
Waveforms and Timing a t0 b y c abc 101 000 1 ns t0 y For lecture Assumes unit delay model (gates have non zero delay) Shaded area is a glitch y
signal <= value expressions after time expression Signals Digital systems are about signals, not variables signal <= value expressions after time expression signals (like t0 and y) are analogous to wires and change whenever their inputs change (resulting in a waveform) std_logic conforms to the IEEE 1164 standard library IEEE; use IEEE.std_logic_1164.all; entity nor_nor_logic is ...
Signal Resolution for IEEE 1164 Standard When a signal has multiple drivers (e.g., a bus), the value of the resulting signal is determined by a resolution function U Unknown X Forcing unknown 1 Z High imped W Weak unknown L Weak 0 H Weak 1 - Don’t care For lecture e.g., is one source is driving the shared signal to 1, and the other source to 0, the resulting value will be X
A 2-input AND Gate with Delay = 2 ns b and entity is port( : in std_logic ; : out std_logic); end entity ; architecture behavior of is begin end architecture ; a,b y and and For lecture y <= (a and b) after 2ns; behavior
Bit-Vector Data Types Simple way to represent sets of signals (e.g. a 32-bit bus) std_logic_vector (31 downto 0) entity nand32 is port(a,b: in std_logic_vector (31 downto 0); y: out std_logic_vector (31 downto 0)); end entity nand32; architecture concurrent_behavior of nand32 is begin y <= a nand b; end architecture concurrent_behavior; The analyzer (compiler) expands the architecture description into thirty-two 2-input nand gates with the inputs connected appropriately
Model of Execution CSA’s are executed concurrently - textural order of the statements is irrelevant to correct operation Two stage model of circuit execution first stage all CSA’s with events occurring at the current time on signals on their right hand side are evaluated all future events that are generated from this evaluation are scheduled on an events list second stage time is advanced to the time of the next event VHDL programmer specifies events - with CSA’s delays - with CSA’s with delay annotation concurrency - by having a distinct CSA for each signal
Discrete Event Simulation Model Methodology for modeling the generation of events in physical systems models the passage of time and the occurrence of events at various points in time for nor_nor_logic simulator clock 5ns signal events 101 000 0 1 y,t0@6ns 1 0 y@7ns a,b,c@5ns For lecture simulator clock 6ns signal events 0 1 y,t0@6ns 1 0 y@7ns
Discrete Event Simulation Steps Advance simulation time to that of the event with the smallest timestamp in the event list. Execute all events at this timestamp by updating signal values Execute the simulation models of all components affected by the new signal values Schedule any future events Repeat until the event list is empty, or a preset simulation time has expired
Computer design has moved far past the point where people can keep complex microarchitectures in their heads. Behavioral modeling is mandatory. Eventually, the project will migrate from a basis that is mostly behavioral modeling to one that is mostly structural. The Pentium Chronicles, Colwell, pg. 51 & 55
IBM on campus… IBM Coop Night IBM Day open to all students. Meet recruiters from Software, Systems and Technology, Sales, Finance and Extreme Blue. October 16 7-8:30pm Wartik 110 IBM Day Meet recruiters from IBM Consulting, Software, Systems and Technology, Sales and Extreme Blue. October 17 11am-3pm HUB, Alumni Hall We look forward to talking with you about fulltime and co-op opportunities. Bring your resume. Interviews will be held on campus 10/18.
Review: VHDL Support design, documentation, simulation & verification, and synthesis of hardware Allows integrated design at behavioral and structural (gate) levels Concepts Design entity-architecture descriptions Time-based execution (discrete event simulation) model Entity == External Characteristics Design Entity-Architecture == Hardware Component Architecture (Body ) == Internal Behavior or Structure
Review: Entity-Architecture Features Entity defines externally visible characteristics Ports: channels of communication signal names for inputs, outputs, clocks, control Generic parameters: define class of components timing characteristics, size (fan-in), fan-out Architecture defines the internal behavior or structure of the circuit Declaration of internal signals Description of behavior collection of Concurrent Signal Assignment (CSA) statements (indicated by <=); can also model temporal behavior with the delay annotation one or more processes containing CSAs and (sequential) variable assignment statements (indicated by :=) Description of structure interconnections of components; underlying behavioral models of each component must be specified
Review: An Entity-Architecture Example b y c entity nor_nor_logic is port(a,b,c: in std_logic; y: out std_logic); end entity nor_nor_logic; architecture concurrent_behavior of nor_nor_logic is signal t0: std_logic; begin t0 <= (a nor b) after 1 ns; y <= (t0 nor c) after 1 ns; end architecture concurrent_behavior;
Signal Modes Signals in port declarations can be input signals (in), output signals (out) or bidirectional signals (inout) R S Q Qbar entity RS_latch is port(R,S: in std_logic; Q, Qbar: inout std_logic); end entity RS_latch; architecture concurrent_behavior of RS_latch is begin Q <= (Qbar nor R) after 1 ns; Qbar <= (Q nor S) after 1 ns; end architecture concurrent_behavior;
Constant Objects Constant parameters provide default values may be overridden on each instance attach value to symbol as attribute entity nor_nor_logic is port(a,b,c: in std_logic; y: out std_logic); end entity nor_nor_logic; architecture concurrent_behavior of nor_nor_logic is signal t0: std_logic; constant gate_delay: Time := 1 ns; begin t0 <= (a nor b) after gate_delay; y <= (t0 nor c) after gate_delay; end architecture concurrent_behavior;
Conditional Signal Assignment Statement Conditional CSA order is important - the first conditional expression that evaluates to true determines the output signal 00 S0 In0 Z S1 In1 In2 In3 01 10 11 entity mux4 is port(In0,In1,In2,In3: in std_logic_vector (7 downto 0); S0,S1: in std_logic; Z: out std_logic_vector (7 downto 0)); end entity mux4; architecture behavior of mux4 is begin Z <= In0 after 5 ns when S0 = ‘0’ and S1 = ‘0’ else In1 after 5 ns when S0 = ‘0’ and S1 = ‘1’ else In2 after 5 ns when S0 = ‘1’ and S1 = ‘0’ else In3 after 5 ns when S0 = ‘1’ and S1 = ‘1’ else “00000000” after 5 ns end architecture behavior;
Selected Signal Assignment Statement Selected CSA all choices are evaluated, but only one must be true entity reg_file is port(addr1,addr2: in std_logic_vector (1 downto 0); dout1, dout2: out std_logic_vector (31 downto 0)); end entity reg_file; architecture behavior of reg_file is signal reg0: std_logic_vector (31 downto 0) := to_stdlogicvector (x”00000000”); signal reg1,reg2: std_logic_vector (31 downto 0) := to_stdlogicvector (x”ffffffff”); begin with addr1 select dout1 <= reg0 after 5 ns when “00”, <= reg1 after 5 ns when “01”, <= reg2 after 5 ns when others; with addr2 select dout2 <= reg0 after 5 ns when “00”, end architecture behavior;
Motivation for Process Construct How would you build the logic for a 32x2 multiplexor given inverters and 2 input nands? SEL A[0] DOUT[0] DOUT[1] A[1] B[1] B[0] . . . TA(0) TA(1) TB(1) TB(0) SELbar 1 SEL A DOUT B Given the logic schematic, can you write the VHDL code? For lecture
MUX CSA Description entity MUX32X2 is port(A,B: in std_logic_vector(31 downto 0); DOUT: out std_logic_vector(31 downto 0); SEL: in std_logic); end entity MUX32X2; 1 SEL A DOUT B architecture conc_behavior of MUX32X2 is signal TA,TB: std_logic_vector (31 downto 0), SELbar: std_logic; begin SELbar <= not SEL after 1 ns; TA <= A nand SELbar after 2 ns; TB <= B nand SEL after 2 ns; DOUT <= TA nand TB after 2 ns; end architecture conc_behavior; expands to 32 gates each For lecture How can we describe the circuit in VHDL if we don’t know what primitive gates we will be designing with?
Mux Process Description entity MUX32X2 is port(A,B: in std_logic_vector(31 downto 0); DOUT: out std_logic_vector(31 downto 0); SEL: in std_logic); end entity MUX32X2; architecture process_behavior of MUX32X2 is begin mux32x2_process: process(A, B, SEL) if (SEL = ‘0’) then DOUT <= A after 5 ns; else DOUT <= B after 4 ns; end if; end process mux32x2_process; end architecture process_behavior; SEL A DOUT B 1 Talk about why sensitivity list – mux is combinational logic. so whenever one of the inputs change – either the select input or one (or both) of the two data inputs the process should reevaluate. If one of the signals on the sensitivity list changes, then 5 ns later that input change is reflected at the output signal Process fires whenever a signal in the “sensitivity list” changes
VHDL Process Features Process body is executed sequentially to completion in zero (simulation) time Delays are associated only with assignment of values to signals marked by CSAs <= operator Variable assignments take effect immediately marked by := operator Upon initialization all processes are executed once After initialization processes are data-driven activated by events on signals in sensitivity list waiting for the occurrence of specific events using wait statements
Process Programming Constructs if-then-else Boolean valued expressions are evaluated sequentially until first true is encountered case branches must cover all possible values for the case expression for loop loop index declared (locally) by virtue of use in loop stmt loop index cannot be assigned a value or altered in loop body while loop condition may involve variables modified within the loop if (expression1 = ‘value1’) then . . . elsif (expression2 = ‘value2’) then end if; case (expression) is when ‘value0’ => . . . end case; for index in value1 to value2 loop for loop - loop indices cannot be provided as parameters via a procedure call or as an input port while (condition) loop
Behavioral Description of a Register File write_cntrl src1_addr src1_data src2_addr 32 words dst_addr src2_data write_data 32 bits library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; entity regfile is port(write_data: in std_logic_vector(31 downto 0); dst_addr,src1_addr,src2_addr: in UNSIGNED(4 downto 0); write_cntrl: in std_logic; src1_data,src2_data: out std_logic_vector(31 downto 0)); end entity regfile;
Behavioral Description of a Register File, con’t architecture process_behavior of regfile is type reg_array is array(0 to 31) of std_logic_vector (31 downto 0); begin regfile_process: process(src1_addr,src2_addr,write_cntrl) variable data_array: reg_array := ( (X”00000000”), . . . (X”00000000”)); variable addrofsrc1, addrofsrc2, addrofdst: integer; addrofsrc1 := conv_integer(src1_addr); addrofsrc2 := conv_integer(src2_addr); addrofdst := conv_integer(dst_addr); if write_cntrl = ‘1’ then data_array(addrofdst) := write_data; end if; src1_data <= data_array(addrofsrc1) after 10 ns; src2_data <= data_array(addrofsrc2) after 10 ns; end process regfile_process; end architecture process_behavior;
Process Construct with Wait Statement Q library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; entity dff is port(D,clk: in std_logic; Q,Qbar: out std_logic); end entity dff; architecture dff_behavior of dff is begin output: process wait until (clk’event and clk = ‘1’); Q <= D after 5 ns; Qbar <= not D after 5 ns; end process output; end architecture dff_behavior; D dff Qbar clk positive edge-triggered
Wait Statement Types Wait statements specify conditions under which a process may resume execution after suspension wait for time expression suspends process for a period of time defined by the time expression wait on signal suspends process until an event occurs on one (or more) of the signals wait until condition suspends process until condition evaluates to specified Boolean wait Process resumes execution at the first statement following the wait statement wait for (20 ns); wait on clk, reset, status; waits allow us to build processes where we can suspend operation at multiple points within the process - not just at the beginning with a sensitivity list wait until (clk’event and clk = ‘1’);
Signal Attributes Attributes are used to return various types of information about a signal Function attribute Function signal_name’event Boolean value signifying a change in value on this signal signal_name’active Boolean value signifying an assignment made to this signal (may not be a new value!) signal_name’last_event Time since the last event on this signal signal_name’last_active Time since the signal was last active signal_name’last_value Previous value of this signal
Things to Remember About Processes A process must have either a sensitivity list or at least one wait statement A process cannot have both a sensitivity list and a wait statement Remember, all processes are executed once when the simulation is started Don’t confuse signals and variables. Signals are declared either in the port definitions in the entity description or as internal signals in the architecture description. They are used in CSAs. Signals will be updated only after the next simulation cycle. Variable exist only inside architecture process descriptions. They are used in variable assignment statements. Variables are updated immediately.
Finite State Machine “Structure” z comb b Fetch PC = PC+4 Decode Exec Q(0) dff D(0) Q(1) dff D(1) clk
Structural VHDL Model System is described by its component interconnections assumes we have previously designed entity-architecture descriptions for both comb and dff with behavioral models a in1 z out1 b comb in2 c_state(1) nxt_state(1) c_state(0) nxt_state(0) s1(0) s1(1) s2(0) s2(1) Q(0) dff D(0) For lecture Qbar(0) Q(1) dff D(1) Qbar(1) clk clk
Finite State Machine Structural VHDL entity seq_circuit is port(in1,in2,clk: in std_logic; out1: out std_logic); end entity seq_circuit; architecture structural of seq_circuit is component comb is port(a,b: in std_logic; z: out std_logic; c_state: in std_logic_vector (1 downto 0); nxt_state: out std_logic_vector (1 downto 0)); end component comb; component dff is port(D,clk: in std_logic; Q,Qbar: out std_logic); end component dff; for all: comb use entity work.comb(comb_behavior); for all: dff use entity work.dff(dff_behavior); signal s1,s2: std_logic_vector (1 downto 0); begin C0:comb port map(a=>in1,b=>in2,c_state=>s1,z=>out1, nxt_state=>s2); D0:dff port map(D=>s2(0),clk=>clk,Q=>s1(0),Qbar=>open); D1:dff port map(D=>s2(1),clk=>clk,Q=>s1(1),Qbar=>open); end architecture structural; go back and label class handout diagram to show structural relationship
Summary Introduction to VHDL A language to describe hardware entity = symbol, architecture ~ schematic, signals = wires Inherently concurrent (parallel) Has time as concept Behavioral descriptions of a component can be specified using CSAs can be specified using one or more processes and sequential statements Structural descriptions of a system are specified in terms of its interconnections behavioral models of each component must be provided