Modeling VHDL in POSE
Overview Motivation Motivation Quick Introduction to VHDL Quick Introduction to VHDL Mapping VHDL to POSE (the Translator) Mapping VHDL to POSE (the Translator) Simulator Overview Simulator Overview
Motivation Circuit Complexity on the Rise (# Transistors) Circuit Complexity on the Rise (# Transistors) P4 (Prescott = 125 million, EE = 178 million) P4 (Prescott = 125 million, EE = 178 million) Athlon 64 (Venice = 114 million, X2 = million) Athlon 64 (Venice = 114 million, X2 = million) NVidia 7800 GTX – 302 million NVidia 7800 GTX – 302 million ATI X1800 Series – 321 million ATI X1800 Series – 321 million Simulation is becoming a bottleneck it the design cycle (it can take days to run a simulation) Simulation is becoming a bottleneck it the design cycle (it can take days to run a simulation) Use multiple processors to speedup simulation Use multiple processors to speedup simulation
Overview Motivation Motivation Quick Introduction to VHDL Quick Introduction to VHDL Mapping VHDL to POSE (the Translator) Mapping VHDL to POSE (the Translator) Simulator Overview Simulator Overview
VHDL - Intro V HSIC H ardware D escription L anguage V HSIC H ardware D escription L anguage V ery H igh S peed I ntegrated C ircuit V ery H igh S peed I ntegrated C ircuit Language for modeling ICs Language for modeling ICs Two ways of describing ICs Two ways of describing ICs Structural – Model the circuit using gate operators Structural – Model the circuit using gate operators Behavioral – Using code (ifs, loops, etc.) Behavioral – Using code (ifs, loops, etc.) Using simulation, logic designs can be verified Using simulation, logic designs can be verified
VHDL – Example entity AND_GATE is entity AND_GATE is port (A, B: in BIT; Y: out BIT); port (A, B: in BIT; Y: out BIT); end AND_GATE; end AND_GATE; architecture AND_GATE_struct of AND_GATE is architecture AND_GATE_struct of AND_GATE is begin begin Y <= A and B; Y <= A and B; end AND_GATE_struct; end AND_GATE_struct;
VHDL – Example (II) entity AND_GATE is entity AND_GATE is port (A, B: in BIT; Y: out BIT); port (A, B: in BIT; Y: out BIT); end AND_GATE; end AND_GATE; architecture AND_GATE_behav of AND_GATE is architecture AND_GATE_behav of AND_GATE is begin begin process (A, B) begin process (A, B) begin if (A = ‘1’) and (B = ‘1’) then if (A = ‘1’) and (B = ‘1’) then Y <= ‘1’; Y <= ‘1’; else else Y <= ‘0’; Y <= ‘0’; end if; end if; end process; end process; end AND_GATE_behav; end AND_GATE_behav;
VHDL – Example (III) entity NOT_GATE is entity NOT_GATE is port (A: in BIT; Y: out BIT); port (A: in BIT; Y: out BIT); end NOT_GATE; end NOT_GATE; entity AND_GATE is entity AND_GATE is port (A, B: in BIT; Y: out BIT); port (A, B: in BIT; Y: out BIT); end AND_GATE; end AND_GATE; entity NAND_GATE is entity NAND_GATE is port (A, B: in BIT; Y: out BIT); port (A, B: in BIT; Y: out BIT); end NAND_GATE; end NAND_GATE; architecture NAND_GATE_struct of NAND_GATE is architecture NAND_GATE_struct of NAND_GATE is signal andOutput : BIT; signal andOutput : BIT; component NOT_GATE port (A: in BIT; Y: out BIT); end component; component NOT_GATE port (A: in BIT; Y: out BIT); end component; component AND_GATE port (A, B: in BIT; Y: out BIT); end component; component AND_GATE port (A, B: in BIT; Y: out BIT); end component; begin begin myAndGate : AND_GATE port map (A=>A, B=>B, Y=>andOutput); myAndGate : AND_GATE port map (A=>A, B=>B, Y=>andOutput); myNotGate : NOT_GATE port map (A=>andOutput, Y=>Y); myNotGate : NOT_GATE port map (A=>andOutput, Y=>Y); end NOT_GATE_struct; end NOT_GATE_struct;
VHDL Kernel Continuously loop through simulation cycles Continuously loop through simulation cycles Simulation Cycles Simulation Cycles Computation/Processing Computation/Processing Calculate simulation time for next simulation cycle based on ‘earliest next event’ Calculate simulation time for next simulation cycle based on ‘earliest next event’ If time advances, move onto next simulation cycle If time advances, move onto next simulation cycle Otherwise, enter delta cycle Otherwise, enter delta cycle Delta Cycle Delta Cycle Similar to a Simulation Cycle but has less computation and simulation time has not moved forward from last cycle (e.g. – no postponed processes) Similar to a Simulation Cycle but has less computation and simulation time has not moved forward from last cycle (e.g. – no postponed processes)
Overview Motivation Motivation Quick Introduction to VHDL Quick Introduction to VHDL Mapping VHDL to POSE (the Translator) Mapping VHDL to POSE (the Translator) Simulator Overview Simulator Overview
Translator Source (VHDL) to Source (POSE/C++) Source (VHDL) to Source (POSE/C++) Only a subset of VHDL so far Only a subset of VHDL so far Generates POSE/Charm++/C++ Code Generates POSE/Charm++/C++ Code Simple checks (identifiers are valid, etc.) Simple checks (identifiers are valid, etc.) C++ provides Portable Code C++ provides Portable Code
Mapping VHDL to C++ Each Component becomes a Poser (Object in C++) Each Component becomes a Poser (Object in C++) Granularity of the simulation can be controlled by the user Granularity of the simulation can be controlled by the user Signals and shared variables become member variables Signals and shared variables become member variables Processes become member functions Processes become member functions Contexts created for each process to hold locals Contexts created for each process to hold locals POSE Event Queues hold pending wire changes POSE Event Queues hold pending wire changes
Interesting Problems Load Balancing Load Balancing Zero Time (Delta Cycles) Zero Time (Delta Cycles) Pausing Execution of Processes Pausing Execution of Processes
Load Balancing The components in VHDL form a tree structure The components in VHDL form a tree structure Components in the tree closer to the root become locations with high communication Components in the tree closer to the root become locations with high communication Optimization: If value just “passes through parent node”, forward value to all targets Optimization: If value just “passes through parent node”, forward value to all targets Score based mapping Score based mapping Bursts of Activity Bursts of Activity Entire portions of the simulation may go “inactive” for several simulation cycles and then suddenly become highly active because of some trigger Entire portions of the simulation may go “inactive” for several simulation cycles and then suddenly become highly active because of some trigger Phase based load balancing (hard, future work) Phase based load balancing (hard, future work)
VHDL Structure
Zero Time (Delta Cycles) Forward progression without moving simulation time forward Forward progression without moving simulation time forward Ordering of “Events” within a single simulation cycle Ordering of “Events” within a single simulation cycle Object Virtual Time (OVT) in POSE not sufficient for this task (at least how it was intended to be used) Object Virtual Time (OVT) in POSE not sufficient for this task (at least how it was intended to be used)
Pausing Execution of Processes Sequential Statements (Behavioral Code) Sequential Statements (Behavioral Code) Most of the sequential statements map well Most of the sequential statements map well if-then-else, loops, etc. if-then-else, loops, etc. The “Wait” Statement The “Wait” Statement A sequence of statements may need to suspend A sequence of statements may need to suspend Cannot use POSE’s “elapse( )” function to elapse time since other activity may be taking place Cannot use POSE’s “elapse( )” function to elapse time since other activity may be taking place Solution: Allows the C++ functions that represent the processes to suspend execution Solution: Allows the C++ functions that represent the processes to suspend execution Return at some statement and then restart after that statement Return at some statement and then restart after that statement To do this efficiently… use ‘goto’s and ‘switch’s… :( To do this efficiently… use ‘goto’s and ‘switch’s… :(
Overview Motivation Motivation Quick Introduction to VHDL Quick Introduction to VHDL Mapping VHDL to POSE (the Translator) Mapping VHDL to POSE (the Translator) Simulation Overview Simulation Overview
Simulation Structure Each Component Becomes a Poser Each Component Becomes a Poser Control messages Control messages Signals and variables become local to Poser Signals and variables become local to Poser Send message to self if assignment will be in the future Send message to self if assignment will be in the future Port values are transmitted between components Port values are transmitted between components Send messages to parent/child component(s) Send messages to parent/child component(s) Communication intensive (especially if modeled to the gate level) Communication intensive (especially if modeled to the gate level) Load balancer needs to be aware of communication costs and/or patterns Load balancer needs to be aware of communication costs and/or patterns One VCDGenerator per Processor One VCDGenerator per Processor Handles output Handles output
VHDL – Simulation Cycle
VHDL - Simulation
NAND Example
VHDL – Simulation Cycle (II) Components do not need to be synchronized Components do not need to be synchronized Components can be optimistic Components can be optimistic Portions of the simulation can be ahead (in time) of other portions Portions of the simulation can be ahead (in time) of other portions “Optimism” is limited by strategy/POSE-configuration and GVT calculation “Optimism” is limited by strategy/POSE-configuration and GVT calculation Components can skip cycles where they are idle Components can skip cycles where they are idle Wakeup event allows components to skip the simulation cycles where they do no work Wakeup event allows components to skip the simulation cycles where they do no work Separate drive/process events allow delta cycles to be skipped Separate drive/process events allow delta cycles to be skipped
Future Work Additional support for VHDL Additional support for VHDL Add support for Verilog Add support for Verilog Create a second front end that can parse Verilog Create a second front end that can parse Verilog Allow the AST or a second IR to represent the super-set of VHDL and Verilog Allow the AST or a second IR to represent the super-set of VHDL and Verilog Load Balancing Load Balancing Score-Based Refinement Score-Based Refinement Phase-Based Phase-Based Automatic Granularity Detection: Break apart large components or combine smaller components without making the user modify code Automatic Granularity Detection: Break apart large components or combine smaller components without making the user modify code And so much more… And so much more…