RAMP Retreat, UC Berkeley SMASH: The C++ Layer Krste Asanovic krste@mit.edu MIT Computer Science and Artificial Intelligence Laboratory http://cag.csail.mit.edu/scale RAMP Retreat, UC Berkeley January 11, 2007
SMASH: SiMulation And SyntHesis Goal: One framework for both architectural exploration and chip design and verification Approach: High-level design discipline where design expressed as network of transactors (transactional actor) Transactors (aka units) refined down to RTL implementations Design structure preserved during refinement From my perspective, RDL & RAMP are pieces of SMASH
Transactor Anatomy Transactor Output queues Input queues Transactions Transactor unit comprises: Architectural state (registers + RAMs) Input queues and output queues connected to other units Transactions (guarded atomic actions on state and queues) Scheduler (selects next ready transaction to run) Transactions Output queues Input queues Scheduler Transactor Architectural State Advantages Handles non-deterministic inputs Allows concurrent operations on mutable state within unit Natural representation for formal verification
RAMP Design Framework Overview Target System: the machine being emulated Describe structure as transactor netlist in RAMP Description Language (RDL) Describe behavior of each leaf unit in favorite language (Verilog, VHDL, Bluespec, C/C++, Java) CPU Interconnect Network DRAM 2VP70 FPGA RDL Compiled to FPGA Emulation BEE2 Host Platform RDL Compiled to Software Simulation Workstation Host Platform Host Platforms: systems that run the emulation or simulation Can have part of target mapped to FPGA emulation and part mapped to software simulation SMASH/C++ is the way to write leaf units in C++, either for use with RDL or for standalone C++ simulation
What’s in SMASH/C++? A C++ class library plus conventions for writing transactor leaf units These should work within a RDL-generated C++-harness In addition, libraries for channels, configuration, and parameter passing code to support standalone C++ elaboration and simulation Also, can convert HDL modules into C++ units for co-simulation Verilog -> C++ using either Verilator or Tenison VTOC Bluespec -> C++ using Bluespec Csim
Why C++? I thought RAMP was FPGAs? Initial design in C++, eventually mapped into RTL Much faster to spin C++ design than to spin FPGA design Hardware verification needs golden model Some units might only ever be software Power/temperature models Disk models
SMASH/C++ Code Example – Leaf Unit struct Increment : public smash::IUnit_LeafImpl { // Parameters static const smash::Parameter<int> inc_amount; // Port functions smash::InputPort<IntMsg>& in(){return m_in;} smash::OutputPort<IntMsg>& out(){return m_out;} void elaborate(smash::ParameterList& plist) { m_inc = plist.get(Increment::inc_amount, 1); }; bool tick() if ( xactInc() ) return true; else if ( xactBumpInc() ) return true; return false; } private: // Ports smash::InputPort<IntMsg> m_in; smash::OutputPort<IntMsg> m_out; // Private state int m_inc; // Private transactions…
Example Leaf Unit Transactions bool xactInc() { bool xactIncFired = m_in.deqRdy() && m_out.enqRdy() && (m_in.first() != 0); if ( !xactIncFired ) return false; m_out.enq( m_in.first() + m_inc ); m_in.deq(); return true; } bool xactBumpInc() bool xactBumpIncFired = m_in.deqRdy() && (m_in.first() == 0); if ( !xactBumpIncFired ) m_inc += 1; }
SMASH/C++ Example: Structural Unit struct IncPipe : public smash::IUnit_StructuralImpl { // Port functions smash::InputPort<IntMsg>& in() {return m_in;} smash::OutputPort<IntMsg>& out() {return m_out;} void elaborate( smash::ParameterList& plist ) { regPort ( "in", &m_in ); regUnit ( "incA", &m_incA ); regChannel ( “inc2inc", &m_inc2inc ); regUnit ( "incB", &m_incB ); regPort ( "out", &m_out ); elaborateChildUnits(plist); // Connect child units and channels smash::connect( m_in, m_incA.in() ); smash::connect( m_incA.out(), m_channel, m_incB.in() ); smash::connect( m_incB.out(), m_out ); } private: // Ports smash::InputPort<IntMsg> m_in; smash::OutputPort<IntMsg> m_out; // Child units and channels Increment m_incA; Increment m_incB; smash::SimpleChannel<IntMsg> m_inc2inc; }; InputPort “in” IncPipe Incrementer “incA” SimpleChannel “inc2inc” Incrementer “incB” OutputPort “out”
SMASH/C++ Example: Simulation Loop int main( int argc, char* argv[] ) { // Toplevel channels and unit smash::SimpleChannel<IntMsg> iChannel("iChannel",32,3,7); smash::SimpleChannel<IntMsg> oChannel("oChannel",32,3,7); IncPipe incPipe; incPipe.setName("top"); // Set some parameters and elaborate the design smash::ParameterList plist; plist.set("top.incB",Increment::increment_amount, 2); plist.set("top.inc2inc",SimpleChannel<IntMsg>::bandwidth,32); plist.set("top.inc2inc",SimpleChannel<IntMsg>::latency,3); plist.set("top.inc2inc",SimpleChannel<IntMsg>::buffering,7); incPipe.elaborate(plist); // Connect the toplevel channels to the toplevel unit smash::connect( iChannel, incPipe.in() ); smash::connect( incPipe.out(), oChannel ); // Simulation loop int testInputs[] = { 1, 2, 0, 3, 4, 0, 1, 2, 3, 4 }; int inputIndex = 0; for ( int cycle = 0; cycle < 20; cycle++ ) { if ( iChannel.enqRdy() && (inputIndex < 10) ) iChannel.enq( IntMsg(testInputs[inputIndex++]) ); if ( oChannel.deqRdy() ) { std::cout << oChannel.first() << std::endl; oChannel.deq(); } incPipe.tick(); // Hierarchical tick iChannel.tick(); // Always tick units before channels oChannel.tick(); “iChannel” IncPipe “top” Incrementer “incA” SimpleChannel “inc2inc” Incrementer “incB” “oChannel”
Why didn’t we just use SystemC? If you’re asking, you haven’t read the SystemC standard Ugly semantics Too many ways of doing the same thing Fundamental assumption is that host is sequential SMASH/C++ designed to support parallel hosts Even worse, simulator is a global object (can’t have two engines in one executable) In industry, architects use SystemC, hardware designers ignore it when building chips
Issues Need to figure out flexible type system and bindings from RDL into C++/Bluespec/Verilog Need to figure out common (across C++/RDL/Bluespec) interfaces/syntax for Elaboration Configuration Debugging Monitoring/Tracing