SystemC and Levels of System Abstraction: Part I
Levels of System Abstraction Executable specification Translation of design specification into SystemC Translation of design specification into SystemC Independent of the proposed implementation Independent of the proposed implementation Time delays if present denote timing constraints Time delays if present denote timing constraints Untimed functional model (CP) Separation of the specification into modules Separation of the specification into modules Communication between modules is point to point Communication between modules is point to point Communication usually via bounded FIFO Communication usually via bounded FIFO blocking read, blocking writeblocking read, blocking write No time delays present in the model No time delays present in the model
Levels of System Abstraction Timed functional model (CP + T) Processes communicate via point-to-point links Processes communicate via point-to-point links Timing delays added to processes to reflect Timing delays added to processes to reflect Timing constraintsTiming constraints Latency of a particular implementationLatency of a particular implementation Timing delays added to FIFOs to model communication latencies Timing delays added to FIFOs to model communication latencies Initial model for hardware/software trade-off analysis Initial model for hardware/software trade-off analysis
Levels of System Abstraction Transaction level model or Programmers view (PV) Communication modeled by function calls Communication modeled by function calls Similar to transport level specificationSimilar to transport level specification Bus burst mode of read/writeBus burst mode of read/write FIFOs replaced by actual communication protocolFIFOs replaced by actual communication protocol Bus contention modeled but not with realistic timingBus contention modeled but not with realistic timing Instruction set simulator for SW code (HW in timed process) Instruction set simulator for SW code (HW in timed process) Register accurateRegister accurate Memory map is defined Memory map is defined Timing defined in terms of instruction set cycles Timing defined in terms of instruction set cycles Platform transaction model (or PV + T) Interconnection architecture fully defined Interconnection architecture fully defined Multiple clock cycles for bus communicationMultiple clock cycles for bus communication Can model bus arbitration overheadCan model bus arbitration overhead SW on ISS, HW in timed process SW on ISS, HW in timed process Model is not pin accurate Model is not pin accurate
Levels of System Abstraction Behavioral hardware model (mainly for ASIC cores) Pin accurate and functionally accurate at its boundaries Pin accurate and functionally accurate at its boundaries Not clock accurate Not clock accurate No internal structure No internal structure Input to behavioral synthesis tool Input to behavioral synthesis tool Pin accurate cycle accurate model (CA) In addition to pin accurate it is also cycle accurate In addition to pin accurate it is also cycle accurate Does not necessarily have the internal structure Does not necessarily have the internal structure SW ISS replaced by model that includes cache SW ISS replaced by model that includes cache
Levels of System Abstraction Register transfer level model Pin accurate Pin accurate Cycle accurate Cycle accurate Internal structure is defined Internal structure is defined
System Design Styles Application specific standard product (ASSP) Intel IXP series network processors Intel IXP series network processors TI OMAP platform TI OMAP platform
System Design Styles Programming for application specific standard product (ASSP) Architecture is fixed Architecture is fixed Multiple programmable unitsMultiple programmable units Multiple application specific coresMultiple application specific cores Multiple memory interfacesMultiple memory interfaces Objective : Map application on target platform Objective : Map application on target platform Maximize performanceMaximize performance Design proceeds from CP, CP+T, PV to final implementation Design proceeds from CP, CP+T, PV to final implementation
System Design Styles Structured ASIC: IP based design style Xilinx platform FPGA Xilinx platform FPGA PowerPC hardcoresPowerPC hardcores Microblaze softcoresMicroblaze softcores 18 bit Multipliers18 bit Multipliers Block RAMSBlock RAMS Rocket I/O transceiversRocket I/O transceivers
System Design Styles Structured ASIC: IP based design style Architecture is fixed to a large extent Architecture is fixed to a large extent Flexibility in introducing HW coresFlexibility in introducing HW cores IP blocks provided by vendor IP blocks provided by vendor Designer generates new cores Designer generates new cores Programmable units are fixedProgrammable units are fixed Memory interfaces are fixedMemory interfaces are fixed Objective : Refine micro-architecture with an objective of Objective : Refine micro-architecture with an objective of Maximize: performanceMaximize: performance Minimize: cost, powerMinimize: cost, power Design proceeds from CP, CP+T, PV, PV+T, CA to final implementation Design proceeds from CP, CP+T, PV, PV+T, CA to final implementation
System Design Styles Generic multiprocessor SoC design Architecture is yet to be defined Architecture is yet to be defined HW cores can consist ofHW cores can consist of IP blocks provided by vendor IP blocks provided by vendor Designer generates new cores Designer generates new cores Multiple programmable unitsMultiple programmable units Multiple memory interfacesMultiple memory interfaces Objective : Design micro-architecture with an objective of Objective : Design micro-architecture with an objective of Maximize: performanceMaximize: performance Minimize: cost, powerMinimize: cost, power Design proceeds from CP, CP+T, PV, PV+T, CA to final implementation Design proceeds from CP, CP+T, PV, PV+T, CA to final implementation
Executable Specification Any code in C or C++ can act as the executable specification The code is implementation agnostic The code acts a functional benchmark for verification Timing specified at input/output boundaries Specified as design directive Specified as design directive Code might be specified as one SystemC process with timing Code might be specified as one SystemC process with timing
Untimed Functional Model Design is specified in terms of its functional components Functional components specify possible structural boundaries of final implementation Some functional components might be merged while other might become discrete cores
Untimed Functional Model Dataflow MoC is the most common form of specification Alternatively Kahn process networks, Multi-rate dataflow Alternatively Kahn process networks, Multi-rate dataflow Communication between components is through directed point-to-point FIFO FIFO are bounded Blocking read, blocking write Blocking read, blocking write Time delays might exist in the model at the boundaries Processes in which the model interacts with the environment Processes in which the model interacts with the environment Time delays denote constraints for the system Time delays denote constraints for the system
SystemC Untimed Functional Model Modules contain SC_THREAD process Modules communicate through sc_fifo channels Initial values in FIFOs should be specified Initial values in FIFOs should be specified Stop simulation when Finite time has elapsed Finite time has elapsed Requires atleast one module with time delayRequires atleast one module with time delay Terminate processes through data backlog Terminate processes through data backlog Thread returns when input FIFO is emptyThread returns when input FIFO is empty Call sc_stop() when termination condition is reached Call sc_stop() when termination condition is reached Easiest to implementEasiest to implement For example, consumption of finite set of input stimuliFor example, consumption of finite set of input stimuli
SystemC Untimed Functional Model constantadderforkprinter Z -1 template SC_MODULE(DF_Const){ sc_fifo_out output; T constant_; void process() { while(1) output.write(constant_); } SC_HAS_PROCESS(DF_Const); DF_Const(sc_module_name N, const T& C): sc_module(N), constant_(C) { SC_THREAD(process); } };
SystemC Untimed Functional Model template SC_MODULE(DF_Fork){ sc_fifo_in input; sc_fifo_out output1; sc_fifo_out output2; void process() { while(1) { T value = input.read(); output1.write(value); output2.write(value); } SC_CTOR(DF_Fork) { SC_THREAD(process); } }; template SC_MODULE(DF_Adder){ sc_fifo_in input1, input2; sc_fifo_out output; sc_fifo_out output2; void process() { while(1) { output.write(input1.read() + input2.read()); } } SC_CTOR(DF_Fork) { SC_THREAD(process); } };
SystemC Untimed Functional Model template SC_MODULE(DF_Printer){ sc_fifo_in input; unsigned n_iterations_; bool done_; void process() { while(1) { For (unsigned i = 0; i < n_iterations_; i++) { T value = input.read(); cout << name() << “ “ << value << endl; } done = true; return; } SC_HAS_PROCESS(DF_printer); DF_Printer(sc_module_name NAME, unsigned N_ITER) : sc_module(NAME), n_interations_(N_ITER), done_(false); { SC_THREAD(process); } ~DF_Printer() { If (!done_) cout << name() << “ not done yet “ << endl; } };
SystemC Untimed Functional Model int sc_main(int argc, char** argv) { DF_Const constant(“constant”,1); DF_Adder adder(“adder”); DF_Fork fork(“fork”); DF_Printer printer(“printer”, 10); sc_fifo const_out(“const_out”, 5); sc_fifo adder_out(“adder_out”,1); sc_fifo feedback(“feedback”, 1); sc_fifo to_printer(“to_printer”, 1); feedback.write(42); constant.output(const_out); adder.input1(feedback); adder.input2(const_out); fork.input(adder_out); fork.outpu1(feedback); fork.outpu2(to_printer); printer.input(to_printer); sc_start(-1); return(0); } Start simulation without time limit. Simulation stops when no more events. When printer thread exits simulation will stop. All FIFOs will become full
SystemC Timed Functional Model template SC_MODULE(DF_Const){ sc_fifo_out output; T constant_; void process() { while(1) { wait(200, SC_NS); output.write(constant_);} } SC_HAS_PROCESS(DF_Const); DF_Const(sc_module_name N, const T& C): sc_module(N), constant_(C) { SC_THREAD(process); } }; template SC_MODULE(DF_Adder){ sc_fifo_in input1, input2; sc_fifo_out output; sc_fifo_out output2; void process() { while(1) { output.write(input1.read() + input2.read()); wait(200, SC_NS);} } SC_CTOR(DF_Fork) { SC_THREAD(process); } };
Stopping Dataflow Simulation Simulate with fixed number of input stimuli and return If a method thread is blocked due to a bug then simulation will not stop If a method thread is blocked due to a bug then simulation will not stop Each process consumes a fixed amount of stimuli and sets a “done” signal A “terminator” process issues “sc_stop()” when all processes have issued “done” A “terminator” process issues “sc_stop()” when all processes have issued “done”
Levels of System Abstraction To be continued Detour into Hardware performance estimation Hardware performance estimation System-level performance estimation System-level performance estimation