Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department.

Similar presentations


Presentation on theme: "Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department."— Presentation transcript:

1 Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles

2 Outline u Transaction-level model (TLM)  SystemC TLM  Metropolis Meta Model u Synthesis from TLM  RDR/MCAS: our existing architectural synthesis approach  xPilot: Ongoing synthesis infrastructure for TLM

3 Outline u Transaction-level model (TLM)  SystemC TLM  Metropolis Meta Model u Synthesis from TLM  RDR/MCAS: our existing architectural synthesis approach  xPilot: Ongoing synthesis infrastructure for TLM

4 SystemC Framework u SystemC history  OO system/HW modeling and simulation  SystemC under development by CAD vendors/researchers Synopsys Synopsys Frontier Design Frontier Design CoWare (Belgium) CoWare (Belgium)  Released to public Sept. ‘99 Open source distribution @ www.systemc.org Open source distribution @ www.systemc.org Version 2 out July ‘01 Version 2 out July ‘01

5 Channels and Modules u Basic building blocks:  Module (class) instances, communicating via channel (class) instances  Modules’ functionality coded as concurrent processes Processes communicate via channels or events Processes communicate via channels or events

6 Communication Modeling in SystemC

7 Primitive Channels in SystemC Library u Ordinary signal (wire) of type u Ordinary signal (wire) of type  Fill in data type T when instantiated  Point-to-point or multi-point (1 writer, n readers) u Signal bus (arbitrary width) u FIFO, for producer/consumer connection u Pseudo-channels  Mutex & semaphore, for interprocess sync  Accessed using channel syntax u Complex “hierarchical” channels composed of primitive channels, processes, modules

8 Events and Processes u Events: abstract occurrences used for  Process triggering (like VHDL sensitivity list)  Channel communication  Interprocess synchronization u Process can call wait() to block on event u Event occurrence tells simulator to schedule simulation of relevant process u Processes execution  Not called directly from your code  Triggered for simulation by events on ports, channels, or explicit named events  Registered in constructor of enclosing module (associate method with events) u Thread process → infinite loop  Must call wait() to lose control u Method process → runs to completion  Less scheduling overhead

9 Data Types in SystemC u u SystemC supports   Native C/C++ Types   SystemC Types u u SystemC Types   Data type for system modeling   2 value (‘0’,’1’) logic/logic vector   4 value (‘0’,’1’,’Z’,’X’) logic/logic vector   Arbitrary sized integer (Signed/Unsigned)   Fixed Point types (Templated/Untemplated) u Objective: to reflect HW registers & ALU operations

10 Functional Level and RTL Modeling in SystemC u Functional level  Sequential, algorithmic, software-like  Explore HW/SW architectures, proof of algorithms, performance modeling & analysis u Register transfer level  Complete detailed functional description of hardware Every register, bus, bit for every clock cycle Every register, bus, bit for every clock cycle Use C++ switch/case for FSM implementation Use C++ switch/case for FSM implementation  At this point, can switch to HDL, but staying in SystemC leverages test benches  Prepare for HW synthesis step by using only synthesizable constructs

11 Transaction Level Modeling in SystemC u Transaction level  Model includes architectural components  Maintain component interface accuracy E.g., buses modeled as channels (read/write operations) E.g., buses modeled as channels (read/write operations)  Behavioral style inside a component  Simulates 100-10,000x faster than RTL  Provide execution platform for SW development

12 TLM – Raise the Level of Architectural Modeling u u What is TLM?   Communication uses function calls burst_read(char* buf, int addr, int len); u u Why is TLM interesting?   Simulation: Fast and compact   Integrate HW and SW models   Early platform for SW development   Early system exploration and verification   Verification reuse   Synthesis … u u Reference: www.systemc.orgwww.systemc.org

13 Typical Design Flow Using TLM u Functional model  Captures system behaviour u TLM, Transaction Level Model  Bus transactions  Accurate interaction with SW portion  Simulates rapidly u Can create TLM model initially

14 Introduction of Metropolis u A UCB and GSRC project, http://www.gigascale.org/metropolis/ http://www.gigascale.org/metropolis/ u Platform-based design [ASV]  Platforms have sufficient flexibility to support a series of applications/products  Choose a platform by design space exploration  Above two require models to be reusable u Orthogonalization of concerns  Computation vs. Communication  Behavior vs. Coordination  Behavior vs. Architecture  Capability vs. Cost

15 Metropolis Meta Model u A combination of imperative program and declarative constraints u Imperative program:  objects (process, media, quantity, statemedia)  netlist  await  block and label  interface function call  quantity annotation u Declarative constraints  Linear Temporal Logic (LTL)  (synch)  Logic of Constraints (LOC)

16 A Metropolis Design Tutorial MyFncNetlist M P1 P2 Env1 Env2 MyMapNetlist

17 A Metropolis Design Tutorial MyMapNetlist MyFncNetlist M P1 P2 Env1 Env2 Y2T write() Th,Wk T2Y read() Bus Arbiter Bus Mem Cpu OsSched MyArchNetlist mP1 mP2 mP1 mP2 B(P1, M.write) B(mP1, mP1.writeCpu); E(P1, M.write) E(mP1, mP1.writeCpu); B(P1, P1.f) B(mP1, mP1.mapf); E(P1, P1.f) E(mP1, mP1.mapf); B(P2, M.read) B(P2, mP2.readCpu); E(P2, M.read) E(mP2, mP2.readCpu); B(P2, P2.f) B(mP2, mP2.mapf); E(P2, P2.f) E(mP2, mP2.mapf); Bus Arbiter Bus Mem Cpu OsSched MyArchNetlist … … …

18 Outlook of the First Metropolis Release Meta model infrastructure SPIN interface LOC checking Front end Meta model language SystemC simulation Back end 1 Abstract syntax trees Back end 2 Back end N Back end 3 Meta model debugger Sample architectural libraries: coarse-simple cpu, bus, memory, arbiters time quantity Sample MoC: multi-media (Yapi, TTL) Synchronous A design tutorial u http://www.gigascale.org/metropolis/

19 TLM Conclusions u SystemC is the defacto system-level-design standard  Pushed by many CAD tool vendors  Used widely in industry and academia E.g., Intel handhold system project [ICCAD’04] E.g., Intel handhold system project [ICCAD’04]  Unified language to model a system in different levels  Improving path to HW synthesis from SystemC source code  Fits with trend to take system design to higher level u Metropolis is a novel academic framework of model of computation  Capable of representing TLM as well  Provides a comprehensive starting point of synthesis

20 Outline u Transaction-level model (TLM)  SystemC TLM  Metropolis Meta Model u Synthesis from TLM  xPilot: our ongoing synthesis infrastructure for TLM  RDR/MCAS: our existing architectural synthesis approach

21 xPilot: TLM to RTL Synthesis Flow TLM in SystemC/Metropolis RTL SSDMSSDM u Arch-generation passes: RTL/constraints generation  Verilog/VHDL/SystemC  Altera/Xilinx  General/Synopsys/Magma … u Arch-dependent passes  Memory analysis/allocation  Scheduling/Binding/Memory analysis/allocation  Register/port binding  Traditional/Low power/RDR-pipe or Placement driven … u Arch-Independent passes  SSDM Checking  Loop unrolling/pipelining  Strength reduction/Bitwidth analysis  Speculative-execution transformation … FPGAsFPGAs FrontendFrontend

22 Integration xPilot with Metropolis Meta model infrastructure Front end Meta model language SystemC Simulation Abstract syntax trees LOC Checking SPIN Interface Synthesis HW Implementation … FPGAASICS … IP Assembly Predictable RTL Synthesis RTL Timing Constraints Physical Constraints RTL Handoff Latency Insensitive Design GALSRDR/MCAS IP Library HW implementation Compilation for RP … Simulation Extended Instruction Reconfigurable Interconnect Reconfigurable Coprocessor … xPilot/SSDM

23 SSDM Zoomed In – CDFG if (cond1) bb1(); else bb2(); else bb2();bb3(); switch (test1) { case c1: bb4(); break; case c2: bb5(); break; case c3: bb6(); break; }bb7() cond1 bb1() bb2() bb3() bb4() test1 bb5()bb6() T F c1 c2 c3 bb7() u 2-level CDFG representation  1 st level: control flow graph  2 nd level: data flow graph

24 SSDM Features Different from Software IR u Top-level: netlist of concurrent processes u Process port/interface semantics  FIFO: FifoRead() / FifoWrite()  BUFF: BuffRead() / BuffWrite()  Memory: MemRead() / MemWrite() u Bit vector manipulation  Bit extraction / concatenation / insertion  Bit-width property for every value u Cycle-level notation  Scheduling / binding information / delay

25 Our Architectural Synthesis Approaches – RDR / MCAS u Consideration of multi-cycle communication during architectural (or behavioral) synthesis  Regular Distributed Register (RDR) micro-architecture [Cong et al, ISPD’03] Highly regular Highly regular Direct support of multi-cycle on-chip communication Direct support of multi-cycle on-chip communication  MCAS: Architectural Synthesis for Multi-cycle Communication Efficiently maps the behavioral descriptions to RDR uArch Efficiently maps the behavioral descriptions to RDR uArch Integrates architectural synthesis (e.g. resource binding, scheduling) with physical planning Integrates architectural synthesis (e.g. resource binding, scheduling) with physical planning

26 RDR/MCAS: Support for Heterogeneous Integration with Multi- cycle Communication & Automatic Interconnect Pipelining u Distribute registers to each “island” u Choose the island size such that  Single cycle for intra-island computation and communication  Multi-cycle communication between islands u Support interconnect pipelining  Inter-island pipeline register station (PRS) for global communications  PRS performs autonomous store-and-forward u MCAS: Multi-cycle architectural synthesis integrated with global placement u Experimental results  MCAS vs. Conventional flow: 36% reduction in clock period and 36% reduction in clock period and 30% reduction in total latency 30% reduction in total latency  MCAS-Pipe vs. MCAS: 28.8% long global wirelength reduction 28.8% long global wirelength reduction 19.3% total wirelength reduction 19.3% total wirelength reduction u Can also support IP integration using latency insensitive technique [Carloni, ICCAD’99] Pipeline Register Station (PRS) 3 1 2 4 LCC FSM LCC FSM LCC FSM IP Library Adaptor Reg. File V channel H channel 1 2 34 PR S

27 Synthesis Flow: MCAS-Pipe System ICG C / VHDL Locations Placement-driven rescheduling & rebinding Placement-driven rescheduling & rebinding Scheduling-driven placement CDFG generation Register and port binding Datapath & FSM generation Resource allocation & Functional unit binding Resource allocation & Functional unit binding RTL VHDL & Floorplan constraints CDFG Global interconnect sharing u Global interconnect sharing  Enable multiple data communications to share one physical link (a wire with pipeline registers)

28 Related Publications u Regular distributed register (RDR) architecture and MCAS synthesis algorithms  ISPD’03, ICCAD’03 u RDR-Pipe and MCAS-Pipe synthesis algorithms  DAC’04 u Lopass: high-level synthesis for low-power FPGAs  ISLPED’03 u Multiplexor optimization through register/port binding  ASPDAC’04 u Bitwidth-aware scheduling and binding algorithms  ASPDAC’05

29 Conclusions u Higher level abstraction is needed in current SO(P)C design flow  SystemC becomes the SLD standard, esp., TLM is widely used  Metropolis is a platform-based design framework  It is time to build new generation of behavioral synthesis system from TLM u xPilot:  Ongoing project  An architectural synthesis infrastructure from TLM to RTL (FPGAs)


Download ppt "Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department."

Similar presentations


Ads by Google