Download presentation
Presentation is loading. Please wait.
1
GanesanP91 Synthesis for Partially Reconfigurable Computing Systems Satish Ganesan, Abhijit Ghosh, Ranga Vemuri Digital Design Environments Laboratory Dept of ECECS, University of Cincinnati [satish, ranga] @ececs.uc.edu This work is sponsored in part by the US Air Force, Wright Laboratory, WPAFB, under contract number F33615-97-C-1043
2
GanesanP92 Synthesis System Overview Translator High-level Synthesis Dynamic Reconfiguration Set Generation Dynamic Reconfiguration Set Generation Logic Elaboration Host-side Controller Layout Synthesis PARTIALLY RECONFIGURABLE FPGA Input Specification (VHDL / C)
3
GanesanP93 Target Architecture Model P1 P2 device Features: Partially reconfigurable device where a portion of the device can be reconfigured while the remaining part is still operational Target device split into two parts : P1, P2 Design is split into sequential blocks and loaded on the two portions of the device Reconfiguration of a block is overlapped with execution of another
4
GanesanP94 Input Specification Behavior specification in VHDL/C subset Translated into Intermediate Representation Intermediate Representation: Block 1 Block 2 Block 3Block 4 Block 5 Block 6 Behavior Block Input Format Single thread of control Each block performs set of computations Data transfer through branch interface Supports control constructs
5
GanesanP95 High-level Synthesis (HLS) High-level Synthesis Engine RTL Component Library Input Specification (Behavior Blocks) Area / Timing Constraints Register - Transfer Level Design (RTL Blocks) SchedulingAllocationBinding
6
GanesanP96 High-level Synthesis (HLS) Block 1 Block 2 Block 3Block 4 Block 5 Block 6 Each behavior block in the block graph separately synthesized HLS RTL Blk 1 RTL Blk 2 RTL Blk 3RTL Blk 4 RTL Blk 5 RTL Blk 6
7
GanesanP97 RTL Model DATAPATH (net-list of components) CONTROLLER (finite state machine) DESIGN I/0 Clock Reset Start Finish Flags Controls Glushkovian Model Components in the datapath implement operations specified in behavior Controller (FSM) provides necessary controls for execution HLS generates 4 signals : Clock(in), Reset(in), Start(in), Finish(out)
8
GanesanP98 Dynamic Reconfiguration RTL Blk 1 RTL Blk 2 RTL Blk 3RTL Blk 4 RTL Blk 5 RTL Blk 6 RTL Blk 1RTL Blk 2 RTL Blk3|4 RTL Blk 5 RTL Blk 6 DR Input: RTL block graph, with each block having been separately synthesized Output: Sequence of reconfiguration sets Each reconfiguration set has two blocks: one reconfigures, other executes Intermediate data between blocks stored in board registers
9
GanesanP99 Dynamic Reconfiguration: Example Step1: RTL Block 1 is loaded on the device Step2: RTL Block 1 is executed ; RTL Block 2 is configured Step3: RTL Block 1 completes execution ; RTL Block 3 is reconfigured in place of RTL Block 1; RTL Block 2 is executed Step4: Repeat Steps 2 and 3 until all RTL blocks have been loaded and executed RTL Blk 1 RTL Blk 2 RTL Blk 3 RTL Blk 4 RTL Blk 5
10
GanesanP910 Latency Improvement Latency of design without DRSG approach L 1 = (R i + E i ) 1 <= i <= n Latency of design with DRSG approach L 2 = R 1 + max(R i+1, E i ) 1 <= i <= n where : Ri : reconfiguration time of i th block Ei : execution time of i th block It is easily seen that L 2 <= L 1 RTL Blk 1 RTL Blk 2 RTL Blk 3 RTL Blk 4 RTL Blk 5
11
GanesanP911 Handling Conditional Constructs RTL Block 1 is a conditional block Either RTL Block2 or RTL Block3 is executed due to single thread of control Two approaches to handle conditional branching Approach I: host polling The host waits on the conditional predicate to evaluate to load the appropriate branch L 1 = R 1 + max(R i+1, E i ) +R j 1 <= i <= n where R j : reconfiguration time of the branch that is executed RTL Blk 1 RTL Blk 2RTL Blk 3 RTL Blk 4
12
GanesanP912 Handling Conditional Constructs Approach II: branch prediction The host loads one of the branches based on a user given profile Latency of the design if the correct branch was loaded L 1 = R 1 + max(R i+1, E i ) 1 <= i <= n If the wrong branch was loaded, L 2 = R 1 + max(R i+1, E i ) +R j 1 <= i <= n where R j : reconfiguration time of the branch L 1 <= L 2, always RTL Blk 1 RTL Blk 2RTL Blk 3 RTL Blk 4
13
GanesanP913 Logic Elaboration VELAB Logic Elaboration VELAB RTL Component Library Input RTL Specification Elaborated net-list file in EDIF format Features: Pre-placed component library to aid layout synthesis RTL specification obtained form HLS tool ASSERTA Net-list produced in EDIF format
14
GanesanP914 Layout Synthesis XACT6000 Layout Synthesis XACT6000 Input Net-list Specification FPGA bit-stream Features: Manual placement required to ensure place and route using XACT6000 Replaced blocks are placed in the same location as the blocks they substitute Bitmap files produced in cal format
15
GanesanP915 Host-side Controller Bitmap filesReconfiguration Set Sequence RTR implementation of design Features: Manages the partially reconfigurable FPGA device Loads and executes bitmap files based on the reconfiguration sequence generated by DRSG phase Device used is Xilinx 6200
16
GanesanP916 Results : Percentage Configuration time Design 4x4 2D FFT 4x4 1D DCT 16-tap FIR Total rec. 929 us 1416 us 338 us Total exec 1025 us 2008 us 200 us Overlap 678 us 1161 us 0 us Latency 1276 us 2263 us 538 us % conf 19.7 11.2 62.8 Table presents percentage total time spent only in configuration using the synthesis flow The examples show significant improvements in overall latency
17
GanesanP917 Conclusions and Future Work Conclusions: Presented a synthesis system for partially reconfigurable FPGAs Proposed a dynamic reconfiguration set generation strategy to improve overall design latency by reducing reconfiguration time Results showed considerable decrease in reconfiguration times Future work: Automate the procedure of generating run-time reconfigurable designs for partially reconfigurable FPGAs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.