May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS G. N. Khan and M. Jin System-on-Chip Research Group Electrical & Computer Engineering Ryerson University, Toronto ON M5B 2K3
May 2004 Department of Electrical and Computer Engineering 2 Hardware-Software (HW/SW) Co-design Objective: To design HW/SW early in the design cycle to produce more reliable, efficient and first time right design with in a reasonable time.
May 2004 Department of Electrical and Computer Engineering 3 Hardware Software Partitioning Assignment of System parts to hetrogeneous implementation units (Hardware and Software) Meet constraints (Timing) and Minimize cost (Area, Time to Market) Directly affects the cost and performance of final system
May 2004 Department of Electrical and Computer Engineering 4 Specification Traditionally in Plain English MSC, SDL, SystemC were developed Both textual and graphical representation like DAG (Directed Acyclic Graph) are used to describe system.
May 2004 Department of Electrical and Computer Engineering 5 What is DADGP Directed Acyclic Data dependency Graph with Precedence is an extension of DAG DADGP is a super set of DAG Two types of edges: 1) Weighted Dependency edge 2) Precedence edge
May 2004 Department of Electrical and Computer Engineering 6 DADGP Example Arrow represents dependence relationship Precedence edge is represented with a line Precedence dependency captures the order of execution between nodes and such nodes can be executed in parallel. Only necessary parallelism is exposed A B C D
May 2004 Department of Electrical and Computer Engineering 7 Overall System Partitioning Structure Specification Profiling LD Path Search Mapping Scheduling Valid Mapping Constraint Satisfied Finish Yes No
May 2004 Department of Electrical and Computer Engineering 8 System Partitioning Algorithm i.Profiling and building an initial DADGP ii. Find the LD_path (longest delay path) in DADGP iii.Mapping of LD-path nodes to hardware iv.Schedule and if invalid mapping then goto Step iii v.Update DADGP and calculate the total execution time of target system. vi.If system constraints (specified by the user) are not met then goto Step ii, otherwise quit.
May 2004 Department of Electrical and Computer Engineering 9 Profiling Profiler collects the following data Execution time Amount of data transfer Execution order Data dependencies between nodes
May 2004 Department of Electrical and Computer Engineering 10 Longest Delay Path Search Finding the longest delay path in DADGP is like finding a bottleneck of the system Minimizes search space for mapping Longest Delay path means, longest execution path
May 2004 Department of Electrical and Computer Engineering 11 Mapping Maps a node to be hardware Mapping can change the Longest Delay path, as well as DADGP Mapping is valid if mapping that node to Hardware gives the shortest Longest Delay path
May 2004 Department of Electrical and Computer Engineering 12 Scheduling Very simple List Scheduling approach. Schedules the earliest node first without violating the resource limit. Exposes parallelism and changes the DADGP accordingly.
May 2004 Department of Electrical and Computer Engineering 13 Summary of DADGP Scheduling Start scheduling from the root of DADGP Traverse down the tree and schedule the earliest starting time node If the node is connected with precedence dependency edge, check whether exposing parallelism can eliminate that edge. When an edge is eliminated, DADGP structure may convert to two DADGPs. Roots of the two DADGPs are combined to form a single DADGP with a dummy root node. In case of multiple descendents, schedule them forcibly by adding PEs Update the PE resource (HW-SW) library
May 2004 Department of Electrical and Computer Engineering 14 Constraints Constraints of deadline and cost is given by the designer. Hardware cost is calculated by gate count. Different granularity level should be explored if no solution is found.
May 2004 Department of Electrical and Computer Engineering 15 Edge Detection Example Pair of 3 x 3 masks are convolved to estimate gradients (G x & G y ) in x and y directions HW-SW Library Datadependency Precedence dependency GxGxGxGx Gy2Gy2Gy2Gy2 GyGyGyGy Gx2Gx2Gx2Gx2 Ad d OperationSW EXE (ms) HW EXE (ms) HW Area (gates) Gradient (Gx or Gy) Square Add
May 2004 Department of Electrical and Computer Engineering 16 Edge Detection Solutions 0.1 Gx Sq Y Gy Sq X Ad d Gx Sq Y Gy Sq X Ad d 0.1 Gx Sq Y Gy Sq X Ad d 0.1 Gx Sq Y Gy Sq X Ad d Gx Sq Y Gy Sq X Ad d 0.1
May 2004 Department of Electrical and Computer Engineering 17 Performance improvement vs. HW area
May 2004 Department of Electrical and Computer Engineering 18 Conclusion HW-SW Partitioning is a NP-hard problem To find optimal partitioning Hardware-Software set is very difficult due to many factors affecting the partitioning decision. DADGP Structure Expose Parallelism The complexity of DADGP partitioning algorithm is approximately n 2 log(n).