May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH.

Slides:



Advertisements
Similar presentations
Exploiting Deadline Flexibility in Grid Workflow Rescheduling Wei Chen Alan Fekete Young Choon Lee.
Advertisements

Heuristic Search techniques
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
U of Houston – Clear Lake
Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
ECE-777 System Level Design and Automation Hardware/Software Co-design
A Graph-Partitioning-Based Approach for Multi-Layer Constrained Via Minimization Yih-Chih Chou and Youn-Long Lin Department of Computer Science, Tsing.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
Algorithm Strategies Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Sensor Relocation in Mobile Sensor Networks Guiling Wang, Guohong Cao, Tom La Porta, and Wensheng Zhang Department of Computer Science & Engineering.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
Efficient Software Performance Estimation Methods for Hardware/Software Codesign Kei Suzuki Alberto Sangiovanni-Vincentelli Present: Yanmei Li.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Spring 2010CS 2251 Graphs Chapter 10. Spring 2010CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs.
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
System Partitioning Kris Kuchcinski
Mahapatra-Texas A&M-Fall'001 Partitioning - I Introduction to Partitioning.
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
Winter-Spring 2001Codesign of Embedded Systems1 Introduction to HW/SW Co-Synthesis Algorithms Part of HW/SW Codesign of Embedded Systems Course (CE )
CSE 550 Computer Network Design Dr. Mohammed H. Sqalli COE, KFUPM Spring 2007 (Term 062)
1 IOE/MFG 543 Chapter 7: Job shops Sections 7.1 and 7.2 (skip section 7.3)
A New Approach for Task Level Computational Resource Bi-Partitioning Gang Wang, Wenrui Gong, Ryan Kastner Express Lab, Dept. of ECE, University of California,
Iterative Flattening in Cumulative Scheduling. Cumulative Scheduling Problem Set of Jobs Each job consists of a sequence of activities Each activity has.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
Optimal Parallelogram Selection for Hierarchical Tiling Authors: Xing Zhou, Maria J. Garzaran, David Padua University of Illinois Presenter: Wei Zuo.
Design Space Exploration
Network Aware Resource Allocation in Distributed Clouds.
Graph Coloring with Ants
Successful IT Projects slides © 2007 Darren Dalcher & Lindsey Brodie Successful IT Projects By Darren Dalcher & Lindsey Brodie
Topology aggregation and Multi-constraint QoS routing Presented by Almas Ansari.
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu
Integrated Circuits and Systems Laboratory Darmstadt University of Technology Design Space Exploration of incompletely specified Embedded Systems.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:
Software Engineering 2 Software Testing Claire Lohr pp 413 Presented By: Feras Batarseh.
Chapter 5B: Hardware/Software Codesign / Partitioning EECE **** Embedded System Design.
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
CS 61B Data Structures and Programming Methodology Aug 5, 2008 David Sun.
Chapter 7 – PERT, CPM and Critical Chain Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition © Wiley 2010.
Static Process Scheduling
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
(M) Chapter 12 MANGT 662 (A): Procurement, Logistics and Supply Chain Design Purchasing and Supply Chain Analysis (1/2)
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
Introduction to NP Instructor: Neelima Gupta 1.
Carnegie Mellon Lecture 8 Software Pipelining I. Introduction II. Problem Formulation III. Algorithm Reading: Chapter 10.5 – 10.6 M. LamCS243: Software.
Graphs David Kauchak cs302 Spring Admin HW 12 and 13 (and likely 14) You can submit revised solutions to any problem you missed Also submit your.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
CHaRy Software Synthesis for Hard Real-Time Systems
James D. Z. Ma Department of Electrical and Computer Engineering
Improving cache performance of MPEG video codec
Introduction to cosynthesis Rabi Mahapatra CSCE617
Graphs & Graph Algorithms 2
1.206J/16.77J/ESD.215J Airline Schedule Planning
Instruction Scheduling Hal Perkins Autumn 2011
Presentation transcript:

May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS G. N. Khan and M. Jin System-on-Chip Research Group Electrical & Computer Engineering Ryerson University, Toronto ON M5B 2K3

May 2004 Department of Electrical and Computer Engineering 2 Hardware-Software (HW/SW) Co-design Objective: To design HW/SW early in the design cycle to produce more reliable, efficient and first time right design with in a reasonable time.

May 2004 Department of Electrical and Computer Engineering 3 Hardware Software Partitioning Assignment of System parts to hetrogeneous implementation units (Hardware and Software) Meet constraints (Timing) and Minimize cost (Area, Time to Market) Directly affects the cost and performance of final system

May 2004 Department of Electrical and Computer Engineering 4 Specification Traditionally in Plain English MSC, SDL, SystemC were developed Both textual and graphical representation like DAG (Directed Acyclic Graph) are used to describe system.

May 2004 Department of Electrical and Computer Engineering 5 What is DADGP Directed Acyclic Data dependency Graph with Precedence is an extension of DAG DADGP is a super set of DAG Two types of edges: 1) Weighted Dependency edge 2) Precedence edge

May 2004 Department of Electrical and Computer Engineering 6 DADGP Example Arrow represents dependence relationship Precedence edge is represented with a line Precedence dependency captures the order of execution between nodes and such nodes can be executed in parallel. Only necessary parallelism is exposed A B C D

May 2004 Department of Electrical and Computer Engineering 7 Overall System Partitioning Structure Specification Profiling LD Path Search Mapping Scheduling Valid Mapping Constraint Satisfied Finish Yes No

May 2004 Department of Electrical and Computer Engineering 8 System Partitioning Algorithm i.Profiling and building an initial DADGP ii. Find the LD_path (longest delay path) in DADGP iii.Mapping of LD-path nodes to hardware iv.Schedule and if invalid mapping then goto Step iii v.Update DADGP and calculate the total execution time of target system. vi.If system constraints (specified by the user) are not met then goto Step ii, otherwise quit.

May 2004 Department of Electrical and Computer Engineering 9 Profiling Profiler collects the following data Execution time Amount of data transfer Execution order Data dependencies between nodes

May 2004 Department of Electrical and Computer Engineering 10 Longest Delay Path Search Finding the longest delay path in DADGP is like finding a bottleneck of the system Minimizes search space for mapping Longest Delay path means, longest execution path

May 2004 Department of Electrical and Computer Engineering 11 Mapping Maps a node to be hardware Mapping can change the Longest Delay path, as well as DADGP Mapping is valid if mapping that node to Hardware gives the shortest Longest Delay path

May 2004 Department of Electrical and Computer Engineering 12 Scheduling Very simple List Scheduling approach. Schedules the earliest node first without violating the resource limit. Exposes parallelism and changes the DADGP accordingly.

May 2004 Department of Electrical and Computer Engineering 13 Summary of DADGP Scheduling Start scheduling from the root of DADGP Traverse down the tree and schedule the earliest starting time node If the node is connected with precedence dependency edge, check whether exposing parallelism can eliminate that edge. When an edge is eliminated, DADGP structure may convert to two DADGPs. Roots of the two DADGPs are combined to form a single DADGP with a dummy root node. In case of multiple descendents, schedule them forcibly by adding PEs Update the PE resource (HW-SW) library

May 2004 Department of Electrical and Computer Engineering 14 Constraints Constraints of deadline and cost is given by the designer. Hardware cost is calculated by gate count. Different granularity level should be explored if no solution is found.

May 2004 Department of Electrical and Computer Engineering 15 Edge Detection Example Pair of 3 x 3 masks are convolved to estimate gradients (G x & G y ) in x and y directions HW-SW Library Datadependency Precedence dependency GxGxGxGx Gy2Gy2Gy2Gy2 GyGyGyGy Gx2Gx2Gx2Gx2 Ad d OperationSW EXE (ms) HW EXE (ms) HW Area (gates) Gradient (Gx or Gy) Square Add

May 2004 Department of Electrical and Computer Engineering 16 Edge Detection Solutions 0.1 Gx Sq Y Gy Sq X Ad d Gx Sq Y Gy Sq X Ad d 0.1 Gx Sq Y Gy Sq X Ad d 0.1 Gx Sq Y Gy Sq X Ad d Gx Sq Y Gy Sq X Ad d 0.1

May 2004 Department of Electrical and Computer Engineering 17 Performance improvement vs. HW area

May 2004 Department of Electrical and Computer Engineering 18 Conclusion HW-SW Partitioning is a NP-hard problem To find optimal partitioning Hardware-Software set is very difficult due to many factors affecting the partitioning decision. DADGP Structure Expose Parallelism The complexity of DADGP partitioning algorithm is approximately n 2 log(n).