Chapter 5B: Hardware/Software Codesign / Partitioning EECE **** Embedded System Design.

Slides:



Advertisements
Similar presentations
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Advertisements

Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
ECE-777 System Level Design and Automation Hardware/Software Co-design
EECE **** Embedded System Design
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
ISBN Chapter 3 Describing Syntax and Semantics.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
1 ENERGY: THE ROOT OF ALL PERVASIVENESS Anthony Ephremides University of Maryland April 29, 2004.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
ECE Synthesis & Verification - Lecture 3 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.
Courseware High-Level Synthesis an introduction Prof. Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
CS 104 Introduction to Computer Science and Graphics Problems
Chapter 13 Embedded Systems
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
System Partitioning Kris Kuchcinski
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Mahapatra-Texas A&M-Fall'001 Partitioning - I Introduction to Partitioning.
Planning operation start times for the manufacture of capital products with uncertain processing times and resource constraints D.P. Song, Dr. C.Hicks.
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
Describing Syntax and Semantics
Winter-Spring 2001Codesign of Embedded Systems1 Introduction to HW/SW Co-Synthesis Algorithms Part of HW/SW Codesign of Embedded Systems Course (CE )
A New Approach for Task Level Computational Resource Bi-Partitioning Gang Wang, Wenrui Gong, Ryan Kastner Express Lab, Dept. of ECE, University of California,
Hardware/Software Partitioning Witawas Srisa-an Embedded Systems Design and Implementation.
Technische Universität Dortmund Classical scheduling algorithms for periodic systems Peter Marwedel TU Dortmund, Informatik 12 Germany 2007/12/14.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
- 1 - EE898-HW/SW co-design Hardware/Software Codesign “Finding right combination of HW/SW resulting in the most efficient product meeting the specification”
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Spring 2009.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Lecture 13 Introduction to Embedded Systems Graduate Computer Architecture Fall 2005 Shih-Hao Hung Dept. of Computer Science and Information Engineering.
May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH.
SOFTWARE / HARDWARE PARTITIONING TECHNIQUES SHaPES: A New Approach.
Hardware-Software Co-partitioning for Distributed Embedded Systems.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
Hardware/Software Codesign of Embedded Systems
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
Universität Dortmund Chapter 6A: Validation Simulation and test pattern generation (TPG) EECE **** Embedded System Design.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 03, 2005 Session 15.
Vertex Coloring Distributed Algorithms for Multi-Agent Networks
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
Winter-Spring 2001Codesign of Embedded Systems1 Co-Synthesis Algorithms: Distributed System Co- Synthesis Part of HW/SW Codesign of Embedded Systems Course.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
ICS 353: Design and Analysis of Algorithms Backtracking King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Jamie Unger-Fink John David Eriksen.  Allocation and Scheduling Problem  Better MPSoC optimization tool needed  IP and CP alone not good enough  Communication.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Dynamo: A Runtime Codesign Environment
The Hardware / Software Tradeoff -John Burnette-
A Methodology for System-on-a-Programmable-Chip Resources Utilization
Chapter 1: Introduction
CSCI1600: Embedded and Real Time Software
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Integer Programming (정수계획법)
Richard Anderson Lecture 6 Greedy Algorithms
Integer Programming (정수계획법)
Richard Anderson Lecture 7 Greedy Algorithms
Richard Anderson Winter 2019 Lecture 7
CSCI1600: Embedded and Real Time Software
Richard Anderson Autumn 2019 Lecture 7
Presentation transcript:

Chapter 5B: Hardware/Software Codesign / Partitioning EECE **** Embedded System Design

- 2 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Hardware/Software Codesign

- 3 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special purpose hardware in the long run? Correct for fixed functionality, but wrong in general, since “By the time MPEG-n can be implemented in software, MPEG-n+1 has been invented” [de Man]

- 4 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Functionality to be implemented in software or in hardware? Decision based on hardware/ software partitioning, a special case of hardware/ software codesign.

- 5 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Hardware/software codesign: approach Processor P1 Processor P2 Hardware Specification Mapping

- 6 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science An IP model for HW/SW partitioning Notation: Index set I denotes task graph nodes. Index set L denotes task graph node types e.g. square root, DCT or FFT Index set KH denotes hardware component types. e.g. hardware components for the DCT or the FFT. Index set J of hardware component instances Index set KP denotes processors. All processors are assumed to be of the same type Notation: Index set I denotes task graph nodes. Index set L denotes task graph node types e.g. square root, DCT or FFT Index set KH denotes hardware component types. e.g. hardware components for the DCT or the FFT. Index set J of hardware component instances Index set KP denotes processors. All processors are assumed to be of the same type

- 7 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science An IP model for HW/SW partitioning X i,k : =1 if node v i is mapped to hardware component type k  KH and 0 otherwise. Y i,k : =1 if node v i is mapped to processor k  KP and 0 otherwise. NY ℓ,k =1 if at least one node of type ℓ is mapped to processor k  KP and 0 otherwise. T is a mapping from task graph nodes to their types: T: I  L The cost function accumulates the cost of hardware units: C = cost(processors) + cost(memories) + cost(application specific hardware) X i,k : =1 if node v i is mapped to hardware component type k  KH and 0 otherwise. Y i,k : =1 if node v i is mapped to processor k  KP and 0 otherwise. NY ℓ,k =1 if at least one node of type ℓ is mapped to processor k  KP and 0 otherwise. T is a mapping from task graph nodes to their types: T: I  L The cost function accumulates the cost of hardware units: C = cost(processors) + cost(memories) + cost(application specific hardware)

- 8 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Scheduling Processor p 1 ASIC h 1 FIR 1 FIR 2 v1v1 v2v2 v3v3 v4v4 v9v9 v 10 v 11 v5v5 v6v6 v7v7 v8v8 e3e3 e4e4 t p1p1 v8v8 v7v7 v7v7 v8v8 or... t c1c1 or... e3e3 e3e3 e4e4 e4e4 t FIR 2 on h 1 v4v4 v3v3 v3v3 v4v4 or... Communication channel c 1

- 9 - Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Scheduling / precedence constraints For all nodes v i1 and v i2 that are potentially mapped to the same processor or hardware component instance, introduce a binary decision variable b i1,i2 with b i1,i2 =1 if v i1 is executed before v i2 and = 0 otherwise. Define constraints of the type (end-time of v i1 )  (start time of v i2 ) if b i1,i2 =1 and (end-time of v i2 )  (start time of v i1 ) if b i1,i2 =0 Ensure that the schedule for executing operations is consistent with the precedence constraints in the task graph. For all nodes v i1 and v i2 that are potentially mapped to the same processor or hardware component instance, introduce a binary decision variable b i1,i2 with b i1,i2 =1 if v i1 is executed before v i2 and = 0 otherwise. Define constraints of the type (end-time of v i1 )  (start time of v i2 ) if b i1,i2 =1 and (end-time of v i2 )  (start time of v i1 ) if b i1,i2 =0 Ensure that the schedule for executing operations is consistent with the precedence constraints in the task graph.

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Example HW types H1, H2 and H3 with costs of 20, 25, and 30. Processors of type P. Tasks T1 to T5. Execution times: HW types H1, H2 and H3 with costs of 20, 25, and 30. Processors of type P. Tasks T1 to T5. Execution times: TH1H2H3P

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Operation assignment constraints (1) TH1H2H3P X 1,1 +Y 1,1 =1 (task 1 mapped to H1 or to P) X 2,2 +Y 2,1 =1 X 3,3 +Y 3,1 =1 X 4,3 +Y 4,1 =1 X 5,1 +Y 5,1 =1

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Operation assignment constraints (2) Assume types of tasks are ℓ =1, 2, 3, 3, and 1.  ℓ  L,  i:T(v i )=c ℓ,  k  KP: NY ℓ,k  Y i,k Assume types of tasks are ℓ =1, 2, 3, 3, and 1.  ℓ  L,  i:T(v i )=c ℓ,  k  KP: NY ℓ,k  Y i,k Functionality 3 to be implemented on processor if node 4 is mapped to it.

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Other equations Time constraints leading to: Application specific hardware required for time constraints under 100 time units. TH1H2H3P Cost function: C=20 #(H1) + 25 #(H2) + 30 # (H3) + cost(processor) + cost(memory)

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Result For a time constraint of 100 time units and cost(P)<cost(H3): TH1H2H3P Solution (educated guessing) : T1  H1 T2  H2 T3  P T4  P T5  H1

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Separation of scheduling and partitioning Combined scheduling/partitioning very complex;  Heuristic: Compute estimated schedule Perform partitioning for estimated schedule Perform final scheduling If final schedule does not meet time constraint, go to 1 using a reduced overall timing constraint. Combined scheduling/partitioning very complex;  Heuristic: Compute estimated schedule Perform partitioning for estimated schedule Perform final scheduling If final schedule does not meet time constraint, go to 1 using a reduced overall timing constraint. 2 nd Iteration t specification Actual execution time 1 st Iteration approx. execution time t Actual execution time approx. execution time New specification

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Application example Audio lab (mixer, fader, echo, equalizer, balance units); slow SPARC processor 1µ ASIC library Allowable delay of µs (~ 44.1 kHz) Audio lab (mixer, fader, echo, equalizer, balance units); slow SPARC processor 1µ ASIC library Allowable delay of µs (~ 44.1 kHz) SPARC processor ASIC (Compass, 1 µ) External memory Outdated technology; just a proof of concept.

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Running time for COOL optimization  Only simple models can be solved optimally.

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Deviation from optimal design  Hardly any loss in design quality.

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Running time for heuristic

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Design space for audio lab Everything in software: 72.9 µs, 0 2 Everything in hardware: 3.06 µs, 457.9x Lowest cost for given sample rate: 18.6 µs, 78.4x10 6 2,

Universität Dortmund Department of Electrical and Computer Engineering College of Engineering, Technology and Computer Science Final remarks COOL approach:  shows that formal model of hardware/SW codesign is beneficial; IP modeling can lead to useful implementation even if optimal result is available only for small designs. Other approaches for HW/SW partitioning:  starting with everything mapped to hardware; gradually moving to software as long as timing constraint is met.  starting with everything mapped to software; gradually moving to hardware until timing constraint is met.  Binary search. COOL approach:  shows that formal model of hardware/SW codesign is beneficial; IP modeling can lead to useful implementation even if optimal result is available only for small designs. Other approaches for HW/SW partitioning:  starting with everything mapped to hardware; gradually moving to software as long as timing constraint is met.  starting with everything mapped to software; gradually moving to hardware until timing constraint is met.  Binary search.