- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Hardware/Software Codesign.

Slides:



Advertisements
Similar presentations
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Advertisements

Chapter 19 Fast Fourier Transform
Delta Confidential 1 5/29 – 6/6, 2001 SAP R/3 V4.6c PP Module Order Change Management(OCM)
2. Getting Started Heejin Park College of Information and Communications Hanyang University.
Using Matrices in Real Life
Analysis of Computer Algorithms
fakultät für informatik informatik 12 technische universität dortmund Additional compiler optimizations Peter Marwedel TU Dortmund Informatik 12 Germany.
Evaluation and Validation
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
University of Paderborn Software Engineering Group E. Kindler Handout for the talk given in the eJustice Dialogues at Saarland University. June 6, 2005.
Embedded Systems & Parallel Programming P. Marwedel, Univ. Dortmund/Informatik 12 + ICD/ES, 2007 Universität Dortmund A view on embedded systems.
Distributed Systems Architectures
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Processes and Operating Systems
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 1 Embedded Computing.
Chapter 3: Top-Down Design with Functions Problem Solving & Program Design in C Sixth Edition By Jeri R. Hanly & Elliot B. Koffman.
8 Copyright © 2005, Oracle. All rights reserved. Creating the Web Tier: JavaServer Pages.
10 Copyright © 2005, Oracle. All rights reserved. Reusing Code with Inheritance and Polymorphism.
Summary of Convergence Tests for Series and Solved Problems
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Fakultät für informatik informatik 12 technische universität dortmund Classical scheduling algorithms for periodic systems Peter Marwedel TU Dortmund,
Excel Functions. Part 1. Introduction 2 An Excel function is a formula or a procedure that is performed in the Visual Basic environment, outside the.
Real Time Versions of Linux Operating System Present by Tr n Duy Th nh Quách Phát Tài 1.
Electric Bus Management System
Configuration management
Software change management
Chapter 11: Models of Computation
©2004 Brooks/Cole FIGURES FOR CHAPTER 16 SEQUENTIAL CIRCUIT DESIGN Click the mouse to move to the next page. Use the ESC key to exit this chapter. This.
Testing Workflow Purpose
Campaign Overview Mailers Mailing Lists
ABC Technology Project
DAQmx下多點(Multi-channels)訊號量測
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
© 2012 National Heart Foundation of Australia. Slide 2.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 4 Slide 1 Software processes 2.
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
Executional Architecture
Global Analysis and Distributed Systems Software Architecture Lecture # 5-6.
1 of 31 Images from Africa. 2 of 31 My little Haitian friend Antoine (1985)
1 of 32 Images from Africa. 2 of 32 My little Haitian friend Antoine (1985)
25 seconds left…...
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 15 Programming and Languages: Telling the Computer What to Do.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
© Copyright 1992–2005 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Tutorial 13 – Salary Survey Application: Introducing.
From Model-based to Model-driven Design of User Interfaces.
University of Rostock 1 CADUI' June FUNDP Namur Automatic user interface generation from declarative models Egbert Schlungbaum & Thomas.
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
- 1 - EE898-HW/SW co-design Hardware/Software Codesign “Finding right combination of HW/SW resulting in the most efficient product meeting the specification”
EECE **** Embedded System Design
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Presentation transcript:

- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Hardware/Software Codesign

- 2 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Design productivity gap

- 3 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund © Lauro Rizzatti Marketing Vice President Emulation & Verification Engineering (EVE)

- 4 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Reuse as a way out Pre-designed standard components to be used. Standard software components Standard hardware components  Platform-based design Pre-designed standard components to be used. Standard software components Standard hardware components  Platform-based design

- 5 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Platform-based design A platform is a family of architectures satisfying a set of constraints imposed to allow the reuse of hardware and software components. However, a hardware platform is not enough. Quick, reliable, derivative design requires using a platform application programming interface (API) to extend the platform toward application software. In general, a platform is an abstraction layer that covers many possible refinements to a lower level. Platform-based design is a meet-in-the-middle approach: In the top-down design flow, designers map an instance of the upper platform to an instance of the lower, and propagate design constraints [Sangiovanni-Vincentelli, 2002].

- 6 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Iterative approach (1) Guided by performance evaluation

- 7 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Essentially the same with our flow … Mapping

- 8 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Iterative approach: SpecC model

- 9 -  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Overview of design activities Task level concurrency management Which tasks in the final system? High level transformations Transformation that are outside the scope of traditional compilers Hardware/software partitioning Which operation mapped to hardware, which to software? Compilation Hardware-aware compilation Scheduling Performed several times, with varying precision Design space exploration Set of possible designs, not just one. Task level concurrency management Which tasks in the final system? High level transformations Transformation that are outside the scope of traditional compilers Hardware/software partitioning Which operation mapped to hardware, which to software? Compilation Hardware-aware compilation Scheduling Performed several times, with varying precision Design space exploration Set of possible designs, not just one.

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Task-level concurrency management Granularity: size of tasks (e.g. in instructions) Readable specifications and efficient implementations can possibly require different task structures.  Granularity changes Granularity: size of tasks (e.g. in instructions) Readable specifications and efficient implementations can possibly require different task structures.  Granularity changes

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Merging of tasks Reduced overhead of context switches, More global optimization of machine code, Reduced overhead for inter-process/task communication. Reduced overhead of context switches, More global optimization of machine code, Reduced overhead for inter-process/task communication.

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Splitting of tasks No blocking of resources while waiting for input, more flexibility for scheduling, possibly improved result. No blocking of resources while waiting for input, more flexibility for scheduling, possibly improved result.

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Merging and splitting of tasks The most appropriate task graph granularity depends upon the context  merging and splitting may be required. Merging and splitting of tasks should be done automatically, depending upon the context. The most appropriate task graph granularity depends upon the context  merging and splitting may be required. Merging and splitting of tasks should be done automatically, depending upon the context.

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Automated rewriting of the task system - Example -

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Attributes of a system that needs rewriting Tasks blocking after they have already started running

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Work by Costadella et al. 1.Transform each of the tasks into a Petri net, 2.Generate one global Petri net from the nets of the tasks, 3.Partition global net into “sequences of transition” 4.Generate one task from each such sequence 1.Transform each of the tasks into a Petri net, 2.Generate one global Petri net from the nets of the tasks, 3.Partition global net into “sequences of transition” 4.Generate one task from each such sequence Mature, commercial approach not yet available

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Result, as published by Cortadella Reads only at the beginning Initialization task Always true Never true

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Optimized version of Tin Tin () { READ (IN, sample, 1); sum += sample; i++; DATA = sample; d = DATA; L0: if (i < N) return; DATA = sum/N; d = DATA; d = d*c; WRITE(OUT,d,1); sum = 0; i = 0; return; } Tin () { READ (IN, sample, 1); sum += sample; i++; DATA = sample; d = DATA; L0: if (i < N) return; DATA = sum/N; d = DATA; d = d*c; WRITE(OUT,d,1); sum = 0; i = 0; return; } Always true j==i-1 j  i Never true

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Floating-point to fixed point conversion Pros –Lower cost –Faster –Lower power consumption –Sufficient SQNR, if properly scaled –Suitable for portable applications Cons –Decreased dynamic range –Finite word-length effect, unless properly scaled Overflow and excessive quantization noise –Extra programming effort Pros –Lower cost –Faster –Lower power consumption –Sufficient SQNR, if properly scaled –Suitable for portable applications Cons –Decreased dynamic range –Finite word-length effect, unless properly scaled Overflow and excessive quantization noise –Extra programming effort © Ki-Il Kum, et al. (Seoul National University): A Floating-point To Fixed-point C Converter For Fixed-point Digital Signal Processors, 2nd SUIF Workshop, 1996

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Fixed-Point Data Format Floating-Point vs. Fixed-Point –exponent, mantissa –Floating-Point automatic computation and update of each exponent at run-time –Fixed-Point implicit exponent determined off-line Floating-Point vs. Fixed-Point –exponent, mantissa –Floating-Point automatic computation and update of each exponent at run-time –Fixed-Point implicit exponent determined off-line S hypothetical binary point IWL=3 Integer vs. Fixed-Point S (a) Integer (b) Fixed-Point FWL © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Assignment and Addition/Subtraction Assume y = x, with -x (IWL=2) and -y (IWL=3): Assume y = x, with -x (IWL=2) and -y (IWL=3): s s x x>>1 y s Let result = x + y: equalizing each IWL s s x x>>1 y s s result + © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Development Procedure Range Estimation C Program Execution Floating-Point C Program Fixed-Point C Program Floating- Point to Fixed-Point C Program Converter Range Estimator Manual specification IWL information © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Range Estimator C pre-processor C front-end ID assignment Subroutine call insertion SUIF-to-C converter Floating-Point C Program Range Estimation C Program IWL Information Execution float iir1(float x) { static float s = 0; float y; y = 0.9 * s + x; range(y, 0); s = y; range(s, 1); return y; } float iir1(float x) { static float s = 0; float y; y = 0.9 * s + x; range(y, 0); s = y; range(s, 1); return y; } Range Estimation C Program © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Floating-Point to Fixed-Point Program Converter int iir1(int x) { static int s = 0; int y; y=sll(mulh(29491,s)+ (x>> 5),1); s = y; return y; } Fixed-Point C Program mulh –to access the upper half of the multiplied result –target dependent implementation sll –to check runtime overflows mulh –to access the upper half of the multiplied result –target dependent implementation sll –to check runtime overflows © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Performance Comparison - Machine Cycles - © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Performance Comparison - Machine Cycles - © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Performance Comparison - SNR - © Ki-Il Kum, et al

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Fundamental considerations of tradeoffs by Brodersen (Berkeley)

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Fridge RWTH Aachen, commercialized by Synopsys as part of the CoCentric tool suite. Used type definition features of C++ to define types Fixed and fixed. Using types in declarations: fixed a, *b, c[8] Defining types in assignments: a= fixed(5,4,wt,*b) RWTH Aachen, commercialized by Synopsys as part of the CoCentric tool suite. Used type definition features of C++ to define types Fixed and fixed. Using types in declarations: fixed a, *b, c[8] Defining types in assignments: a= fixed(5,4,wt,*b) Word-length Wrap-around truncation Fractional wordlength

 P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Universität Dortmund Other work on the topic Fridge (RWTH Aachen), commercialized by Synopsys Some support in Simulink (MATLAB toolbox).. hundreds of papers on the topic. Fridge (RWTH Aachen), commercialized by Synopsys Some support in Simulink (MATLAB toolbox).. hundreds of papers on the topic.