Hardware-Software Codesign Elvira Kitsis Hermawan Ho Alex Papadimoulis.

Slides:



Advertisements
Similar presentations
Embedded System, A Brief Introduction
Advertisements

1/1/ /e/e eindhoven university of technology Microprocessor Design Course 5Z008 Dr.ir. A.C. (Ad) Verschueren Eindhoven University of Technology Section.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
Chapter 8 Hardware Conventional Computer Hardware Architecture.
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 08: RC Principles: Software (1/4) Prof. Sherief Reda.
Chapter 13 Embedded Systems
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
1 Java Grande Introduction  Grande Application: a GA is any application, scientific or industrial, that requires a large number of computing resources(CPUs,
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
Trend towards Embedded Multiprocessors Popular Examples –Network processors (Intel, Motorola, etc.) –Graphics (NVIDIA) –Gaming (IBM, Sony, and Toshiba)
Winter-Spring 2001Codesign of Embedded Systems1 Introduction to HW/SW Codesign Part of HW/SW Codesign of Embedded Systems Course (CE )
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
Hardware/Software Partitioning Witawas Srisa-an Embedded Systems Design and Implementation.
1  Staunstrup and Wolf Ed. “Hardware Software codesign: principles and practice”, Kluwer Publication, 1997  Gajski, Vahid, Narayan and Gong, “Specification,
Chapter 3 Memory Management: Virtual Memory
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
Operating Systems for Reconfigurable Systems John Huisman ID:
Paper Review: XiSystem - A Reconfigurable Processor and System
Automated Design of Custom Architecture Tulika Mitra
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
High Performance Embedded Computing © 2007 Elsevier Lecture 3: Design Methodologies Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte Based.
System Design with CoWare N2C - Overview. 2 Agenda q Overview –CoWare background and focus –Understanding current design flows –CoWare technology overview.
High Performance Embedded Computing © 2007 Elsevier Chapter 1, part 2: Embedded Computing High Performance Embedded Computing Wayne Wolf.
Real-Time Operating Systems for Embedded Computing 李姿宜 R ,06,10.
IEEE ICECS 2010 SysPy: Using Python for processor-centric SoC design Evangelos Logaras Elias S. Manolakos {evlog, Department of Informatics.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
Hardware-software Interface Xiaofeng Fan
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.
1 Control Unit Operation and Microprogramming Chap 16 & 17 of CO&A Dr. Farag.
Microprogrammed Control Chapter11:. Two methods for generating the control signals are: 1)Hardwired control o Sequential logic circuit that generates.
PART 6: (1/2) Enhancing CPU Performance CHAPTER 16: MICROPROGRAMMED CONTROL 1.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
IMPLEMENTATION OF MIPS 64 WITH VERILOG HARDWARE DESIGN LANGUAGE BY PRAMOD MENON CET520 S’03.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
System-on-Chip Design Homework Solutions
Parallel Computing Presented by Justin Reschke
CISC. What is it?  CISC - Complex Instruction Set Computer  CISC is a design philosophy that:  1) uses microcode instruction sets  2) uses larger.
Winter-Spring 2001Codesign of Embedded Systems1 Essential Issues in Codesign: Architectures Part of HW/SW Codesign of Embedded Systems Course (CE )
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
CoDeveloper Overview Updated February 19, Introducing CoDeveloper™  Targeting hardware/software programmable platforms  Target platforms feature.
Basic Concepts Microinstructions The control unit seems a reasonably simple device. Nevertheless, to implement a control unit as an interconnection of.
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
Computer Organization and Architecture + Networks
Chapter 2 Memory and process management
A Closer Look at Instruction Set Architectures
Introduction to cosynthesis Rabi Mahapatra CSCE617
Pipelining and Vector Processing
Performance Optimization for Embedded Software
Simulation of computer system
Control Unit Introduction Types Comparison Control Memory
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
COMPUTER ORGANIZATION AND ARCHITECTURE
Presentation transcript:

Hardware-Software Codesign Elvira Kitsis Hermawan Ho Alex Papadimoulis

HW/SW Codesign Introduction Unified design of hardware and software systems All design based off of logical model no HW/SW partition Maintained throughout design process Concurrent Design hw/sw optimized for peak performance

HW/SW Codesign Origins Field of Embedded systems Demand for consumer information appliances (cell phone, pda) Specialized industrial products designers developed new tools and techniques to satisfy demand These became HW/SW codesign

Traditional Systems Design Early, key decision: HW/SW Partition Must be kept, changes require extensive redesign for both HW and SW Lacks a well defined HW/SW interface data flow Leads to Sub optimal designs And longer design-to-market time

HW/SW Codesign - A Solution HW/SW Codesign alleviates traditional design issues Maps system specification to a mixed HW/SW implementation Conventional SW on a RISC processor ASICs (Application Specific Integrated Circuit)

Practical Implementation of Hardware Software Codesign Elvira Kitsis

Practical implementation of hardware/software co-design The purpose of hardware/software co-design Four common approaches to the task of hardware/software co-design  unbiased  hardware-biased  software-biased  hardware acceleration

Co-design development routine Objectives of development routine The first stage is to determine the performance critical section of a C program using a profiler tool and routine system, as described in Fig.1.

Next step The next step in the development routine is to implement a critical section in hardware as shown in Fig.2.

Limitations on the type C code: All C types must be mapped to 16/32 bit signed integers in HardwareC Type qualifiers, enumerated types, unions and structures NOT permited Global variables are NOT allowed Parameters for functions may consists of simple types, pointer types and data arrays only. No support for "gotos"

Test Results Execution time Example 1 Software-only80 ms Software-hardware47 ms Example 2 Software-only 114 ms Software-hardware 80 ms

Hardware or software? Performance Cost Form factor Flexibility Safety Architectural cleanness and simplicity

System Level Memory Optimization for Hardware- Software Co-Design Hermawan Ho

Intro In multi media applications, a considerable amount of memory is required. To reduce this dominant cost. A quad-tree based image coding application.

Design Model If we do not need the flexibility, one or more dedicated hardware processor(s) can be designed to perform the functions which are in the cycle. When the flexibility is needed, we can use data level parallelism. The advantage of this approach is that it is simple to program but the memory overhead is high.

Design Model Alternatively, we can use task level parallelism. The advantages are that the code size per processor is relatively low. The disadvantage is that the design time will be much higher due to the complex processor partitioning and memory management.

System Level Memory Optimization All functions are taken together in one big function. We have an algorithm that operates block per block. All computations are done on the first block. Buffer memory for only one block will be required between the sub modules.

QSDPCM QSDPCM (Quadtree Structured Difference Pulse Code Modulation) is a compression technique for video. The algorithm optimize both the displacement vector and the quadtree mean decomposition jointly. The displacement which requires the minimum number of bits for the quadtree decomposition is selected

Summary If the HW/SW partitioning is performed first, remaining buffers afterwards cannot be optimized away anymore. QSDPCM application, can do much better before the HW/SW partitioning.

The Design of Mixed Hardware/Software Systems

Mixed Hardware/Software Systems Many digital systems contain both hardware and software Combining hardware and software design tasks has several advantages. One is that may accelerate the design process. Another is that may enable hardware/software trade-offs to be made dynamically, as the design progresses.

Mixed Hardware/Software Systems Unless the they are design together, we do not think of it as a mixed hardware/software system. The distinguishing factor is whether the boundary between hardware and software is logical boundary or a physical boundary.

Simulation of Hardware/Software Systems Presents the problem of modeling the behavior of a system based on the behavior of the hardware and software components. Requires a simulation environment that can understand the semantics of both the software and the hardware components

Automated Hardware/Software Co-Synthesis Allow the designer to explore more of the design space by dynamically reconfiguring the hardware and software. Another challenge for hardware/software co-synthesis is that hardware and software are often described using different languages and formalisms.

Automated Hardware/Software Co-Synthesis May include hardware/software partitioning. Some of the considerations are: Performance requirements Implementation cost Modifiability Nature of computation Concurrency Communication

Several Examples of Hardware/Software Co-Design Embedded microprocessor systems Heterogeneous multiprocessing systems Application-specific instruction set processors Special-purpose functional units Application-specific co-processor

Using HW/SW Codesign Alex Papadimoulis

OOP & HW/SW Codesign Develop entire system in an object oriented programming language Treat hardware as an object Allows for a unified design environment HW functions can be simulated in SW Object and implemented concurently

Problems with OOP Synchronizing sequential code Interleaved SW and HW functions HW needs to know exactly when a data object is ready to be worked on Same holds true for SW

C++ Class Library – Cylib Handle this synchronization problem Clock function and Done flag: objHardware.Modify( objData, blnDone); while (!(blnDone)) { objHardware.clock(); } SoftwareFunction( objData );

C++ Class Library – Cylib 2 Approach is similar to interrupts Complexity is greatly reduced Interface allows HW/SW objects to work hierarchical and in parallel Modification of HW design requires changing only the class library

Another Approach: Complier Generation OOP approach won’t work for all cases Example: MPU architecture changes Traditional MPU replacement, 2 options: Backwards compatible hardware. Simply increase speed of functions, no new functionality. Rewrite compiler, very costly.

Complier Generation Theory: third option, generate compiler Radical architecture changes, compilers wouldn’t need time to catch up Ideal for user defined processors Extract HW architecture information then generate optimized executable code from high-level language

Complier Generation Retargetable compilers exist Require significant human skill Simply are superset of all CPU instructions Compiler Generator would Overcome retargetable compiler limitations Maintain quality (speed, size, compilation time) of conventional compiler

How it works 1 Optimize front-end code Architecture independent step Performed by conventional compilers Passes a optimized grammar tree structure to the next step

How it works 2 Get parameterized architecture info Number of general registers, memory word size, instruction behavior, etc Modify tree branches Using existing language (“twig”) Translate into pattern functions Allocate registers Generate Executable Code

Compiler Gen. Conclusion Requires a lot of work Pipelined compilation techniques Automated architecture information extraction (perhaps HDL, etc) Experiments provide that concept could be used in HW/SW Codesign in the future